next up previous http://data.fas.harvard.edu/jsekhon/pics/home.gif
Next: The Bootstrap Up: The GENetic optimization and Previous: The GENetic optimization and

The Optimization Problem

The gradient-based optimization algorithms most often used with structural equation models (Levenberg-Marquardt, Newton-Raphson, quasi-Newton) are inadequate because they too often fail to find the global maximum of the likelihood function. The discrepancy function is not globally convex. Multiple, local minima and saddlepoints often exist, so that there is no guarantee that gradient-based methods will converge to the global maximum. Indeed, saddlepoints and other complexities in the curvature of the likelihood function can make it difficult for gradient-based optimization methods to find any maximum at all. Such difficulties are intrinsic to linear structure models for two reasons. First, the LISREL likelihood is not globally concave. Second, linear structure models' identification conditions do not require and do not guarantee that the model (as a function of the data) will determine a unique set of parameter values outside a neighborhood of the true values. The derivatives of the likelihood function with respect to the parameters are not well defined outside of the neighborhood of the solution. Therefore, outside of the neighborhood of the solution, derivative based methods often have little or no information upon which to advance to the global maximum.

Bootstrap methodology accentuates optimization difficulties, because the bootstrap resampling distribution draws from the entire distribution of the parameter estimates. Even if optimization in the original sample is not problematic, one can expect to encounter difficulties in a significant number of bootstrap resamples. Even if the model being estimated is correctly specified, problematic resamples contain crucial information about the tails of the distribution of the parameter estimates. Indeed, what bootstrap methods primarily do is make corrections for skewness--for asymmetry between the tails of the distribution of each parameter estimate--that is ignored by normal-theory confidence limit estimates. Tossing out the tail information basically defeats the purpose of using the bootstrap to improve estimated confidence intervals. In general, any procedure of replacing problematic resamples with new resampling draws until optimization is easy must fail, as making such replacements would induce incomplete coverage of the parameter estimates' sampling distribution and therefore incorrect inferences. Ichikawa and Konishi (1995) make this mistake.

Because the nonexistence of good MLEs in bootstrap resamples is evidence of misspecification and because the occurrence of failures affects the coverage of the bootstrap confidence intervals, it is crucial to use an optimization method that finds the global minimum of the discrepancy function if one exists. In order to overcome the problems of local minima and nonconvergence from poor starting values, GENBLIS combines a gradient-based method with an evolutionary programming (EP) algorithm. Our EP algorithm uses a collection of random and homotopy search operators that combine members of a population of candidate solutions to produce a population that on average better fits the current data. Nix and Vose (1992; Vose 1993) prove that genetic algorithms are asymptotically correct, in the sense that the probability of converging to the best possible population of candidate solutions goes to one as the population size increases to infinity. Because they have a similar Markov chain structure, EP algorithms of the kind we use are asymptotically correct in the same sense. For a linear structure model and a data set for which a good MLE (global minimum) exists, the best possible population is the one in which all but a small fraction of the candidate solutions have that value. A fraction of the population will have different values because the algorithm must include certain random variations in order to have effective global search properties. The probability of not finding a good MLE when one exists can be made arbitrarily small by increasing the population size used in the algorithm.

The EP is very good at finding a neighborhood of the global minimum in which the discrepancy function is convex. But the search operators, which do not use derivatives, are quite slow at getting from an arbitrary point in that neighborhood to the global minimum value. We add the Broyden-Fletcher-Goldfarb-Shanno (BFGS) quasi-Newton optimizer as an operator to do the final hill-climbing. We developed and implemented a general form of this EP-BFGS algorithm in a C program called Genetic Optimization Using Derivatives (GENOUD) . GENBLIS is a version of GENOUD specifically tuned to estimate linear structure models.

In our experience the program finds the global minimum solution for the LISREL estimation problem in all cases where the most widely used software fails, except where extensive examination suggests that a solution does not exist.


next up previous http://data.fas.harvard.edu/jsekhon/pics/home.gif
Next: The Bootstrap Up: The GENetic optimization and Previous: The GENetic optimization and
Jas S. Sekhon
1998-08-25