next up previous http://data.fas.harvard.edu/jsekhon/pics/home.gif
Next: Program Syntax Up: The GENetic optimization and Previous: The Optimization Problem

The Bootstrap

It is not necessary to rely on asymptotic normality when using sample statistics to make inferences about linear structure model parameter values or to test whether particular model specifications are adequate. MLEs derived using the multivariate normal-based likelihood do not require any particular conditions on third or higher moments in order to be consistent. All that is needed is that the parameters of interest are identified and the model is in the relevant sense correct--i.e., the sample covariance matrix converges in probability to the covariance matrix generated by the model (Anderson 1989, 97). In particular, for such MLEs to be consistent it is not necessary that fourth cumulants vanish asymptotically as is required for the normal-theory distributional results. So it is reasonable to think of using the multivariate normal-based MLEs to obtain parameter estimates while replacing the normal-theory methods for assessing the estimates' precision and the model's trustworthiness.

Bootstrap methods allow such assessments to be made given conditions much weaker than those needed for the normal-theory results to be reliable. While the normal-theory approach requires asymptotically vanishing fourth cumulants, for the bootstrap it is sufficient to have finite fourth moments (Beran and Srivastava 1985). In cases where the normal-theory approach would validly apply, bootstrap estimates of confidence intervals for the model parameters can be more accurate than estimates obtained in the usual way by inverting the expected information matrix. The better bootstrap methods achieve a higher order of accuracy by correcting for skewness phenomena (Hall 1992, 92-108) that can be problematic in small or moderately large samples even for fully multivariate normal data. So in using an appropriate bootstrap method one will not do worse than one would using normal-theory methods, and in general the bootstrap results will be better.

GENBLIS uses the BCa method (Efron 1987) to estimate parameter confidence intervals. In the context of linear structure models, Efron's BCa method is better than the also highly accurate percentile-t method (Hall 1992, 15). The percentile-t method requires accurate estimates of the variance matrix of the parameter estimates. Such estimates can be difficult to obtain for the linear structure model if one begins with doubts that the data are strictly multivariate normal or doubts about the adequacy of the normal-theory approximations. For the BCa method no such estimates are needed. Because the same covariance structure can in general be parameterized in a number of different ways, it is also important that the BCa method is ``transformation-respecting'' (Hall 1992, 128; Efron and Tibshirani 1993, 162). This property reduces the chances that formally inconsequential changes in the way a model is expressed will produce spurious variations in the accuracy with which model parameters appear to be estimated.

In addition to confidence intervals for model parameter estimates, GENBLIS also reports upper confidence limits for the likelihood kernel statistic $F_{\text{ML}}$ (this is the discrepancy function, defined below). Under normal theory with p observed variables in a sample of size n, the statistic $(n-1)F_{\text{ML}}$ has asymptotically a $\chi^2$ distribution with p(p+1)/2-k degrees of freedom for a correctly specified model that has k free parameters. GENBLIS reports two sets of BCa confidence intervals and confidence limits. One set is computed using results only from bootstrap resamples in which the optimization converged successfully. In this case the resamples in which convergence failed are simply ignored. To ignore the convergence failures is flagrantly erroneous, however. If the specification is correct, a good optimum exists in almost every sample of data, and in all but a vanishingly small proportion of the bootstrap resamples. A high proportion of convergence failures is evidence of misspecification that should not be ignored.

Perhaps the most important feature of GENBLIS is that the power of the optimization algorithm makes it unreasonable to presume that a large number of convergence failures simply reflects the vagaries of numerical computing. By increasing the number of search operators and the number of generations in the EP algorithm, and by manipulating the region over which the algorithm is allowed to search, one can eliminate virtually all instances of convergence failures that are due to having a draw of data that is simply difficult numerically. Having in that way greatly reduced the proportion of convergence failures that are nothing but numerical artifacts, one can get useful information from the failures that remain.

GENBLIS uses the proportion of convergence failures in the set of bootstrap resamples to adjust the BCa results. These adjusted results are the second set of BCa confidence intervals and confidence limits that GENBLIS reports. The technique is to reduce the specified test level (or size) by the proportion of bootstrap resamples in which convergence failed. If the proportion of failures exceeds the specified level, then the model specification is judged to be incorrect at the specified level and GENBLIS reports that confidence intervals and limits cannot be computed.

The motivation for this method is easiest to see in the case of confidence limits to test the goodness-of-fit. The null hypothesis is that the model does fit the data. Let $\tilde{F}_{\text{ML}}$ denote the value obtained in the original sample when $F_{\text{ML}}$ is computed using the maximum likelihood parameter estimates. Let $\alpha<.5$ be the intended test level. Under the null hypothesis, there is a critical value $F_{\text{ML},1-\alpha}$ such that $\text{Pr}(\tilde{F}_{\text{ML}}>F_{\text{ML},1-\alpha})=\alpha$. If $\tilde{F}_{\text{ML}}>F_{\text{ML},1-\alpha}$ is observed, the null hypothesis is rejected and the model is judged not to fit the data. If convergence occurs in all bootstrap resamples, a bootstrap estimate $F^*_{\text{ML},1-\alpha}$ for $F_{\text{ML},1-\alpha}$ is readily computable: if $\tilde{F}_{\text{ML}}>F^*_{\text{ML},1-\alpha}$, the model is judged not to fit the data. Suppose convergence fails in a proportion $\beta>0$ of the bootstrap resamples. If $\beta>\alpha$, then the proportion of poor results--the convergence failures--already exceeds the proportion that should not happen if the null hypothesis is true. The latter proportion is the specified test level. There is therefore no need to compute $F^*_{\text{ML},1-\alpha}$ from the remaining bootstrap resamples. Indeed, with the model specification having already so clearly failed, it is not clear how $F^*_{\text{ML},1-\alpha}$ can be defined. If $\beta<\alpha$, then the convergence failures can be treated as having generated infinitely large $F_{\text{ML}}$ values. To compute $F_{\text{ML},1-\alpha}$ from the remaining set of successfully converged bootstrap resamples, use the adjusted value $\alpha^*=(\alpha-\beta)/(1-\beta)$ to find $F^*_{\text{ML},1-\alpha^*}$. The adjusted value $\alpha^*$ is the basis for the second set of confidence intervals and limits that GENBLIS reports.


next up previous http://data.fas.harvard.edu/jsekhon/pics/home.gif
Next: Program Syntax Up: The GENetic optimization and Previous: The Optimization Problem
Jas S. Sekhon
1998-08-25