balanceMV {Matching} | R Documentation |

This function conducts the Kolmogorov-Smirnov tests for balance. The
function does no matching. Matching is performed by the
`Match`

function. `balanceMV`

is
used to determine if `Match`

was successful in achieving
balance. Multivariate balance is determined by the use of a model.
This approach can be used regardless of the algorithm used to do the
original matching. For example, even if `Match`

was
told NOT to use a propensity score, `balanceMV`

can be used to
test balance. Output can be summarized by using the
`summary.balanceMV`

function. Generally, users should call
`MatchBalance`

and not this function directly.

balanceMV(formul, data = NULL, match.out=NULL, maxit = 1000, weights = rep(1,nrow(data)), nboots=100, nmc=nboots, print.level=0, ...)

`formul` |
A formula denoting the model for which balance should be determined. This model will be estimated by a binary logistic estimator. The dependent variable in the formula is usually the treatment indicator. Univariate balance tests will be conducted for each of the regressors included in this model. And the multivariate tests will be conducted on the predicted probabilities of treatment for both treated and control based on this formula. The predicted probability densities for both treated and control should be indistinguishable if balance has been achieved. Note that the model defined by this formula is estimated separately for the matched and unmatched datasets. |

`data` |
A data frame which contains all of the variables in the formula. If a data frame is not provided, the variables are obtained via lexical scoping. |

`match.out` |
The output object from the `Match`
function. If this output is included, `balanceMV` will provide
balance tests for both before and after matching. Otherwise
balance tests will only be conducted for the raw data. |

`maxit` |
The maximum number of iterations for the glm logistic procedure. |

`weights` |
A vector of observation weights. |

`nboots` |
The number of bootstrap samples to be run. If zero, no
bootstraps are done. Bootstrapping is highly recommended because
the Kolmogorov-Smirnov test only provides correct coverage when
bootstrapped due to the existence of nuisance parameters. At least
500 `nboots` (preferably 1000) are recommended for publication
quality p-values. Also see the `nmc` option. |

`nmc` |
The number of Monte Carlo simulations to be conducted for
each Kolmogorov-Smirnov test calculated. Monte Carlo simulations
are highly recommended because the usual Kolmogorov-Smirnov test is not
consistent when the densities being compared contain point masses. At least
500 `nmc` (preferably 1000) are recommended for publication
quality p-values. Also see the `nboots` option. |

`print.level` |
The amount of printing to be done. If zero, there is no printing. If two, details are printing such as the number of the bootstrap sample currently being estimated. |

`...` |
Further arguments passed to `glm.control` . |

The function can be used to determine if matching was successful in achieving balance. Two multivariate tests are provided. The Kolmogorov-Smirnov (KS) and Chi-Square null deviance tests. The KS test is to be preferred over the Chi-Square test because the Chi-Square test is not testing the relevant hypothesis. The null hypothesis for the KS test is of equal balance in the estimated probabilities between treated and control. The null hypothesis for the Chi-Square test, however, is of all of the parameters being insignificant; a comparison of residual versus null deviance. If the covariates being considered are discrete, this KS test is asymptotically nonparametric as long as the logit model does not produce zero parameter estimates. The bootstrap-Monte Carlo version of the KS test is highly recommended because the usual KS test is not consistent when there are point masses in the distributions being compared, and the bootstrap is needed because parameters are being estimated in the logit model.

`pval.kboot.unmatched` |
The bootstrap p-value of the Kolmogorov-Smirnov test for the hypothesis that the probability densities for both the treated and control groups are the same (unmatched data). |

`pval.kboot.matched` |
The bootstrap p-value of the Kolmogorov-Smirnov test for the hypothesis that the probability densities for both the treated and control groups are the same (matched data). |

`logit.unmatched` |
Return object from glm estimating the model
defined by the user in `formul` on the unmatched data. |

`logit.matched` |
Return object from glm estimating the model
defined by the user in `formul` on the matched data. |

`ks.unmatched` |
Return object from `ks.test` before matching. |

`ks.matched` |
Return object from `ks.test` after
matching. |

`data` |
A list containing five variables: a treatment indicator
for the unmatched (`Tr` ) and the matched data
(`Tr.matched` ), weights for the matched data used for the glm
models, `index.treated` , and `index.control` . The two
indexes are defined in `Match` . The user should never
need to examine this object. |

Jasjeet S. Sekhon, UC Berkeley, sekhon@berkeley.edu, http://sekhon.berkeley.edu/.

Sekhon, Jasjeet S. 2007. ``Multivariate and Propensity Score
Matching Software with Automated Balance Optimization.''
*Journal of Statistical Software*.
http://sekhon.berkeley.edu/papers/MatchingJSS.pdf

Sekhon, Jasjeet S. 2006. ``Alternative Balance Metrics for Bias Reduction in Matching Methods for Causal Inference.'' Working Paper. http://sekhon.berkeley.edu/papers/SekhonBalanceMetrics.pdf

Diamond, Alexis and Jasjeet S. Sekhon. 2005. ``Genetic Matching for Estimating Causal Effects: A General Multivariate Matching Method for Achieving Balance in Observational Studies.'' Working Paper. http://sekhon.berkeley.edu/papers/GenMatch.pdf

Abadie, Alberto. 2002. ``Bootstrap Tests for Distributional Treatment
Effects in Instrumental Variable Models.'' *Journal of the
American Statistical Association*, 97:457 (March) 284-292.

Hall, Peter. 1992. *The Bootstrap and Edgeworth Expansion*. New
York: Springer-Verlag.

Wilcox, Rand R. 1997. *Introduction to Robust Estimation*. San
Diego, CA: Academic Press.

William J. Conover (1971), *Practical nonparametric statistics*.
New York: John Wiley & Sons. Pages 295-301 (one-sample
"Kolmogorov" test), 309-314 (two-sample "Smirnov" test).

Shao, Jun and Dongsheng Tu. 1995. *The Jackknife and Bootstrap*.
New York: Springer-Verlag.

Also see `summary.balanceMV`

,
`MatchBalance`

, `balanceUV`

, `qqstats`

,
`ks.boot`

, `Match`

, `GenMatch`

,
`GerberGreenImai`

, `lalonde`

data(lalonde) # #direct matching on some variable # X <- cbind(lalonde$re74,lalonde$re75,lalonde$age,lalonde$hisp,lalonde$black) Y <- lalonde$re78 Tr <- lalonde$treat rr <- Match(Y=Y,Tr=Tr,X=X,M=1) #multivariate test for balance # 'nboots' and 'nmc' are set to small values in the interest of speed. # Please increase to at least 500 each for publication quality p-values. ks <- balanceMV(treat~ age + I(age^2) + educ + I(educ^2) + black + hisp + married + nodegr + re74 + I(re74^2) + re75 + I(re75^2) + u74 + u75, data=lalonde, match.out=rr, nboots=10, nmc=10) summary(ks)