This website is for the distribution of "Matching" which is a
R package for
estimating causal effects by multivariate and propensity score
matching. The package provides functions for multivariate and
propensity score matching and for finding optimal balance based on
a genetic search
algorithm. A variety of univariate and multivariate tests to
determine if balance has been obtained are also provided. These tests
can also be used to determine if an experiment or quasi-experiment is
balanced on baseline covariates.
Match()
is the fastest multivariate and propensity score matching function I
know of. Maximum speed is achieved when one uses the replace=FALSE
and/or ties=FALSE options---see the Match() help
for details. But the most reliable estimates are obtained with the
defaults settings: replace=TRUE and ties=TRUE. GenMatch()
supports the use of multiple computers, CPUs or cores to perform
parallel computations. Examples are provided for how to use multiple
chips on the same computer to perform parallel computations.
Examples are also provided for how to use multiple computers to
perform parallel calculations. A Change
Log is available which tracks changes across versions.
The easiest way to install the latest version is to type in
a R
session:
> install.packages("Matching", dependencies=TRUE)
Also, make sure that the latest version of rgenoud is also
installed:
> install.packages("rgenoud")
Alternatively, the package may be directly downloaded from CRAN.
The package includes the following main user exposed functions, two
replication datasets and three demos:
GenMatch():
finds optimal balance using multivariate matching where a genetic
search algorithm determines the weight each covariate is given.
The user can choose which function of covariate balance to
optimize from a list or provide one of her own.
Match():
performs multivariate and propensity score matching.
MatchBalance():
provides a variety of univariate and multivariate tests to determine if
balance exists.
Matchby():
This function is a wrapper for the Match()
function which separates the matching problem into subgroups
defined by a factor. This function is much faster for large
datasets than the Match()
function itself.
The package is under active development so please check back for
updates. Please cite the software as follows: Sekhon, Jasjeet
S. 2011. "Multivariate and Propensity Score Matching Software with
Automated Balance Optimization: The Matching package for R."
Journal of Statistical Software. 42(7): 1-52.
GenMatch()
can make use of multiple chips on the same computer or multiple
computers to perform parallel computations. Examples are provided for
how to use multiple
chips on the same computer. Examples are also provided for how to
use multiple computers to perform parallel computations in the Journal
of Statistical Software article.
Also see my "Alternative
Balance Metrics for Bias Reduction in Matching Methods for Causal
Inference" paper which critically reviews various ways to measure
balance. Cumulative probability distribution functions of
standardized statistics are advocated as balance metrics. Formal
hypothesis tests of balance should not be conducted as is common in
the matching literature because no measure of balance is a monotonic
function of bias and because balance should be optimized without
limit. However, descriptive measures of discrepancy ignore information
related to bias which is captured by probability distribution
functions of standardized statistics.
The rbounds
package by Luke
Keele implements a number of Rosenbaum's methods of sensitivity
analysis for matched data. One can conduct sensitivity analyses for
matched data with binary, ordinal or continuous outcomes, and for
matched data with multiple control units matched to each treated
unit. The package is designed work with the object returned by the Match()
function.
Significant performance enhancements were provided by Nate Begeman
(Mac OS X Performance Group at Apple). And "Matching" relies on a
modified version of the Scythe
Statistical Library developed by Andrew Martin, Kevin Quinn and
Daniel Pemstein. My modified version of the library is included in
the "Matching" package.