Marcella Vigneri introduces a new technique for matching data as part of the programme impact evaluation process.
This blog series is aimed at anyone with an interest in research, evaluation and adaptive learning.
Global Impact Evaluation Advisers routinely face the challenge of how to re-create samples of units (be it individuals, families, farmers, or group members) with the same probability of being selected as recipients of an Oxfam-sponsored project, so that the size of the intervention’s effect is the only observed difference between those who received it (aka the ‘Intervention’ group) and those who didn’t (aka the ‘Comparison’ group).
the methodological challenge boils down to the selection of an appropriate matching procedure to apply to data collected after the project endedIn our analysis of ex-post project data, the methodological challenge boils down to the selection of an appropriate matching procedure to apply to data collected after the project ended. Matching methods differ primarily by how they define the approximate matching. The objective of matching is to retain only observations in both the project (intervention) and non-project (comparison) sampled groups which are wholly comparable (or balanced in technical terms) based on observed characteristics. Unmatched observations are pruned from the data set before further analysis is carried out. Lowering imbalance reduces the degree of model dependence in the subsequent statistical estimation of causal effects, and results in reduced inefficiency and bias. The primary goal of any matching procedure is therefore to maximize both balance, i.e. the similarity between the multivariate distributions of the intervention and comparison observations, and the size of the matched data set. Any remaining imbalance after matching must be dealt with by statistical modelling assumptions.
We tested out this alternative matching algorithm with data collected to conduct the Effectiveness Review in MaliGary King of Harvard University offers a different take on matching procedures by means of a new technique known as Coarsened Exact Matching (CEM); this is a fast, easy to use and understand procedure, requiring fewer assumptions than, for example, Propensity Score Matching (PSM), and with a number of attractive statistical properties. We tested out this alternative matching algorithm with data collected to conduct the Effectiveness Review in Mali, which estimated the impact of the “Girls’ Can” project. In this review both PSM and CEM matching techniques were adopted to create comparable groups for the evaluation exercise. It is in our goal to keep exploring and experimenting alternative methods and approaches, enabling us to increase our set of tools – and we thought it might be helpful to share our experience with CEM.
The central motivation for CEM is to temporarily coarsen each observed variable into substantively meaningful groups. We then exact match on these coarsened data, and retain only the original (un-coarsened) values of the matched data. Many analysts know how to coarsen a variable into groups that preserve information; for instance education or age can be measured in years, but many would be comfortable grouping observations into categories of school grade or age brackets depending on the project context. This method works by exact matching on distilled information in the covariates as chosen by the user. CEM chooses a fixed level of imbalance ex ante (by means of a number of block level variables selected by the user based on her/his knowledge of the data and the underlying intervention), and hopes that the number of observations left as a result of the procedure is sufficiently large.
Observations with the same values for all the coarsened variables are placed in a single stratum. And finally, for further analyses, comparison observations within each stratum are weighted to equal the number of intervention observations in that stratum. Strata without at least one intervention and one comparison unit are thereby weighted at zero, and thus pruned from the data set (The weights for each intervention observation is 1; the weights for each comparison observation equals the number of intervention observations in its stratum divided by the number of comparison observations in the same stratum, normalized so that the sum of the weights equals the total matched sample size). The un-pruned observations with the original uncoarsened values of their variables are passed on to the analysis stage. Bad matches are dropped as an integral part of the CEM procedure.
With CEM the chosen measure of match quality is computed directly from the multivariate data sets of intervention and comparison observationsThe inherent trade-off of matching is reflected in CEM too: larger bins (more coarsening) used to coarsen the covariates will result in fewer strata. Fewer strata will result in more diverse observations within the same strata and, thus, higher imbalance. With CEM the chosen measure of match quality is computed directly from the multivariate data sets of intervention and comparison observations.
Moreover, unlike PSM, which involves a full scale (usually probit or logistic) estimation step, CEM does not require estimation. When the logit’s performance is poor, such as in well balanced data, PSM can approximate random matching.
The key to the productive use of modern matching methods is to compare many matching solutions, and chose a final solution from the diverse set of possibilities of the ‘bias-variance’ trade-off: in large data sets, one can afford to prune a large number of observations in return for lower imbalance (since confidence intervals will still be narrow), but in smaller data sets (or when an application requires especially narrow confidence intervals) one might prefer a matching solution which prunes fewer observations at the cost of having to build a model to cope with imbalance left after matching.
You can find more information in the report . We look forward to hearing your thoughts!