Hungary For More? Optimal 1-to-Many Matching for Causal Inference
The Hungarian algorithm (formally: Kuhn-Munkres) gives you the optimal 1-to-1 matching between treated and control units, minimizing total cost—usually Euclidean distance in covariate space. It's been used for estimating treatment effects via matching.
But it has a limitation: it's strictly one-to-one.
In many causal inference settings, especially when estimating the Average Treatment effect on the Treated (ATT), we have far more controls than treated units. Forcing each treated unit to pair with only one control discards useful data and limits efficiency.
When is 1-to-Many Optimal?
The choice between 1-to-1 and 1-to-many matching involves a fundamental bias-variance trade-off. 1-to-many matching is typically optimal when you have a large pool of control units relative to treated units, noisy outcomes that benefit from averaging, or when match quality degrades slowly (the 2nd and 3rd best matches are nearly as good as the 1st). In these settings, the variance reduction from using additional matches outweighs the bias from including slightly lower-quality controls. Conversely, 1-to-1 matching is preferred when match quality drops sharply after the first match, when you have abundant high-quality matches for all treated units, or when treatment effects are highly heterogeneous and require precise individual counterfactuals. The optimal choice depends on your specific data structure—control-to-treated ratios, covariate overlap, outcome noise levels—and can be evaluated empirically by examining match quality distributions across ranks and conducting sensitivity analyses across different matching ratios.
1-to-How Many?
We can select the optimal number of matches (k) in a data-driven way using:
- Cross-validation to select the k that minimizes prediction error on held-out data
- Adaptive matching where k varies by treated unit based on local match quality—units in dense covariate regions get fewer matches, those in sparse regions get more
- Caliper-based selection using all matches within a quality threshold rather than a fixed number, and
- Bias-variance decomposition, where you empirically estimate the bias-variance trade-off across different k values using simulation or leave-one-out validation.
Implementation of 1-to-many
Each treated unit can be matched to k distinct control units, with no replacement. We cast this as a min-cost flow problem:
- Each treated unit becomes a node with k units of supply.
- Each control unit becomes a node with capacity 1.
- The cost of matching is the distance between treated and control units.
- The goal: assign each treated unit to k controls while minimizing total cost.
We solve this using the min-cost max-flow algorithm (networkx.max_flow_min_cost
) (see here). This gives us a global optimum, just like the Hungarian algorithm—but with more flexibility.
Results from Basic Simulations
caliper_percentile | mean_caliper | bias_1to1 | bias_1tomany | se_1to1 | se_1tomany | quality_1to1 | quality_1tomany | n_sims |
---|---|---|---|---|---|---|---|---|
50 | 0.2954 | -0.0396 | -0.0528 | 0.1960 | 0.1822 | 0.1950 | 0.2123 | 30 |
60 | 0.3426 | -0.0465 | -0.0616 | 0.1790 | 0.1642 | 0.2156 | 0.2407 | 30 |
70 | 0.4041 | -0.0373 | -0.0436 | 0.1672 | 0.1508 | 0.2381 | 0.2748 | 30 |
75 | 0.4424 | -0.0488 | -0.0460 | 0.1627 | 0.1444 | 0.2505 | 0.2966 | 30 |
80 | 0.4893 | -0.0554 | -0.0515 | 0.1581 | 0.1403 | 0.2640 | 0.3163 | 30 |
90 | 0.6829 | -0.0458 | -0.0213 | 0.1514 | 0.1282 | 0.2988 | 0.3976 | 30 |
95 | 0.8635 | -0.0402 | -0.0092 | 0.1481 | 0.1254 | 0.3239 | 0.4590 | 30 |