A Difference-in-Differences Approach Using Mixed-Integer Programming Matching: An Application for the Effect of a Preferential School Voucher on Segregation.

Magdalena Bennett

Introduction: Differences-in-Differences (DD) is a commonly-used approach in policy evaluation for identifying the impact of an intervention or treatment. Under a parallel trend assumption (PTA), we can recover a causal effect by comparing the difference in outcomes between a treatment and a control group, both before and after an intervention was set in place. However, is there something to be done when the PTA does not appear to hold for our study sample? In this paper, I address this issue by identifying sub-samples within the study population, if they exist, for which the PTA would be more likely to hold. Using a mixed-integer programming matching approach, I match units in both the treatment and control group on key baseline covariates, and check whether these sub-samples follow similar trends in the preintervention period. I also identify the desirable properties of the baseline covariates to avoid introducing additional bias through the matching procedure. The use of matching as a method to recover parallel trends under violations of PTA is a contentious topic. While some researchers argue that there are clear advantages that stem from combining matching with a DD approach (Basu & Small, 2020; Ryan, 2015), others argue for a more cautious approach given that matching can also bias estimates depending on the context (Zeldow & Hatfield, 2019; Daw & Hatfield, 2018; Chabé-Ferret, 2017). However, most of this discussion has focused on matching on outcomes instead of matching on covariates. One of the main drawbacks of matching on outcomes is that, by construction, the PTA will hold in the pre-intervention period, and eliminates one of the most sensible robustness checks for a difference-in-difference setting. By using a matching approach on covariates, depending on their characteristics, we are able to make both a conceptual argument in favor of the PTA, as well as provide a robustness check on whether the trends in the pre-intervention period are similar between both the treatment and the control group. In this paper, I identify the different contexts in which matching can help reduce such biases, and show how balancing covariates directly can yield better results for solving these issues. I particularly focus on the case where we have time-varying confounders, which could potentially yield to introduction of bias through regression to the mean. I illustrate these results with simulations and a case study of the impact of a new voucher scheme on socioeconomic segregation in Chile. Simulations: Using simulations, I test how covariate matching using a mixed-integer programming approach works for different data generating processes (DGP), including scenarios where the PTA is violated. I replicate the simulations scenarios in Zeldow & Hatfield (2019) using both time-invariant and time-variant covariate as following: (1) Time-constant covariant: (a) Time-invariant covariate effect; (b) Time-variant covariate effect; and (c) Treatment independent covariate; (2) Time-constant covariate: (a) Parallel evolution; (b) Evolution differs by group; and (c) Evolution diverges in post-intervention period. Results show that MIP matching on covariates is able to recover unbiased causal estimates in all but one scenario, and I can bound the direct effect of the intervention on the outcome for the final DGP process (Figure 1). I also find that covariate matching can be a solution for PTA violations when both groups are drawn from the same population, and in a more general case, when matching covariates have high serial correlation. Application: The preferential schooling subsidy (SEP) was introduced in 2008, and its objective was to allocate more resources to vulnerable students. The amount of the increase of the voucher was significant compared to the previously universal flat voucher: Preferential vouchers represented approximately a 50% increase with respect to the previous amount per child, increasing from US$70 to US$105 in 2008, on average. In addition to the extra resources given for each vulnerable student, schools received a concentration bonus associated with the percentage of this type of students that they enrolled. Schools had the option to opt-in to this program to receive the extra resources for the vulnerable students that were enrolled in their institution, meaning that a key feature of this policy is that it was voluntary. However, in order to receive these funds, schools also had to comply with certain Government requirements: (i) accountability, (ii) no discrimination, and (iii) educational quality. Given the design of the SEP policy, there could be a clear incentive for schools that opted in (especially for-profit ones) to concentrate as many SEP-eligible students as possible. In that line, I analyze the effect that the introduction of the SEP policy had on the socioeconomic composition of schools that adhered to SEP. In general, schools that subscribed to the policy were more vulnerable and had lower standardized scores than those that did not. In fact, if I analyze both groups using a difference-in-difference approach using the complete sample, it is clear to see that the parallel trend assumption necessary to recover causal estimates does not hold for the pre-intervention period (Figure 2). However, when matching on constant covariates and other variables with high serial correlation (e.g. rural status, copayment, vulnerability status, number of schools in the county, among others), I am able to recover pre-intervention parallel trends. Using the previously described matched sample, I find that even though the introduction of this new voucher scheme reduced the test score gap between SEP and non-SEP schools, it also made the difference in average income between both types of schools larger, increasing segregation in an already highly segregated educational system.