ContinuousRD is a package that implements regression discontinuity designs when the treatment variable is continuous following Dong, Lee, and Gou (2022)
You can install the development version of ContinuousRD from GitHub with:
r
# install.packages("devtools")devtools::install_github("kylebutts/ContinuousRD")
I will provide a quick (and simplified) overview of the methods from Dong, Lee, and Gou (2022). For the full details, please see the original text. I’m purposefully leaving out details since hopefully you are here after reading the paper :-)
Let denote the outcome variable and a continuous variable of interest. There exists a running variable and a cutoff (which we’ll normalize to zero) that affects the treatment . The paper assumes that different quantiles of are differentially affected by the discontinuity.
For example, in the paper they discuss a minimum capital requirement for banks in cities with population over 3,000 in 1900. The running variable would be population and the treatment variable is banking capital. Clearly, only banks with counterfactual banking capital below the threshold are affected by the population cutoff. Therefore the jump in will be zero for most banks, but positive for banks at the lower end of the counterfactual distribution of .
For this reason, the estimator proposed in the paper wants to estimate the LATE at different quantiles of the distribution of , which they denote as the Quantile-LATE or QLATE for short. The method proceeds as follows.
For a given quantile of , denoted , estimate the “first-stage” RD of . Define be the percentile of under treatment at and be the percentile of under no treatment at . The RD estimate of would therefore estimate This will estimate . Intuitively, this gives the jump in the continuous treatment for units at the percentile of the distribution.
Then, the reduced form involves running an RD of using observations to the right of the cutoff with values of near and to the left of the cutoff with values of near . The left and right limits in the RD would respectively identify
The difference between these two terms would estimate the jump in for units near the percentile of . The QLATE estimate will then be the ratio of these two quantities. Intuitively it’s the jump in rescaled by the jump in . Note that only quantiles that have a significant jump are used to form estimates (since otherwise the denominator is 0).
A visual example using a made-up DGP similar to the paper’s main example is shown here for a given quantile . The left-figure shows the estimated jump in . The right-figure shows the estimated jump in at value of and . The ratio of these two estimates would be the estimated treatment effect and can be interpreted as the jump in per unit increase in .
I will use a set of simulated data as an illustration of the method
r
library(ContinuousRD)#> Loading required package: data.table# Load datadata(sim_data)head(sim_data)#> Key: <id>#> id u R Z T Y#> <int> <num> <num> <num> <num> <num>#> 1: 1 0.2002145 -0.71892650 0 0.6673815 0.6652777#> 2: 2 0.6852186 0.19781138 1 1.3139139 1.3141626#> 3: 3 0.9168758 0.03111523 1 1.4876568 1.4881608#> 4: 4 0.2843995 0.64044390 1 1.0132996 1.0165819#> 5: 5 0.1046501 -0.75736198 0 0.3488338 0.3474286#> 6: 6 0.7010575 0.06593254 1 1.3257931 1.3291840
The data consists of 10000 observations each of which is a unique unit.
The variable Y
is the outcome variable, T
is the continuous
treatment of interest, R
is the running variable with a cutoff of .
Positive values of the running variable are affected by the border, in
this case a binding lower bound on T
. The variable u
is the
(typically unobserved) percentile of each unit for the variable T
.
The below data shows the observed continuous treatment along the running
variable. The color of the dot represents the quantile of T
at that
value of R
. You can see a clear jump at the border where T
is
affected by the binding constraint.
r
library(ggplot2)ggplot(sim_data) +geom_point(aes(x = R, y = T, color = u),alpha = 0.4) +geom_vline(xintercept = 0, color = "red") +labs(x = "Running Variable",y = "Continuous Treatment") +theme_minimal()
However, this is only binding for units at the lower end of the
distribution. To see this more clearly, let’s look at multiple RDs at
different quantiles of T
.
r
library(rdrobust)low_q = sim_data[sim_data$u > 0 & sim_data$u < 0.2,]high_q = sim_data[sim_data$u > 0.5 & sim_data$R < 0.7,]rdrobust::rdplot(y = low_q$T, x = low_q$R,title = "Low quantile of T", y.label = "Continuous Treatment", x.label = "Running Variable")
r
rdrobust::rdplot(y = high_q$T, x = high_q$R,title = "High quantile of T", y.label = "Continuous Treatment", x.label = "Running Variable")
The data is simulated to have units with quantile below be
affected by the cutoff (the magnitude of the change in T
depends on
quantile u
) and those units experience a treatment effect of size .
Let’s see if we get that using the QLATE_bc_se
function:
r
res <- ContinuousRD::QLATE_bc_se(sim_data, qlist = seq(0.1, 0.9, by = 0.1), yname = "Y", tname = "T", rname = "R", c = 0)#> `qlist`: 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, and 0.9#> `tau_u`: 1, 1.001, NA, NA, NA, NA, NA, NA, and NA#> `SEtau`: 0.005, 0.018, NA, NA, NA, NA, NA, NA, and NA#> `tau_u_bc`: 0.992, 0.985, NA, NA, NA, NA, NA, NA, and NA#> `SEtau_bc`: 2.384, 7.942, NA, NA, NA, NA, NA, NA, and NA#> `pi`: 1.001#> `SEpi`: 0.004#> `pi_bc`: 0.991#> `SEpi_bc`: 0.359
In the printed results a value of NA
appears when no significant jump
in the treatment variable is found at the cutoff. In this case, since
the jump around the 30th percentile is near zero, it doesn’t show up in
the estimate. However, we see there the effect is around as expected
for the 10th and 20th percentile. The overall average effect pi
is
also approximately as well.