The didnpreg command contains tools for computing both heterogenous and average treatment effects for the treated in a model-free differences-in-differences framework.

didnpreg(...)

# S3 method for formula
didnpreg(
  formula,
  data = stop("argument 'data' is missing"),
  subset,
  bwmethod = "opt",
  boot.num = 399,
  TTx = "TTa",
  print.level = 1,
  digits = 4,
  cores = 1,
  seed = 17345168,
  ...
)

# S3 method for default
didnpreg(
  outcome,
  regressors,
  id,
  time,
  treatment,
  treatment_period,
  weights = NULL,
  bwmethod = "opt",
  boot.num = 399,
  TTx = "TTa",
  print.level = 1,
  digits = 4,
  cores = 1,
  seed = 17345168,
  ...
)

Arguments

formula

an object of class formula (or one that can be coerced to that class): a symbolic description of the model. The details of model specification are given under `Details'

data

name of the data frame; must be specified if the 'formula' method is used

subset

NULL, optional subsample of 'data'

bwmethod

bandwidth type. 2 options can be specified. "opt" is the default option, the plug-in is rule of thumb for continuous and basic for categorical. "CV" will trigger calculating cross-validated bandwidths.

boot.num

399,

TTx

Conditional Treatment Effect on the Treated. Default is FALSE.

print.level

the level of printing; larger number implies more output is printed. Default is 1. 0 suppresses all printing.

cores

Integer specifies the number of cores to be used for parallel computation.

seed

integer used for the random number generation for the replication purposes. Default is 17345168.

outcome

a vector, matrix, or data frame of length \(NT\). The outcome can be a continuous or dummy variable.

regressors

a data frame with \(NT\) rows that contains regressors. A data frame class is required to identify the type/class of each regressor.

id

a vector, matrix, or data frame of length \(NT\) that identifies the unit of observation.

time

a vector, matrix, or data frame of length \(NT\) that specifies in which period id is observed.

treatment

a vector, matrix, or data frame of length \(NT\) with zeros for the control and ones for the treated observations.

treatment_period

a vector, matrix, or data frame of length \(NT\) with zeros for the period before treatment and ones for the period of treatment and after.

weights

NULL,

TTb

Unconditional Treatment Effect on the Treated. TTb estimates by averaging over all treated. TTa estimates by averaging over treated one time period after the treatment. Depending on the sample, calcularing TTb may take some time. Default is FALSE.

Value

didnpreg returns a list containing:

NTTotal number of observations
esampleA vector of TRUE/FALSE values identifying observations used in estimation. Relevant for the 'formula' method but complete cases will also be checked in the matrix method
sample1A vector of TRUE/FALSE values identifying treated observations.
sample11A vector of TRUE/FALSE values identifying treated observations right after the treatement
sample10A vector of TRUE/FALSE values identifying treated observations just before the treatment
sample01A vector of TRUE/FALSE values identifying observations in control group right after the treatement
sample00A vector of TRUE/FALSE values identifying observations in control group just before the treatement
regressor.typeA vector of length 3 with number of continuous, unordered categorical, and ordered categorical regressors.
bwmethodbandwidth type
bw.timeTime in seconds it took to calculate bandwidths. For bandwidth type "opt" is 0.
bwsData frame with variable names, type of the regressor and bandwidths.
boot.timeTime in seconds it took to bootstrap the standard errors.
boot.numNumber of bootstrap replications.
bw11Bandwidths calculated for the sample of treated right after the treatment.
bw10Bandwidths calculated for the sample of treated just before the treatment.
bw01Bandwidths calculated for the sample of of observations in control group right after the treatment.
bw00Bandwidths calculated for the sample of observations in control group just before the treatement
do.TTbTRUE/FALSE whether to perform TTb
TTa.positions.in.TTbPositions of TTa observations in TTb. Only if do.TTb
TTathe DiD estimator of the avarage unconditional TT
TTa.ithe DiD estimators of the unconditional TT
TTbthe DiD estimator of the avarage unconditional TT
TTb.ithe DiD estimators of the unconditional TT
TTa.sdthe standard error of the DiD estimator of the avarage unconditional TT
TTb.sdthe standard error of the DiD estimator of the avarage unconditional TT
TTxthe DiD estimators of the conditional TT (also known as CATET)
TTa.i.bootMatrix of the size \(n_{11} \times boot.num\)
TTb.i.bootMatrix of the size \(n_{1} \times boot.num\)

Details

The formula shell contain multiple parts separated by '|'. An example is

form1 <- y ~ x1 + x2 | id | time | treatment | treatment_period | weights

weights can be omitted if not available

form1 <- y ~ x1 + x2 | id | time | treatment | treatment_period

References

... (...). This. Journal of , 1(1), 1-1 https://doi.org/10.

Author

Oleg Badunenko oleg.badunenko@brunel.ac.uk,

Daniel J. Henderson djhender@cba.ua.edu,

Stefan Sperlich stefan.sperlich@unige.ch

Examples

if (FALSE) {
  data(DACAsub, package = "didnp")
  # will get a data frame 'DACAsub' with 330106 rows and 18 columns

  # get the subsample
  DACAsub$mysmpl <- mysmpl <-
    DACAsub$a1922==1 & !is.na(DACAsub$a1922) &
    DACAsub$htus==1 & !is.na(DACAsub$htus)

  # generate 'treatment_period'
  DACAsub$treatment_period <- ifelse(DACAsub[,"year"]>2011,1,0)

  # define formula with the weight
  form1 <- inschool ~ fem + race + var.bpl + state + age + yrimmig +
    ageimmig | inschool | year | elig | treatment_period | perwt

  # or without the weight
  form11 <- inschool ~ fem + race + var.bpl + state + age + yrimmig +
    ageimmig | inschool | year | elig | treatment_period

  ## Syntax using formula
  # suppress output
  tym1a <- didnpreg(
    form1,
    data = DACAsub,
    subset = mysmpl,
    bwmethod = "opt",
    boot.num = 399,
    TTb = FALSE,
    print.level = 0,
    cores = 4)

  # Print the summary
  summary(tym1a)

  ## Use CV bandwidths
  tym1aCV <- didnpreg(
    form1,
    data = DACAsub,
    subset = mysmpl,
    bwmethod = "CV",
    boot.num = 399,
    TTb = FALSE,
    print.level = 1,
    cores = 4)

  # Print the summary
  summary(tym1aCV)

  ## Calculate also TTb (will take longer)
  tym1bCV <- didnpreg(
    form1,
    data = DACAsub,
    subset = mysmpl,
    bwmethod = "CV",
    boot.num = 399,
    TTb = TRUE,
    print.level = 1,
    cores = 4)

  # Print the summary
  summary(tym1bCV)

  ## Syntax using matrices

  tym1aM <- didnpreg(
    outcome = DACAsub[mysmpl,"inschool"],
    regressors = DACAsub[mysmpl,c("fem", "race", "var.bpl", "state", "age", "yrimmig", "ageimmig")],
    id = DACAsub[mysmpl,"inschool"],
    time = DACAsub[mysmpl,"year"],
    treatment = DACAsub[mysmpl,"elig"],
    treatment_period = ifelse(DACAsub[mysmpl,"year"]>2011,1,0),
    weights = DACAsub[mysmpl,"perwt"],
    bwmethod = "opt",
    boot.num = 399,
    TTb = FALSE,
    print.level = 1,
    cores = 4)

  # Print the summary
  summary(tym1aM)

}