Monday, Aug 4: 8:30 AM - 10:20 AM
4036
Contributed Papers
Music City Center
Room: CC-205C
Main Sponsor
Business and Economic Statistics Section
Presentations
The analysis of panel data via linear regression is ubiquitous across disciplines. However, standard confidence intervals typically assume that the residuals are cluster-independent. This paper introduces a method called the mosaic permutation test that can be used to (a) test this assumption and (b) weaken it. We elaborate on these contributions below.
Testing: Our method can use flexible machine learning techniques to detect violations of the cluster-independence assumption while exactly controlling false positives under a mild "local exchangeability" condition. To illustrate our method, we survey the literature and assess whether cluster-independence assumptions are accurate.
Inference: Our method produces confidence intervals for linear models that are (i) finite-sample valid under a local exchangeability assumption and (ii) asymptotically valid under the cluster-independence assumption. In short, our method is valid under assumptions that are strictly weaker than classical methods. Experiments on real, randomly selected datasets from the literature show that many existing standard errors are up to ten times too small, whereas mosaic methods produce reliable results.
Keywords
Panel data
Permutation tests
Linear regression
Semiparametric models
Hypothesis tests
We provide pairwise-difference (Gini-type) representations of higher-order central moments for both general random variables and empirical moments. Such representations do not require a measure of location. For third and fourth moments, this yields pairwise-difference representations of skewness and kurtosis coefficients. We show that all central moments possess such representations, so no reference to the mean is needed for moments of any order. This is done by considering i.i.d. replications of the random variables considered, by observing that central moments can be interpreted as covariances between a random variable and powers of the same variable, and by giving recursions which link the pairwise-difference representation of any moment to lower order ones. Numerical summation identities are deduced. Finally, through a similar approach, we give analogues of the Lagrange and Binet-Cauchy identities for general random variables, along with a simple derivation of the classic Cauchy-Schwarz inequality for covariances.
Keywords
Moments
Covariance
Skewness
Kurtosis
Gini
Lagrange identity
This paper examines the consequences of model misspecification when data are generated from a trivariate probit model that accounts for recursive dependencies and sample selection. We investigate the estimation bias that arises under different model specifications, either ignoring recursive structures, sample selection, or both. In addition to providing theoretical results, we conduct Monte Carlo simulations to quantify the bias magnitude, not only in the parameters associated with the explanatory variables in the three equations, but also in the correlation parameters of the corresponding error terms. We highlight the risk of misinterpreting these correlation parameters, which could lead to invalid conclusions about the potential presence of selection bias. Our findings emphasize the importance of careful model specification in applications involving multiple binary outcomes, where selection bias and recursive structures can play an important role in shaping the results.
Keywords
trivariate probit models
sample selection
recursive models
Simulated Maximum Likelihood estimation
This paper investigates how mis-specification of the short memory dynamics affects estimation and prediction in a fractionally integrated model with an unknown mean. We derive the limiting distributions of three parametric estimators, namely exact Whittle, time-domain maximum likelihood, and conditional sum of squares, under common mis-specification of the short memory dynamics. We show that, given a consistent estimator of the mean, these three estimators converge to the same pseudo-true value and their asymptotic distributions are identical to those of the frequency domain maximum likelihood and discrete Whittle estimators, which are mean invariant. We analyze the properties of a linear predictor under mis-specification, demonstrating that it is biased unless the true mean is zero and that mean squared forecast error depends on the true and pseudo-true fractional differencing parameter. To support our theoretical findings, we conduct an extensive numerical exploration of these estimation methods. Our simulations reveal that the DWH estimator performs best in terms of bias and mean squared error and provides superior forecast accuracy when combined with the sample mean.
Keywords
conditional sum of squares
linear predictor
long memory model
maximum likelihood
mis-specification
pseudo-true value
In a clustered interference setting, with networks collected within clusters and no interference between clusters, we introduce a general causal estimand for conditional spillover effects, offering flexible ways of integrating unit-to-unit spillover effects. Such estimand enables to access the heterogeneity of a unit's spillover effect on their neighbors with respect to the unit's characteristics. Two weighted regression-based estimators are proposed: i) at the individual level, taking neighbors' averages either in the outcomes or in the treatments within weights; and ii) at the dyadic level, where the outcome of one unit is regressed on the treatment of each neighbor. When covariates driving the heterogeneity are categorical, we prove the equivalence of the two regression-based estimators to the non-parametric Hajek estimator. For continuous covariates, we demonstrate that both estimators consistently estimate the proposed estimands. Under a design-based perspective, we derive HAC variance estimators and establish the central limit theorem. We then apply our methods to a randomized experiment conducted in Honduras to evaluate the spillover effect of a behavioral intervention.
Keywords
causal inference in networks
design-based causal inference
Staggered rollout cluster randomized experiments (SR-CREs) are increasingly used for their practical feasibility and logistical convenience. These designs involve staggered treatment adoption across clusters, requiring analysis methods that account for an exhaustive class of dynamic causal effects, anticipation, and non-ignorable cluster-period sizes. Without imposing outcome modeling assumptions, we study regression estimators using individual data, cluster-period averages, and scaled cluster-period totals, with and without covariate adjustment from a design-based perspective, where only the treatment adoption time is random. We establish consistency and asymptotic normality of each regression estimator under a finite-population framework and formally prove that the associated variance estimators are asymptotically conservative in the Lowner ordering. Furthermore, we conduct a unified efficiency comparison of the estimators and provide practical recommendations. We highlight the efficiency advantage of using estimators based on scaled cluster-period totals with covariate adjustment over their counterparts using individual-level data and cluster-period averages.
Keywords
Covariate adjustment
Causal inference
cluster-robust variance estimator
design-based inference
heteroskedasticity-consistent variance estimator
finite-population central limit theorem
Co-Author
Fan Li, Yale School of Public Health
First Author
Xinyuan Chen, Mississippi State University
Presenting Author
Xinyuan Chen, Mississippi State University
Google Cloud uses A/B testing for launch decisions, relying on A/A tests to validate the A/B testing infrastructure. A key metric is initial page load latency, or the amount of time it takes each page to load all elements from start to finish. A series of A/A experiments revealed unexpectedly high false discovery rates (FDR) at the page-path level, even after applying common corrections such as Bonferroni adjustment. Drawing from genomics methodologies, we derived a new significance threshold using permutation tests. We randomly assigned users to "treatment" and "control" groups, calculated p-values for the 75th percentile latency nonparametrically, sorted all p-values, recorded the 1,000 smallest, and repeated this 10,000 times. This yielded the minimum p-value where the cumulative distribution function approached 0.05, returning FDRs to expected levels. We also evaluated the trade-off between significance thresholds and power by injecting hypothetical lifts. This solution was implemented in Google's internal A/B experiment tools.
Keywords
online A/B experimentation
false discovery rate (FDR)
permutation testing
power analysis
Google Cloud
multiple comparisons