Wednesday, Aug 6: 8:30 AM - 10:20 AM
0861
Topic-Contributed Paper Session
Music City Center
Room: CC-209A
Applied
Yes
Main Sponsor
Section on Bayesian Statistical Science
Co Sponsors
Section on Statistical Computing
Presentations
Estimating varying treatment effects in randomized trials with noncompliance is inherently challenging since variation comes from two separate sources: variation in the impact itself and variation in the compliance rate. In this setting, existing flexible machine learning methods are highly sensitive to the weak instruments problem, in which the compliance rate is (locally) close to zero. Our main methodological contribution is to present a Bayesian Causal Forest model for binary response variables in scenarios with noncompliance. By repeatedly imputing individuals' compliance types, we can flexibly estimate heterogeneous treatment effects among compliers. Simulation studies demonstrate the usefulness of our approach when compliance and treatment effects are heterogeneous. We apply the method to detect and analyze heterogeneity in the treatment effects in the Illinois Workplace Wellness Study, which not only features heterogeneous and one-sided compliance but also several binary outcomes of interest. We demonstrate the methodology on three outcomes one year after intervention. We confirm a null effect on the presence of a chronic condition, discover meaningful heterogeneity in the impact of the intervention on metabolic parameters though the average effect is null in classical partial effect estimates, and find substantial heterogeneity in individuals' perception of management prioritization of health and safety.
Combat dynamics in no-gi Brazilian jiu-jitsu (NGJJ) can be organized as sequences of discrete events. Absorbing Markov chains have been used to model sequence data with success. The individual sequences can have an attached label corresponding to a certain subgroups, such as weight categories for NGJJ. We aim to estimate the transition probabilities of the tactic progression in NGJJ, whilst simultaneously exploring the effects of weight categories on this progression and the underlying groupings of these weight categories. To model such probabilities we use a hierarchical absorbing Markov chain model with a random partition using the Ewens-Pitman attraction of Dahl, Day and Tsai. With this model we can cluster the weight categories to borrow strength and estimate the variation between them even with sparse data. The hierarchical structure gives us the ability to obtain an overall mean estimate to which we can compare the individual estimates of the weight categories. We cover the literature associated with combat sports, Markov chains, and similar hierarchical models, the methodology used, and discussion of the application.
Keywords
Sparse count data
dependent random partition models
Markov chain Monte Carlo
partial pooling
slice sampler
no-gi Brazilian jiu-jitsu
Confidential data, such as electronic health records, activity data from wearable devices, and geolocation data, are becoming increasingly prevalent. Differential privacy provides a framework to conduct statistical analyses while mitigating the risk of leaking private information. Compositional data, which consist of vectors with positive components that add up to a constant, have received little attention in the differential privacy literature. This article proposes differentially private approaches for analyzing compositional data based on the Dirichlet distribution. We explore several methods, including Bayesian and bootstrap procedures. For the Bayesian methods, we consider posterior inference techniques based on Markov chain Monte Carlo, Approximate Bayesian Computation, and asymptotic approximations. We conduct an extensive simulation study to compare these approaches and make evidence-based recommendations. Finally, we apply the methodology to a dataset from the American Time Use Survey.
Speaker
Yilin Zhu, University of Massachusetts Amherst