Contributed Poster Presentations: Caucus for Women in Statistics
Ryan Peterson
Chair
University of Colorado - Anschutz Medical Campus
Wednesday, Aug 7: 10:30 AM - 12:20 PM
6058
Contributed Posters
Oregon Convention Center
Room: CC-Hall CD
Main Sponsor
Caucus for Women in Statistics
Presentations
In the context of declining national college enrollment rates over the past years, this study focuses on the increasingly competitive recruitment of students from marginalized ethnic backgrounds to promote diversity. This study utilized machine learning to analyze enrollment decision-making data, addressing budgeting uncertainties related to the enrollments of these students. The dataset, obtained from a Midwest urban non-profit 4-year private university, spanned seven years and included 53,240 students from marginalized ethnic backgrounds, with 49 features, to predict enrollment decisions. To mitigate multicollinearity and address the highly imbalanced nature of the enrollment decisions, the variance inflation factor and stratified 10-fold cross-validation were applied. Four machine learning models were evaluated using classification metrics-accuracy, sensitivity, specificity, precision, F-score, areas under the ROC, and PR curves-to determine the most effective for predicting student enrollments. The study's implications extend to the practical application of machine learning in managing enrollment and strategy development for a diverse student body in U.S. higher education.
Keywords
Machine Learning
Higher Education Enrollment
Diversity
Ethnically Marginalized Students
Abstracts
Ranks of institutes are often estimated based on estimates of certain latent features of the institutes, and due to sample randomness it is of interest to quantify the uncertainty associated with the estimated ranks. This task is especially important in often-seen ``near tie'' situations in which the estimated latent features are not well separated among some of the institutes resulting in a nonignorable portion of wrongly ordered estimated ranks. Uncertainty quantification can help mitigate some of the issues and give us a fuller picture, but the task is very challenging because the ranks are discrete parameters and the standard inference methods developed under regularity conditions do not apply. Bayesian methods are sensitive to prior choices while large sample-based methods do not work since the central limit theorem fail to hold for the estimated ranks. In this article, we propose a repro Samples Method to address this nontrivial irregular inference problem by developing a confidence set for the true rank of the institutes. The confidence set obtained has finite sample coverage guarantee and the method can handle difficult near tie cases. The effectiveness of the proposed de
Keywords
Inference on discrete parameter space
Finite-sample performance guarantee
Discrete parameter space
irregular inference problem
Abstracts
You have unsaved changes.