09 Machine Learning Utilization for Predicting US College Enrollment of Ethnically Marginalized Students

Anna Kye First Author
 
Anna Kye Presenting Author
 
Wednesday, Aug 7: 10:30 AM - 12:20 PM
1914 
Contributed Posters 
Oregon Convention Center 
In the context of declining national college enrollment rates over the past years, this study focuses on the increasingly competitive recruitment of students from marginalized ethnic backgrounds to promote diversity. This study utilized machine learning to analyze enrollment decision-making data, addressing budgeting uncertainties related to the enrollments of these students. The dataset, obtained from a Midwest urban non-profit 4-year private university, spanned seven years and included 53,240 students from marginalized ethnic backgrounds, with 49 features, to predict enrollment decisions. To mitigate multicollinearity and address the highly imbalanced nature of the enrollment decisions, the variance inflation factor and stratified 10-fold cross-validation were applied. Four machine learning models were evaluated using classification metrics-accuracy, sensitivity, specificity, precision, F-score, areas under the ROC, and PR curves-to determine the most effective for predicting student enrollments. The study's implications extend to the practical application of machine learning in managing enrollment and strategy development for a diverse student body in U.S. higher education.

Keywords

Machine Learning

Higher Education Enrollment

Diversity

Ethnically Marginalized Students 

Abstracts


Main Sponsor

Caucus for Women in Statistics