Optimal Data Splitting
Thursday, Aug 7: 8:35 AM - 9:00 AM
Invited Paper Session
Music City Center
It is common to split a dataset into a training set and a testing set for building statistical and machine learning models. In this talk, we will discuss about deterministic methods for optimally splitting the dataset. SPlit and Twinning are two such methods where the aim was to split the dataset with similar distributional characteristics. We will propose a new method for creating a testing set that not only maintains the distribution but also difficult to predict.
training set
testing set
validation
experimental design
You have unsaved changes.