Print Close

Specific Source Machine Learning Score-based Likelihood Ratios for Forensic Evidence

Presented During: Data-Driven Justice: Transforming Forensic Science with Statistics, AI, and Data Science

Federico Veneri Co-Author
Iowa State University

Danica Ommen Speaker
Iowa State University

Monday, Aug 4: 11:55 AM - 12:15 PM
Topic-Contributed Paper Session

Music City Center

The specific source problem refers to a type of inference in forensic science where the aim is to assess if a particular source generated the evidence or if it was generated from an alternative, unknown source. Score-based likelihood ratios (SLR) quantify the relative likelihood of the evidence under both propositions for complex features. This analysis requires a conditional inference, but data for the specific source (e.g. control items related to the person of interest) is often scarce, making this approach practically infeasible. Furthermore, the dependence structure created by the current procedure for generating data for machine learning algorithms can lead to reduced performance of such SLR systems. To address this, we propose creating synthetic items to train machine learning algorithms for the specific source problem. Simulation results show that our approach achieves a high level of agreement with an ideal scenario where data is not a limitation and where the data are independent. We also present real-world applications in forensic sciences.

Keywords

forensics

random forest

SMOTE

resampling

data augmentation

handwriting