Building blocks of Efficient Initial Data Analysis and Data Quality Assessments – Best practice examples

Carsten Oliver Schmidt Speaker
University Medicine Greifswald
 
Wednesday, Aug 6: 10:30 AM - 12:20 PM
Topic-Contributed Paper Session 
STRATOS Topic Group 3 (TG3) deals with all assessment steps performed on the data of a study between the end of the data collection/entry and start of those statistical analyses that address research questions —referred to as Initial Data Analysis (IDA). Deficiencies in these preliminary steps may lead to the application of inappropriate statistical methods or incorrect conclusions. Consequently, TG3 develops guidance for systematically planning and conducting IDA1.
This presentation will first discuss the rationale for incorporating IDA into statistical analysis plans and outline how to effectively integrate it. Second, it will provide an overview of best practices for conducting IDA in the context of regression-type analyses2,3. Third, at the intersection of IDA and data quality assessment (DQA), approaches to data handling will be introduced to facilitate systematic data evaluation and checking1 4. The talk will highlight the critical role of effective metadata management—specifically, the structured annotation of knowledge about the data—in supporting both IDA and DQA.

1. Huebner M, le Cessie S, Schmidt CO, Vach W. A Contemporary Conceptual Framework for Initial Data Analysis. Observational Studies 2018;4:171-92.
2. Heinze G, Baillie M, Lusa L, et al. Regression without regrets -initial data analysis is a prerequisite for multivariable regression. BMC Med Res Methodol 2024;24:178.
3. Lusa L, Proust-Lima C, Schmidt CO, et al. Initial data analysis for longitudinal studies to build a solid foundation for reproducible analysis. PLoS ONE 2024;19:e0295726.
4. Struckmann S, Marino J, Kasbohm E, Salogni E, Schmidt CO. dataquieR 2: An updated R package for FAIR data quality assessments in observational studies and electronic health record data. . Journal of Open Source Software 2024;9:6581.

Keywords

Initial data analysis

Data quality

Regression modelling

Metadata