019 - Matching irregular electronic health record intensive care unit data to create an accurate and analyzable dataset
Conference: International Conference on Health Policy Statistics 2023
01/10/2023: 7:30 PM - 8:30 PM MST
Posters
Background
Electronic health record (EHR) data have been equated to the holy grail of health services research-offering comprehensive access to clinical data to answer research questions. This type of dataset is especially important in an intensive care setting, where ventilated subjects often utilize a number of life support and monitoring technologies which stream longitudinal data into the EHR.
Fortunately, much of these data are discretely structured as numerical data, along with units of measurement and a timestamp for when the measurement was taken, or a ventilator setting was recorded. However, these data are often collected at irregular intervals reflective of the dynamic nature of critically-ill patients whose monitoring and support needs often fluctuate. This irregularity presents a challenge when analysts use these data to reconstruct a clinical picture that can be leveraged to create an analyzable dataset; none of the times align where some measurements may be misaligned by seconds and others by hours.
Methodological Aim
Our team had to solve the problem of aligning a multitude of irregular, sparse data from subjects in an intensive care setting who were ventilated. To our knowledge, this problem has not been addressed for this clinical population-demonstrating the innovation of this methodological approach. While this method was employed in SAS (using PROC SQL), it could easily be used in other statistical software such as R or Stata.
Methods
We created a long file whereby for each subject a date/time entry was created for each minute of intubation-from the first to the last date/time of intubation, along with the subject identifier. For example, if a subject was intubated for 5 full days there would be 7,200 rows for that subject, containing two columns: subject identifier and a date/time. For each of the clinical measures we wished to add to the table, the original date/time of that measurement would be rounded to the nearest minute. Then the tables would be left-joined using PROC SQL by subject identifier and date/time.
After ample discussion with our clinical collaborators, we felt confident that the time between recorded measurements would be safely assumed to contain the same value until a change was documented-especially for ventilation parameters. Therefore, we utilized last observation carried forward to fill in the values in the table between observed measures for each variable-yielding a complete dataset. As an example, this allowed us to calculate and analyze the impact of a time-weighted mean ventilator setting on ICU outcomes, which in the past was operationalized as a single ventilator setting at 10 AM in prior clinical trials and observational studies.
Discussion
The result of this method yields a long table containing a minute-by-minute record of mechanical ventilation for each subject, with all available clinical measurements present. This allows the researcher to ask a multitude of research questions that would have previously been impossible, such as exposure times to lung-protective ventilation, identifying spontaneous breathing trials, identifying adherence to site-specific clinical pathways or clinical best practices. Full SAS code and synthetic example data will be provided to attendees to allow for learning.
How presentation adds to diversity mission of the conference:
Presenter diversity: I am a first-generation college graduate. I am also a service-disabled Veteran of the United States Coast Guard.
Presentation diversity: Presentations such as this, which are hands-on and immediately applicable, are especially helpful to junior researchers, and those venturing into the world of EHR data research-particularly in an intensive care setting. The details of how to think and work through these complex data issues are often omitted from published research but are essential to understand to help move our field forward in the current era of observational data.
Methodology
Electronic health records
EHR
Presenting Author
Daniel Brinton, Medical University of South Carolina
First Author
Daniel Brinton, Medical University of South Carolina
CoAuthor(s)
Annie Simpson, Medical University of South Carolina
Andrew Goodwin, Medical University of South Carolina
You have unsaved changes.