Enhancing OMOP Vocabulary Mapping with a Transformer-Based Semantic-Hierarchical Framework
Dian Zhou
Co-Author
University of Illinois Urbana-Champaign
Enshuo Hsu
Co-Author
University of Texas MD Anderson Cancer Center
Jiefei Wang
First Author
University of Texas Medical Branch
Jiefei Wang
Presenting Author
University of Texas Medical Branch
Monday, Aug 4: 11:05 AM - 11:20 AM
2288
Contributed Papers
Music City Center
Interoperability across EHR systems is a critical barrier to leveraging healthcare data for policy and research due to inconsistent medical terminologies. The OMOP Common Data Model (CDM) offers a standardized framework to harmonize data across platforms. However, traditional rule-based mapping is labor-intensive, which disproportionately impacts underserved hospitals with limited resources. Existing tools, such as USAGI, alleviate this burden by automating the mapping process, but they struggle with semantic complexity. For example, mapping "Leukemia" to its superclass "Hematologic neoplasm" requires understanding hierarchical relationships that go beyond surface-level text similarity.
In this talk, we propose a novel transformer-based model for automated OMOP terminology mapping that integrates OMOP's vocabulary structure and relational hierarchy. Two special tokens were added to guide the model's focus during training. This dual-task training approach captures ontology-based dependencies beyond surface-level semantics. Preliminary evaluation on the unseen CIEL vocabulary (condition domain) demonstrates improved accuracy and scalability compared to existing methods.
sentence transformer
OMOP Common Data Model
semantic similarity
hierarchical relationships
terminology mapping
healthcare data integration
Main Sponsor
Health Policy Statistics Section
You have unsaved changes.