A Guide to the OMOP Common Data Model

Share this article

The Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) is a standardized framework designed by the Observational Health Data Sciences and Informatics (OHDSI) community. This open-science community aims to improve the quality of healthcare by providing guidelines for a more harmonized approach to data science.

The utilization of disparate software systems across hospitals, each with their proprietary data models, poses a significant challenge to research studies and complicates data analysis. Even when the data is structured, the data models are often not compatible. The OMOP CDM addresses this challenge by offering a common standardized model for representing health-related data from diverse sources. This facilitates observational research and analysis across varied healthcare organizations.

OMOP CDM for data standardization

The data model is organized as a relational database structure. The data structure comprises tables representing various aspects of patient information, clinical events (such as diagnoses and procedures), and healthcare observations (such as vital signs and other clinical measurements). This relational structure allows a wide range of healthcare data to be represented in a standardized manner. The tables have fixed columns linked to each other to enable researchers to query and analyze data across multiple dimensions, with standardized vocabularies further enhancing consistency and interoperability.

An overview of the tables in the OMOP CDM v5.4*

Let’s look at a simplified example of a patient’s hospital encounter to illustrate the links between tables.

In the architecture below, a patient (PERSON) can have certain conditions (CONDITION_OCCURRENCE) and is hospitalized (VISIT_OCCURRENCE) to have surgery to cure one of these conditions. The procedure (PROCEDURE_OCCURRENCE) is performed by a clinician (PROVIDER) working in a specific healthcare facility (CARE_SITE). The observation (OBSERVATION) linked to the patient could be their BMI at a specific moment.

Simplified example of a patient’s hospital encounter

OMOP CDM for terminology

The OMOP CDM uses standardized vocabularies that ensure a systematic approach to representing clinical data. This fosters consistency and interoperability across diverse datasets.

The standardized vocabularies also serve as a comprehensive meta-code system, encompassing a wide array of medical terminologies. Key among these are globally recognized coding systems such as ICD-10, SNOMED-CT, and RxNorm. What sets this approach apart is the mapping of these terminologies to a unified set of standard concepts within the OMOP framework.

The OMOP CDM also supports the inclusion of local code systems. Organizations can retain their original coding systems for certain data elements while mapping them to standard concepts for broader interoperability.

OMOP CDM for federated research

The OMOP CDM excels in accommodating federated approaches, enhancing the flexibility and scalability of collaborative research initiatives. A federated model in healthcare data refers to the decentralized storage (e.g., within each hospital’s IT infrastructure) and processing of data across multiple independent sources, ensuring that each entity retains control over its data while enabling collaborative analyses.

The OMOP CDM’s federated capabilities stem from its inherent design, allowing the model to be implemented across various data infrastructures without centralization. Each participating organization or data partner can maintain its own instance of the OMOP CDM.

The federated nature of the OMOP CDM enables researchers to conduct analyses across disparate databases, gaining insights from diverse populations and healthcare contexts. This approach aligns with the principles of the community, promoting inclusivity and collaboration while respecting the autonomy and privacy of each contributing entity.

A simplified example of OHDSI federated data network model with the OMOP CDM **

OMOP CDM for customization

Customization within the OMOP CDM is possible but not desirable. While introducing additional tables or columns doesn’t inherently break the model, it will impact cross-organizational compatibility. Queries involving these extra fields will not function across multiple organizations, particularly in network studies where data standardization is essential.

Furthermore, ATLAS, a tool for designing and executing analyses on the OMOP CDM data, may not leverage these additional fields. Users should be aware that any analyses or studies

relying on these extra columns may encounter challenges in terms of interoperability and consistency, especially when collaborating with diverse datasets.

Despite these considerations, there are legitimate use cases for customization within the OMOP CDM. Organizations may have specific data elements crucial to their research or clinical context that the standard CDM does not cover. In such cases, customization allows for the incorporation of these unique elements, supporting a more comprehensive representation of their healthcare data.

If the standard OMOP CDM lacks certain data elements that hold significant importance within specific research or clinical contexts, users can add a proposal for their inclusion in newer versions of the model[1]. This flexibility allows for a continuous improvement of the CDM, ensuring that it remains adaptable to emerging needs and standards within the ever-evolving healthcare landscape.

OMOP CDM tooling

OHDSI provides a robust arsenal of open-source tools designed to support various data analytics use cases on observational patient-level data. These tools share a common thread, they interact seamlessly with databases using the CDM. The standardization introduced by these tools simplifies analyses, enhances reproducibility, and ensures transparency in healthcare data analytics.

OMOP CDM vs. FHIR

The OMOP CDM and FHIR (Fast Healthcare Interoperability Resources) are two standards that are often compared to each other: They are both widely used to structure healthcare data and can both represent similar medical concepts. FHIR is a standard for exchanging healthcare information between different systems in real-time, while the OMOP CDM is a standard for representing observational medical data in a common format for large-scale research purposes. FHIR and the OMOP CDM are not necessarily competing standards; they serve different purposes and can be used together to achieve specific healthcare data targets. A more detailed comparison between these two standards can be found in our blog post ‘How FHIR and OMOP CDM are competing toward Healthcare Data Interoperability.

OMOP CDM in Belgium

In Belgium, the OHDSI community has seen significant growth and engagement, culminating in the establishment of the Belgian OHDSI Node in 2023. It is supported by various stakeholders, including data partners and certified enterprises, and aims to align Belgium’s health data ecosystem. The Node builds awareness, facilitates collaboration, and promotes best practices through regular meetings, workshops, and a robust online presence. Follow the OHDSI Belgium LinkedIn page to stay tuned for upcoming events.



Sources: 
* https://ohdsi.github.io/CommonDataModel/
**Integrating real-world data from Brazil and Pakistan into the OMOP common data model and standardized health analytics framework to characterize COVID-19 in the Global South’, Journal of the American Medical Informatics Association, Volume 30(10234), October 2022
[1] https://www.ohdsi.org/2021-global-symposium-showcase-18/