Bell Eapen MD, PhD.

Bringing Digital health & Gen AI research to life!

OHDSI OMOP CDM ETL Tools in Python, .Net and Go

TL;DR Here are few OHDSI OMOP CDM tools that may save you time if you are developing ETL tools!

Python: pyomop | pypi
.NET: omopcdmlib | NuGet
Golang: gocdm

OHDSI OMOP CDM Libraries

The COVID-19 pandemic brought to light many of the vulnerabilities in our data collection and analytics workflows. Lack of uniform data models limits the analytical capabilities of public health organizations and many of them have to re-invent the wheel even for basic analysis. As many other sectors embrace big data and machine learning, many healthcare analysts are still stuck with the basic data wrenching with Excel.

The OHDSI OMOP CDM (Common data model) for observational data is a popular initiative for bringing data into a common format that allows for collaborative research, large-scale analytics, and sharing of sophisticated tools and methodologies. Though OHDSI OMOP CDM is primarily for patient-centred observational analysis, mostly for clinical research, it can be used with minor tweaks for public health and epidemiologic data as well. We have written about some of the technical details here.

The OHDSI OMOP CDM is relatively simple and intuitive for clinical teams than emerging standards such as FHIR. Though the relational database approach and some of the software tools associated with OHDSI OMOP CDM are a bit old-fashioned, the data model is clinically motivated. There is an ecosystem of software tools for many of the analytics tools that can be used out of the box. The Observational Medical Outcomes Partnership (OMOP) CDM, now in its version 6.0, has simple but powerful vocabulary management. OHDSI OMOP CDM is a good choice for healthcare organizations moving towards health data warehousing and OLAP.

One weakness of OHDSI is the lack of tools for efficient ETL from existing EHR and HIS. Converting existing EHR data to the CDM is still a complex task that requires technical expertise. During the additional “home time” during the COVID pandemic, I have created three software libraries for ETL tool developers. These libraries in Python, .NET and Golang encapsulated the V6.0 CDM and helps in writing and reading data from a variety of databases with the V6.0 tables. The libraries also support creating the CDM tables for new databases and loading the vocabulary files.

Python: pyomop | pypi
.NET: omopcdmlib | NuGet
Golang: gocdm

These libraries might save you some time if you are building scripts for ETL to CDM. They are all open-source and free to use in your tools. Do give me a shout if you find these libraries useful and please star the repositories on GitHub.

OSCAR in a BOX – Virtualized and fault-tolerant OSCAR EMR

TL;DR: OSCAR in a BOX is a fault-tolerant OSCAR instance that you can use out of the box and is virtually maintenance-free!

Image credit DarkoStojanovic @ Pixabay

OSCAR EMR is an open-source Electronic Medical Record (EMR) for Canadian family physicians. The official OSCAR repository is available here: https://bitbucket.org/oscaremr/

OSCAR is a spring java application deployed in a tomcat container with a MySQL database backend. OSCAR project being relatively old, with few users outside Canada, has struggled to keep pace with the developments in the electronic health records domain. However, OSCAR is still useful and popular among family physicians and some public health organizations as it is free and well supported.

Oscar is known for its support for the billing workflow, data collection forms (eForms) and comprehensive patient charts (eCharts). Some of the limitations of OSCAR include lack of scalability beyond a handful of users and limited support for data analytics. Oscar by design is hard to be virtualized as a docker container. Availability of a docker container is crucial for sustainable and fault-tolerant deployment on the cloud and distributed systems such as Kubernetes.

Docker is the world’s leading software container platform, used mostly for DevOps. Docker is also useful for developers to set up a development environment in a few easy steps. I was one of the first few who worked on virtualizing OSCAR. Thanks to all those who forked (and hopefully used) this repository.

I have continued my work on OSCAR docker container and have been successful in creating a (reasonably stable) container. It is now available on docker hub. I am now working on a fault-tolerant deployment of OSCAR in customized hardware. I (and some of my friends who know about and encouraged this project) call it OSCAR in a BOX! It has multiple instances of OSCAR with each instance capable of self-healing when a JAVA process hangs (fairly common for OSCAR). The database is replicated, and both the database and documents incrementally back up to an additional disk.

OSCAR in a BOX is ideal for family physicians who wish to adopt OSCAR but do not have the technical support for maintaining the system. OSCAR in a BOX is plug-and-play and is virtually maintenance-free. The virtualization workflow will also be useful for existing bigger user groups reeling under the sluggish pace of OSCAR. Please let me know if anybody is interested in collaborating.

BTW, did you check out Drishti?

Dockerized OSCAR EMR for developers

I have created a simple docker-compose script to set up Oscar for developers. The script checks out the master branch from OSCAR repository, compile with maven, create Docker containers and deploy them.