Bell Eapen MD, PhD.

Bringing Digital health & Gen AI research to life!

Kedro for multimodal machine learning in healthcare 

Healthcare data is heterogenous with several types of data like reports, tabular data, and images. Combining multiple modalities of data into a single model can be challenging due to several reasons. One challenge is that the diverse types of data may have different structures, formats, and scales which can make it difficult to integrate them into a single model. Additionally, some modalities of data may be missing or incomplete, which can make it difficult to train a model effectively. Another challenge is that different modalities of data may require different types of pre-processing and feature extraction techniques, which can further complicate the integration process. Furthermore, the lack of large-scale, annotated datasets that have multiple modalities of data can also be a challenge. Despite these challenges, advances in deep learning, multi-task learning and transfer learning are making it possible to develop models that can effectively combine multiple modalities of data and achieve reliable performance. 

Kedro for multimodal machine learning in healthcare 
Pipelines Kedro for multimodal machine learning

Kedro for multimodal machine learning

Kedro is an open-source Python framework that helps data scientists and engineers organize their code, increase productivity and collaboration, and make it easier to deploy their models to production. It is built on top of popular libraries such as Pandas, TensorFlow and PySpark, and follows best practices from software engineering, such as modularity and code reusability. Kedro supplies a standardized structure for organizing code, handling data and configuration, and running experiments. It also includes built-in support for version control, logging, and testing, making it easy to implement reproducible and maintainable pipelines. Additionally, Kedro allows to easily deploy the pipeline on cloud platforms like AWS, GCP or Azure. This makes it a powerful tool for creating robust and scalable data science and data engineering pipelines. 

I have built a few kedro packages that can make multi-modal machine learning easy in healthcare. The packages supply prebuilt pipelines for preprocessing images, tabular and text data and build fusion models that can be trained on multi-modal data for easy deployment. The text preprocessing package currently supports BERT and CNN-text models. There is also a template that you can copy to build your own pipelines making use of the preprocessing pipelines that I have built. Any number and combination of data types are supported. Additionally, like any other kedro pipeline, these can be deployed on kubeflow and VertexAI. Do comment below if you find these tools useful in your research. 

Link to the repository below.

Dark Mode

kedro-multimodal (this link opens in a new window) by dermatologist (this link opens in a new window)

Template for multi-modal machine learning in healthcare using Kedro. Combine reports, tabular data and image using various fusion methods.

OHDSI OMOP CDM ETL Tools in Python, .Net and Go

TL;DR Here are few OHDSI OMOP CDM tools that may save you time if you are developing ETL tools!

Python: pyomop | pypi
.NET: omopcdmlib | NuGet
Golang: gocdm

OHDSI OMOP CDM Libraries

The COVID-19 pandemic brought to light many of the vulnerabilities in our data collection and analytics workflows. Lack of uniform data models limits the analytical capabilities of public health organizations and many of them have to re-invent the wheel even for basic analysis. As many other sectors embrace big data and machine learning, many healthcare analysts are still stuck with the basic data wrenching with Excel.

The OHDSI OMOP CDM (Common data model) for observational data is a popular initiative for bringing data into a common format that allows for collaborative research, large-scale analytics, and sharing of sophisticated tools and methodologies. Though OHDSI OMOP CDM is primarily for patient-centred observational analysis, mostly for clinical research, it can be used with minor tweaks for public health and epidemiologic data as well. We have written about some of the technical details here.

The OHDSI OMOP CDM is relatively simple and intuitive for clinical teams than emerging standards such as FHIR. Though the relational database approach and some of the software tools associated with OHDSI OMOP CDM are a bit old-fashioned, the data model is clinically motivated. There is an ecosystem of software tools for many of the analytics tools that can be used out of the box. The Observational Medical Outcomes Partnership (OMOP) CDM, now in its version 6.0, has simple but powerful vocabulary management. OHDSI OMOP CDM is a good choice for healthcare organizations moving towards health data warehousing and OLAP.

One weakness of OHDSI is the lack of tools for efficient ETL from existing EHR and HIS. Converting existing EHR data to the CDM is still a complex task that requires technical expertise. During the additional “home time” during the COVID pandemic, I have created three software libraries for ETL tool developers. These libraries in Python, .NET and Golang encapsulated the V6.0 CDM and helps in writing and reading data from a variety of databases with the V6.0 tables. The libraries also support creating the CDM tables for new databases and loading the vocabulary files.

Python: pyomop | pypi
.NET: omopcdmlib | NuGet
Golang: gocdm

These libraries might save you some time if you are building scripts for ETL to CDM. They are all open-source and free to use in your tools. Do give me a shout if you find these libraries useful and please star the repositories on GitHub.

Serverless on FHIR: Management guidelines for the semi-technical clinician!

Serverless is the new kid on the block with services such as AWS Lambda, Google Cloud Functions or Microsoft Azure Functions. Essentially it lets users deploy a function (Function As A Service or FaaS) on the cloud with very little effort. Requirements such as security, privacy, scaling, and availability are taken care of by the framework itself. As healthcare slowly yet steadily progress towards machine learning and AI, serverless is sure to make a significant impact on Health IT. Here I will explain serverless (and some related technologies) for the semi-technical clinicians and put forward some architectural best practices for using serverless in healthcare with FHIR as the data interchange format.

artificial intelligence
Serverless on FHIR

Let us say, your analyst creates a neural network model based on a few million patient records that can predict the risk for MI from BP, blood sugar, and exercise. Let us call this model r = f(bp, bs, e). The model is so good that you want to use it on a regular basis on your patients and better still, you want to share it with your colleagues. So you contact your IT team to make this happen.

This is what your IT guys currently do: First, they create a web application that can take bp, bs and e as inputs using a standard interface such as REST and return r. Next, they rent a virtual machine (VM) from a cloud provider (such as DigitalOcean). Then they convert this application into a container (docker) and deploy it in the VM. You now can use this as an application from your browser (chrome) or your EMR (such as OpenMRS or OSCAR) can directly access this function. You can share it with your colleagues and they can access it in their browsers and you are happy. The VM can support up to 3 users at a time.

In a couple of months, your algorithm becomes so popular that at any one time hundreds of users try to access it and your poor VM crashes most of the time or your users have to wait forever. So you call your IT guys again for a solution. They make 100 copies of your container, but your hospital is reluctant to give you the additional funding required.

Your smart resident notices that your application is being used only in the morning hours and in the night all the 100 containers are virtually sleeping. This is not a good use of the funding dollars. You contact your IT guys again, and they set up Kubernetes for orchestrating the containers according to usage. So, what is Serverless? Serverless is a framework that makes all these so easy that you may not even need your IT guys to do this. (Well, maybe that is an exaggeration)

My personal favourite serverless toolset (if you care) is Kubernetes + Knative + riff. I don’t try to explain what the last two are or how to use them. They are so new that they keep changing every day. In essence, your IT team can complete all the above tasks with few commands typed on the command line on the cloud provider of your choice. The application (function rather) can even scale to zero! (You don’t pay anything when nobody uses it and add more containers as users increase, scaling down in the night as in your case).

Best Practices

What are the best practices when you design such useful cloud-based ‘functions’ for healthcare that can be shared by multiple users and organizations? Well, here are my two cents!

First, you need a standard for data exchange. As JSON is the data format for most APIs, FHIR wins hands down here.

Next, APIs need a mechanism to expose their capabilities and properties to the world. For example, r = f(bp, bs, e) needs to tell everyone what it accepts (bp, bs, e) and what it returns (at the bare minimum). FHIR has a resource specifically for this that has been (not so creatively) named as an Endpoint. So, a function endpoint should return a FHIR Endpoint resource with information about itself if there is no payload.

What should the payload be? Payload should be a FHIR Bundle that has all the FHIR Resources that the function needs (bp, bs and e as FHIR Observations in your case). The bundle should also include a FHIR Subscription resource that points to the receiving system (maybe your EMR) for the response ( r ).

So, what next?

Take the phone and call your IT team. Tell them to take
Kubernetes + Knative + riff for a spin! I might do the same and if I do, I will share it here. And last but not the least, click on the blue buttons below! 🙂

10 points to consider before adopting open-source software in eHealth

Open-source software (hereafter OSS) is a phenomenon that has revolutionized the software industry. OSS is supported by voluntary programmers, who regularly dedicate their time and energy for the common good of all. The question that immediately comes to mind is how is it sustainable? Will they continue to contribute their social hours forever? Read the programmers perspective here. But does it make sense for healthcare organizations to accept their charity always? And, how do these organizations that adopt OSS improve the sustainability of these projects? These are some of the factors to consider:

artificial intelligence

Do you have enough funding?

OSS supporters are humanists with an emancipatory worldview. OSS is fundamentally not designed for an organization that can sustain a paid product. Firstly, there is the ethical problem of exploiting the OSS community. But more importantly, healthcare organizations with enough funding tend to spend more on the long-term maintenance and customization of OSS. Hence, OSS is generally designed to be an option when you have no other option.

Does the project have a regional focus?

OSS projects generally aim to solve global problems. So be careful when you hear Canadian OSS or Danish OSS. Regional OSS is mostly just cheaper local products masquerading as OSS for funding or for other reasons. They are unlikely to have the support of the global OSS community and is prone to burnout.

Is the OSS really OSS?

Any OSS worth its salt will be on GitHub. If you cannot find the project on GitHub, you should definitely ask why.

Is it really popular?

Some OSS that masquerade as OSS claim that they have a worldwide network of developers. The GitHub stars and forks would be a reasonable indicator of the popularity. Consider an OSS for your organization only if it has a thousand stars on the GitHub sky.

Are you looking for a specific workflow support?

Is your workflow generic enough to be supported by a global network of volunteers? Well, OHIP billing workflow may not be the right process to seek OSSM support.

Do you need customization?

If you need a lot of customizations to support your workflow, then OSS may not be the ideal solutions. OSS is ideally suited for situations where you can use it out of the box.

Do you have the time?

Remember that OSS is supported by voluntary programmers. So if you need a feature, you make a request and wait. If your organization is used to demanding, then OSS is not for you. OSS project is not owned by anyone, so their priorities may be different from yours.

Do you have internal expertise?

It is far easier to use an OSS if you have someone supporting the project in your organization. OSS community tends to respect one of their own more than an organization.

Supporting Open-Source Software?

It is crucial for organizations that depend on an OSS for your day to day operations to support the project. If the project becomes unsustainable, it affects the organization too. You can support the project in many ways such as donations, coding support and infrastructural support.

Do you know what OSS means and stands for?

Does the higher management know what OSS means and stands for? It is common in healthcare organizations to adopt OSS focusing on the free aspect.

“Free software” is a matter of liberty, not price. To understand the concept, you should think of “free” as in “free speech,” not as in “free beer”.

Personally, I think the first point is the most important. OSS is designed and intended for use in areas where a paid option is not viable. In other scenarios in healthcare, you are likely to spend more for an open-source product than you spend for a regular product.

Finally, a quick mention of some noteworthy OSS in healthcare. OpenMRS is an open-source EMR started with the mission to improve healthcare delivery in resource-constrained environments. DHIS2 is web-based open-source public health information system with awesome visualization features including GIS, charts and pivot tables.

Intelligent Federated Model of Health Information Exchange Clinical Viewers IFCV

I have blogged about federated search clinical viewers before. Essentially such viewers query source health information systems in real time and provide the user with a consolidated view. There is no data repository, ensuring data integrity and data privacy. Though the system can be slow because of the real-time search, there are many successful regional implementations of this type.

There is an obvious disadvantage for federated model of health information exchange viewers. Intelligence cannot be built into such viewers. Since there is no server side data storage, there is no scope for server side processing. The data comes together only in the rendered view. Some  mixed systems that have a data-repository provide some crude warning flags that are not-real time. But clinically useful alerts are beyond the capabilities of federated systems. So how do we build intelligent federated clinical viewers?

Intelligent Federated Clinical Viewer #IFCV

As HTML5 specifications mature, intelligence can be built into clinical viewers by client side processing and storage. Though local database storage implemented by Safari is too insecure at this stage for this purpose, it might become viable in the future. Some useful functionality can be built with the Session storage functionality that has been expanded to store up to 4MB of data depending on browser implementations. This is much more than the conventional cookie storage. The Sessions object persist only till the window is active and is reasonably secure. I have listed some typical use cases below.

A typical pharmacy module of a clinical viewer brings together all the medications the patient is on from all source hospitals. A client side script could identify drug interactions by analysing the view. This is beyond the local EHR system as disparate systems do not talk to each other.

If all the active drugs can be stored locally during the pharmacy module view, contraindications such as steroids in hyperglcemia can be alerted when the lab module is accessed. These intelligent alerts could be clinically invaluable.

The utopian dream of cross-communicating EHRs are still a long way in the future. Regional federated clinical viewers are going to rule the roost for some time. So intelligent federated clinical viewers may be worth a consideration. I don’t know whether this is already in the pipeline for vendors such as Mulesoft, Medseek or Mirth. If not, a link here (and a mail) will be highly appreciated when they do.

Twitter hashtag for this topic: #IFCV

Bring Your Own EMR (#BYOE)

The e-Health lessons from healthcare.gov debacle is being debated widely. The idea of applying large scale IT initiatives in clinical domains has its own risks. As we relentlessly move towards a fully digital healthcare ecosystem, is it possible to hide some of its complexities from the clinicians?

Patient empowerment is the buzzword in eHealth now and clinicians are generally viewed with some skepticism. EHealth has learnt over the years (in the hard way) that the clinicians may be reluctant to relinquish their firm grip on clinical data. After all they generated the data and they are the custodians though they do not own it!

One of the approaches worth trying is to give the clinicians control and freedom over their end of things. In other words, separate enterprise EMR from the physician EMR. However the key to success in this scenario is interoperability.

Interoperability of EMRs are being actively explored by many research teams and organizations. However the emphasis is on better standardization. As interoperability emerges as a global paradigm, the standardization strategy that has failed for the last decade or so, still seems impractical?

Bring Your Own EMR

I am working on an interoperability solution that segregates physician EMR solutions logically and physically from Enterprise EMR solutions. I would like to call this Bring Your Own EMR (#BYOEpronounced as ‘bio’. The general framework is shown above. If you would like to join the #BYOE initiative or give feedback, shoot me an email!!

Creative Commons Licence
Bring Your Own EMR by Bellraj Eapen is licensed under a Creative Commons Attribution 4.0 International License.
Based on a work at http://nuchange.ca/?p=21.

Please cite this page as: Eapen BR. Bring Your Own EMR (#BYOE) . Available from: http://nuchange.ca/?p=21