Bell Eapen MD, PhD.

Bringing Digital health & Gen AI research to life!

Locally hosted LLMs

TL; DR: From my personal experiments (on an 8-year-old, i5 laptop with 16 GB RAM), locally hosted LLMs are extremely useful for many tasks that do not require much model-captured knowledge. 

Ollama

Image credit: I, Luc Viatour, CC BY-SA 3.0 https://creativecommons.org/licenses/by-sa/3.0, via Wikimedia Commons

The era of relying solely on large language models (LLMs) for all-encompassing knowledge is evolving. As technology advances, the focus shifts towards more specialized and integrated systems. These systems combine the strengths of LLMs with real-time data access, domain-specific expertise, and interactive capabilities. This evolution aims to provide more accurate, context-aware, and up-to-date information, saving us time and addressing the limitations of static model knowledge.

I have started to realize that LLMs are more useful as language assistants who can summarize documents, write discharge summaries, and find relevant information from a patient’s medical record. The last one still has several unsolved limitations, and reliable diagnostic (or other) decision-making is still in the (distant?) future. In short, LLMs are becoming increasingly useful in healthcare as time-saving tools, but they are unlikely to replace us doctors as decision-makers soon. That raises an important question; Do locally hosted LLMs (or even the smaller models) have a role to play? I believe they do! 

Locally hosted large language models (LLMs) offer several key benefits. First, they provide enhanced data privacy and security, as all data remains on your local infrastructure, reducing the risk of breaches and unauthorized access. Second, they allow for greater customization and control over the hardware, software, and data used, enabling more tailored solutions. Additionally, locally hosted LLMs can operate offline, making them valuable in areas with unreliable internet access. Finally, they can reduce latency and potentially lower costs if you already have the necessary hardware. These advantages make locally hosted LLMs an attractive option for many users.  

The accessibility and ease of use offered by modern, user-friendly platforms like OLLAMA are significantly lowering the barriers for individuals seeking technical expertise in self-hosting large language models (LLMs). The availability of a range of open-source models on Hugging Face lowers the barrier even further. 

I have been doing some personal experiments with Ollama (on docker), Microsoft’s phi3: mini (language model) and all-minilm (embedding model), and I must say I am pleasantly surprised by the results! I have been using an 8-year-old, i5 laptop with 16 GB RAM. I have been using it as part of a project for democratizing Gen AI in healthcare, especially for resource-deprived areas (more about it here), and it does a decent job of vectorizing health records and answering questions based on RAG. I also made a helpful personal writing assistant that is RAG-based. I am curious to know if anybody else in my network is doing similar experiments with locally hosted LLMs on personal hardware. 

LLM-in-the Loop CQL execution 

TL;DR: I have added experimental support for using LLMs to interpret DocumentReference-based definitions in CQL of the type:

define HasVisualFootExamThisMonth: 
exists( 
[DocumentReference] D 
where D.status.value = ‘current’ 

 (Read Part I here)

Clinical Quality Language (CQL)

Image credit: Grufo, CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0, via Wikimedia Commons

CQL is a domain-specific language that allows clinicians and researchers to express queries and retrieve data from electronic health records (EHRs) in a standardized and interoperable way. CQL can be used to define clinical quality measures, decision support rules, cohort identification criteria, and other clinical logic. Though CQL can be based on the HL7 FHIR standard, which defines a common data model and terminology for health information, the language itself is schema-independent. CQL aims to improve the quality and efficiency of healthcare by enabling the reuse and sharing of clinical knowledge across different systems and platforms. Here is an example.

CQL is a high-level language that needs to be translated into a lower-level language that can be executed. To facilitate this process, CQL supports a mechanism for transforming CQL expressions into an intermediate representation called Expression Logical Model (ELM). ELM is a platform-independent XML/JSON format that preserves the semantics and structure of CQL expressions but removes the syntactic variations and ambiguities of natural languages. ELM can then be converted into executable formats such as SQL or FHIRPath, depending on the target data source and system. This way, CQL to ELM translation enables the portability and interoperability of clinical queries across different platforms and environments. Here is an example.

CQL can leverage FHIRPath to express queries over FHIR resources in a consistent and interoperable way. However, FHIRPath alone is not sufficient to handle the semantic variations and complexity of clinical data. For example, different systems may use different codes or terminologies to represent the same concept, such as diabetes or foot exam. To address this issue, CQL supports the use of terminology services, which are external services that provide mappings and translations between different code systems and value sets. CQL can invoke terminology services to resolve the codes and values used in the queries and align them with the ones used in the data source. This way, CQL can execute queries over FHIR data using both syntactic and semantic interoperability. 

One limitation of FHIRPath-based CQL execution is that it cannot handle assertions in the FHIR DocumentReference resource. I have forked the nodejs CQL execution engine to add a hook that can call an LLM when it encounters a DocumentReference here:

https://github.com/dermatologist/cql-execution 

I will post a link to an end-to-end application that uses this hook. Also, I have added experimental support for other FHIR servers to the VSAC enabled code service engine. Do comment below, if you find this useful and use it for your project! 

Chatting with FHIR endpoint

FHIR is an emerging standard for exchanging healthcare information electronically. Searching for resources is fundamental to the mechanics of FHIR. Search operations traverse through an existing set of resources filtering by parameters supplied to the search operation. health information systems convert the clinicians’ interactions into the search string.

With the growing importance (and intelligence) of chatbots, it is possible to converse with the physician and retrieve what they want by converting their needs to FHIR search string. This can make the clinicians’ life easy as most of them do not like entering complex search terms into text boxes.

Rasa is an open-source machine learning framework for building AI assistants and chatbots. You can create custom actions with Rasa to support various use-cases. The Microsoft Bot Framework SDK allows you to create and develop bots for the Azure Bot Service. Starting with version 4.7 of the Bot Framework SDK, you can extend a bot using another bot (a skill). A skill can be consumed by various other bots, facilitating reuse.

During the COVID break, I created a couple of experimental projects to make chatting with electronic health records possible. One is a RASA project for mapping conversations to FHIR search.

https://github.com/dermatologist/rasaonfhir

The other is a FHIR search skill for Microsoft Bot SDK that more or less does the same thing.

These are experimental for now and pull requests are welcome!

NLP for Clinical Notes – Tools and Techniques

Clinicians add clinical notes to the EMR on each visit. The clinical notes are unstructured in most cases and can benefit from NLP (natural language processing) tools and techniques. Some are created by dictation software or by medical scribes. Family physicians and family practice-centric EMRs like OSCAR EMR rely on unstructured clinical notes.

natural language processing
NLP for Clinical Notes

Clinical notes, because of the unstructured nature is difficult to analyze for statistical insights. Besides, the notes may require further processing for billing and for generating problem charts. The analysis is becoming increasingly important for quality assessments as well.

NLP can be useful in automated analysis of clinical notes. Here I have listed some of the open-source tools (some maintained by me) for such automated analysis of clinical notes.

Apache cTakes for NLP

Apache cTakes (clinical Text Analysis and Knowledge Extraction System) is one of the first open-source NLP systems that extract clinical information from electronic health record unstructured text. Though it is relatively slow, it is still widely used. I have packaged it as a Quarkus application, that is fast. Quarkus (Supersonic Subatomic Java) is designed primarily for docker containers and the quarkus based containers are easy to be deployed and scaled using platforms such as Kubernetes.

SpaCy and related tools for NLP

SpaCy is an open-source python library for NLP. It features NER, POS tagging, dependency parsing, word vectors and is widely used. But spacy is not designed for clinical workflows and may not be directly usable. Scispacy is SpaCy pipeline and models for scientific/biomedical documents trained on biomedical data. MedaCy is a healthcare-specific NLP framework built over spaCy to support the fast prototyping, training, and application of medical NLP models. One of the advantages of Medacy is that it is fast and lightweight.

UMLS

Unified Medical Language System (UMLS), is a set of files and software that brings together biomedical vocabularies for health information systems. UMLS provides a set of RESTful APIs for licensed users. I have created a JavaScript wrapper for the UMLS APIs that are easy to be called from JavaScript programs. It is available from the npm package repository. See the update on UmlsBERT below.

MedCAT

Medical  Concept Annotation Tool (MedCAT) is a relatively new tool for extraction and linking of terms from vocabularies such as UMLS and SNOMED for free text in EMRs. The paper describing MedCAT is here. MedCAT models can be further refined by training on a domain-specific corpus of text. MedCAT is fast and very useful.

Word Embeddings for NLP

A word embedding is a weighted model for text where words that have the same meaning have a similar weight. It is one of the most popular methods of deep learning for NLP problems. Word2Vec is a method to construct embeddings and the word2vec model based on the entire Wikipedia corpus is available for use. This paper describes the creation of a clinical concept embedding based on a large corpus of clinical documents. I have created a gensim wrapper for this model that can be used for concept similarity search in python.

BERT and related

Bidirectional Encoder Representations from Transformers (BERT) is a technique for NLP pre-training developed by Google. Here is the highly cited official paper. BERT has replaced embeddings as the most successful NLP technique in most domains including healthcare. Some of the refined BERT models used in healthcare are BioBERT and ClinicalBERT.

It is vital to deploy these models in a scalable and maintainable manner to be available for use within EMR systems. We are working on such a framework called ‘Serverless on FHIR’. Give me a shout if you want to know more.

Update (2022): Tools for building multi-modal models.

UPDATE: May 30, 2021: The library (ckblib) is now available under MPL 2.0 license (see below). Feel free to use it in your research.

Dark Mode

ckblib (this link opens in a new window) by dermatologist (this link opens in a new window)

Tools to create a clinical knowledge graph from biomedical literature. Includes wrappers for NCBI Eutils, cTakes annotator and Neo4J

Update (Dec 2020):

Researchers from the University of Waterloo have introduced the novel concept of UmlsBERT. Current clinical embedding such as BioBERT described above are generic models, trained further on clinical corpora applying the concept of transfer learning. Most biomedical ontologies such as UMLS define the hierarchies of concepts defined in them. UmlsBERT makes use of these hierarchical group information at the pre-training stage for augmenting the clinical concept embeddings. Table 3 in the paper compares the results with other embeddings, and it is quite impressive. The GitHub repo is here
Way to go George Michalopoulos and team!

Update (Mar 2021):

Create a chatbot to talk to an FHIR endpoint using conversational AI!

Update (May 2022):

ICDBigBird: A Contextual Embedding Model for ICD Code Classification: https://arxiv.org/pdf/2204.10408.pdf

Kickstart NLP with UMLS

The UMLS, or Unified Medical Language System, is a set of files and software that brings together many health and biomedical vocabularies and standards to enable interoperability between computer systems.

Natural Language Processing (NLP) on the vast amount of data captured by electronic medical records (EMR) is gaining popularity. The recent advances in machine learning (ML) algorithms and the democratization of high-performance computing (HPC) have reduced the technical challenges in NLP. However, the real challenge is not the technology or the infrastructure, but the lack of interoperability — in this case, the inconsistent use of terminology systems.

natural language processing
UMLS for NLP

NLP tasks start with recognizing medical terms in the corpus of text and converting it into a standard terminology space such as SNOMED and ICD. This requires a terminology mapping service that can do this mapping in an easy and consistent manner. The Unified Medical Language System (UMLS) terminology server is the most popular for integrating and distributing key terminology, classification and coding standards. The consistent use of  UMLS resources leads to effective and interoperable biomedical information systems and services, including EMRs.

To make things easier, UMLS provides both REST-based and SOAP-based services that can be integrated into software applications. A high-level library that encapsulated these services, making the REST calls easy to the user is required for the efficient use of these resources.  Umlsjs is one such high-level library for the UMLS REST web services for javascript. It is free, open-source and available on NPM, making it easy to integrate into any javascript (for browsers) or any nodejs applications.

The umlsjs package is available on GitHub and the NPM. It is still work in progress and any coding/documentation contributions are welcome. Please read the CONTRIBUTING.md file on the repository for instructions. If you use it and find any issues, please report it on GitHub.

OSCAR in a BOX – Virtualized and fault-tolerant OSCAR EMR

TL;DR: OSCAR in a BOX is a fault-tolerant OSCAR instance that you can use out of the box and is virtually maintenance-free!

Image credit DarkoStojanovic @ Pixabay

OSCAR EMR is an open-source Electronic Medical Record (EMR) for Canadian family physicians. The official OSCAR repository is available here: https://bitbucket.org/oscaremr/

OSCAR is a spring java application deployed in a tomcat container with a MySQL database backend. OSCAR project being relatively old, with few users outside Canada, has struggled to keep pace with the developments in the electronic health records domain. However, OSCAR is still useful and popular among family physicians and some public health organizations as it is free and well supported.

Oscar is known for its support for the billing workflow, data collection forms (eForms) and comprehensive patient charts (eCharts). Some of the limitations of OSCAR include lack of scalability beyond a handful of users and limited support for data analytics. Oscar by design is hard to be virtualized as a docker container. Availability of a docker container is crucial for sustainable and fault-tolerant deployment on the cloud and distributed systems such as Kubernetes.

Docker is the world’s leading software container platform, used mostly for DevOps. Docker is also useful for developers to set up a development environment in a few easy steps. I was one of the first few who worked on virtualizing OSCAR. Thanks to all those who forked (and hopefully used) this repository.

I have continued my work on OSCAR docker container and have been successful in creating a (reasonably stable) container. It is now available on docker hub. I am now working on a fault-tolerant deployment of OSCAR in customized hardware. I (and some of my friends who know about and encouraged this project) call it OSCAR in a BOX! It has multiple instances of OSCAR with each instance capable of self-healing when a JAVA process hangs (fairly common for OSCAR). The database is replicated, and both the database and documents incrementally back up to an additional disk.

OSCAR in a BOX is ideal for family physicians who wish to adopt OSCAR but do not have the technical support for maintaining the system. OSCAR in a BOX is plug-and-play and is virtually maintenance-free. The virtualization workflow will also be useful for existing bigger user groups reeling under the sluggish pace of OSCAR. Please let me know if anybody is interested in collaborating.

BTW, did you check out Drishti?

DHIS2 and longitudinal health records: Connecting systems to get the best of both worlds!

DHIS2 is a health information system that revolutionized the way healthcare data is managed. It is open source and is a byproduct of a multinational action research project initiated from Oslo and first implemented in India. 1Currently, DHIS2 is the world’s largest health management information system (HMIS) platform, in use by 67 low and middle-income countries. 2.28 billion (30% of the world’s population) people live in countries where DHIS2 is used.

DHIS2 is a public health information system (PHIS) where the unit of management is a group or a geographical region and not individuals. It is unfortunate that this distinction between a typical EMR (a longitudinal health record) and a public health information system to manage population health is not clear to many policymakers.

The growing popularity of machine learning and artificial intelligence applications make the PH agencies rethink their data management strategies. A longitudinal health record is essential for most ML and AI applications for creating complex predictive models. PH agencies are gradually realizing the importance of data warehouses in managing the changing healthcare data management applications and workflows. Hence, the next generation of public health information systems should be able to efficiently handle longitudinal as well as group/cross-sectional data.

The easiest strategy to adopt may be to make existing PHIS systems talk to each other by leveraging the recent advances in health information exchange. HL7 may not be ideal for this purpose as it relies on a patient-centric model. FHIR may be more capable to deal with this, but the underlying REST interface may not support real-time data exchange.

RabbitMQ and Apache Kafka are industry standard open-source messaging frameworks that can be leveraged for real-time communication between disparate systems such as DHIS2 and OSCAR EMR / OpenMRS. DHIS2 supports both out of the box, and I have modified the DHIS2 docker container optimized for message exchange. A sample Java client is also available from my fork. The repo is here.

If you have ideas/want to work on creating DHIS2 connectors for EMRs like OSCAR EMR or OpenMRS, please comment below. OpenMRS has an existing module that can pull certain reports from DHIS2.

References

1.
Braa, Monteiro, Sahay. Networks of Action: Sustainable Health Information Systems across Developing Countries. M. 2004;28(3):337. doi:10.2307/25148643

Dockerized OSCAR EMR for developers

I have created a simple docker-compose script to set up Oscar for developers. The script checks out the master branch from OSCAR repository, compile with maven, create Docker containers and deploy them.