Bell Eapen MD, PhD.

Bringing Digital health & Gen AI research to life!

Medprompt: How to architect LLM solutions for healthcare.

Leveraging the power of advanced machine learning, particularly large language models (LLMs), has increasingly become a transformative element in healthcare and medicine. The applications of LLMs in healthcare are multifaceted, showing immense potential to improve patient outcomes, streamline administrative tasks, and foster medical research and innovation.

Medprompt: How to architect LLM solutions for healthcare.
Image Credit: David S. Soriano, CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0, via Wikimedia Commons

Architecting LLM solutions in the healthcare domain is challenging because of the intricacies associated with healthcare data and the complex nature of healthcare applications. In this post, I will give some recommendations based on the widely popular LangChain library, giving some examples.

The first step is to define the overarching problem you are trying to solve. It can be broad as in getting the right information about a patient to the doctor. Next, subdivide the problem into subproblems that can be tackled separately. For example in the above case, we need to find the patient’s health record, convert it into an easily searchable form (embedding), find areas of interest in the record and generate a summary or an answer to the specific question. Next, find solutions for each problem that may or may not require an LLM. Finally, design the orchestrator that can stitch everything together.

LangChain has some useful abstractions that will help in the last two steps. If the solution does not involve an LLM and mostly involves data retrieval and transformations, use the tool abstraction. If you need one or more LLM calls to achieve it, use the chain abstraction. Agents are the orchestrators that can stitch everything together. It is important to carefully craft the prompts for the chains and agents. Rigorous testing is vital. This includes technical performance and validation of the model’s recommendations by healthcare professionals to ensure they are accurate and clinically relevant.

Named Entity Recognition using LLMs: a cTakes alternative?

TLDR: The targeted distillation method described may be useful for creating an LLM-based cTakes alternative for Named Entity Recognition. However, the recipe is not available yet. 

Image credit: Wikimedia

Named Entity Recognition is essential in clinical documents because it enhances patient safety, supports efficient healthcare workflows, aids in research and analytics, and ensures compliance with regulations. It enables healthcare organizations to harness the valuable information contained in clinical documents for improved patient care and outcomes. 

Though Large Language Models (LLMs) can perform Named Entity Recognition (NER), the capability can be improved by fine-tuning, where you provide the model with input text that contains named entities and their associated labels. The model learns to recognize these entities and classify them into predefined categories. However, as described before fine-tuning Large Language Models (LLMs) is challenging due to the need for substantial, high-quality labelled data, the risk of overfitting on limited datasets, complex hyperparameter tuning, the requirement for computational resources, domain adaptation difficulties, ethical considerations, the interpretability of results, and the necessity of defining appropriate evaluation metrics. 

Targeted distillation of Large Language Models (LLMs) is a process where a smaller model is trained to mimic the behaviour of a larger, pre-trained LLM but only for specific tasks or domains. It distills the essential knowledge of the LLM, making it more efficient and suitable for particular applications, reducing computational demands.  

This paper described targeted distillation with mission-focused instruction tuning to train student models that can excel in a broad application class. The authors present a general recipe for such targeted distillation from LLMs and demonstrate that for open-domain NER. Their recipe may be useful for creating efficient distilled models that can perform NER on clinical documents, a potential alternative to cTakes. Though the authors have open-sourced their generic UniversalNER model, they haven’t released the distillation recipe code yet. 

REF: Zhou, W., Zhang, S., Gu, Y., Chen, M., & Poon, H. (2023). UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition. ArXiv. /abs/2308.03279