Bell Eapen MD, PhD.

Bringing Digital health & Gen AI research to life!

Medprompt: How to architect LLM solutions for healthcare.

Leveraging the power of advanced machine learning, particularly large language models (LLMs), has increasingly become a transformative element in healthcare and medicine. The applications of LLMs in healthcare are multifaceted, showing immense potential to improve patient outcomes, streamline administrative tasks, and foster medical research and innovation.

Medprompt: How to architect LLM solutions for healthcare.
Image Credit: David S. Soriano, CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0, via Wikimedia Commons

Architecting LLM solutions in the healthcare domain is challenging because of the intricacies associated with healthcare data and the complex nature of healthcare applications. In this post, I will give some recommendations based on the widely popular LangChain library, giving some examples.

The first step is to define the overarching problem you are trying to solve. It can be broad as in getting the right information about a patient to the doctor. Next, subdivide the problem into subproblems that can be tackled separately. For example in the above case, we need to find the patient’s health record, convert it into an easily searchable form (embedding), find areas of interest in the record and generate a summary or an answer to the specific question. Next, find solutions for each problem that may or may not require an LLM. Finally, design the orchestrator that can stitch everything together.

LangChain has some useful abstractions that will help in the last two steps. If the solution does not involve an LLM and mostly involves data retrieval and transformations, use the tool abstraction. If you need one or more LLM calls to achieve it, use the chain abstraction. Agents are the orchestrators that can stitch everything together. It is important to carefully craft the prompts for the chains and agents. Rigorous testing is vital. This includes technical performance and validation of the model’s recommendations by healthcare professionals to ensure they are accurate and clinically relevant.

MEDPrompt coming soon!

Deploy a fastai image classifier using OpenFaaS for serverless on DigitalOcean in 5 easy steps!

Fastai is a python library that simplifies training neural nets using modern best practices. See the fastai website and view the free online course to get started. Fastai lets you develop and improve various NN models with little effort. Some of the deployment strategies are mentioned in their course, but most are not production-ready.

OpenFaaS® (Functions as a Service) is a framework for building Serverless functions easily with Docker that can be deployed on multiple infrastructures including Docker swarm and Kubernetes with little boilerplate code. Serverless is a cloud-computing model in which the cloud provider runs the server, and dynamically manages the allocation of machine resources and can scale to zero if a service is not being used. It is interesting to note that OpenFaaS has the same requirements as the new Google Cloud Run and is interoperable. Read more about OpenFaaS (and install the CLI) from their website.

DigitalOcean: I host all my websites on DigitalOcean (DO) which offers good (in my opinion) cloud services at a low cost. They have data centres in Canada and India. DO supports Kubernetes and Docker Swarm, but they offer a One-Click install of OpenFaaS for as little as $5 per month (You can remove the droplet after the experiment if you like, and you will only be charged for the time you use it.) If you are new to DO, please sign up and setup OpenFaaS as shown here:

In fastai class, Jeremy creates a dog breed classifier.

As STEP 1, export the model to .pkl as below

learn.export()

This creates the export.pkl file that we will be using later. To deploy we need a base container to run the prediction workflow. I have created one with Python3 along with fastai core and vision dependencies (to keep the size small). It is available here: https://hub.docker.com/r/beapen/fastai-vision But you don’t have to directly use this container. My OpenFaaS template will make this easy for you.

STEP 2: Using the OpenFaaS CLI (How to Install) pull my template as below:

mkdir dog-classifier
cd dog-classifier
faas-cli template pull https://github.com/dermatologist/python3-ml --prefix your-docker-uname
faas-cli new --lang fastai-vision dog-classifier

STEP 3: Copy export.pkl to the model folder

STEP 4: Add http://digitaloceanIP:8080 to dog-classifier.yml

provider:
  name: openfaas
  gateway: http://digitaloceanIP:8080

and finally in STEP 5:

faas-cli up -f dog-classifier.yml

That’s it! Your predictor is up and running! Access it at http://digitaloceanIP:8080/function/dog-classifier

The template has a builtin image uploader interface! If you get stuck at any stage, give me a shout below. More to follow on using OpenFaaS for deploying machine learning workflows!

Running Tomcat as www-data

For last couple of days I have been working on a web based application for skin color measurement as I blogged here. The core of the application is implemented in php as it is much easier to handle. (May be I am more comfortable with it). But the GD image library for php has limited functionality. So I had to rely on Java for border detection and segmentation. So I implemented part of the functionality as a servlet running on tomcat6.

Today's latte, Apache Tomcat.
Today’s latte, Apache Tomcat. (Photo credit: yukop)

Both php and servlet had to read and write same files. But standard configuration does not allow this as php creates files as www-data (apache user) and tomcat creates files as tomcat6 user. I searched the internet for solutions. One suggestion was to create an appropriate umask for both which I could never completely understand. Finally I realised that the easiest solution is to make tomcat run as user www-data so that all created files will have same ownership.

I found this webpage with detailed instructions on how to do this. I would like to add two more steps:
1. Edit /etc/init.d/tomcat6 also to add the user ID and group ID
2. chown var/cache/tomcat6 directory also.

Nina Jablonski - The Evolution of Human Skin Color
Nina Jablonski – The Evolution of Human Skin Color (Photo credit: wagnerfreeinstitute)

BTW the application is hosted on my laptop. If you want to check it out click on the banner below if you see it. You will see it only when my laptop is on.

Negative N to Unknown U

The identification of disease specific genes is pivotal in clinical informatics. This paper describes an improved algorithm for machine learning in which the negative N is classified more appropriately as Unknown U.

English: Weka Data Mining Open Software in Java
English: Weka Data Mining Open Software in Java (Photo credit: Wikipedia)

Peng Yang, Xiao-Li Li, Jian-Ping Mei, Chee-Keong Kwoh, and See-Kiong Ng. Positive-Unlabeled Learning for Disease Gene Identification
Bioinformatics first published online August 24, 2012 doi:10.1093/bioinformatics/bts504

SVMs are an important tool in bioinformaticians armamentarium. Weka is a collection of machine learning algorithms for data mining tasks.