Bell Eapen MD, PhD.

Bringing Digital health & Gen AI research to life!

When GenAI Ideas translate to practice with DHTI

If you’ve ever worked in healthcare, you know the feeling: you have a brilliant idea—something that would save time, reduce frustration, or make patient care smoother—and then… nothing happens. Not because the idea is bad, but because turning it into real software feels like trying to build a spaceship out of sticky notes.

That’s the gap vibe coding is trying to close. And with tools like DHTI, that gap is finally starting to shrink, and it resembles more of a conversation like below!

npx dhti-cli copilot --model gpt-5.3-codex --skill elixir-generator --prompt "Generate an elixir glycemic_advisor that summarizes diabetic patients' latest lab results and medications"

npx dhti-cli copilot --model gpt-5.3-codex --skill start-dhti --prompt "Start the glycemic_advisor elixir and display in CDS-Hooks sandbox"

Let’s walk through what vibe coding is, why it matters, and how DHTI makes it surprisingly doable—even if you’ve never written a line of code in your life.


So… what exactly is vibe coding?

Think of vibe coding as building software the same way you’d brainstorm with a colleague over coffee. You don’t start with code. You start with the vibe of what you want.

Instead of saying:

“I need a function that queries a FHIR endpoint and transforms the JSON.”

You say:

“I want a little helper that pulls a patient’s meds and tells me if anything looks risky.”

And the system starts shaping that idea into something real.

Vibe coding is:

  • Talking to the computer like you’d talk to a person
  • Iterating as you go
  • Letting the AI handle the technical scaffolding
  • Staying focused on the idea, not the syntax

It’s not magic. It’s just finally letting people who understand healthcare shape the tools they need—without having to become software engineers first.


Why healthcare needs this more than anyone

Healthcare is full of smart, creative people. But it’s also full of complexity: clinical workflows, privacy rules, specialized language, and data standards that feel like they were designed by a committee of cryptographers.

Even when clinicians know exactly what they want, translating that into something a developer can build is… hard. And developers, for their part, often spend more time deciphering clinical nuance than writing code.

Vibe coding cuts out the translation layer.
It lets clinicians express ideas in their own words.
It lets AI turn those ideas into working prototypes.
And it lets developers focus on polishing and deploying—not guessing.

But vibe coding alone isn’t enough. Healthcare needs structure. It needs guardrails. It needs standards.

That’s where DHTI comes in.


Meet DHTI: the “let’s actually build this” engine

DHTI is an open‑source reference architecture built specifically for healthcare GenAI applications. If vibe coding is the conversation, DHTI is the workshop where the ideas get shaped into something sturdy.

DHTI gives you a ready‑made foundation for building GenAI healthcare tools. It understands healthcare standards, provides synthetic data, supports agentic workflows, and helps you turn natural‑language ideas into real, testable applications.

In plain English:
DHTI makes vibe‑coded ideas actually work in healthcare environments.

Here’s how.


DHTI speaks healthcare, so you don’t have to

Most AI tools can generate code, but they don’t understand the rules of healthcare. They don’t know what FHIR is supposed to look like. They don’t know how CDS‑Hooks cards plug into clinical workflows. They don’t know what’s safe, what’s allowed, or what’s interoperable.

DHTI does.

So when someone says:

“Can you build something that checks whether a patient with diabetes is overdue for an A1c?”

DHTI can assemble the pieces:

  • A FHIR query
  • A little reasoning chain
  • A card that could show up in the EHR
  • A test environment to try it out

All without the user needing to know any of those words.


It lets non‑technical users build real workflows

Healthcare tasks aren’t simple. They involve multiple steps, multiple data sources, and multiple decisions. DHTI is built for that.

A clinician might say:

“I want something that looks at a patient’s skin images, compares them to previous ones, and drafts a note.”

DHTI can turn that into:

  • A workflow that loads images
  • A reasoning step that describes changes
  • A draft note
  • A preview card

It’s not just generating text—it’s building a mini‑application.


It makes experimentation safe and fast

One of the biggest barriers in healthcare innovation is simply being able to try things. Real patient data is locked down (as it should be). EHR systems are hard to access. And IT teams are stretched thin.

DHTI solves this by including:

  • Synthetic data that looks realistic but contains no PHI
  • A ready‑to‑use FHIR server
  • Prebuilt agent templates
  • A local environment you can spin up quickly

This means you can test ideas without waiting for approvals, access, or integration.

You can play.
You can explore.
You can see what works.

And that’s where the best ideas come from.


It smooths the path from prototype to production

Prototyping is fun. Deploying is not.

Healthcare IT teams have to think about:

  • Security
  • Compliance
  • Standards
  • Maintenance
  • Integration
  • Auditing

DHTI is built with these realities in mind. Because everything is structured, modular, and standards‑aligned from the start, IT teams don’t have to rebuild the prototype from scratch. They can refine it, secure it, and deploy it.

This is the difference between “cool demo” and “something translatable to practice.”


The Copilot SDK: your agent, packed right into the app

One of the most exciting pieces of this ecosystem is the Copilot SDK, making the AI agent directly available in DHTI—no external tools, no switching windows, no juggling platforms.

You can:

  • Build the agent
  • Test it
  • Tweak it

All in one place.

For vibe coding, this is huge. It means the conversation that creates the tool can happen inside the tool itself. Clinicians can test ideas in the same interface where they’ll eventually use them. Developers can refine behavior without rebuilding infrastructure.

It’s a tight, elegant loop.


Why this moment matters

Healthcare has always been full of ideas. What it hasn’t had is a way to turn those ideas into working software without months of meetings, requirements documents, and integration headaches.

Vibe coding changes the front end of innovation.
DHTI changes the back end.

Together, they make it possible for:

  • Clinicians to prototype ideas
  • Researchers to test hypotheses
  • Developers to build faster
  • IT teams to deploy safely
  • Organizations to innovate sustainably

It’s not just a new tool.
It’s a new way of building.


Last but not least, thank you, Hanson Professional Services, for supporting this project! Version 1 will debut at the Medical Informatics Europe Conference 2026 in Genoa, Italy, taking place May 26–28, 2026. Read more about DHTI and try it today! It is free and open-source. See the repository link below. Please comment/share if you find this useful!

Hanson - DHTI

How DHTI Makes MCP Practical for Healthcare Through “Docktor” (Part IV)

The previous post of this series explained why LLMs need tools, why the agentic pattern matters, and how standards like MCP and A2A make tool‑calling safe and interoperable. But standards alone don’t guarantee usability—especially in healthcare, where clinicians and researchers need systems that “just work.” This is where DHTI steps in, transforming the complexity of MCP into something deployable, maintainable, and clinician‑friendly.

A key part of this transformation is DHTI’s integration with MCPX, a production‑ready gateway for managing MCP servers at scale. MCPX is powerful, but on its own it still requires engineering expertise. DHTI removes that barrier by packaging MCPX inside its own container environment and extending it with a new feature called docktor, which makes installing healthcare algorithms as simple as running a single command.

Let’s unpack how this works.

Image credit: sOER Frank, CC BY 2.0 https://creativecommons.org/licenses/by/2.0, via Wikimedia Commons


What MCPX Is and Why It Matters

MCPX is an open‑source, production‑grade gateway created by Lunar.dev to orchestrate and manage multiple MCP servers. It provides a unified gateway for an entire MCP ecosystem, giving teams centralized control over which tools are exposed to which agents, how they are configured, and how they perform. MCPX acts as an aggregator, meaning it can connect to many MCP servers and present them as a single, coherent interface to an LLM or agent.

Search results also highlight several key capabilities:

  • MCPX dynamically manages multiple MCP servers through simple configuration changes, enabling zero‑code integration with MCP‑compatible services.
  • It centralizes tool discovery, access control, call prioritization, and usage tracking, making it suitable for production‑grade agentic systems.
  • It is designed for environments where the number of tools and servers grows rapidly, providing visibility and governance across all interactions.

In short, MCPX solves the “tool sprawl” problem: instead of wiring dozens of MCP servers manually, you point everything to MCPX and let it orchestrate the rest.

But MCPX is still a developer‑oriented platform. It assumes familiarity with Node.js, configuration files, environment variables, and container orchestration. That’s where DHTI changes the game.


How DHTI Makes MCPX Easy

DHTI embeds MCPX directly into its Docker‑based architecture. Instead of requiring users to install MCPX manually, configure it, and manage its lifecycle, DHTI:

  • Deploys MCPX automatically inside its container environment
  • Configures MCPX to work with DHTI’s FHIR, CDS‑Hooks, and agentic components
  • Exposes MCPX to agents without requiring any user‑side setup
  • Handles the installation of algorithms/calculators packaged as an MCP server with a single command.

For healthcare teams, this is transformative. MCPX becomes invisible infrastructure: powerful, flexible, and standards‑compliant, but never something clinicians need to touch.


Docker‑in‑Docker: Deploying Healthcare Algorithms as Tools

Healthcare algorithms and calculators increasingly ship as Docker containers. This is already common in research environments and many FHIR‑related tools. Docker packaging ensures reproducibility, version control, and portability across systems.

DHTI extends this model by enabling Docker‑in‑Docker deployments inside MCPX. In practice, this means:

  1. A research team packages an algorithm—say, a stroke‑risk calculator or EEG classifier—into a Docker container.
  2. They follow a simple standard:
    • The container must accept a patient ID as input
    • It must be able to access a FHIR server for structured data
    • It must be able to read from a local folder for unstructured data (EEG, radiology images, PDFs, etc.)
  3. DHTI installs this container inside MCPX, where it becomes an MCP tool.
  4. Agents can now call the tool using MCP, passing only the patient ID.

This architecture ensures that algorithms remain isolated, reproducible, and easy to update—just replace the container with a new version.


Why Standards Matter for Healthcare Algorithms

Healthcare algorithms are not arbitrary scripts. They must follow predictable patterns so they can be safely integrated into clinical workflows. The most common requirements include:

  • FHIR server access for structured data such as vitals, labs, medications, and encounters.
    • Many FHIR servers—including Microsoft’s open‑source implementation—run as Docker containers, making them easy to integrate into DHTI’s environment.
  • Local folder access for unstructured data such as EEG files, radiology images, or waveform data.
  • Patient‑ID‑based invocation, which keeps the interface simple and consistent across tools.

By standardizing these expectations, DHTI ensures that any algorithm packaged by a research team can be installed and used by clinicians without custom engineering.


Introducing “Docktor”: One‑Command Installation for Clinicians

The most exciting part of this new architecture is docktor, a DHTI feature that lets clinicians install algorithms with a single command.

Instead of:

  • cloning GitHub repositories
  • configuring environment variables
  • wiring MCP servers manually
  • setting up Docker networks
  • mapping volumes
  • writing MCP configuration files

…a doctor simply runs:

dhti-cli docktor install <algorithm-name>

Behind the scenes, DHTI:

  1. Pulls the Docker image
  2. Deploys it inside MCPX using Docker‑in‑Docker
  3. Registers it as an MCP tool
  4. Grants it access to FHIR and local data folders
  5. Makes it available to agents immediately

The installed algorithm becomes a first‑class MCP tool, callable by any agent in the system.

This is the democratization of GenAI in action: clinicians gain the ability to extend their AI stack without needing a DevOps team.


Seamless Data Access and Agent Output

Once installed, these tools behave like native components of the agentic workflow. An agent can call a tool with:

patient_id: “12345”

The tool retrieves the necessary data from the FHIR server or local folders, runs the algorithm, and returns structured output to the agent. The agent can then:

  • summarize results
  • generate recommendations
  • integrate findings into a CDS‑Hooks card
  • write notes back into the EMR (if permitted)

The entire workflow becomes smooth, modular, and maintainable.


Why This Matters for Healthcare

Healthcare AI has long been held back by deployment friction. Researchers build algorithms, but clinicians struggle to use them. EMR vendors offer APIs, but integrating new tools requires engineering resources most hospitals don’t have.

DHTI + MCPX + docktor changes that equation:

  • Researchers package algorithms as Docker containers
  • DHTI installs them with one command
  • MCPX orchestrates them as standardized MCP tools
  • Agents call them using patient IDs
  • Clinicians get actionable results inside their workflows

This is the missing layer that turns agentic AI from a prototype into a practical clinical platform.


What’s Next

In the next post, I’ll walk through practical examples of docktor in action—how a clinician might install a stroke‑risk calculator, a dermatology classifier, or a genomics pipeline, and how agents use these tools to deliver meaningful clinical insights.

Try the following command today!

npx dhti-cli docktor –help

LLMs, Agentic Patterns, and Practical Healthcare: Why Tools Matter (Part III)

TL;DR Large language models (LLMs) are powerful at reasoning and language but cannot perform real-world actions on their own; the agentic pattern—exposing callable tools or functions—is the practical solution that lets LLMs drive systems safely and reliably. Try DHTI — help us democratize GenAI.

Image credit: JPxG, Public domain, via Wikimedia Commons


LLMs excel at understanding, summarizing, and generating text, but they are not actuators: they cannot click buttons, run code, or update records by themselves. To bridge that gap, engineers use the agentic design pattern in which an LLM is paired with tools—well-defined functions that actually perform actions (search, execute code, call APIs, update databases). This pattern is widely discussed in recent engineering guides and industry posts about tool use for agents.

How the agentic pattern works
An agentic system exposes a catalog of tools (functions) with clear schemas. The LLM decides which tool to call and with what parameters; the tool executes the action and returns structured results the model can reason over. This separation keeps the model focused on decision-making while delegating side effects to auditable, testable code.

In healthcare, clinical calculators and scoring algorithms (e.g., eGFR, SOFA, CHADS-VASc) can be exposed as callable services so the model can compute and return validated numeric outputs rather than guessing formulas.

Standards that make agentic systems interoperable
Two emerging standards are central to scaling agentic AI: Model Context Protocol (MCP) and Agent2Agent (A2A). MCP is an open protocol designed to connect models to data sources and tools via a standardized server interface; it defines how tools are described, invoked, and secured so models can access files, APIs, and computations consistently. A2A is an open agent-to-agent communication protocol that enables different agents to coordinate, delegate, and stream messages in multi-agent ecosystems—think of it as the networking layer for autonomous agents.

MCP in healthcare: exposing calculators and models as tools
Medical calculators and algorithmic models can be packaged as MCP servers so any compliant model can call them with patient parameters and receive validated outputs. Open implementations and marketplaces already demonstrate medical-calculator MCP servers that expose dozens of clinical tools for integration into EMRs and workflows. Projects like AgentCare show how MCP servers can connect LLMs to EMRs (SMART on FHIR) to fetch vitals, labs, and run clinical workflows.

Practical barriers for clinicians
Despite the promise, non-technical clinicians face friction: setting up MCP servers, wiring tools into EMRs, and maintaining models and calculators requires engineering resources. Algorithms and models also evolve—guidelines change, new equations are published—so a one-time integration can become stale quickly.

Where DHTI fits
This is where DHTI steps in: by lowering the technical barrier, managing tool deployments, and keeping clinical algorithms up to date, DHTI helps healthcare teams adopt agentic GenAI without deep engineering overhead. In the next post, I will explain exactly how DHTI handles deployment, governance, and lifecycle updates.

Why DHTI Chains Matter: Moving Beyond Single LLM Calls in Healthcare AI (Part II)

Large Language Models (LLMs) are powerful, but a single LLM call is rarely enough for real healthcare applications. Out of the box, LLMs lack memory, cannot use tools, and cannot reliably perform multi‑step reasoning—limitations highlighted in multiple analyses of LLM‑powered systems. In clinical settings, where accuracy, context, and structured outputs matter, relying on a single prompt‑response cycle is simply not viable.

Healthcare workflows require the retrieval of patient data, contextual reasoning, validation, and often the structured transformation of model output. A single LLM call cannot orchestrate these steps. This is where chains become essential.


Image credit: FASING Group, CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0, via Wikimedia Commons

What Are Chains, and Why Do They Matter?

A chain is a structured workflow that connects multiple steps—LLM calls, data transformations, retrieval functions, or even other chains—into a coherent pipeline. LangChain describes chains as “assembly lines for LLM workflows,” enabling multi‑step reasoning and data processing that single calls cannot achieve.

Chains allow developers to:

  • Break complex tasks into smaller, reliable steps
  • Enforce structure and validation
  • Integrate external tools (e.g., FHIR APIs, EMR systems)
  • Maintain deterministic flow in safety‑critical environments

In healthcare, this is crucial. For example, generating a patient‑specific summary may require:

  1. retrieving data from an EMR,
  2. cleaning and structuring it,
  3. generating a clinical narrative, and
  4. validating the output.

A chain handles this entire pipeline.


Sequential, Parallel, and Branch Flows

Modern LLM applications often require more than linear processing. LangChain supports three major flow types:

✅ Sequential Chains

Sequential chains run steps in order, where the output of one step becomes the input to the next. They are ideal for multi‑stage reasoning or data transformation pipelines.

✅ Parallel Chains

Parallel chains run multiple tasks at the same time—useful when extracting multiple data elements or generating multiple outputs concurrently. LangChain’s RunnableParallel enables this pattern efficiently.

✅ Branching Chains

Branch flows allow conditional logic—different paths depending on model output or data state. This is essential for clinical decision support, where logic often depends on patient‑specific conditions.

Together, these patterns allow developers to build robust, production‑grade AI systems that go far beyond simple prompt engineering.


Implementing Chains in LangChain and Hosting Them on LangServe

LangChain provides a clean, modular API for building chains, including prompt templates, LLM wrappers, and runnable components. LangServe extends this by exposing chains as FastAPI‑powered endpoints, making deployment straightforward.

This combination—LangChain + LangServe—gives developers a scalable, observable, and maintainable way to deploy multi‑step GenAI workflows.


DHTI: A Real‑World Example of Chain‑Driven Healthcare AI

DHTI embraces these patterns to build GenAI applications that integrate seamlessly with EMRs. DHTI uses:

  • Chains for multi‑step reasoning
  • LangServe for hosting GenAI services
  • FHIR for standards‑based data retrieval
  • CDS‑Hooks for embedding AI output directly into EMR workflows

This standards‑based approach ensures interoperability and makes it easy to plug GenAI into clinical environments without proprietary lock‑in. DHTI makes sharing chains remarkably simple by packaging each chain as a modular, standards‑based service that can be deployed, reused, or swapped without touching the rest of the system. Because every chain is exposed through LangServe endpoints and integrated using FHIR and CDS‑Hooks conventions, teams can share, version, and plug these chains into different EMRs or projects with minimal friction.

Explore the project here:


Try DHTI and Help Democratize GenAI in Healthcare

DHTI is open‑source, modular, and built on widely adopted standards. Whether you’re a researcher, developer, or clinician, you can use it to prototype safe, interoperable GenAI workflows that work inside real EMRs.

More examples for chains


✅ 1. Clinical Note → Problem List → ICD-10 Coding

Why chaining helps

A single LLM call struggles because:

  • The task is multi‑step: extract problems → normalize → map to ICD‑10.
  • Each step benefits from structured intermediate outputs.
  • Errors compound if the model tries to do everything at once.

Sequential Runnable Example

Step 1: Extract the structured problem list from the free‑text note
Step 2: Normalize problems to standard clinical terminology
Step 3: Map each normalized problem to ICD‑10 codes

This mirrors real clinical coding workflows and allows validation at each step.

Sequential chain sketch

  1. extract_problems(note_text)
  2. normalize_terms(problem_list)
  3. map_to_icd10(normalized_terms)

✅ 2. Clinical Decision Support: Medication Recommendation With Safety Checks

Why chaining helps

A single LLM call might hallucinate or skip safety checks. A chain allows:

  • Independent verification steps
  • Parallel evaluation of risks
  • Branching logic based on findings

Parallel Runnable Example

Given a patient with multiple comorbidities:

Parallel tasks:

  • Evaluate renal dosing requirements
  • Check drug–drug interactions
  • Assess contraindications
  • Summarize guideline‑based first‑line therapies

All run simultaneously, then merged.

Parallel chain sketch

{

  renal_check: check_renal_function(patient),

  ddi_check: check_drug_interactions(patient),

  contraindications: check_contraindications(patient),

  guideline: summarize_guidelines(condition)

}

→ combine_and_recommend()

This mirrors how pharmacists and CDS systems work: multiple independent checks feeding into a final recommendation.


✅ 3. Triage Assistant: Symptom Intake → Risk Stratification → Disposition

Why chaining helps

Triage requires conditional logic:

  • If red‑flag symptoms → urgent care
  • If moderate risk → telehealth
  • If low risk → self‑care

A single LLM call tends to blur risk categories. A branching chain enforces structure.

Branch Runnable Example

Step 1: Extract structured symptoms
Step 2: Risk stratification
Branch:

  • High risk → generate urgent-care instructions
  • Medium risk → generate telehealth plan
  • Low risk → generate self‑care guidance

Branch chain sketch

symptoms = extract_symptoms(input)

risk = stratify_risk(symptoms)

if risk == “high”:

    return urgent_care_instructions(symptoms)

elif risk == “medium”:

    return telehealth_plan(symptoms)

else:

    return self_care_plan(symptoms)

This mirrors real triage protocols (e.g., Schmitt/Thompson).


✅ Summary Table

ScenarioWhy a Chain HelpsBest Runnable Pattern
Clinical note → ICD‑10 codingMulti-step reasoning, structured outputsSequential
Medication recommendation with safety checksIndependent safety checks, guideline lookupParallel
Triage assistantConditional logic, different outputs based on riskBranch

Bringing Generative AI Into the EHR: Why DHTI Matter (Part I)

Large Language Models (LLMs) are transforming how we think about clinical decision support, documentation, and patient engagement. Yet despite their impressive capabilities, LLMs have a fundamental limitation that becomes especially important in healthcare: LLMs are stateless. They do not remember prior interactions unless that information is explicitly included in the prompt. For clinical use, this means that patient‑specific data must be added to every prompt if we want the model to generate relevant, safe, and context‑aware output.

This is where the real challenge begins.

Image credit: Grzegorz W. Tężycki, CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0, via Wikimedia Commons

Why Patient Context Matters for Generative AI in Healthcare

Healthcare workflows depend on rich, longitudinal patient data—medications, allergies, labs, imaging, diagnoses, and more. To generate clinically meaningful output, an LLM must be given this context. Without it, the model is essentially guessing.

But adding patient data to prompts is not as simple as it sounds. Extracting structured, reliable data from Electronic Medical Records (EMRs) is notoriously difficult. EMRs were not originally designed with AI integration in mind. Data may be siloed, inconsistently structured, or locked behind proprietary interfaces. Even when APIs exist, authentication, authorization, and data‑mapping complexities can slow down innovation.

FHIR: The Standard That Makes Interoperability Possible

Fortunately, the healthcare ecosystem has rallied around a modern interoperability standard: HL7® FHIR® (Fast Healthcare Interoperability Resources). FHIR provides a consistent, web‑friendly way to represent clinical data, making it easier for external applications—including AI systems—to retrieve patient information.

Most major EMRs now expose FHIR APIs that allow authorized systems to query patient‑specific data such as demographics, medications, conditions, and lab results. This shift has been transformative. Instead of custom integrations for each EMR vendor, developers can rely on a shared standard.

FHIR also underpins many modern interoperability frameworks, including SMART on FHIR and CDS‑Hooks. These standards are now widely adopted across the industry, with CDS‑Hooks explicitly designed to connect EMRs to external decision‑support services using FHIR data.

Displaying AI Output Inside the EMR: The Role of CDS‑Hooks

Retrieving data is only half the problem. Once an AI model generates insights, the output must be displayed inside the clinician’s workflow—not in a separate window, not in a separate app, and not in a place where it will be ignored.

This is where CDS‑Hooks comes in.

CDS‑Hooks is a standard that allows EMRs to call external decision‑support services at specific points in the clinical workflow. When a clinician opens a chart, writes an order, or reviews a medication list, the EMR can trigger a “hook” that sends key context—including the patient ID—to a backend service. That backend can then use FHIR APIs to retrieve the necessary patient data, run AI models, and return actionable “cards” that appear directly inside the EMR interface.

This pattern is powerful because:

  • It keeps clinicians in their workflow
  • It ensures AI output is tied to real‑time patient context
  • It avoids sending large amounts of PHI directly from the EMR to the AI model

In short, CDS‑Hooks is the bridge between EMRs and modern AI‑powered decision support.

DHTI: A Reference Architecture for GenAI in Healthcare

As interest in generative AI grows, developers and researchers need a framework that brings all these pieces together—LLMs, FHIR, CDS‑Hooks, EMR integration, and modular AI components. DHTI (Distributed Health Technology Interface) is one such open‑source project.

DHTI embraces the standards that matter:

  • FHIR for structured data exchange
  • CDS‑Hooks for embedding AI output in the EMR
  • LangServe for hosting modular GenAI applications
  • Ollama for local LLM hosting
  • OpenMRS as an open‑source EMR environment

The project’s documentation highlights how CDS‑Hooks is used to send patient context (including patient ID) and how backend services retrieve additional data using FHIR before generating AI‑driven insights. DHTI’s architecture is intentionally modular, allowing developers to prototype new GenAI “elixirs” (backend services) and UI “conches” (frontend components) that plug directly into an EMR environment.

You can explore the project here:

Why This Matters for the Future of Clinical AI

Healthcare AI must be:

  • Context‑aware
  • Integrated into clinical workflows
  • Standards‑based
  • Secure and privacy‑preserving
  • Interoperable across EMRs

LLMs alone cannot meet these requirements. But LLMs combined with FHIR, CDS‑Hooks, and frameworks like DHTI can.

This is how we move from isolated AI demos to real, production‑ready clinical tools. Try DHTI and Help Democratize GenAI in Healthcare

Loading MIMIC dataset onto a FHIR server in two easy steps

The integration of generative AI into healthcare has the potential to revolutionize the industry, from drug discovery to personalized medicine. However, the success of these applications hinges on the availability of high-quality, curated datasets such as MIMIC. These datasets are crucial for training and testing AI models to ensure they can perform tasks accurately and reliably. 

MIMIC dataset on FHIR server
Free Clip Art, CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0, via Wikimedia Commons

The Medical Information Mart for Intensive Care (MIMIC) dataset is a comprehensive, freely accessible database developed by the Laboratory for Computational Physiology at MIT. It includes deidentified health data from over 40,000 critical care patients admitted to the Beth Israel Deaconess Medical Center between 2001 and 2012. The dataset encompasses a wide range of information, such as demographics, vital signs, laboratory test results, medications, and caregiver notes. MIMIC is notable for its detailed and granular data, which supports diverse research applications in epidemiology, clinical decision-making, and the development of electronic health tools. The open nature of the dataset allows for reproducibility and broad use in the scientific community, making it a valuable resource for advancing healthcare research. 

MIMIC-IV has been converted into the Fast Healthcare Interoperability Resources (FHIR) format and exported as newline-delimited JSON (ndjson). FHIR provides a structured way to represent healthcare data, ensuring consistency and reducing the complexity of data integration. However, importing the ndjson export of FHIR resources into a FHIR server can be challenging. Having the MIMIC-IV dataset loaded onto a FHIR server could be incredibly valuable. It would provide a consistent and reproducible environment for testing and developing Generative AI applications. Researchers and developers could leverage this setup to create and refine AI models, ensuring they work effectively with standardized healthcare data. This could ultimately lead to more robust and reliable AI applications in the healthcare sector. Here I show you how to do it in two easy steps using docker and the MIMIC-IV demo dataset.  

STEP 1: Start the FHIR server

Use docker-compose to spin up the latest HAPI FHIR server that supports bulk data import using the docker-compose.yml file as below. 

version: "3.7" 

services: 
  fhir: 
    image: hapiproject/hapi:latest 
    ports: 
      - 8080:8080 
    restart: "unless-stopped" 
    environment: 
      - hapi.fhir.bulkdata.enabled=true 
      - hapi.fhir.bulk_export_enabled=true 
      - hapi.fhir.bulk_import_enabled=true 
      - hapi.fhir.cors.enabled=true 
      - hapi.fhir.cors.allow_origin=* 
      - hapi.fhir.enforce_referential_integrity_on_write=false 
      - hapi.fhir.enforce_referential_integrity_on_delete=false 
      - "spring.datasource.url=jdbc:postgresql://postgres-db:5432/postgres" 
      - "spring.datasource.username=postgres" 
      - "spring.datasource.password=postgres" 
      - "spring.datasource.driverClassName=org.postgresql.Driver" 
      - "spring.jpa.properties.hibernate.dialect=ca.uhn.fhir.jpa.model.dialect.HapiFhirPostgres94Dialect" 

  
  postgres-db: 
    image: postgis/postgis:16-3.4 
    restart: "unless-stopped" 
    environment: 
      - POSTGRES_USER=postgres 
      - POSTGRES_PASSWORD=postgres 
      - POSTGRES_DB=postgres 
    ports: 
      - 5432:5432 
    volumes: 
      - postgres-db:/var/lib/postgresql/data 

volumes: 
  postgres-db: ~ 

 

Please note that the referential integrity on write is set to false. 

docker compose up to start the server at the following base URL: http://localhost:8080/fhir 

STEP 2: Send a POST request to the $import endpoint. 

The full MIMIC-IV dataset is available here for credentialed users. The demo dataset used in the request below is available here. You don’t have to download the dataset. The request below contains the URL to the demo data sources. Anyone can access the files, as long as they conform to the terms of the license specified in this page. All you need is an internet connection for the docker environment. The FHIR $import operation allows for bulk data import into a FHIR server. When using resource type Parameters, you can specify the types of FHIR resources to be imported. This is done by including a Parameters resource in the request body, which details the resource types and their respective data files. I use the VSCODE REST Client extension to make the request and the format below aligns with its requirements. However, you can make the POST request in any way you prefer.

### 

  
POST http://localhost:8080/fhir/$import HTTP/1.1 
Prefer: respond-async 
Content-Type: application/fhir+json 

  
{ 

  "resourceType": "Parameters", 

  "parameter": [ { 

    "name": "inputFormat", 

    "valueCode": "application/fhir+ndjson" 

  }, { 

    "name": "inputSource", 

    "valueUri": "http://example.com/fhir/" 

  }, { 

    "name": "storageDetail", 

    "part": [ { 

      "name": "type", 

      "valueCode": "https" 

    }, { 

      "name": "credentialHttpBasic", 

      "valueString": "admin:password" 

    }, { 

      "name": "maxBatchResourceCount", 

      "valueString": "500" 

    } ] 

  }, { 

    "name": "input", 

    "part": [ { 

      "name": "type", 

      "valueCode": "Observation" 

    }, { 

      "name": "url", 

      "valueUri": "https://physionet.org/files/mimic-iv-fhir-demo/2.0/mimic-fhir/ObservationLabevents.ndjson" 

    } ] 

  }, { 

    "name": "input", 

    "part": [ { 

      "name": "type", 

      "valueCode": "Medication" 

    }, { 

      "name": "url", 

      "valueUri": "https://physionet.org/files/mimic-iv-fhir-demo/2.0/mimic-fhir/Medication.ndjson" 

    } ] 

  }, { 

    "name": "input", 

    "part": [ { 

      "name": "type", 

      "valueCode": "Procedure" 

    }, { 

      "name": "url", 

      "valueUri": "https://physionet.org/files/mimic-iv-fhir-demo/2.0/mimic-fhir/Procedure.ndjson" 

    } ] 

  }, { 

    "name": "input", 

    "part": [ { 

      "name": "type", 

      "valueCode": "Condition" 

    }, { 

      "name": "url", 

      "valueUri": "https://physionet.org/files/mimic-iv-fhir-demo/2.0/mimic-fhir/Condition.ndjson" 

    } ] 

  }, { 

    "name": "input", 

    "part": [ { 

      "name": "type", 

      "valueCode": "Patient" 

    }, { 

      "name": "url", 

      "valueUri": "https://physionet.org/files/mimic-iv-fhir-demo/2.0/mimic-fhir/Patient.ndjson" 

    } ] 

  } ] 

} 

 

That’s it! It takes a few minutes for the bulk import to complete, depending on your system resources. 

Feel free to reach out if you’re interested in collaborating on developing a gold QA dataset for testing clinician-facing GenAI applications. My research is centered on creating and validating clinician-facing chatbots.

Cite this article as: Eapen BR. (November 20, 2024). Loading MIMIC dataset onto a FHIR server in two easy steps. Retrieved March 22, 2026, from https://nuchange.ca/2024/11/loading-mimic-dataset-onto-a-fhir-server-in-two-easy-steps.html.

Locally hosted LLMs

TL; DR: From my personal experiments (on an 8-year-old, i5 laptop with 16 GB RAM), locally hosted LLMs are extremely useful for many tasks that do not require much model-captured knowledge. 

Image credit: I, Luc Viatour, CC BY-SA 3.0 https://creativecommons.org/licenses/by-sa/3.0, via Wikimedia Commons

The era of relying solely on large language models (LLMs) for all-encompassing knowledge is evolving. As technology advances, the focus shifts towards more specialized and integrated systems. These systems combine the strengths of LLMs with real-time data access, domain-specific expertise, and interactive capabilities. This evolution aims to provide more accurate, context-aware, and up-to-date information, saving us time and addressing the limitations of static model knowledge.

I have started to realize that LLMs are more useful as language assistants who can summarize documents, write discharge summaries, and find relevant information from a patient’s medical record. The last one still has several unsolved limitations, and reliable diagnostic (or other) decision-making is still in the (distant?) future. In short, LLMs are becoming increasingly useful in healthcare as time-saving tools, but they are unlikely to replace us doctors as decision-makers soon. That raises an important question; Do locally hosted LLMs (or even the smaller models) have a role to play? I believe they do! 

Locally hosted large language models (LLMs) offer several key benefits. First, they provide enhanced data privacy and security, as all data remains on your local infrastructure, reducing the risk of breaches and unauthorized access. Second, they allow for greater customization and control over the hardware, software, and data used, enabling more tailored solutions. Additionally, locally hosted LLMs can operate offline, making them valuable in areas with unreliable internet access. Finally, they can reduce latency and potentially lower costs if you already have the necessary hardware. These advantages make locally hosted LLMs an attractive option for many users.  

The accessibility and ease of use offered by modern, user-friendly platforms like OLLAMA are significantly lowering the barriers for individuals seeking technical expertise in self-hosting large language models (LLMs). The availability of a range of open-source models on Hugging Face lowers the barrier even further. 

I have been doing some personal experiments with Ollama (on docker), Microsoft’s phi3: mini (language model) and all-minilm (embedding model), and I must say I am pleasantly surprised by the results! I have been using an 8-year-old, i5 laptop with 16 GB RAM. I have been using it as part of a project for democratizing Gen AI in healthcare, especially for resource-deprived areas (more about it here), and it does a decent job of vectorizing health records and answering questions based on RAG. I also made a helpful personal writing assistant that is RAG-based. I am curious to know if anybody else in my network is doing similar experiments with locally hosted LLMs on personal hardware. 

Come, join us to make generative AI in healthcare more accessible! 

ChatGPT captured the imagination of the healthcare world though it led to the rather misguided belief that all it needs is a chatbot application that can make API calls. A more realistic and practical way to leverage generative AI in healthcare is to focus on specific problems that can benefit from its ability to synthesize and augment data, generate hypotheses and explanations, and enhance communication and education. 

Generative AI Image credit: Bovee and Thill, CC BY 2.0
Generative AI Image credit: Bovee and Thill, CC BY 2.0 https://creativecommons.org/licenses/by/2.0, via Wikimedia Commons

One of the main challenges of applying generative AI in healthcare is that it requires a high level of technical expertise and resources to develop and deploy solutions. This creates a barrier for many healthcare organizations, especially smaller ones, that do not have the capacity or the budget to build or purchase customized applications. As a result, generative AI applications are often limited to large health systems that can invest in innovation and experimentation. Needless to say, this has widened the already big digital healthcare disparity. 

One of my goals is to use some of the experience that I have gained as part of an early adopter team to increase the use and availability of Gen AI in regions where it can save lives. I think it is essential to incorporate this mission in the design thinking itself if we want to create applications that we can scale everywhere. What I envision is a platform that can host and support a variety of generative AI applications that can be easily accessed and integrated by healthcare organizations and professionals. The platform would provide the necessary infrastructure, tools, and services to enable developers and users to create, customize, and deploy generative AI solutions for various healthcare problems. The platform would also foster a community of practice and collaboration among different stakeholders, such as researchers, clinicians, educators, and patients, who can share their insights, feedback, and best practices. 

I have done some initial work, guided by my experience in OpenMRS and I have been greatly inspired by Bhamini. The focus is on modular design both at the UI and API layers. OpenMRS O3 and LangServe templates show promise in modular design. I hope to release the first iteration on GitHub in late August 2024. 

Do reach out in the comments below, if you wish to join this endeavour, and together we can shape the future of healthcare with generative AI. 

Read Part II

Why is RAG not suitable for all Generative AI applications in healthcare?

Retrieval-augmented generation (RAG) is a method of generating natural language that leverages external knowledge sources, such as large-scale text corpora. RAG first retrieves a set of relevant documents for a given input query or context and then uses these documents as additional input for a neural language model that generates the output text. RAG aims to improve the factual accuracy, diversity, and informativeness of the generated text by incorporating knowledge from the retrieved documents. 

RAG applications
Image credit: Nomen4Omen with relabelling by Felix QW, CC BY-SA 3.0 DE https://creativecommons.org/licenses/by-sa/3.0/de/deed.en, via Wikimedia Commons

However, it may not be suitable for all healthcare applications because of the following reasons: 

RAG relies on the quality and relevance of the retrieved documents, which may not always be available or accurate for specific healthcare domains or tasks. For example, if the task is to generate a personalized treatment plan for a patient based on their medical history and symptoms, RAG may not be able to retrieve any relevant documents from a general-domain corpus, or it may retrieve outdated or inaccurate information that could harm the patient’s health. 

– RAG may not be able to capture the complex and nuanced context of healthcare scenarios, such as the patient’s preferences, values, goals, emotions, or social factors. These aspects may not be explicitly stated in the retrieved documents, or they may require additional knowledge and reasoning to infer. For example, if the task is to generate empathetic and supportive messages for a patient who is diagnosed with a terminal illness, RAG may not be able to consider the patient’s psychological state, coping strategies, or family situation, and may generate generic or inappropriate responses that could worsen the patient’s distress.

– RAG cannot be used to summarize a patient’s medical history as it may not be able to extract the most relevant and important information from the retrieved documents, which may contain a lot of noise, redundancy, or inconsistency. For example, if the task is to generate a concise summary of a patient’s chronic conditions, medications, allergies, and surgeries, RAG may not be able to filter out irrelevant or outdated information, such as the patient’s demographics, vital signs, test results, or minor complaints, or it may include conflicting or duplicate information from different sources. This could lead to a confusing or inaccurate summary that could misinform the patient or the healthcare provider. 

Therefore, RAG is not suitable for all Generative AI applications in healthcare, and it may require careful design, evaluation, and adaptation to ensure its safety, reliability, and effectiveness in specific healthcare contexts and tasks. 

Cite this article as: Eapen BR. (May 11, 2024). Why is RAG not suitable for all Generative AI applications in healthcare?. Retrieved March 22, 2026, from https://nuchange.ca/2024/05/why-is-rag-not-suitable-for-all-generative-ai-applications-in-healthcare.html.