eHealth Archives

Bringing Generative AI Into the EHR: Why DHTI Matter (Part I)

Large Language Models (LLMs) are transforming how we think about clinical decision support, documentation, and patient engagement. Yet despite their impressive capabilities, LLMs have a fundamental limitation that becomes especially important in healthcare: LLMs are stateless. They do not remember prior interactions unless that information is explicitly included in the prompt. For clinical use, this means that patient‑specific data must be added to every prompt if we want the model to generate relevant, safe, and context‑aware output.

This is where the real challenge begins.

Image credit: Grzegorz W. Tężycki, CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0, via Wikimedia Commons

Why Patient Context Matters for Generative AI in Healthcare

Healthcare workflows depend on rich, longitudinal patient data—medications, allergies, labs, imaging, diagnoses, and more. To generate clinically meaningful output, an LLM must be given this context. Without it, the model is essentially guessing.

But adding patient data to prompts is not as simple as it sounds. Extracting structured, reliable data from Electronic Medical Records (EMRs) is notoriously difficult. EMRs were not originally designed with AI integration in mind. Data may be siloed, inconsistently structured, or locked behind proprietary interfaces. Even when APIs exist, authentication, authorization, and data‑mapping complexities can slow down innovation.

FHIR: The Standard That Makes Interoperability Possible

Fortunately, the healthcare ecosystem has rallied around a modern interoperability standard: HL7® FHIR® (Fast Healthcare Interoperability Resources). FHIR provides a consistent, web‑friendly way to represent clinical data, making it easier for external applications—including AI systems—to retrieve patient information.

Most major EMRs now expose FHIR APIs that allow authorized systems to query patient‑specific data such as demographics, medications, conditions, and lab results. This shift has been transformative. Instead of custom integrations for each EMR vendor, developers can rely on a shared standard.

FHIR also underpins many modern interoperability frameworks, including SMART on FHIR and CDS‑Hooks. These standards are now widely adopted across the industry, with CDS‑Hooks explicitly designed to connect EMRs to external decision‑support services using FHIR data.

Displaying AI Output Inside the EMR: The Role of CDS‑Hooks

Retrieving data is only half the problem. Once an AI model generates insights, the output must be displayed inside the clinician’s workflow—not in a separate window, not in a separate app, and not in a place where it will be ignored.

This is where CDS‑Hooks comes in.

CDS‑Hooks is a standard that allows EMRs to call external decision‑support services at specific points in the clinical workflow. When a clinician opens a chart, writes an order, or reviews a medication list, the EMR can trigger a “hook” that sends key context—including the patient ID—to a backend service. That backend can then use FHIR APIs to retrieve the necessary patient data, run AI models, and return actionable “cards” that appear directly inside the EMR interface.

This pattern is powerful because:

It keeps clinicians in their workflow
It ensures AI output is tied to real‑time patient context
It avoids sending large amounts of PHI directly from the EMR to the AI model

In short, CDS‑Hooks is the bridge between EMRs and modern AI‑powered decision support.

DHTI: A Reference Architecture for GenAI in Healthcare

As interest in generative AI grows, developers and researchers need a framework that brings all these pieces together—LLMs, FHIR, CDS‑Hooks, EMR integration, and modular AI components. DHTI (Distributed Health Technology Interface) is one such open‑source project.

DHTI embraces the standards that matter:

FHIR for structured data exchange
CDS‑Hooks for embedding AI output in the EMR
LangServe for hosting modular GenAI applications
Ollama for local LLM hosting
OpenMRS as an open‑source EMR environment

The project’s documentation highlights how CDS‑Hooks is used to send patient context (including patient ID) and how backend services retrieve additional data using FHIR before generating AI‑driven insights. DHTI’s architecture is intentionally modular, allowing developers to prototype new GenAI “elixirs” (backend services) and UI “conches” (frontend components) that plug directly into an EMR environment.

You can explore the project here:

DHTI: a reference architecture for Gen AI in healthcare and a skill platform for vibe coding!
https://github.com/dermatologist/dhti
0 forks.
17 stars.
8 open issues.

Recent commits:

Why This Matters for the Future of Clinical AI

Healthcare AI must be:

Context‑aware
Integrated into clinical workflows
Standards‑based
Secure and privacy‑preserving
Interoperable across EMRs

LLMs alone cannot meet these requirements. But LLMs combined with FHIR, CDS‑Hooks, and frameworks like DHTI can.

This is how we move from isolated AI demos to real, production‑ready clinical tools. Try DHTI and Help Democratize GenAI in Healthcare

Published by Bell Eapen on January 7, 2026 | Permalink

Design Science Research in Healthcare: Bridging the Gap Between Ideas and Impact

In the world of healthcare research, the dominant paradigm has long been empirical and observational—studies that measure, compare, and validate phenomena to uncover truths. But what if the goal isn’t just to understand the world, but to change it? That’s where Design Science Research (DSR) comes in—a paradigm that’s less about observing and more about building, solving, and transforming. As a health informatics researcher working at the intersection of consultancy and academia, I’ve found DSR to be the most powerful lens for translating ideas into practice.

Image Credit: kevineriley, CC BY-SA 3.0 https://creativecommons.org/licenses/by-sa/3.0, via Wikimedia Commons

What Is Design Science Research?

Design Science Research, as defined by Hevner et al. (2004), focuses on the creation and evaluation of artifacts—tools, frameworks, models, or systems—that solve identified problems. Unlike traditional behavioral science, which seeks to explain phenomena, DSR aims to design solutions and assess their utility. In healthcare, this means building decision support tools, workflow optimizers, or data integration platforms that directly improve clinical or operational outcomes.

Hevner’s seminal work laid out seven guidelines for DSR, including the need for a clear problem definition, artifact relevance, rigorous evaluation, and contribution to knowledge. These principles have since guided a wave of innovation in health informatics, where the complexity of real-world systems demands more than just theoretical insight—it demands actionable design.

Why Traditional Research Falls Short in Healthcare Innovation

Traditional healthcare research often struggles with the “translational gap”—the chasm between discovery and implementation. A novel algorithm might predict sepsis with high accuracy, but without integration into clinical workflows, it remains a paper exercise. Similarly, a new policy framework might promise equity, but without tools to operationalize it, it’s just words.

This is where DSR shines. It doesn’t stop at the idea; it builds the bridge. It asks: What artifact can embody this idea? How will it be used? What constraints must it navigate? And most importantly: Does it work in practice?

Living at the Intersection: Consultancy Meets Research

My work often begins with a specific organizational challenge—say, a hospital struggling with fragmented dermatology referrals. This is the consultancy mode: solving a particular problem for a particular client. But as a researcher, I’m also asking: Is this problem part of a broader class? Can the solution be generalized?

This dual lens allows me to design artifacts that are both context-sensitive and theoretically grounded. For example, a referral triage tool built for one clinic might evolve into a modular framework for dermatology decision support across multiple institutions. That’s the essence of DSR: solving classes of problems through iterative design, evaluation, and abstraction.

The Role of Translational Designers in Health Research Teams

Most health research teams are rich in domain expertise—clinicians, epidemiologists, policy analysts. But they often lack what I call translational designers: people who can take a promising idea and make it work in the messy, constraint-laden world of healthcare delivery.

These designers are fluent in both theory and practice. They understand stakeholder needs, data limitations, regulatory constraints, and user experience. They build prototypes, test them in real settings, and refine them based on feedback. Without them, even the best ideas risk dying in the valley of death between research and implementation.

Making DSR Accessible and Impactful

One challenge with DSR is that it can feel abstract or overly technical. But at its heart, it’s a human-centered approach. It starts with people—their needs, frustrations, and goals—and builds solutions that fit their world. It’s not about perfect algorithms; it’s about useful artifacts.

To make DSR more accessible, I often use metaphors. I tell students: “Traditional research is like mapping the terrain. DSR is like building the bridge.” Or: “Empirical studies ask ‘what is?’ DSR asks ‘what could be?’” These reframings help shift the mindset from passive observation to active creation.

Final Thoughts: Designing the Future of Healthcare

As healthcare becomes more complex, data-rich, and digitally mediated, we need more than observational studies. We need designers—people who can build, test, and refine solutions that work in practice. Design Science Research offers a rigorous, impactful way to do just that.

Whether you’re a clinician with an idea, a researcher with a model, or a technologist with a prototype, DSR provides the scaffolding to turn insight into innovation. And if you’re looking for inspiration, check out DHTI—it’s a living example of how design can drive transformation.

DHTI: a reference architecture for Gen AI in healthcare and a skill platform for vibe coding!
https://github.com/dermatologist/dhti
0 forks.
17 stars.
8 open issues.

Recent commits:

Published by Bell Eapen on November 11, 2025 | Permalink

Locally hosted LLMs

TL; DR: From my personal experiments (on an 8-year-old, i5 laptop with 16 GB RAM), locally hosted LLMs are extremely useful for many tasks that do not require much model-captured knowledge.

Image credit: I, Luc Viatour, CC BY-SA 3.0 https://creativecommons.org/licenses/by-sa/3.0, via Wikimedia Commons

The era of relying solely on large language models (LLMs) for all-encompassing knowledge is evolving. As technology advances, the focus shifts towards more specialized and integrated systems. These systems combine the strengths of LLMs with real-time data access, domain-specific expertise, and interactive capabilities. This evolution aims to provide more accurate, context-aware, and up-to-date information, saving us time and addressing the limitations of static model knowledge.

I have started to realize that LLMs are more useful as language assistants who can summarize documents, write discharge summaries, and find relevant information from a patient’s medical record. The last one still has several unsolved limitations, and reliable diagnostic (or other) decision-making is still in the (distant?) future. In short, LLMs are becoming increasingly useful in healthcare as time-saving tools, but they are unlikely to replace us doctors as decision-makers soon. That raises an important question; Do locally hosted LLMs (or even the smaller models) have a role to play? I believe they do!

Locally hosted large language models (LLMs) offer several key benefits. First, they provide enhanced data privacy and security, as all data remains on your local infrastructure, reducing the risk of breaches and unauthorized access. Second, they allow for greater customization and control over the hardware, software, and data used, enabling more tailored solutions. Additionally, locally hosted LLMs can operate offline, making them valuable in areas with unreliable internet access. Finally, they can reduce latency and potentially lower costs if you already have the necessary hardware. These advantages make locally hosted LLMs an attractive option for many users.

The accessibility and ease of use offered by modern, user-friendly platforms like OLLAMA are significantly lowering the barriers for individuals seeking technical expertise in self-hosting large language models (LLMs). The availability of a range of open-source models on Hugging Face lowers the barrier even further.

I have been doing some personal experiments with Ollama (on docker), Microsoft’s phi3: mini (language model) and all-minilm (embedding model), and I must say I am pleasantly surprised by the results! I have been using an 8-year-old, i5 laptop with 16 GB RAM. I have been using it as part of a project for democratizing Gen AI in healthcare, especially for resource-deprived areas (more about it here), and it does a decent job of vectorizing health records and answering questions based on RAG. I also made a helpful personal writing assistant that is RAG-based. I am curious to know if anybody else in my network is doing similar experiments with locally hosted LLMs on personal hardware.

Published by Bell Eapen on July 14, 2024 | Permalink

Kedro for multimodal machine learning in healthcare

Healthcare data is heterogenous with several types of data like reports, tabular data, and images. Combining multiple modalities of data into a single model can be challenging due to several reasons. One challenge is that the diverse types of data may have different structures, formats, and scales which can make it difficult to integrate them into a single model. Additionally, some modalities of data may be missing or incomplete, which can make it difficult to train a model effectively. Another challenge is that different modalities of data may require different types of pre-processing and feature extraction techniques, which can further complicate the integration process. Furthermore, the lack of large-scale, annotated datasets that have multiple modalities of data can also be a challenge. Despite these challenges, advances in deep learning, multi-task learning and transfer learning are making it possible to develop models that can effectively combine multiple modalities of data and achieve reliable performance.

Kedro for multimodal machine learning

Kedro is an open-source Python framework that helps data scientists and engineers organize their code, increase productivity and collaboration, and make it easier to deploy their models to production. It is built on top of popular libraries such as Pandas, TensorFlow and PySpark, and follows best practices from software engineering, such as modularity and code reusability. Kedro supplies a standardized structure for organizing code, handling data and configuration, and running experiments. It also includes built-in support for version control, logging, and testing, making it easy to implement reproducible and maintainable pipelines. Additionally, Kedro allows to easily deploy the pipeline on cloud platforms like AWS, GCP or Azure. This makes it a powerful tool for creating robust and scalable data science and data engineering pipelines.

I have built a few kedro packages that can make multi-modal machine learning easy in healthcare. The packages supply prebuilt pipelines for preprocessing images, tabular and text data and build fusion models that can be trained on multi-modal data for easy deployment. The text preprocessing package currently supports BERT and CNN-text models. There is also a template that you can copy to build your own pipelines making use of the preprocessing pipelines that I have built. Any number and combination of data types are supported. Additionally, like any other kedro pipeline, these can be deployed on kubeflow and VertexAI. Do comment below if you find these tools useful in your research.

Template for multi-modal machine learning in healthcare using Kedro. Combine reports, tabular data and images using various fusion methods.
https://github.com/dermatologist/kedro-multimodal
3 forks.
24 stars.
1 open issues.

Recent commits:

Update README.md, GitHub
change graphics, Bell Eapen
Update README.md, GitHub
Merge pull request #1 from dermatologist/add-license-1Create LICENSE, GitHub
Create LICENSE, GitHub

Dark Mode

kedro-multimodal (this link opens in a new window) by dermatologist (this link opens in a new window)

Template for multi-modal machine learning in healthcare using Kedro. Combine reports, tabular data and image using various fusion methods.

Published by Bell Eapen on January 25, 2023 | Permalink

Six things data scientists in healthcare should know

Healthcare, like most other fields, is eager to get on the data science bandwagon. Data scientists can make a huge difference in the way big data is utilized for clinical decision-making. However, there are paradigmatic differences in the way data scientists from quantitative fields view the world, compared to their clinical counterparts. This is especially true in the emerging fields of machine learning and artificial intelligence. This may lead to considerable inefficiencies. As a person trained in both fields, here is my take on this.

Data scientists should focus on the problem and not the solutions

Data scientists are excited about the latest GPT or BERT. Data scientists tend to refine the model a bit more using 10 more GPUs! In the process, they tend to solve problems that do not exist. From my experience practicing medicine in extremely resource-poor areas, simple solutions are valued more than BERT running on Kubernetes! This is true in the developed world as well, and many teams may have fundamental data needs that need to be tackled first.

Explanation comes before prediction

Emerging machine learning methods prioritize prediction accuracy compromising on explainability in the process. Clinicians, in most cases, cannot use nor trust a model that arrives at a conclusion without showing how it reached there. Hence, in the clinical domain, a simple logistic regression model may be more acceptable than a deep learning neural network. Parsimony is the key and a bit of feature selection to ensure parsimony will be appreciated always.

You need to know the clinical terminologies

A basic understanding of the clinical terminologies and terminology systems such as SNOMED and ICD is vital. It helps in understanding the clinical community better. Any healthcare analytics to consider variations in terminologies and adopt a standard system for consistency. Any tool that data scientists build for the clinical community should have support for terminology systems.

Biostatistics is more pervasive than you think

Most healthcare professionals are trained in biostatistics. Hence, the thinking leans towards population, sampling, randomization, blindings and showing a ‘statistically significant’ difference. Moving towards machine learning needs a paradigmatic shift. It may be useful to have a discussion on this at the outset.

Classes are of unequal importance

In healthcare, finding one class (e.g. cancer) is more important than the other class (e.g. no cancer). One class may need active intervention to save lives. Hence, sensitivity and specificity are of vital importance than accuracy!

Life is precious!

In healthcare, there is no room for error. Some decisions may have disastrous consequences while few others may save lives. As a data scientist in the healthcare domain, you should be cognizant of the fact that healthcare data is different from banking/airline data.

Published by Bell Eapen on November 3, 2021 | Permalink

Open-source for healthcare

This post is meant to be an instruction guide for healthcare professionals who would like to join my projects on GitHub.

What is a contribution?

Contribution is not always coding. You can clean up stuff, add documentation, instructions for others to follow etc. Issues and feature requests should be posted under the ‘issues’ tab and general discussions under the ‘Discussions’ tab if one is available.

How do I contribute.

Typically you fork the repo and issue a pull request (PR)
Leave the ‘Allow edit by maintainers checked’
Submit your pull requests to develop branch.
Star the repository and follow me 🙂

How do I develop

The .devcontainer folder will have the configuration for the docker container for development.
Version bump action (if present) will automatically bump version based on the following terms in a commit message: major/minor/patch. Avoid these words in the commit message unless you want to trigger the action.
Most repositories have GH actions to generate and deploy documentation and changelog.

What do I do next

My repositories (so far) are small enough for beginners to get the big picture and make meaningful contributions.
Don’t be discouraged if you make mistakes. That is how we all learn.

There’s no better time than now to choose a repo to contribute!

Published by Bell Eapen on July 28, 2021 | Permalink

Are you an eHealth programmer or developer

EHealth programmer is one who has written one or more computer programs that are used by someone else to solve a real-world problem.

Published by Bell Eapen on January 2, 2017 | Permalink

HL10: From Model to Framework

HL10 (Hamilton) is an attempt to take the BIT model and the sense-plan-act paradigm to the next level of a software framework.

Published by Bell Eapen on July 2, 2015 | Permalink

Physician-patient encounter for HMIS & eHealth

I have tried to summarize basic steps in a physician-patient encounter.

Published by Bell Eapen on May 7, 2015 | Permalink

Interoperability for Doctors and Healthcare professionals Part I

It is important for health information systems to talk to each other. Unfortunately they speak different languages. This article is not for eHealth experts to understand the nuances of interoperability (HIE), but for health care professionals to have an idea about what is out there and what can be expected in the future.

When we consider HIE we have to think about what is being exchanged (package), how the information is organized (format) and how it is being transported (protocol). Though it is not essential to know, few terms that you might recognize are: HL7, XML for format and HTTP, TCP/IP for protocol. (Have you heard of MLLP? Google it!) The donor has the information in a certain format and protocol while the recipient expects it in a particular format and protocol.

At the core of all HIE platforms such as MirthConnect or OrionHealth’s Rhapsody, is an engine that does this conversion. Format and protocol of donor to format and protocol of recipient. Simple eh?

Is pragmatic interoperability the best solution?- M. Martineau @eHealthMusings explores a pragmatic Rhapsody approach http://t.co/XMkMXAEdt7

— Orion Health Canada (@OrionHealthCA) December 11, 2014

Now most of these platforms have a user interface or IDE for making this connection. You can also introduce certain filters at this stage. Enterprise systems like Rhapsody presents an attractive visual interface, whereas open source solutions may not be very user friendly.

What else can the engine do? It usually keeps a log of all package deliveries and whether the delivery was successful. If the delivery failed, it can attempt again and alert the maintenance team through a console. The console can even be mobile as in rhapsody. Though the engine can store the package itself for a limited time, storing the package is not really its job.

The donors could be:

A single department in a hospital sending lab reports.
All departments in a hospital sending all sorts of information.
Several hospitals in a region.

The recipient could be:

Another department in the same hospital expecting a lab test report.
A family physician who wants real time access to the lab reports for his patients admitted in the hospital.
A researcher who wants to know blood sugar status of all the diabetes patients. (population health)

We need a separate layer between the engine and the recipient to support all these use cases. Let us call this layer mediator.

The mediator can pull data in real time from donors or store it in a local database. The first one is the federated model while the other one is the centralized model. Federated is slow but up-to-date while centralized is fast but not concurrent. Mixed model has both and is preferred. The so called clinical viewers are federated mediators with a web interface.

Emerging paradigms like NoSQL and RDF may be ideal for data representation in mixed model. I have discussed RDF before. Will discuss NoSQL soon!

Published by Bell Eapen on December 13, 2014 | Permalink

Bell Eapen MD, PhD.

Bringing Generative AI Into the EHR: Why DHTI Matter (Part I)

Design Science Research in Healthcare: Bridging the Gap Between Ideas and Impact

Locally hosted LLMs

Kedro for multimodal machine learning in healthcare

Kedro for multimodal machine learning

Six things data scientists in healthcare should know

Data scientists should focus on the problem and not the solutions

Explanation comes before prediction

You need to know the clinical terminologies

Biostatistics is more pervasive than you think

Classes are of unequal importance

Life is precious!

Open-source for healthcare

What is a contribution?

How do I contribute.

How do I develop

What do I do next

Are you an eHealth programmer or developer

HL10: From Model to Framework

Physician-patient encounter for HMIS & eHealth

Interoperability for Doctors and Healthcare professionals Part I

Related articles