FHIR Archives

Building a Modular Framework for Generative AI in Healthcare

A reference architecture designed to accelerate experimentation, deployment, and collaboration in healthcare AI.

Published by Bell Eapen on July 18, 2025 | Permalink

Loading MIMIC dataset onto a FHIR server in two easy steps

The integration of generative AI into healthcare has the potential to revolutionize the industry, from drug discovery to personalized medicine. However, the success of these applications hinges on the availability of high-quality, curated datasets such as MIMIC. These datasets are crucial for training and testing AI models to ensure they can perform tasks accurately and reliably.

Free Clip Art, CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0, via Wikimedia Commons

The Medical Information Mart for Intensive Care (MIMIC) dataset is a comprehensive, freely accessible database developed by the Laboratory for Computational Physiology at MIT. It includes deidentified health data from over 40,000 critical care patients admitted to the Beth Israel Deaconess Medical Center between 2001 and 2012. The dataset encompasses a wide range of information, such as demographics, vital signs, laboratory test results, medications, and caregiver notes. MIMIC is notable for its detailed and granular data, which supports diverse research applications in epidemiology, clinical decision-making, and the development of electronic health tools. The open nature of the dataset allows for reproducibility and broad use in the scientific community, making it a valuable resource for advancing healthcare research.

MIMIC-IV has been converted into the Fast Healthcare Interoperability Resources (FHIR) format and exported as newline-delimited JSON (ndjson). FHIR provides a structured way to represent healthcare data, ensuring consistency and reducing the complexity of data integration. However, importing the ndjson export of FHIR resources into a FHIR server can be challenging. Having the MIMIC-IV dataset loaded onto a FHIR server could be incredibly valuable. It would provide a consistent and reproducible environment for testing and developing Generative AI applications. Researchers and developers could leverage this setup to create and refine AI models, ensuring they work effectively with standardized healthcare data. This could ultimately lead to more robust and reliable AI applications in the healthcare sector. Here I show you how to do it in two easy steps using docker and the MIMIC-IV demo dataset.

STEP 1: Start the FHIR server

Use docker-compose to spin up the latest HAPI FHIR server that supports bulk data import using the docker-compose.yml file as below.

version: "3.7" 

services: 
  fhir: 
    image: hapiproject/hapi:latest 
    ports: 
      - 8080:8080 
    restart: "unless-stopped" 
    environment: 
      - hapi.fhir.bulkdata.enabled=true 
      - hapi.fhir.bulk_export_enabled=true 
      - hapi.fhir.bulk_import_enabled=true 
      - hapi.fhir.cors.enabled=true 
      - hapi.fhir.cors.allow_origin=* 
      - hapi.fhir.enforce_referential_integrity_on_write=false 
      - hapi.fhir.enforce_referential_integrity_on_delete=false 
      - "spring.datasource.url=jdbc:postgresql://postgres-db:5432/postgres" 
      - "spring.datasource.username=postgres" 
      - "spring.datasource.password=postgres" 
      - "spring.datasource.driverClassName=org.postgresql.Driver" 
      - "spring.jpa.properties.hibernate.dialect=ca.uhn.fhir.jpa.model.dialect.HapiFhirPostgres94Dialect" 

  
  postgres-db: 
    image: postgis/postgis:16-3.4 
    restart: "unless-stopped" 
    environment: 
      - POSTGRES_USER=postgres 
      - POSTGRES_PASSWORD=postgres 
      - POSTGRES_DB=postgres 
    ports: 
      - 5432:5432 
    volumes: 
      - postgres-db:/var/lib/postgresql/data 

volumes: 
  postgres-db: ~

Please note that the referential integrity on write is set to false.

docker compose up to start the server at the following base URL: http://localhost:8080/fhir

STEP 2: Send a POST request to the $import endpoint.

The full MIMIC-IV dataset is available here for credentialed users. The demo dataset used in the request below is available here. You don’t have to download the dataset. The request below contains the URL to the demo data sources. Anyone can access the files, as long as they conform to the terms of the license specified in this page. All you need is an internet connection for the docker environment. The FHIR $import operation allows for bulk data import into a FHIR server. When using resource type Parameters, you can specify the types of FHIR resources to be imported. This is done by including a Parameters resource in the request body, which details the resource types and their respective data files. I use the VSCODE REST Client extension to make the request and the format below aligns with its requirements. However, you can make the POST request in any way you prefer.

### 

  
POST http://localhost:8080/fhir/$import HTTP/1.1 
Prefer: respond-async 
Content-Type: application/fhir+json 

  
{ 

  "resourceType": "Parameters", 

  "parameter": [ { 

    "name": "inputFormat", 

    "valueCode": "application/fhir+ndjson" 

  }, { 

    "name": "inputSource", 

    "valueUri": "http://example.com/fhir/" 

  }, { 

    "name": "storageDetail", 

    "part": [ { 

      "name": "type", 

      "valueCode": "https" 

    }, { 

      "name": "credentialHttpBasic", 

      "valueString": "admin:password" 

    }, { 

      "name": "maxBatchResourceCount", 

      "valueString": "500" 

    } ] 

  }, { 

    "name": "input", 

    "part": [ { 

      "name": "type", 

      "valueCode": "Observation" 

    }, { 

      "name": "url", 

      "valueUri": "https://physionet.org/files/mimic-iv-fhir-demo/2.0/mimic-fhir/ObservationLabevents.ndjson" 

    } ] 

  }, { 

    "name": "input", 

    "part": [ { 

      "name": "type", 

      "valueCode": "Medication" 

    }, { 

      "name": "url", 

      "valueUri": "https://physionet.org/files/mimic-iv-fhir-demo/2.0/mimic-fhir/Medication.ndjson" 

    } ] 

  }, { 

    "name": "input", 

    "part": [ { 

      "name": "type", 

      "valueCode": "Procedure" 

    }, { 

      "name": "url", 

      "valueUri": "https://physionet.org/files/mimic-iv-fhir-demo/2.0/mimic-fhir/Procedure.ndjson" 

    } ] 

  }, { 

    "name": "input", 

    "part": [ { 

      "name": "type", 

      "valueCode": "Condition" 

    }, { 

      "name": "url", 

      "valueUri": "https://physionet.org/files/mimic-iv-fhir-demo/2.0/mimic-fhir/Condition.ndjson" 

    } ] 

  }, { 

    "name": "input", 

    "part": [ { 

      "name": "type", 

      "valueCode": "Patient" 

    }, { 

      "name": "url", 

      "valueUri": "https://physionet.org/files/mimic-iv-fhir-demo/2.0/mimic-fhir/Patient.ndjson" 

    } ] 

  } ] 

}

That’s it! It takes a few minutes for the bulk import to complete, depending on your system resources.

Feel free to reach out if you’re interested in collaborating on developing a gold QA dataset for testing clinician-facing GenAI applications. My research is centered on creating and validating clinician-facing chatbots.

Cite this article as: Eapen BR. (November 20, 2024). Loading MIMIC dataset onto a FHIR server in two easy steps. Retrieved August 7, 2025, from https://nuchange.ca/2024/11/loading-mimic-dataset-onto-a-fhir-server-in-two-easy-steps.html.

Published by Bell Eapen on November 20, 2024 | Permalink

OHDSI OMOP to FHIR mapper

TL;DR Below is an open-source common-line tool for converting an OHDSI OMOP cohort (defined in ATLAS) to a FHIR bundle and vice versa.

Wikimedia commons: Copyright held by BAPS Swaminarayan Sanstha (web: www.baps.org, email: info@baps.org); Unknown photographer / CC BY-SA (https://creativecommons.org/licenses/by-sa/4.0)

OHDSI OMOP CDM is one of the most popular clinical data models for health data warehouses. The simple, but clinically motivated data structure is intuitively appealing to clinicians leading to its good adoption. In this respect, it has overtaken HL7-V3 which is more robust but has a steeper learning curve, especially for clinicians. The OHDSI OMOP CDM is widely used in the pharmaceutical industry for drug monitoring.

FHIR is emerging as the defacto standard for health system interoperability, owing largely to its simplicity and the use of existing and popular standards such as REST. As NoSQL databases become more and popular in healthcare, FHIR can also be a good persistence schema. It aligns well with search technologies such as elasticsearch.

As both standards are popular, conversion from one to the other may be commonly required. Researchers at Georgia Tech have an open-source tool – GT-FHIR2 – for mapping an existing OHDSI OMOP CDM database as FHIR endpoint. However, conversion between existing systems may not be easy with a full-stack solution.

I have a simpler solution that I believe will be useful in the following scenarios:

To export a cohort to a FHIR based analytics tool.
To load new resources to OMOP CDM databases for incremental ETL.

Omopfhirmap is a command-line tool for mapping a OHDSI cohort, defined in ATLAS, to a FHIR bundle that can be optionally submitted to a FHIR server for processing. Conversely, it can process a FHIR bundle and add resources to an existing CDM database ignoring duplicates. Unlike GT-FHIR2, the OMOP on FHIR Project at Georgia Tech omopfhirmap does not expose OMOP database as FHIR endpoints.

I have used spring-boot and JPA for easy wiring of services and abstraction of database and the hapi-fhir as it is an obvious choice for any java based FHIR applications. It is still a work in progress and any help will be appreciated (Refer to CONTRBUTING.md).

OMOP <-> FHIR mapper
https://github.com/dermatologist/omopfhirmap
6 forks.
12 stars.
8 open issues.

Recent commits:

Published by Bell Eapen on July 22, 2020 | Permalink

FHIR and public health data warehouses

First posted on CanEHealth.com

The provincial government is building a connected health care system centred around patients, families and caregivers through the newly established OHTs. As disparate healthcare and public health teams move towards a unified structure, there is a growing need to reconsider our information system strategy. Most off the shelf solutions are pricey, while open-source solutions such as DHIS2 is not popular in Canada. Some of the public health units have existing systems, and it will be too resource-intensive to switch to another system. The interoperability challenge needs an innovative solution, beyond finding the single, provincial EMR.

We have written about the theoretical aspects, especially the need to envision public health information systems separate from an EMR. In this working paper, we propose a maturity model for PHIS and offer some pragmatic recommendations for dealing with the common challenges faced by public health teams.

Below is a demo project on GitHub from the data-intel lab that showcases a potential solution for a scalable data warehouse for health information system integration. Public health databases are vital for the community for efficient planning, surveillance and effective interventions. Public health data needs to be integrated at various levels for effective policymaking. PHIS-DW adopts FHIR as the data model for storage with the integrated Elasticsearch stack. Kibana provides the visualization engine. PHIS-DW can support complex algorithms for disease surveillance such as machine learning methods, hidden Markov models, and Bayesian to multivariate analytics. PHIS-DW is work in progress and code contributions are welcome. We intend to use Bunsen to integrate PHIS-DW with Apache Spark for big data applications.

Public Health Data Warehouse Framework on FHIR
https://github.com/E-Health/fhir-server-phis-dw
2 forks.
3 stars.
3 open issues.

Recent commits:

FHIR has some advantages as a data persistence schema for public health. Apart from its popularity, the FHIR bundle makes it possible to send observations to FHIR servers without the associated patient resource, thereby ensuring reasonable privacy. This is especially useful in the surveillance of pandemics such as COVID19. Some useful yet complicated integrations with OSCAR EMR and DHIS2 is under consideration. If any of the OHTs find our approach interesting, give us a shout.

BTW, have you seen Drishti, our framework for FHIR based behavioural intervention?

Published by Bell Eapen on April 28, 2020 | Permalink

Drishti: An mHealth platform for pervasive health monitoring

TL;DR: Here is an open-source mHealth framework based on FHIR! and here is the paper and my presentation at ICSE!

Pervasive health monitoring is becoming less and less intrusive with better sensors, and more and more useful with machine learning and predictive analytics.

mHealth (mobile health) could play an important part in pervasive health monitoring. It is difficult for clinicians to efficiently use the data from disparate apps that do not communicate with each other. For example, if a clinician has to monitor a patient’s blood sugar, blood pressure and physical activity, the clinician may have to check data from multiple apps. Another challenge is the difficulty in communicating clinical requirements to app developers and it is difficult to test and approve the clinical validity of these apps. Besides, there are always privacy and security concerns with personal health information.

Open mHealth is a framework introduced to manage the problem of interoperability between apps. It is an open-source project. Open mHealth project provides interfaces for cloud services such as GoogleFit and Fitbit and converts the data into a common data format. BIT model deals with the communication problem between clinicians and developers during app development. Drishti incorporates Open mHealth framework into the BIT model using FHIR as the common data model.

The BIT model is based on the Sense-Plan-Act paradigm from robotics. The BIT model encourages conceptualizing mHealth apps as three distinct components: Profilers that sense data on various physiological parameters such as blood pressure, planners that create a clinical intervention plan and actors that deliver the plan to the users as alerts or messages on their mobile devices. Drishti adopts the BIT model as a design model with all components sharing a central data repository. Drishti makes sharing of information with the doctors easy, by integrating it into an EMR. The central data repository also makes big data applications possible.

The central data repository in Drishti uses FHIR schema for storage. FHIR is a schema for health data created by HL7 that defines ‘Resources’ that can be exchanged as json or xml using RESTful interfaces. Resources support 80% of common use cases and the rest can be supported using extensions. For example, age and gender are defined for a Patient resource, while skin type that is not commonly used is defined through an extension if required. Drishti uses the ‘Observation’ resource for storing data from profilers and the ‘CarePlan’ resource for the planner and actor components.

Open mHealth is the profiler in Drishti. All data from the various cloud services are converted to FHIR Observations and stored in the Drishti-Cog. The Drishti-Planner can take data stored in the cog and create a careplan and the actor can deliver it to the patient. Drishti uses OpenMRS EMR for managing access, both for clinicians and patients. We have developed an OpenMRS module for integration with Drishti. The javascript visualization library called hGraph provides a consolidated view of the data pulled from sensors to the clinician.

In the current implementation, the cog is a FHIR server based on the HAPI java library. Planner and actor components are just stubs that can be extended for several use cases. The planner is a python flask app and the viewer is a Vue App that can be used as a native mobile app. Both are templates that can be extended. The entire stack is available on GitHub along with pre-built Docker containers for quick prototyping.

Here is a typical use case. Depression is a common mental health problem, characterized by loss of interest in activities that you normally enjoy. Patients with depression are typically treated with anti-depressant drugs. The clinicians need to track the patient’s activity to assess progress along with medication compliance. The patient can use an activity tracker app and a medication tracker app, both sending data to the cog as FHIR observations. The clinicians can have a consolidated view in their EMR and create alerts or messages (plan) that can be delivered to the patient’s mobile device. The interventions can also be created by AI systems.

Drishti was presented at Software Engineering in Healthcare conference in Montreal and selected for FHIR devdays. Please cite Drishti as below:

Bell Raj Eapen, Norm Archer, Kamran Sartipi, and Yufei Yuan. 2019. Drishti: a sense-plan-act extension to open mHealth framework using FHIR. In Proceedings of the 1st International Workshop on Software Engineering for Healthcare (SEH ’19). IEEE Press, Piscataway, NJ, USA, 49-52. DOI: https://doi.org/10.1109/SEH.2019.00016

@inproceedings{Eapen:2019:DSE:3353963.3353974,
 author = {Eapen, Bell Raj and Archer, Norm and Sartipi, Kamran and Yuan, Yufei},
 title = {Drishti: A Sense-plan-act Extension to Open mHealth Framework Using FHIR},
 booktitle = {Proceedings of the 1st International Workshop on Software Engineering for Healthcare},
 series = {SEH '19},
 year = {2019},
 location = {Montreal, Quebec, Canada},
 pages = {49--52},
 numpages = {4},
 url = {https://doi.org/10.1109/SEH.2019.00016},
 doi = {10.1109/SEH.2019.00016},
 acmid = {3353974},
 publisher = {IEEE Press},
 address = {Piscataway, NJ, USA},
 keywords = {FHIR, interoperability, mHealth},
}

Published by Bell Eapen on August 13, 2019 | Permalink

Serverless on FHIR: Management guidelines for the semi-technical clinician!

Serverless is the new kid on the block with services such as AWS Lambda, Google Cloud Functions or Microsoft Azure Functions. Essentially it lets users deploy a function (Function As A Service or FaaS) on the cloud with very little effort. Requirements such as security, privacy, scaling, and availability are taken care of by the framework itself. As healthcare slowly yet steadily progress towards machine learning and AI, serverless is sure to make a significant impact on Health IT. Here I will explain serverless (and some related technologies) for the semi-technical clinicians and put forward some architectural best practices for using serverless in healthcare with FHIR as the data interchange format.

Let us say, your analyst creates a neural network model based on a few million patient records that can predict the risk for MI from BP, blood sugar, and exercise. Let us call this model r = f(bp, bs, e). The model is so good that you want to use it on a regular basis on your patients and better still, you want to share it with your colleagues. So you contact your IT team to make this happen.

This is what your IT guys currently do: First, they create a web application that can take bp, bs and e as inputs using a standard interface such as REST and return r. Next, they rent a virtual machine (VM) from a cloud provider (such as DigitalOcean). Then they convert this application into a container (docker) and deploy it in the VM. You now can use this as an application from your browser (chrome) or your EMR (such as OpenMRS or OSCAR) can directly access this function. You can share it with your colleagues and they can access it in their browsers and you are happy. The VM can support up to 3 users at a time.

In a couple of months, your algorithm becomes so popular that at any one time hundreds of users try to access it and your poor VM crashes most of the time or your users have to wait forever. So you call your IT guys again for a solution. They make 100 copies of your container, but your hospital is reluctant to give you the additional funding required.

Your smart resident notices that your application is being used only in the morning hours and in the night all the 100 containers are virtually sleeping. This is not a good use of the funding dollars. You contact your IT guys again, and they set up Kubernetes for orchestrating the containers according to usage. So, what is Serverless? Serverless is a framework that makes all these so easy that you may not even need your IT guys to do this. (Well, maybe that is an exaggeration)

My personal favourite serverless toolset (if you care) is Kubernetes + Knative + riff. I don’t try to explain what the last two are or how to use them. They are so new that they keep changing every day. In essence, your IT team can complete all the above tasks with few commands typed on the command line on the cloud provider of your choice. The application (function rather) can even scale to zero! (You don’t pay anything when nobody uses it and add more containers as users increase, scaling down in the night as in your case).

Best Practices

What are the best practices when you design such useful cloud-based ‘functions’ for healthcare that can be shared by multiple users and organizations? Well, here are my two cents!

First, you need a standard for data exchange. As JSON is the data format for most APIs, FHIR wins hands down here.

Next, APIs need a mechanism to expose their capabilities and properties to the world. For example, r = f(bp, bs, e) needs to tell everyone what it accepts (bp, bs, e) and what it returns (at the bare minimum). FHIR has a resource specifically for this that has been (not so creatively) named as an Endpoint. So, a function endpoint should return a FHIR Endpoint resource with information about itself if there is no payload.

What should the payload be? Payload should be a FHIR Bundle that has all the FHIR Resources that the function needs (bp, bs and e as FHIR Observations in your case). The bundle should also include a FHIR Subscription resource that points to the receiving system (maybe your EMR) for the response ( r ).

So, what next?

Take the phone and call your IT team. Tell them to take
Kubernetes + Knative + riff for a spin! I might do the same and if I do, I will share it here. And last but not the least, click on the blue buttons below! 🙂

Published by Bell Eapen on February 12, 2019 | Permalink

SMART CDS-Hooks

CDS-Hooks specification describes a “hook”-based pattern for invoking decision support from within a clinician’s EHR workflow.

Published by Bell Eapen on March 21, 2016 | Permalink