Bell Eapen MD, PhD.

Bringing Digital health & Gen AI research to life!

R&D and Innovation in IT; to or not to combine both

R&D and innovation are two related but distinct concepts. My aim is not to delve into the subtle semantic differences between the two but to explore, as an information systems researcher, some organizational factors that may impact individual innovators. My focus is exclusively on information technology and information systems innovation within a corporate setting. 

R&D and Innovation in IT; to or not to combine both
Image credit: Petrovskyz and Jahobr, CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0, via Wikimedia Commons

In my view, Research & Development (R&D) is a systematic process of exploring existing methods, techniques, and processes within an organization to improve upon them or discover new applications. This involves thorough investigation, experimentation, analysis, and refinement aimed at creating solutions that can enhance productivity, efficiency, quality, and competitiveness in the marketplace. The methods, techniques, and processes are either available internally or can be procured free or bought. The focus is on finding the organizational fit for a known solution. 

The R&D process typically comprises several stages:  

1. Identification of the pain points. (Problem space) 

2. Identification of potential solutions. (Solution space) 

3. Gap analysis and research objectives. 

4. Experimental design and execution. 

5. Interpretation. 

6. Reporting back to stakeholders and decision-makers. 

R&D is potentially scalable by increasing the team size. Individuals work as a team to solve problems. It is easy to track and monitor progress. Documentation is the key to externalizing the gained knowledge to the organizational memory for the use and reuse of knowledge, ensuring transparency and continuous learning within the organization. Moreover, utilizing collaboration tools and project management software will streamline communication between team members, facilitating effective knowledge sharing and reducing potential bottlenecks. As the R&D department grows, it is essential to maintain an agile mindset.  

As the focus is on finding the organizational fit for a known IT artifact, the recommendations have little relevance outside the organization; and as such are not publishable. This is not to discount any attempt to tease out generalizable knowledge from R&D initiatives and publish them as papers. The notion of “failed R&D” applies only if you treat the identification of an IT artifact as unsuitable for the organization as a failed R&D project. I do not believe you should! 

In contrast, innovation is the pursuit of the unknown. The idea, method or process is either not obvious (it may be obvious in hindsight), or it is outright novel and disruptive. It involves pushing boundaries, challenging pre-existing norms and beliefs, and exploring uncharted territories to create something new. Innovation often requires creativity, critical thinking, and a willingness to take calculated risks. It is driven by curiosity, passion for learning, and an unyielding desire to find better solutions to complex problems. 

Innovation is risky with a high failure rate. Innovation teams are typically small, and members often work in isolation. Most innovation teams maintain some secrecy as the potential worth of some of the artifacts generated is not immediately apparent. Successful innovation projects offer substantial rewards such as competitive advantage, market differentiation, and the opportunity for disruptive breakthroughs that can revolutionize industries or create entirely new ones. Innovation artifacts are often valuable outside the organization and publishable as new knowledge sources. However, it is uncommon to publish or even document the interim artifacts. Most innovation artifacts are “ideas” in the innovator’s mind. 

Many organizations (knowingly or unknowingly) club R&D and innovation teams together and try to blur the boundaries. This may be due to many reasons: 

1. Innovation has a high failure rate. Combining both teams can hedge against the risk of failure. 

2. Combining both teams may encourage knowledge sharing. 

3. True innovation teams are expensive to maintain, and turnover rates are high. 

4. Many innovators are averse to structure organizational norms and culture. 

5. Innovation team may be unaware of organizational facilities, needs and requirements. 

Though all these reasons are valid, combining R&D and innovation teams reduces the chances of disruptive innovation. The decision to or not to combine R&D and innovation teams depends on the organization’s aspirations. 

Translational Research in Digital Health and Gen AI 

Translational research is the process of turning scientific discoveries into practical applications that can benefit society. It involves bridging the gap between different stages of research, from basic to applied, and between different stakeholders, such as researchers, clinicians, policy makers, and industry. Translational research aims to accelerate the transfer of knowledge and technology from the laboratory to the bedside, from the bench to the market, and from the ivory tower to the community. 

Translational Research

Image credit: DoD Architecture Framework Working Group, Public domain, via Wikimedia Commons.

One of the key features of translational research is pragmatism, which means focusing on real-world problems and solutions, rather than abstract theories and models. Pragmatism also implies being flexible and adaptable to the changing needs and contexts of the target users and environments. Translational researchers are not satisfied with publishing papers in academic journals; they want to see their work make a difference in people’s lives and health outcomes. 

Translational Research Methods & Techniques

To achieve this goal, translational researchers need to adopt a variety of methods and techniques that can help them design, develop, evaluate, and implement digital health solutions in an effective and efficient way. These methods and techniques include: 

  • User-centered design, which involves understanding the needs, preferences, and behaviors of the potential users and stakeholders of a digital health solution and involving them in the co-creation and evaluation of the solution. 
  • Rapid prototyping, which involves creating low-fidelity or high-fidelity prototypes of a digital health solution and testing them with the users and stakeholders in an iterative way, to obtain feedback and improve the solution. 
  • Pilot testing, which involves conducting a small-scale trial of a digital health solution in a real-world setting, to assess its feasibility, acceptability, usability, and preliminary effectiveness. 
  • Randomized controlled trials, which involve comparing the effects of a digital health solution with a control condition (such as usual care or another intervention) in a large and representative sample of participants, to determine its efficacy, safety, and cost-effectiveness. 
  • Implementation science, which involves studying the factors and strategies that influence the adoption, integration, and sustainability of a digital health solution in a real-world setting and developing and evaluating interventions to enhance these processes. 
  • Health economics, which involves analyzing the costs and benefits of a digital health solution from different perspectives, such as the users, the providers, the payers, and the society. 

Translational researchers need to be aware of the latest advances and trends in digital health and related fields. One of the emerging paradigms in digital health is Gen AI. Gen AI refers to the development of artificial intelligence systems that can perform any intellectual task that a human can do, such as reasoning, learning, planning, decision making, and creativity. Gen AI has the potential to revolutionize digital health by enabling personalized, predictive, preventive, and participatory medicine, as well as enhancing the quality and efficiency of health care delivery and management. 

Translational researchers play a crucial role in shaping the future of digital health and Gen AI. They act as translators, mediators, facilitators, and innovators between different disciplines, sectors, and domains. They also work as consultants for companies, organizations, and startups that want to develop, test, and implement digital health and Gen AI solutions. Translational researchers provide expert advice and guidance on the best practices and methods for designing, developing, evaluating, and implementing digital health and Gen AI solutions, as well as identifying and addressing the potential challenges and risks involved. Translational researchers also help to disseminate and communicate the results and impacts of digital health and Gen AI solutions to various audiences, such as academics, practitioners, policy makers, industry, and the public. 

In summary, translational research is a vital and exciting field that aims to bring research papers into working artifacts, and to bridge the gap between digital health and Gen AI research and practice. Translational researchers adopt pragmatism as their guiding principle and use a variety of methods and techniques to design, develop, evaluate, and implement digital health and Gen AI solutions in real-world settings. Translational research is a practical endeavor that can make a positive difference in people’s lives and health outcomes.

Do you have a Gen AI research project that you need help with?

Six things data scientists in healthcare should know

Healthcare, like most other fields, is eager to get on the data science bandwagon. Data scientists can make a huge difference in the way big data is utilized for clinical decision-making. However, there are paradigmatic differences in the way data scientists from quantitative fields view the world, compared to their clinical counterparts. This is especially true in the emerging fields of machine learning and artificial intelligence. This may lead to considerable inefficiencies. As a person trained in both fields, here is my take on this.

Data scientists
Credit: Dasaptaerwin, CC0, via Wikimedia Commons

Data scientists should focus on the problem and not the solutions

Data scientists are excited about the latest GPT or BERT. Data scientists tend to refine the model a bit more using 10 more GPUs! In the process, they tend to solve problems that do not exist. From my experience practicing medicine in extremely resource-poor areas, simple solutions are valued more than BERT running on Kubernetes! This is true in the developed world as well, and many teams may have fundamental data needs that need to be tackled first.

Explanation comes before prediction

Emerging machine learning methods prioritize prediction accuracy compromising on explainability in the process. Clinicians, in most cases, cannot use nor trust a model that arrives at a conclusion without showing how it reached there. Hence, in the clinical domain, a simple logistic regression model may be more acceptable than a deep learning neural network. Parsimony is the key and a bit of feature selection to ensure parsimony will be appreciated always.

You need to know the clinical terminologies

A basic understanding of the clinical terminologies and terminology systems such as SNOMED and ICD is vital. It helps in understanding the clinical community better. Any healthcare analytics to consider variations in terminologies and adopt a standard system for consistency. Any tool that data scientists build for the clinical community should have support for terminology systems.

Biostatistics is more pervasive than you think

Most healthcare professionals are trained in biostatistics. Hence, the thinking leans towards population, sampling, randomization, blindings and showing a ‘statistically significant’ difference. Moving towards machine learning needs a paradigmatic shift. It may be useful to have a discussion on this at the outset.

Classes are of unequal importance

In healthcare, finding one class (e.g. cancer) is more important than the other class (e.g. no cancer). One class may need active intervention to save lives. Hence, sensitivity and specificity are of vital importance than accuracy!

Life is precious!

In healthcare, there is no room for error. Some decisions may have disastrous consequences while few others may save lives. As a data scientist in the healthcare domain, you should be cognizant of the fact that healthcare data is different from banking/airline data.

Chatting with FHIR endpoint

FHIR is an emerging standard for exchanging healthcare information electronically. Searching for resources is fundamental to the mechanics of FHIR. Search operations traverse through an existing set of resources filtering by parameters supplied to the search operation. health information systems convert the clinicians’ interactions into the search string.

With the growing importance (and intelligence) of chatbots, it is possible to converse with the physician and retrieve what they want by converting their needs to FHIR search string. This can make the clinicians’ life easy as most of them do not like entering complex search terms into text boxes.

Rasa is an open-source machine learning framework for building AI assistants and chatbots. You can create custom actions with Rasa to support various use-cases. The Microsoft Bot Framework SDK allows you to create and develop bots for the Azure Bot Service. Starting with version 4.7 of the Bot Framework SDK, you can extend a bot using another bot (a skill). A skill can be consumed by various other bots, facilitating reuse.

During the COVID break, I created a couple of experimental projects to make chatting with electronic health records possible. One is a RASA project for mapping conversations to FHIR search.

https://github.com/dermatologist/rasaonfhir

The other is a FHIR search skill for Microsoft Bot SDK that more or less does the same thing.

These are experimental for now and pull requests are welcome!

FHIR and public health data warehouses

First posted on CanEHealth.com

The provincial government is building a connected health care system centred around patients, families and caregivers through the newly established OHTs. As disparate healthcare and public health teams move towards a unified structure, there is a growing need to reconsider our information system strategy. Most off the shelf solutions are pricey, while open-source solutions such as DHIS2 is not popular in Canada. Some of the public health units have existing systems, and it will be too resource-intensive to switch to another system. The interoperability challenge needs an innovative solution, beyond finding the single, provincial EMR.

artificial intelligence

We have written about the theoretical aspects, especially the need to envision public health information systems separate from an EMR. In this working paper, we propose a maturity model for PHIS and offer some pragmatic recommendations for dealing with the common challenges faced by public health teams. 

Below is a demo project on GitHub from the data-intel lab that showcases a potential solution for a scalable data warehouse for health information system integration. Public health databases are vital for the community for efficient planning, surveillance and effective interventions. Public health data needs to be integrated at various levels for effective policymaking. PHIS-DW adopts FHIR as the data model for storage with the integrated Elasticsearch stack. Kibana provides the visualization engine. PHIS-DW can support complex algorithms for disease surveillance such as machine learning methods, hidden Markov models, and Bayesian to multivariate analytics. PHIS-DW is work in progress and code contributions are welcome. We intend to use Bunsen to integrate PHIS-DW with Apache Spark for big data applications. 

FHIR has some advantages as a data persistence schema for public health. Apart from its popularity, the FHIR bundle makes it possible to send observations to FHIR servers without the associated patient resource, thereby ensuring reasonable privacy. This is especially useful in the surveillance of pandemics such as COVID19. Some useful yet complicated integrations with OSCAR EMR and DHIS2 is under consideration. If any of the OHTs find our approach interesting, give us a shout. 

BTW, have you seen Drishti, our framework for FHIR based behavioural intervention? 

Deploy a fastai image classifier using OpenFaaS for serverless on DigitalOcean in 5 easy steps!

Fastai is a python library that simplifies training neural nets using modern best practices. See the fastai website and view the free online course to get started. Fastai lets you develop and improve various NN models with little effort. Some of the deployment strategies are mentioned in their course, but most are not production-ready.

OpenFaaS® (Functions as a Service) is a framework for building Serverless functions easily with Docker that can be deployed on multiple infrastructures including Docker swarm and Kubernetes with little boilerplate code. Serverless is a cloud-computing model in which the cloud provider runs the server, and dynamically manages the allocation of machine resources and can scale to zero if a service is not being used. It is interesting to note that OpenFaaS has the same requirements as the new Google Cloud Run and is interoperable. Read more about OpenFaaS (and install the CLI) from their website.

DigitalOcean: I host all my websites on DigitalOcean (DO) which offers good (in my opinion) cloud services at a low cost. They have data centres in Canada and India. DO supports Kubernetes and Docker Swarm, but they offer a One-Click install of OpenFaaS for as little as $5 per month (You can remove the droplet after the experiment if you like, and you will only be charged for the time you use it.) If you are new to DO, please sign up and setup OpenFaaS as shown here:

In fastai class, Jeremy creates a dog breed classifier.

As STEP 1, export the model to .pkl as below

learn.export()

This creates the export.pkl file that we will be using later. To deploy we need a base container to run the prediction workflow. I have created one with Python3 along with fastai core and vision dependencies (to keep the size small). It is available here: https://hub.docker.com/r/beapen/fastai-vision But you don’t have to directly use this container. My OpenFaaS template will make this easy for you.

STEP 2: Using the OpenFaaS CLI (How to Install) pull my template as below:

mkdir dog-classifier
cd dog-classifier
faas-cli template pull https://github.com/dermatologist/python3-ml --prefix your-docker-uname
faas-cli new --lang fastai-vision dog-classifier

STEP 3: Copy export.pkl to the model folder

STEP 4: Add http://digitaloceanIP:8080 to dog-classifier.yml

provider:
  name: openfaas
  gateway: http://digitaloceanIP:8080

and finally in STEP 5:

faas-cli up -f dog-classifier.yml

That’s it! Your predictor is up and running! Access it at http://digitaloceanIP:8080/function/dog-classifier

The template has a builtin image uploader interface! If you get stuck at any stage, give me a shout below. More to follow on using OpenFaaS for deploying machine learning workflows!