H2O is an open-source, distributed and scalable machine learning platform written in JAVA. H2O supports many of the statistical & machine learning algorithms, including gradient boosted machines, generalized linear models, deep learning and more. OpenFaaS® (Functions as a Service) is a framework for building Serverless functions easily with Docker. Read my previous post to learn more about OpenFaaS and DO.
H2O has a module aptly named sparkling water that allows users to combine the machine learning algorithms of H2O with the capabilities of Spark. Integrating these two open-source environments provides a seamless experience for users who want to make a query using Spark SQL, feed the results into H2O to build a model and make predictions, and then use the results again in Spark. For any given problem, better interoperability between tools provides a better experience.
H2O Driverless AI is a commercial package for automatic machine learning that automates some of the most difficult data science and machine learning workflows such as feature engineering, model validation, model tuning, model selection, and model deployment. H2O also has a popular open-source module called AutoML that automates the process of training a large selection of candidate models. H2O’s AutoML can be used for automating the machine learning workflow, which includes automatic training and tuning of many models within a user-specified time-limit. AutoML makes hyperparameter tuning accessible to everyone.
H2O allows you to convert the models to either a Plain Old Java Object (POJO) or a Model Object or an Optimized (MOJO) that can be easily embeddable in any Java environment. The only compilation and runtime dependency for a generated model is the h2o-genmodel.jar file produced as the build output of these packages. You can read more about deploying h2o models here.
I have created an OpenFaaS template for deploying the exported MOJO file using a base java container and the dependencies defined in the gradle build file. Using the OpenFaaS CLI (How to Install) pull my template as below:
mkdir watersplash
cd watersplash
faas-cli template pull https://github.com/dermatologist/java-ext --prefix your-docker-uname
faas-cli new --lang java-h2o watersplash
Copy the exported MOJO zip file to the root folder along with build.gradle and settings.gradle. Make appropriate changes to handle.java as per the needs of the model, as explained here. Add http://digitaloceanIP:8080 to watersplash.yml
provider:
name: openfaas
gateway: http://digitaloceanIP:8080
and finally:
faas-cli up -f watersplash.yml
That’s it! Congratulations! Your model is up and running! Access it at http://digitaloceanIP:8080/function/watersplash
If you get stuck at any stage, give me a shout below.
- Locally hosted LLMs - July 14, 2024
- LLM-in-the Loop CQL execution - June 30, 2024
- Come, join us to make generative AI in healthcare more accessible! - June 15, 2024
Pingback: Using OpenFaaS containers in Kubeflow - Bell Eapen