.. ===============LICENSE_START======================================================= .. Acumos CC-BY-4.0 .. =================================================================================== .. Copyright (C) 2017-2018 AT&T Intellectual Property & Tech Mahindra. All rights reserved. .. =================================================================================== .. This Acumos documentation file is distributed by AT&T and Tech Mahindra .. under the Creative Commons Attribution 4.0 International License (the "License"); .. you may not use this file except in compliance with the License. .. You may obtain a copy of the License at .. .. http://creativecommons.org/licenses/by/4.0 .. .. This file is distributed on an "AS IS" BASIS, .. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. .. See the License for the specific language governing permissions and .. limitations under the License. .. ===============LICENSE_END========================================================= ========================================== On-Boarding H2o.ai and Generic Java Models ========================================== The Acumos Java Client Library command line utility is used to on-board H2o.ai and Generic Java models. This library creates artifacts from an H2o or Generic Java model and pushes the artifacts to the on-boarding server for the H2o Model runner to be able to use them. High-Level Flow =============== #) The Modeler creates a model in H2o and exports it in the MOJO model format (.zip file) using any interface (eg.Python, Flow, R) provided by H2o. For Generic Java, the Modeler creates a model and exports it in the .jar format. #) The Modeler runs the JavaClient jar, which creates a Protobuf (default.proto) file for the Model, creates the required metadata.json file and an artifact called modelpackage.zip. #) Depending on the choice of the Modeler, she can manually upload these generated artifacts to the Acumos Marketplace via its Web interface. This is Web-based on-boarding. We will see how to do this in this article. #) Or the Java client library itself, on-boards the model onto the on-boarding server if the modeler provides the on-boarding server URL. This is CLI-based on-boarding. The Model Runner provides a wrapper around the ML model, packages it as a containerized microservice and exposes a predict method as a REST endpoint. When the model is onboarded and deployed, this method (REST endpoint) can then be called by other external applications to request predictions off of the model. Prerequisites ============= - Java 1.8 - The following Released components: - `Java Client `_ v1.11.0 (java_client-1.11.0.jar) - `Generic Model Runner `_ v2.2.3 (h2o-genericjava-modelrunner-2.2.3.jar) Preparing to On-Board your H2o or a Generic Java Model ====================================================== a. Place Java Client jar in one folder locally. This is the folder from which you intend to run the jar. After the jar runs, the created artifacts will also be available in this folder. You will use some of these artifacts if you are doing Web-based onboarding. We will see this later. Note: the versions of the libraries in the screenshots may be outdated. |image0| b. Prepare a supporting folder with the following contents. Items of this folder will be used as input for the java client jar. |image1| It will contain: #. Models - In case of H2o, your model will be a MOJO zip file. In case of Generic Java, the model will be a jar file. #. Model runner or Service jar - For H2O rename downloaded h2o-genericjava-modelrunner.jar as per the first section to H2OModelService.jar or to GenericModelService.jar for Java model and Place it in this folder. #. CSV file used for training the model - Place the csv file (with header having the same column names used for training but without the quotes (“ ”) ) you used for training the model here. This is used for autogenerating the .proto file. If you don’t have the .proto file, you will have to supply the .proto file yourself in the supporting folder. Make sure you name it default.proto. #. default.proto - This is only needed If you don't have sample csv data for training, then you will have to provide the proto file yourself. In this case, Java Client cannot autogenerate the .proto file. You will have to supply the .proto file yourself in the supporting folder. Make sure you name it default.proto Also make sure, the default.proto file for the model is in the following format. You need to appropriately replace the data and datatypes under DataFrameRow and Prediction according to your model. .. code-block:: python syntax = "proto3"; option java_package = "com.google.protobuf"; option java_outer_classname = "DatasetProto"; message DataFrameRow { string sepal_len = 1; string sepal_wid = 2; string petal_len = 3; string petal_wid = 4; } message DataFrame { repeated DataFrameRow rows = 1; } message Prediction { repeated string prediction= 1; } service Model { rpc transform (DataFrame) returns (Prediction); } #. application.properties file - Mention the port number on which the service exposed by the model will finally run on. .. code-block:: python server.contextPath=/modelrunner # IF WORKING WITH MODEL CONNECTOR AND COMPOSITE SOLUTION, THE #server.contextPath will be / # NOTE: THIS WILL TAKE AWAY SWAGGER # This is the port number you want to run the service on. User may select a convenient port. server.port=8336 spring.http.multipart.max-file-size=100MB spring.http.multipart.max-request-size=100MB # Linux version # if model_type is Generic Java, then default_model will be /models/model.jar # if model_type is H2o, then the default_model will be /models/Model.zip #default_model=/models/model.jar default_model=/models/Model.zip default_protofile=/models/default.proto logging.file = ./logs/modelrunner.log # The value of model_type can be H or G # if model is Generic java model, then model_type is G. # if model is H2o model, then model_type is H. And the /predict method will use H2O model; otherwise, it will use generic Model # if model_type is not present, then the default is H #model_type=G model_type=H model_config=/models/modelConfig.properties # Linux some properties are specific to java generic models # The plugin_root path has to be outside of ModelRunner root or the code won't work # Default proto java file, classes and jar # DatasetProto.java will be in $plugin_root\src # DatasetProto$*.classes will be in $plugin_root\classes # pbuff.jar will be in $plugin_root\classes plugin_root=/tmp/plugins #. modelConfig.properties - Add this file only in case of Generic Java model onboarding. This file contains the modelMethod and modelClassName of the model. .. code-block:: python modelClassName=org.acumos.ml.XModel modelMethod=predict Create your modeldump.zip file ============================== Java Client jar is the executable client jar file. For Web-based onboarding of H2o models, the parameters to run the client jar are: #. Current Folder path : Full folder path in which Java client jar is placed and run from #. Model Type : H for H2o, G for Generic Java #. Supporting folder path : Full Folder path of the supporting folder which contains items. #. Name of the model : For h2o just the name of the model without the .zip extension. Make sure this matches name of the supplied MOJO model file exactly. #. Input csv file : csv file that was used for training the model. Include the .csv extension in the csv file name. This will be used to autogenerate the default.proto file. This parameter will be empty if you yourself have supplied a default.proto for your model. For CLI-based onboarding, the parameters to run the client jar are: #. Onboarding server url. #. Pass the authentication API url for onboarding - This API returns jwtToken for authenticated users. e.g http://:8090/onboarding-app/v2/auth #. Model Type : H for H2o, G for Generic Java. #. Supporting folder path : Full Folder path of the supporting folder which contains items. #. Name of the model : For h2o just the name of the model without the .zip extension. Make sure this matches name of the supplied MOJO model file exactly. #. Username of the Portal MarketPlace account. #. Password of the Portal MarketPlace account. #. Input csv file : csv file that was used for training the model. Include the .csv extension in the csv file name. This will be used to autogenerate the default.proto file. This parameter will be empty if you yourself have supplied a default.proto for your model. See example below for how to run the client jar and how the modeldump.zip artifact appears after its successful run: |image2| |image3| Onboarding to the Acumos Portal =============================== - If you used CLI-based onboarding, you don't need to perform the steps outlined just below. The Java client has done it for you. You will see a message on the terminal that states the model onboarded successfully. - If you use Web-based onboarding, you must complete the following steps: #. After you run the client, you will see a modeldump.zip file generated in the same folder where we ran the Java Client for. #. Upload this file in the Web based interface (drap and drop). See :doc:`portal-onboarding-web` #. You will be able to see a success message in the Web interface. you will be able to see a success method in the Web interface. The needed TOSCA artifacts and docker images are produced when the model is onboarded to the Portal. You and your teammates can now see, rate, review, comment, collaborate on your model in the Acumos marketplace. When requested and deployed by a user, your model runs as a dockerized microservice on the infrastructure of your choice and exposes a predict method as a REST endpoint. This method can be called by other external applications to request predictions off of your model. Addendum : Creating a model in H2o ================================== You must have H2o 3.14.0.2 installed on your machine. For instructions on how to install visit the H2o web site: https://www.h2o.ai/download/. H2o provides different interfaces to create models and use H2o for eg. Python, Flow GUI, R, etc. As an example, below we show how to create a model using the Python innterface of H2o and also using the H2o Flow GUI. You can use the other interfaces too which have comparable functions to train a model and download the model in a MOJO format. Here is a sample H2o iris program that shows how a model can be created and downloaded as a MOJO using the Python interface: .. code-block:: python import h2o import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns # for jupyter notebook plotting, %matplotlib inline sns.set_context("notebook") h2o.init() # Load data from CSV iris = h2o.import_file('https://raw.githubusercontent.com/h2oai/h2o-3/master/h2o-r/h2o-package/inst/extdata/ iris_wheader.csv') Iris data set description ------------------------- 1. sepal length in cm 2. sepal width in cm 3. petal length in cm 4. petal width in cm 5. class: Iris Setosa Iris Versicolour Iris Virginica iris.head() iris.describe() # training parameters training_columns = ['sepal_len', 'sepal_wid', 'petal_len', 'petal_wid'] # response parameter response_column = 'class' # Split data into train and testing train, test = iris.split_frame(ratios=[0.8]) train.describe() test.describe() from h2o.estimators import H2ORandomForestEstimator model = H2ORandomForestEstimator(ntrees=50, max_depth=20, nfolds=10) # Train model model.train(x=training_columns, y=response_column, training_frame=train) print (model) # Model performance performance = model.model_performance(test_data=test) print (performance) # Download the model in MOJO format. Also download the h2o-genmodel.jar file modelfile = model.download_mojo(path="/home/deven/Desktop/", get_genmodel_jar=True) predictions=model.predict(test) predictions Here is a sample H2o iris example program that shows how a model can be created and downloaded as a MOJO using the H2o Flow GUI. |image4| |image5| |image6| |image7| |image8| .. |image0| image:: ../images/java-client/downloaded_java_client.png .. |image1| image:: ../images/java-client/supporting_folder.PNG .. |image2| image:: ../images/java-client/running_the_java_client.PNG .. |image3| image:: ../images/java-client/after_running_java_client.PNG .. |image4| image:: ../images/java-client/1.png .. |image5| image:: ../images/java-client/2.png .. |image6| image:: ../images/java-client/3.png .. |image7| image:: ../images/java-client/4.png .. |image8| image:: ../images/java-client/5.png .. |image9| image:: ../images/java-client/upload_modeldump.png