.. ===============LICENSE_START======================================================= .. Acumos CC-BY-4.0 .. =================================================================================== .. Copyright (C) 2017-2018 AT&T Intellectual Property & Tech Mahindra. All rights reserved. .. =================================================================================== .. This Acumos documentation file is distributed by AT&T and Tech Mahindra .. under the Creative Commons Attribution 4.0 International License (the "License"); .. you may not use this file except in compliance with the License. .. You may obtain a copy of the License at .. .. http://creativecommons.org/licenses/by/4.0 .. .. This file is distributed on an "AS IS" BASIS, .. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. .. See the License for the specific language governing permissions and .. limitations under the License. .. ===============LICENSE_END========================================================= ==================================== Generic Model Runner Developer Guide ==================================== Overview ======== In the application.properties file under resources directory, the property, model_type, defines the type of model runner this is. If model_type is defined and the value is G, then this is a generic Java Model Runner, invoking generic models internally; otherwise, this is a H2O Model Runner, running H2O models instead. The model runner takes a proto string, extracts attributes information from this proto string, and writes this proto string to dataset.proto file. The model runner then invokes protoc compiler to compile this dataset.proto file to generate DatasetProto.java file. After that the model runner invokes javac compiler and compiles this java file to the corresponding class files. There are following POST end points in the Model Runner - /{operation}, /transformCSV, /transformCSVDefault, /getBinary, /getBinaryDefault, /transformJSON, /transformJSONDefault, /getBinaryJSON and /getBinaryJSONDefault; and three PUT end points - /model, /proto, and /model/configuration. For /{operation} API, the request body contains a binary string that the model runner needs to parse before passing it to the predictor. To parse, the model runner dynamically load all the relevant DatasetProto$*** classes generated by protoc and javac compiler at run time so that it can use its de-serialization methods. After the binary strings get de-serialized, the results will be used to construct the row data in the format that the predictor accepts. The Model Runner then re-serialize the results and send it back to the client. The /transformCSV and /transformCSVDefault APIs allow the users to directly upload a .csv file that contains all the columns of data that match with what's specified in the default.proto file. The first one, /transform, also allows the users to upload the corresponding XXX.proto file and modelXXX.zip files. The second one, /transformDefault, will use the defaults.proto and model.zip in the directory specified in the application.properties file. The model runner will build the binary representation of the .csv file using the DatasetProto$*** classes which saves the users from having to convert the .csv file to binary string themselves. Both end points will return the prediction results. The /getBinary and /getBinaryDefault APIs are utilities allowing users to upload a .csv file and returning its binary representation in array of byte[]. The users can use this returned byte string as input data to the /operation/{operation} API. The users can use the three PUT requests, /model, /model/configuration, and /proto, to replace the existing resources, the current uploaded model, the model configuration file, and the default proto file, respectively. Requirements ============ In order for the Model Runner to be able to dynamically load the plugin jar that contains the proto classes at run time, the plugin jar must be outside the project directories. The application.properties specifies the default plugin root directory, ${plugin_root}, which if not existed, will be created when the Model Runner starts. When the Model Runner receives a POST request, it will put the generated JAVA code under ${plugin_root}/src directory and generated class files under ${plugin_root}/classes directory. Therefore, these two directories, ${plugin_root}/src and ${plugin_root}/classes, must also be present. If not, the model runner will create them. Supported Methods and Objects ============================= The micro service methods and objects are documented using Swagger. A running server documents itself at a URL like the following, but consult the server's application.properties file for the exact port number ("8334") and context root ("modelrunner") in use: http://localhost:8334/modelrunner/swagger-ui.html Build Prerequisites =================== The build machine needs the following: 1. Java version 1.8 2. Maven version 3 3. Connectivity to Maven Central (for most jars) 4. protoc compiler for JAVA Build and Package ================= Use maven to build and package the service into a single "fat" jar using this command: mvn clean install Launch Prerequisites ==================== 1. Java version 1.8 2. A valid application.properties file. 3. protoc compiler for JAVA 4. protobuf JAVA Runtime Library > 3.4.0 Launch Instructions =================== Start the microservice for development and testing like this: mvn clean spring-boot:run To launch from Eclipse, run the class org.acumos.modelrunner.Application To launch from the command line with an external configuration file, type like this: java -jar ./target/modelrunner-0.0.1-SNAPSHOT.jar --spring.config.location=./application.properties