Acumos C++ Client User Guide

Target Users

The target users of this guide are modelers with sufficient C++ knowledge to write and build C++ applications.

Overview

This guide will describe the steps needed to onboard a c++ model. Basically the following steps are needed:

  1. Train the model
  2. Serialized trained model
  3. Create Microservice
  4. Define the gRPC service
  5. Use the onboaridng-cpp client to save model_bundle locally or to onboard it to Acumos by CLI

The model_bundle is a package that contains all the necessary materials required by Acumos to on-board the model and use it in acumos. At the end you can follow a tutorial where the provided example iris-classifier is prepared to be onboarded.

Architecture

In Acumos a model is packed as a dockerized microservice exposing which is specified using Google protobuf.

In order to achieve that, the modeler must write a short C++ program that attaches the trained model with the generated gRPC stub in order to build an executable that contains the gRPC webserver as well as the trained model. This executable will then be started in the docker container.

The minimum C++ standard level is C++11 and the recommended compiler is gcc 7.4

Prerequisites

The examples was developed in the following environment:

  1. Ubuntu 18.04
  2. g++ 7.4 (default version on Ubuntu 18.04)
  3. gRPC 1.20, should be installed from source (https://github.com/grpc/grpc) into /usr/local including all plugins
  4. python 3.6
  5. cmake

set the two following environment variables

export ACUMOS_HOST = my.acumos.instance.org
export ACUMOS_PORT = my acumos port

These values can be found in the installation folder of Acumos : system-integration/AIO/acumos_env.sh. Please look at the following Acumos local environement variable : ACUMOS_DOMAIN, ACUMOS_PORT or ACUMOS_ORIGIN

Tutorial : Onboard the Iris Kmeans Classifier

To perform the following steps you have to clone the repository acumos-c-client from gerrit (https://gerrit.acumos.org). Browse the repositories to acumos-c-client then retrieve the SSH or HTTPS commands. You can clone it also from Github (https://github.com/acumos/acumos-c-client)

In acumos-c-client repository, the “examples” directory contains the complete steps to onboard the well known Iris Classifier using a KMeans implementation.

Add Kmeans clustering in acumos-c-client repo

After cloning the acumos-c-client repository, do the following :

cd acumos-c-client
git submodule update --init

This will add the DKM algorithm (A generic K-means clustering written in C++) in /examples/iris-kmeans/dkm folder.

Then you have to build basic executables

cmake .
make

Train the model and save it in serialized format

The targeted microservice needs to load the serialized trained model at startup. It is completely up to the developer how this is done. The example uses protobuf, because it fits in the technology lineup of the whole example.

First, create the protobuf model definition that will be used to save and load the trained model

cd examples/iris-kmeans/step2_serialize_model/
protoc --cpp_out=. centroids.proto

Then build the training binary

cmake .
make

and finally Train the model ans save it in serialized format

cd ..
./step2_serialize_model/bin/save-iris-kmeans

The file iris-kmeans/src/iris-kmeans.cpp trains the iris classifier model by finding a centroid for each of the three iris species. The classify method then finds the closest centroid to the given data point and returns it as the most probable species. Thus in this case, the three centroids make up the trained model.

Now the model is serialized and the binary is saved in /iris-kmeans/data/

Create protobuf Microservice

To create the protobuf Microservice do the following :

cd step3_model_microservice/
cmake .
make

The microservice must be implemented and at first read the serialized model from step2. The example implementation can be found in the file iris-kmeans/step3_model_microservice/run-microservice.cpp. Then, the service interface of the microservice must be specified using protobuf. In our example, it is the classify method with its input and output parameters are defined in iris-kmeans/step3_model_microservice/model.proto

launch the Acumos on-boarding cpp client

Create a lib directory

cd ..
mkdir lib

and launch the cpp client by command line if you want or with your prefered Python IDE. It is recommended to call the onboarding script from /examples/iris-kmeans folder.

python3 ../../cpp-client.py

The Acumos on-boarding cpp client will ask you the follwing question :

  • Name of the model
  • Path to model.proto
  • Path to data, lib and executable(bin) directories
  • name of the dump directory (where you want to save the model bundle)
  • CLI onboarding ? [yes/no]
  • if no, the model bundle will be save locally in the dump directory and then you will be able to on-board it later by Web-onboarding
  • if yes you must ask the following questions (if environment variable haven’t been set previously, as requested in prerequisites, the cpp client will ask you to fill the values at this step)
  • Do you want to create a microservice ?
  • Do you want to add license ?
  • User Name (your Acumos login)
  • Password (your Acumos password)

Then the on-boarding start, it will take more or less time depending if you choose to create the microservice during on-boaring or not. Once the onboarding is finished you can retrieve your model in Acumos.

How to on-board your own model

In the follwing we describe all the steps we followed to build the previous tutorial. You must follow these steps to be able to on-board in acumos your own C++ model.

Step 1: Train model

We assume that you have a cpp file like src/iris-kmeans.cpp to train you own model.

The src/iris-kmeans.cpp trains the iris classifier model by finding a centroid for each of the three iris species. The classify method then finds the closest centroid to the given data point and returns it as the most probable species. Thus in this case, the three centroids make up the trained model.

Step 2: Serialize trained model

The targeted microservice needs to load the serialized trained model at startup. It is completely up to the developer how this is done. The example uses protobuf, because it fits in the technology lineup of the whole example. To save and load the trained model, the tutorial use a protobuf definition the can be found in step2_serialize_model/centroids.proto:

syntax = "proto3";
package cppsample;

message Centroid {
  float sepal_length = 1;
  float sepal_width = 2;
  float petal_length = 3;
  float petal_width = 4;
}

message CentroidList {
  repeated Centroid centroid = 1;
}

Then, generate the respective c++ code using the protobuf compiler:

protoc --cpp_out=. centroids.proto

An use a small code snippet to save the data to a file:

string model_file="data/iris-model.bin";
fstream output(model_file, ios::out | ios::binary);
centroids.SerializePartialToOstream(&output);

In the tutotrial, the two examples to load and save the iris model must be run from the iris-kmeans directory to get all file paths right: they expect the data directory in the cwd and will write the model to data/iris-model.bin

Step 3: Create Microservice

The microservice must be implemented and at first read the serialized model from step2. The example implementation can be found in the file run-microservice.cpp.

Then, the service interface of the microservice must be specified using protobuf. In our example, it is the classify method with its input and output parameters must be defined in a file that should be named model.proto:

syntax = "proto3";
package cppservice;

service Model {
  rpc classify (IrisDataFrame) returns (ClassifyOut);
}

message IrisDataFrame {
  repeated double sepal_length = 1;
  repeated double sepal_width = 2;
  repeated double petal_length = 3;
  repeated double petal_width = 4;
}

message ClassifyOut {
  repeated int64 value = 1;
}

Step 4: Define gRPC service

From model.proto, the necessary code fragments and gRPC stubs can be generated like this:

protoc --cpp_out=. model.proto
protoc --grpc_out=. --plugin=protoc-gen-grpc=/usr/local/bin/grpc_cpp_plugin model.proto

After that, the gRPC service method has to be implemented:

Status classify(ServerContext *context, const IrisDataFrame *input, ClassifyOut *response) override {
    cout << "enter classify service" << endl;
    std::array<float, 4> query;
    query[0]=input->sepal_length(0);
    query[1]=input->sepal_width(0);
    query[2]=input->petal_length(0);
    query[3]=input->petal_width(0);
    auto cluster_index = dkm::predict<float, 4>(means, query);
    cout << "data point classified as cluster " << cluster_index << endl;
    response->add_value(cluster_index);

    return Status::OK;
}

And finally, the gRPC server has to be started:

string server_address("0.0.0.0:"+port);
ServerBuilder builder;
builder.AddListeningPort(server_address, grpc::InsecureServerCredentials());
builder.RegisterService(&iris_model);
unique_ptr<Server> server(builder.BuildAndStart());
cout << endl << "Server listening on " << server_address << endl;
server->Wait();

To prepare for packaging, specific folders will be expected:

  1. the data folder, where all files of the serialized model are stored
  2. the lib folder that should contain the shared libraries that are not part of the g++ base installation

Step 5 : Use the onboarding-cpp client to save model_bundle locally or to onboard it to Acumos by CLI

Please refers to tutorial to use the onboarding-cpp client.