Skip to content

XGBoost

Let's go through a simple example of integrating the Aporia SDK with a XGBoost model.

STEP 1: Add Model

Click the Add Model button in the Models page.

Add Model

Enter the model name and optionally a description. Click Next.

STEP 2: Initialize the Aporia SDK

First, we should initialize aporia and load a dataset to train the model.

import uuid
import pandas as pd
import xgboost as xgb
from sklearn.model_selection import train_test_split

import aporia
aporia.init(token='123', environment='example')

data = pd.read_csv("./path_to_real_file_with_data.csv")
features = data.drop(["will_buy_insurance"], axis=1)
labels = data[["will_buy_insurance"]]

STEP 3: Create Model Version

Next, we'll define a version for the new model:

aporia.create_model_version(
  model_id="my-model",
  model_version="v1",
  model_type="binary",
  features=aporia.pandas.infer_schema_from_dataframe(features),
  predictions=aporia.pandas.infer_schema_from_dataframe(labels)
)

STEP 4: Train Model

Now, let's train an XGBoost model, and log the training data:

dtrain = xgb.DMatrix(features, labels["will_buy_insurance"].values)
xgb_model = xgb.train({"objective": "binary:logistic"}, dtrain)

apr_model = aporia.Model(model_id="my-model", model_version="v1")
apr_model.log_training_set(features=features, labels=labels)

STEP 5: Predict

The last step is to log the predictions performed by the model.

# pred_features is a DataFrame containing the features for the predictions
prediction = xgb_model.predict(xgb.DMatrix(pred_features))

apr_model = aporia.Model(model_id="my-model", model_version="v1")
apr_model.log_prediction(
  id=str(uuid.uuid4()),
  features=aporia.pandas.pandas_to_dict(pred_features),
  predictions={
    "will_buy_insurance": prediction[0] > 0.7
  }
)