Python SDK

Analyze your data in a more flexible and familiar notebook environment.

The SDK enables programmatically accessing data in Gantry without going through a dashboard. Accessing this data enables:

  1. Building off the SDK to create new visualizations and analyses.
  2. Taking custom actions, such as triggering retraining if performance dips below a threshold.

Learn how to install the SDK

Note that Gantry is global
The Gantry module is initialized globally for per Python process. That means all logging calls in a process need to share an API key, though the parameters are left to the logging call site.

The first step to take when using the SDK will always be to initialize Gantry with an API key.

import gantry

gantry.init(
    api_key="YOUR_API_KEY",  # see above docs
)

Creating a query

import datetime
import gantry.query as gquery

# Get all available applications
applications = gquery.list_applications()

# Create a window for the last 30 minutes of data.
end = datetime.datetime.utcnow()
start = end - datetime.timedelta(minutes=30)
window = gquery.query(
  application="my-awesome-app",
  start_time=start,
  end_time=end,
  version="1.2.3",
)

# If you have a saved view, you can create a window with it as well
window = gquery.query(
  application="my-awesome-app",
  view="my-saved-view",
)

Viewing data and statistics

The Gantry dataframe object supports many of the pandas dataframe operations.

# See the first 5 rows of your data: inputs, outputs & labels
>>>> window.head()
# This is a pandas dataframe
inputs.feature_1      timestamp
0                      Wed, 12 Jan 2022 21:54:25 GMT
1                      Wed, 12 Jan 2022 21:54:30 GMT
2                      Wed, 12 Jan 2022 21:54:35 GMT
3                      Wed, 12 Jan 2022 21:54:40 GMT
4                      Wed, 12 Jan 2022 21:54:45 GMT


# Compute the mean value of a column.
>>>> window["inputs.feature_1"].mean()
10

# Compute the [0.1, 0.5, 0.9] quantiles for all columns.
>>>> window.quantile([0.1, 0.5, 0.9])

# Get a filtered window
>>>> filtered_window = window[window["inputs.feature_1"] > 100]
# Add filters together
>>>> filtered_window = window[(window["inputs.feature_2"] < 100) & (window["inputs.feature_1"] > 100)]

# Get a stat from that window
>>>> filtered_window["inputs.feature_1"].mean()
101

Computing statistics using group_by:

# Compute the mean value of a column.
>>>> window["inputs.feature_1"].mean()
10
# Compute the mean of a column grouped by another column.
>>>> window['inputs.feature_1'].mean('group_by='inputs.feature_2')
    inputs.feature_2   mean
0             value1    8.0
1             value2    9.0
2             value3   10.0
3             value4   11.0

Compute metrics on predictions and feedback

# For categorical models:

# Compute the model's accuracy.
gquery.metric.accuracy(window["outputs"], window["feedback.label"])

# Get the confusion matrix.
gquery.metric.confusion_matrix(window["outputs"], window["feedback.label"])

# For regression models:

# Compute the model's mean squared error.
gquery.metric.mean_squared_error(window["outputs"], window["feedback.label"])

# Or get the max error in the window.
gquery.metric.max_error(window["outputs"], window["feedback.label"])

Compute distribution distances

Distribution distances measure between two windows of data the similarity of the distributions of a feature.

# Compute the d1 distance between two features.
gquery.distance.d1(window["inputs.feature_1"], window["inputs.feature_2"])

# Compute the Kolmogorov-Smirnov distance.
gquery.distance.ks(window["inputs.feature_1"], window["inputs.feature_2"])

# Compute the Kullback-Liebler divergence distance.
gquery.distance.kl(window["inputs.feature_1"], window["inputs.feature_2"])