Query your data using the Gantry SDK#
Gantry provides two ways to interact with your application’s data: the dashboard and the SDK query module.
To start using the SDK you’ll need an API key.
Using the Gantry SDK#
The goal of the Gantry SDK is to provide an elegant way to:
Programmatically access your data in Gantry, without going through a dashboard
Build off the SDK to create new visualizations and analyses (and let us know what you’re making!)
Take custom actions based off of Gantry data, such as triggering retraining if performance dips below a threshold
Connect to Gantry#
Use your API key to initialize the query module:
import gantry.query as gquery
gquery.init(api_key="<your-api-key>")
Create a query#
Let’s start querying some data:
import datetime
# Get all available applications
applications = gquery.list_applications()
# Create a window for the last 30 minutes of data.
end = datetime.datetime.utcnow()
start = end - datetime.timedelta(minutes=30)
window = gquery.query(application="my-awesome-app", start_time=start, end_time=end, version="1.2.3")
# If you have a saved view, you can create a window with it as well
window = gquery.query(application="my-awesome-app", view="my-saved-view")
The output of .query()
is a lightweight object that has metadata about your query. Because the computation happens server side, you don’t use unnecessary memory on your own machine.
Viewing data and stats#
Our custom dataframe object supports many of the pandas dataframe operations.
# See the first 5 rows of your data: inputs, outputs & labels
>>>> window.head()
# This is a pandas dataframe
inputs.feature_1 timestamp
0 Wed, 12 Jan 2022 21:54:25 GMT
1 Wed, 12 Jan 2022 21:54:30 GMT
2 Wed, 12 Jan 2022 21:54:35 GMT
3 Wed, 12 Jan 2022 21:54:40 GMT
4 Wed, 12 Jan 2022 21:54:45 GMT
# Compute the mean value of a column.
>>>> window["inputs.feature_1"].mean()
10
# Compute the [0.1, 0.5, 0.9] quantiles for all columns.
>>>> window.quantile([0.1, 0.5, 0.9])
# Get a filtered window
>>>> filtered_window = window[window["inputs.feature_1"] > 100]
# You can add filters together as well
>>>> filtered_window = window[(window["inputs.feature_2"] < 100) & (window["inputs.feature_1"] > 100)]
# Get a stat from that window
>>>> filtered_window["inputs.feature_1"].mean()
101
Check out more of the available stats here.
Computing metrics#
You can also compute metrics on your predictions and feedback using the SDK.
# For categorical models:
# Compute the model's accuracy.
gquery.metric.accuracy(window["outputs"], window["feedback.label"])
# Or get the confusion matrix.
gquery.metric.confusion_matrix(window["outputs"], window["feedback.label"])
# For regression models:
# Compute the model's mean squared error.
gquery.metric.mean_squared_error(window["outputs"], window["feedback.label"])
# Or get the max error in the window.
gquery.metric.max_error(window["outputs"], window["feedback.label"])
Check out more of the available metrics here.
Computing distribution distances#
Distribution distances measure how similar the the distributions of a feature are between two windows of data.
# Compute the d1 distance between two features.
gquery.distance.d1(window["inputs.feature_1"], window["inputs.feature_2"])
# Compute the Kolmogorov-Smirnov distance.
gquery.distance.ks(window["inputs.feature_1"], window["inputs.feature_2"])
# Compute the Kullback-Liebler divergence distance.
gquery.distance.kl(window["inputs.feature_1"], window["inputs.feature_2"])
Check out more of the available distance metrics here.