Analyze your unstructured data using Projections#
Unstructured data such as text and images can be hard to analyze in raw form. Projections map those fields to scalars that can act as proxies for possible model quality issues. Projections allow you to define monitorable summaries of the support tickets. They are computed on Gantry’s infrastructure after the text is logged, so they don’t slow down inference and you can iterate on them without redeploying your model.
For example, suppose you are building a machine learning application to classify support tickets and want to detect
changes to the text being passed in to the model. To monitor the input text, you can add Length
and Sentiment
projections. You can use these projections like you would any other field to
compute metrics, set alerts, or slice your data. If your application suddenly starts receiving many long support tickets
with unusually negative sentiment, Gantry will let you know.
Gantry comes with a number of projections pre-defined. You can also define your own.
Configuring projections#
To add a projection, go to the projections sidebar. Click the + button and select the projection you want to add and the field(s) you want to compute it on. Note: adding a projection only affects new data you log to Gantry.
Define custom projections#
Gantry allows you to define your own metrics and projections to run on Gantry’s infrastructure.
Custom Projections#
To create a custom projection, you’ll need to provide (i) projection definition, and (ii) some basic information about how to run it.
A projection definition is a Python file that takes in the model inputs and returns a scalar value. You’ll provide the projection definition as a Python function and a yaml file that gives Gantry additional context on how to run the function.
Example#
Suppose we want to use Spacy to detect whether an input contains a proper noun. We can define the following function:
import spacy
def contains_proper_noun(text):
nlp = spacy.load('en_core_web_sm')
pos = [token.pos_ for token in nlp(text)]
if 'PROPN' in pos:
return 1.0
return 0.0
Now we need to give Gantry some basic information about how to run the function, which we include in a YAML file. In particular we need to tell Gantry:
where the Python file, specified by
entrypoint
, is relative to the YAMLin this case
custom_projections.py
, so in the same directory
the name of the model,
model_name
, to compute the projection onthe parameters the function,
function_args
, will takein this case
text
feature from the model inputs
the dependencies, specifically Python dependencies specified in
requirements
, required to run the functionin this case
spacy
is used to get the sentiment of the text
Here is the full YAML:
proper_noun_detector:
model_name: "my_model"
projection_name: "Proper Noun detect"
entrypoint: custom_projections.py
function_name: contains_proper_noun
function_args:
- inputs.text
requirements:
- spacy
You can then register the custom projection with Gantry as follows, substituting in your API key:
gantry-cli projection register --filename custom-projections.yaml --api_key <YOUR API KEY>
Gantry will then use the YAML file to determine the custom projection’s dependencies, and build a Docker container to run it. Once your projection is registered with your your Gantry instance, it will be computed on the model inputs as Gantry receives them, as well as on any reference data that is uploaded to Gantry.