Analyze your unstructured data using Projections#

Unstructured data such as text and images can be hard to analyze in raw form. Projections map those fields to scalars that can act as proxies for possible model quality issues. Projections allow you to define monitorable summaries of the support tickets. They are computed on Gantry’s infrastructure after the text is logged, so they don’t slow down inference and you can iterate on them without redeploying your model.

For example, suppose you are building a machine learning application to classify support tickets and want to detect changes to the text being passed in to the model. To monitor the input text, you can add Length and Sentiment projections. You can use these projections like you would any other field to compute metrics, set alerts, or slice your data. If your application suddenly starts receiving many long support tickets with unusually negative sentiment, Gantry will let you know.

Gantry comes with a number of projections pre-defined. You can also define your own.

Configuring projections#

To add a projection, go to the projections sidebar. Click the + button and select the projection you want to add and the field(s) you want to compute it on. Note: adding a projection only affects new data you log to Gantry.

Create Projection

Define custom projections#

Gantry allows you to define your own metrics and projections to run on Gantry’s infrastructure.

Custom Projections#

To create a custom projection, you’ll need to provide (i) projection definition, and (ii) some basic information about how to run it.

A projection definition is a Python file that takes in the model inputs and returns a scalar value. You’ll provide the projection definition as a Python function and a yaml file that gives Gantry additional context on how to run the function.

Example#

Suppose we want to use Spacy to detect whether an input contains a proper noun. We can define the following function:

import spacy

def contains_proper_noun(text):
    nlp = spacy.load('en_core_web_sm')
    pos = [token.pos_ for token in nlp(text)]
    if 'PROPN' in pos:
        return 1.0
    return 0.0

Now we need to give Gantry some basic information about how to run the function, which we include in a YAML file. In particular we need to tell Gantry:

  • where the Python file, specified by entrypoint, is relative to the YAML

    • in this case custom_projections.py, so in the same directory

  • the name of the model, model_name, to compute the projection on

  • the parameters the function, function_args, will take

    • in this case text feature from the model inputs

  • the dependencies, specifically Python dependencies specified in requirements, required to run the function

    • in this case spacy is used to get the sentiment of the text

Here is the full YAML:

proper_noun_detector:
  model_name: "my_model"
  projection_name: "Proper Noun detect"
  entrypoint: custom_projections.py
  function_name: contains_proper_noun
  function_args:
    - inputs.text
  requirements:
    - spacy

You can then register the custom projection with Gantry as follows, substituting in your API key:

gantry-cli projection register --filename custom-projections.yaml --api_key <YOUR API KEY>

Gantry will then use the YAML file to determine the custom projection’s dependencies, and build a Docker container to run it. Once your projection is registered with your your Gantry instance, it will be computed on the model inputs as Gantry receives them, as well as on any reference data that is uploaded to Gantry.