Projections

Projections are computed columns mainly used to help interpret unstructured data.

Projections are essentially functions that take one or more columns from a record as inputs, and append a new column containing the output of the function on the given inputs. They are mainly used for projecting, hence the name, higher dimensional inputs and outputs to scalars that can monitored more tractably. In the example below we use the word_count projection to get a feel for how long the output our model is producing, illustrated below.

16261626

Motivation

Unstructured data such as text and images can be hard to analyze in raw form. Projections map those fields to scalars that can act as proxies for possible model quality issues. They are computed on Gantry’s infrastructure after the data is logged, so they don’t slow down inference and you can iterate on them without redeploying your model.

For example, suppose you are building a machine learning application to classify support tickets and want to detect changes to the text being passed in to the model. Projections allow you to define monitorable summaries of the support tickets. To monitor the input text, you can add Length and Sentiment projections. You can then use these projections like you would any other field to compute metrics, set alerts, or slice your data. If your application suddenly starts receiving many long support tickets with unusually negative sentiment, Gantry will let you know.

Gantry comes with a number of projections pre-defined. You can also define your own custom Python function as a projection.

Built-in Projections

To add a projection, go to the projections sidebar. Click the + button and select the projection you want to add and the field(s) you want to compute it on.

14361436

You can choose from the many built-in projections to add a column to your data table.

📘

When you add a new projection, it'll be used to process new data you log to Gantry. You can optionally trigger a backfill to run the projection on existing historical data.

Custom Projections

To create a custom projection, you’ll need to provide (i) a projection function, and (ii) a YAML config file that contains some basic information about how to package and run the projection function.

A projection function is defined in a Python file that takes fields from a logged record and returns a single scalar value. The scalar output can be of type float, int, or str.

Are you interested in defining custom projections in other languages? Let us know!

Step 1: Prerequisites

Create a directory for your projection, for example:

mkdir my-custom-projection

Step 2: Write the Projection

Suppose we want to use Spacy to detect whether an input contains a proper noun. Create a Python file in your projection directory with the projection function, for example custom_projections.py:

import spacy

nlp = spacy.load('en_core_web_sm')

def contains_proper_noun(text):
    pos = [token.pos_ for token in nlp(text)]
    if 'PROPN' in pos:
        return 1.0
    return 0.0

Step 3: Configure the Projection

Create a YAML file config.yaml in your projection directory:

function_definition:
  application_name: my-awesome-app
  projection_name: proper_noun_detection
  entrypoint: custom_projections.py
  function_name: contains_proper_noun
  dtype: float  # optional, defaults to float
  function_args:
    - inputs.text
  function_arg_types: # optional
    - str
  requirements:
    - spacy
    - https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.4.1/en_core_web_sm-3.4.1.tar.gz
lambda_definition: # optional
  runtime_lang: python
  runtime_version: "3.9"
  memory_size: 256

The function_definition section tells Gantry to create a custom projection called proper_noun_detection for the latest version of my-awesome-app application, using the function constains_proper_noun found in the custom_projections.py Python file defined in step 2.

The projection takes in one input text of type str from my-awesome-app, and outputs a scalar of type float.

Additional Python libraries are required to build and execute this function, specified in requirements.

Note that the lambda_definition section is optional. If it is not provided in the config file, we will use the default values:

lambda_definition:
  runtime_lang: python
  runtime_version: "3.9"
  memory_size: 256

Step 4: [Optional] Install private packages

If you have any private packages that can only be accessed on your machine, you can install them to extra_deps before submitting the custom projection, like so:

cd my-custom-projection
pip install --target ./extra_deps requests --index-url https://your-private-pypi/

Step 5: Submit custom projection

Use the CLI to submit the custom projection definition to Gantry. Gantry will take it from here, building the function on its infrastruture.

GANTRY_API_KEY=$MY_API_KEY gantry-cli projection create --projection-dir my-custom-projection/

The console will show you the build progress and display any error messages.

Once the build succeeds, you should see the custom projection from Gantry Dashboard, under Projections for your application.