Projections

Projections are computed columns used to help interpret unstructured data.

Projections overview

Projections are functions that take one or more columns from a record as inputs, and append a new column containing the output of the function on the given inputs. They are mainly used for projecting, hence the name, higher dimensional inputs and outputs to values that can monitored more tractably. Gantry comes with a number of projections pre-defined. Gantry also allows for custom projections defined as a Python function. The example below demonstrates using the word_count built-in projection to understand how many words are in the input to the model.

1626

Projections help make sense of unstructured data such as text and images. They can act as proxies for possible model quality issues by mapping fields to scalars. Projections are computed on Gantry’s infrastructure; they don't slow down inference and can be iterated on without model redeployment.

As an example, let's walk through the following scenario:
Suppose we are building a machine learning application to classify support tickets and want to detect changes to the text being passed in to the model. We can define summaries that we can monitor by adding Length and Sentiment projections. We can then use these projections to set alerts or slice our data. If our application suddenly starts receiving many long support tickets with unusually negative sentiment, Gantry will let us know.

📘

New projections will only be processed for new data. This behavior can be modified by triggering a backfill (click "backfill all projections" in the UI) to run the projection on existing historical data.

Built-in Projections

Projections can be added in the projection sidebar. Click the + button, select the projection to add, and define it with the field(s) on which it should be computed.

1436

Custom Projections

There are two parts to a custom projection:

  1. A projection function defined in a Python file that takes one or more fields from a logged record and returns a single output of any supported type.
  2. a YAML config file that contains some basic information about how to package and run the projection function.

Are you interested in defining custom projections in other languages? Let us know!

Step 1: Prerequisites
Create a directory for your projection, for example:

mkdir my-custom-projection

Step 2: Write the Projection
The example below uses Spacy to detect whether an input contains a proper noun:

import spacy

nlp = spacy.load('en_core_web_sm')

def contains_proper_noun(text):
    pos = [token.pos_ for token in nlp(text)]
    if 'PROPN' in pos:
        return 1.0
    return 0.0

Step 3: Configure the Projection
Create a YAML file config.yaml in your projection directory:

version: 1
function_definition:
  projection_name: proper_noun_detection
  entrypoint: custom_projections.py
  function_name: contains_proper_noun
  output:
    type: Float
  inputs:
    - type: Text
  requirements:
    - spacy #optionally define the version spacy==3.5.0
    - https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.4.1/en_core_web_sm-3.4.1.tar.gz
lambda_definition:
  runtime_lang: python
  runtime_version: "3.9"
  memory_size: 256

The function_definition section tells Gantry to create a custom projection called proper_noun_detection for the latest version of my-awesome-app application, using the function contains_proper_noun found in the custom_projections.py Python file defined in step 2.

The projection takes in one input text of type Text from my-awesome-app and outputs a scalar of type Float.

Additional Python libraries are required to build and execute this function, specified in requirements.

Note that the lambda_definition section is optional. If it is not provided in the config file, Gantry will automatically assign the following defaults:

lambda_definition:
  runtime_lang: python # currently only python is supported
  runtime_version: "3.9"
  memory_size: 256 # memory limit in MB
  timeout: 5 # execution timeout in seconds

Step 4: [Optional] Install private packages
If the custom projection relies on private packages that can only be accessed locally, they must be installed in the extra_deps directory before the custom projection is submitted:

cd my-custom-projection
pip install --target ./extra_deps requests --index-url https://your-private-pypi/

Step 5: Submit custom projection

Use the CLI to submit the custom projection definition to Gantry.

GANTRY_API_KEY=$MY_API_KEY gantry-cli projection create --projection-dir my-custom-projection/

The build process including error messages can be monitored from the console. Once the build succeeds, the custom projection function will appear in the Add projection list and can be added like any other projection.


What’s Next