Deploy Application

After we're satisfied with the performance of our Application, we can deploy it and use it in our product. In this section, we'll deploy our grammar correction application and interact with it using a simple Gradio UI.

Deploy your Application

From the evaluation report from the previous section, we can deploy your version by clicking "Deploy Version 1" at the bottom of the report.

We can also deploy our application by navigating to the Versions section.

Click the action menu for the version you want to deploy to show the deployment options.

πŸ“˜

Understanding deployment in Gantry

Gantry provides two application deployment environments: prod and test.

When an application is deployed to one of those environments, the application version instantly becomes available in the namespace for that environment.

Using the Gantry API and SDK, we can call the latest prod and test endpoints directly, which will fill out the prompt template and forward our request to OpenAI. This option is meant for small deployments, like testing or internal apps.

For production-scale applications, it is recommended to pull down the information that describes the version and use it to call OpenAI directly.

In either case, when an application is deployed, the version tagged prod or test is instantly updated and available to endpoints, without needing to change production code.

Click "Deploy to prod".

Interacting with the deployed application

To make it easy to interact with our grammar correction app, we'll run a simple Gradio UI.

First, install gradio

pip install gradio

Then, copy the following code and save it as a new Python file called app.py:

import os
import gradio as gr
import requests

APP_NAME = "my-app"
API_KEY = os.getenv("GANTRY_API_KEY")

def generate(text):
    api_url = "https://app.gantry.io"
    url = api_url + "/api/v1/applications/{}/envs/prod/completions".format(APP_NAME)
    headers = {
        "accept": "application/json",
        "content-type": "application/json",
        "X-Gantry-Api-Key": API_KEY
    }
    payload = {"prompt_values": {"user_input": text}}
    response = requests.post(url, json=payload, headers=headers)
    return response.json()["choices"][0]["text"]

demo = gr.Interface(
    fn=generate,
    inputs=gr.inputs.Textbox(lines=5, label="Input Text"),
    outputs="text",
    allow_flagging="never"
)

demo.launch(share=False)

Make sure you have your API Key saved in your environment as GANTRY_API_KEY. You can check whether it is by running

env | grep GANTRY_API_KEY

Now run the app

gradio app.py

We can now interact with the version deployed to prod in the browser by visiting `http://127.0.0.1:7861.

🚧

If you run into trouble interacting with your app, make sure:

  1. You are using the correct name for your Gantry Application
  2. You have a version in your Application deployed to production
  3. You're using a valid Gantry API key

Try out your application on a few examples so you have some data for the next step.


What’s Next

Learn how to analyze the performance of your production app and using production data to iterate on your evaluation dataset