Analyze production performance

Now that we've created an application, evaluated it, deployed it to production, and interacted with it, let's explore how to use Gantry to analyze model behavior and use production data to improve performance.

The analytics view

Navigate to the analytics view in the dashboard.

The analytics view displays metrics and stats regarding application performance.

It includes things like metrics logged to Gantry (latency and spend), as well as derived metrics (called Projections) like completion length, sentiment, toxicity, fluidity, language, and coverage.

To change the time range or add some comparison data to the charts, play with the buttons in the upper left. Data can be filtered by clicking "Filter" or clicking on any of the bars in the charts.

Next, let's look at the underlying data that has been logged.

Production logs

Click "View data" or scroll down to see the examples that have been logged as users interact with the model.

Suppose we discover some examples for which the model doesn't perform as expected. That means it's time to iterate on our evaluation set.

Select the rows, click "Add to dataset" and select our existing dataset (or create a new one).

Now, every time we change our application and run an evaluation, that evaluation will run on the production data that caused problems in the past. No more one-step-forward, two-steps-back prompt development!

What’s Next

That's the basics of using Gantry to build, evaluate, deploy, and analyze LLM-powered applications. Learn more in the in-depth guides.