Lena Folk: Language model monitoring

Monitoring LLMs

I created an interface that allows for fully monitoring multiple language models and tracking the amount of money spent on each of them.

Role

Solo designer in team with Product Owner
Duration

2 weeks
Platform

Web, B2B SaaS

The problem

When teams started integrating LLMs into their products, they immediately hit a visibility gap. There was no standard way to track what a model was doing, how it was performing, or what it was costing.

The challenge was designing a monitoring interface for a new category of software: one where the conventions hadn't been established yet.

Decisions

Cost as a first-class metric

Unlike traditional infrastructure components, LLMs have a direct per-request cost tied to token usage. I treated cost tracking not as a secondary detail but as a primary dashboard metric alongside latency and error rates. For teams running multiple models, understanding spend per model is as important as understanding performance.
Two views for chat history

Chat history serves two different needs. When an engineer is investigating a specific conversation, a dialogue view is far more readable. When they need to scan across many interactions, compare patterns, or find a specific exchange, a dense table view is faster. Both views exist, and the user chooses based on what they're trying to do.
Multi-model

The dashboard was designed to monitor multiple language models simultaneously, not just one. This was a deliberate structural choice. Monitoring tool that requires switching between separate views per model creates blind spots. Comparison across models is built into the default experience.

Chat history

Chat history as table