A practical workflow to debug production inference issues and optimize performance using Claude Code and Graphsignal debug context.
Production inference systems fail in ways that are hard to debug with generic logs and dashboards. The fastest path to root cause is to combine inference observability with an AI coding agent that can investigate the right time window, summarize profiles/errors/traces, and suggest concrete fixes.
This is the new way to debug, optimize, and troubleshoot production inference:
Install the Graphsignal debug skill for Claude Code:
git clone https://github.com/graphsignal/graphsignal-debug ~/.claude/skills/graphsignal-debug
Install and authenticate the CLI:
pip install graphsignal-debug
graphsignal-debug login
This enables Claude Code to run:
graphsignal-debug fetch --start <ISO_UTC> --end <ISO_UTC> --tags "env:prod"
When there is an incident, ask a direct question tied to a real time window.
Example prompt:
What was the cause of the spike yesterday? Fetch Graphsignal debug context for 14:00-16:00 UTC in production and summarize root cause with evidence.
A typical investigation flow:

This approach is faster than manually pivoting across multiple dashboards because the agent can interpret profiles, errors, and traces together in one pass.
The same workflow applies to continuous optimization, not just incident response.
Use prompts like:
From those findings, optimize the stack in priority order:
Production inference optimization is not a one-time benchmark exercise. It is an always-on loop powered by observability plus AI-assisted debugging.
See AI Debugging and Quick Start for setup details.