Introduction
What is Graphsignal
Section titled “What is Graphsignal”Graphsignal is a production-scale inference profiling platform that helps engineers optimize AI performance across models, engines, GPUs, and other accelerators. It provides essential visibility across the inference stack, including:
- Continuous, high-resolution profiling timelines exposing operation durations and resource utilization across inference workloads.
- LLM generation tracing with per-step timing, token throughput, and latency breakdowns for major inference frameworks.
- System-level metrics for inference engines and hardware (CPU, GPU, accelerators).
- Error monitoring for device-level failures and inference errors.
- Inference telemetry for AI agents to identify bottlenecks and drive targeted improvements across the inference stack.
The name Graphsignal blends graph - the structure underlying inference - with signal - the telemetry and profiling data emitted during execution.
How it works
Section titled “How it works”The Graphsignal Profiler runs as a sidecar process alongside your inference workload — started with the graphsignal-run CLI or graphsignal.watch() from Python.
The profiler sends recorded performance data to Graphsignal servers, where it is post-processed and ready to analyze at app.graphsignal.com.
Getting started
Section titled “Getting started”- Sign up for an account.
- See the Quick Start guide on how to add Graphsignal to your application.