dstack Profiling, Tracing, and Monitoring

See the Quick Start guide on how to install and configure Graphsignal.

dstack runs inference workloads as services (for example SGLang on dstack). Use Graphsignal the same way as for a bare-metal SGLang server: install Graphsignal in the service image or at startup, pass GRAPHSIGNAL_API_KEY, and start the server through graphsignal-run.

For GPU profiling with SGLang on Linux, install the CUPTI extra matching your CUDA version: pip install graphsignal[cu12] (CUDA 12.x) or pip install graphsignal[cu13] (CUDA 13.x).

Graphsignal automatically instruments and profiles SGLang when it wraps the launch command.

What is captured

  • Profiling: SGLang millisecond-level operations.
  • Tracing: SGLang OTEL spans.
  • Metrics: SGLang Prometheus metrics.

Integration into a Python application

Call graphsignal.configure(...) in your app and run your workload normally.

import graphsignal

graphsignal.configure(api_key='my-api-key')
# or pass the API key via the GRAPHSIGNAL_API_KEY environment variable

dstack inference service (SGLang) with Graphsignal

The dstack SGLang example starts the server with python3 -m sglang.launch_server. Below is the same shape of configuration with Graphsignal installed at startup, the API key supplied via the environment, and the server started through graphsignal-run.

Set GRAPHSIGNAL_API_KEY in your dstack secrets or environment as you prefer; the service only needs the variable present at runtime.

type: service
name: deepseek-r1

image: lmsysorg/sglang:latest
env:
  - MODEL_ID=deepseek-ai/DeepSeek-R1-Distill-Llama-8B
  - GRAPHSIGNAL_API_KEY

commands:
  - |
    pip install --no-cache-dir 'graphsignal[cu12]' && \
    graphsignal-run python3 -m sglang.launch_server \
      --model-path $MODEL_ID \
      --port 8000 \
      --trust-remote-code

port: 8000
model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B

resources:
  gpu: 24GB

Deploy with dstack apply, for example:

dstack apply -f service.dstack.yml

If you use the sglang CLI instead of launch_server, you can swap the wrapped command for graphsignal-run sglang serve ... (see the SGLang integration page).

Add Graphsignal to a custom Docker image used by dstack

If your image already includes Graphsignal (and optionally CUPTI), you only need env with GRAPHSIGNAL_API_KEY and a commands entry that uses graphsignal-run around your server command—no pip install in commands.

If the image does not include Graphsignal, keep the pip install step before graphsignal-run as in the example above, or bake pip install 'graphsignal[cu12]' into the image for faster cold starts.