Cost and Usage Monitoring

Automatic cost monitoring

Graphsignal offers cost estimation for the OpenAI API, enabling users to aggregate and view costs by deployments or other tags, whether automatic or custom, such as model, endpoint, user_id, session_id, or any custom tag.

When using API gateway / reverse proxy to route OpenAI API request, provide api_provider=openai or api_provider=azure tag in your application. Tags can also be set in an environment variable, at request context level, or at operation level.

When streaming is used for completion or chat requests, Grapshignal tracer does not count prompt tokens by default and therefore such requests will be undercounted in cost metrics. To enable token counting for streaming, simply pip install tiktoken (Python), and the tracer will be able to use it for counting.

Manual usage recording

When manually tracing LLM calls or other operations, Graphsignal enables the capture of usage metrics at both the operation and payload levels. Below is an illustration of how usage data can be recorded at the payload level:

Python
with graphsignal.trace('my-generation', tags=dict(model_type='chat')) as span:
    ...
    span.set_payload(
        name='input', 
        content=input_data, 
        usage=dict(
            token_count=my_token_count, 
            cost_usd=my_token_count * token_price))

See Span.set_payload() for more information.

And here is an example of span-level usage recording:

with graphsignal.trace('my-operation') as span:
    ...
    span.set_usage('call_cost', price_per_call)

See Span.set_usage() for more information.

The usage and cost data will be presented as time series on the Metrics Dashboard, allowing for aggregation by tags. Additionally, usage details will be accessible within the traces.