DeepSpeed Inference Monitoring
See the Quick Start guide on how to install and configure Graphsignal.
from transformers import pipeline
import deepspeed
import graphsignal
generator = pipeline(task="text-generation")
ds_engine = deepspeed.init_inference(generator.model, config=config)
generator.model = ds_engine.module
with graphsignal.start_trace(endpoint='predict') as trace:
trace.set_data('input', input_text)
output = generator(input_text, do_sample=False, min_length=50, max_length=50)
trace.set_data('output', output)
Examples
The DeepSpeed GPT Neo example illustrates a distributed inference use case.
Model serving
Graphsignal provides a built-in support for server applications. See Model Serving guide for more information.