DeepSpeed Inference Monitoring

See the Quick Start guide on how to install and configure Graphsignal.

from transformers import pipeline
import deepspeed
import graphsignal

generator = pipeline(task="text-generation")

ds_engine = deepspeed.init_inference(generator.model, config=config)
generator.model = ds_engine.module

with graphsignal.start_trace(endpoint='predict') as trace:
    trace.set_data('input', input_text)
    output = generator(input_text, do_sample=False, min_length=50, max_length=50)
    trace.set_data('output', output)


The DeepSpeed GPT Neo example illustrates a distributed inference use case.

Model serving

Graphsignal provides a built-in support for server applications. See Model Serving guide for more information.