Natively supported frameworks and technologies
Machine learning profiler for full visibility
Benchmark training speed
Compare changes in parameters, speed and compute across runs.
Analyze time distribution
Understand where the most time and resources are spend.
Analyze operation and kernel statistics
Understand what ML operations and compute kernels consume most time.
See detailed device utilization
Make sure all CPUs and GPUs are utilized as expected.
Monitor speed and usage metrics
Monitor run metrics to catch issues such as memory leaks.
Get visibility into distributed workloads
See all distributed training or inference performance data in one place.
Enable team access
Easily share improvements with team members and others.
Ensure data privacy
Keep data private. No code or data is sent to Graphsignal cloud, only run statistics and metadata.
Read more about ML profiling
Benchmarking, Profiling and Monitoring PyTorch Lightning With Graphsignal
Learn how to benchmark, profile and monitor PyTorch Lightning training using Graphsignal.
Benchmarking and Profiling Hugging Face Training With Graphsignal
Learn how to monitor, benchmark and profile Hugging Face training using Graphsignal.