Blog
2026
Agentic Workloads for Inference Evaluation
Why simple chat benchmarks are not enough for inference performance evaluation, and how to model agentic workloads with branching, prefix reuse, bursty timing, token heterogeneity, and reproducible synthetic sessions.
2021
Introducing CompOFA
Fast & Efficient Training of Once-For-All (OFA) models.