Blog

2026

Agentic Workloads for Inference Evaluation

March 17, 2026 April 12, 2026

Why simple chat benchmarks are not enough for inference performance evaluation, and how to model agentic workloads with branching, prefix reuse, bursty timing, token heterogeneity, and reproducible synthetic sessions.

2021

Introducing CompOFA

Alind Khare Past PhD Student

Alexey Tumanov Principal Investigator

April 28, 2021 April 12, 2026

Fast & Efficient Training of Once-For-All (OFA) models.