The System for AI Lab (SAIL) at Georgia Tech, led by Prof. Alexey Tumanov, specializes in advancing systems support and resource management for machine learning (ML) to democratize large-scale AI systems. Our research encompasses the entire AI infrastructure stack, from foundational system design to the development of efficient ML training and inference algorithms. By focusing on managing the complete ML lifecycle, SAIL aims to enhance accessibility and efficiency in AI technologies.
Recent News
- Our papers on Medha for efficient multi-million context LLM inference and Maya for optimizing deep learning training workloads are now public.
- Congratulations to Prof. Tumanov on being awarded tenure and promotion to the position of associate professor 🎉.
- RocketKV 🚀, fast long-context inference with is now KV-cache compression is now on Arxiv.
- Mnemosyne, our paper on efficient inference up to 10M token context lengths is now public.
- Our paper on DL training checkpoint compression system, DynaQuant has been accepted at SoCC’24.
- SuperServe has been accepted at NSDI’25.
- Metron 📐 – our LLM inference system benchmark is now public.
- DεpS and SuperFedNAS have been accepted at ECCV’24.
- Congratulations to Prof. Alexey Tumanov for being awarded college of computing outstanding junior faculty teaching award.
- Sarathi-Serve ☸️, our paper on efficient LLM inference has been accepted at OSDI’24.
- Vidur 👳🏽, our paper on large scale LLM inference cluster simulation has been accepted at MLSys’24.
- Payman Behnam awarded NVIDIA Graduate Fellowship 2024 for advancing machine learning and systems with high-performance, low-latency, and energy-efficient hardware designs.
- Payman Behnam receives Qualcomm Innovation Fellowship 2023 for his work on Hardware-Software Co-Design for DNN inference systems.
- Amey Agrawal secures CRNCH PhD Fellowship 2023 for his research in LLM inference.