Research
All
Showing 10 of 21 results
Clear search
Clear search
2025
2024
SuperFedNAS: Cost-Efficient Federated Neural Architecture Search for On-Device Inference
18th European Conference on Computer Vision (ECCV 2024), Milano, Italy, Oct 2024
·
12 Jul 2024
·
arxiv:2301.10879
D{\epsilon}pS: Delayed {\epsilon}-Shrinking for Faster Once-For-All Training
18th European Conference on Computer Vision (ECCV 2024), Milano, Italy, Oct 2024
·
09 Jul 2024
·
arxiv:2407.06167
2023
SuperServe: Fine-Grained Inference Serving for Unpredictable Workloads
22nd USENIX Symposium on Networked Systems Design and Implementation (NSDI'25), Philadelphia, USA, 2025.
·
29 Dec 2023
·
arxiv:2312.16733
Hardware–Software Co-Design for Real-Time Latency–Accuracy Navigation in Tiny Machine Learning Applications
IEEE Micro
·
01 Nov 2023
·
doi:10.1109/MM.2023.3317243
ABKD: Graph Neural Network Compression with Attention-Based Knowledge Distillation
arXiv
·
25 Oct 2023
·
arxiv:2310.15938
Subgraph Stationary Hardware-Software Inference Co-Design
Proc. of Sixth Conference on Machine Learning and Systems (MLSys'23)
·
03 Jul 2023
·
arxiv:2306.17266
2022
UnfoldML: Cost-Aware and Uncertainty-Based Dynamic 2D Prediction for Multi-Stage Classification
Proc. of 36'th Conference on Neural Information Processing Systems (NeurIPS'22)
·
31 Oct 2022
·
arxiv:2210.15056
Enabling Real-time DNN Switching via Weight-Sharing
Proc. of 2nd Architecture, Compiler, and System Support for Multi-model DNN Workloads Workshop
·
01 Jun 2022
2021
CompOFA: Compound Once-For-All Networks for Faster Multi-Platform Deployment
Proc. of International Conference on Learning Representations (ICLR'21)
·
27 Apr 2021
·
arxiv:2104.12642
2020
HOLMES: Health OnLine Model Ensemble Serving for Deep Learning Models in Intensive Care Units
Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
·
20 Aug 2020
·
doi:10.1145/3394486.3403212