MLSys Papers - Short Notes
[2019 arXiv] Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
[2019 MLSys] BlueConnect: Decomposing All-Reduce for Deep Learning on Heterogeneous Network Hierarchy

[2020 MLSys] Blink: Fast and Generic Collectives for Distributed ML


[2021 ICML] Boosting the Throughput and Accelerator Utilization of Specialized CNN Inference Beyond Increasing Batch Size

[2021 arXiv] Synthesizing Collective Communication Algorithms for Heterogeneous Networks with TACCL

[2021 SC] Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines


[2022 OSDI] Looking Beyond GPUs for DNN Scheduling on Multi-Tenant Clusters


PreviousMachine Learning Systems - IndexNext[2011 NSDI] Dominant Resource Fairness: Fair Allocation of Multiple Resource Types
Last updated

