[2019 SOSP] ByteScheduler: A Generic Communication Scheduler for Distributed DNN Training ...
...Acceleration
Summary

Background & Motivation
Design & Implementation
Which layer should ByteScheduler be implemented in to make it more general?

Unified abstraction for communication tasks
Interaction with framework engines and crossing the global barrier


Auto-tuning partition size and credits using Bayesian Optimization

Comparisons with P3 and TicTac
Evaluation

Links & References
Previous[2019 NSDI] Tiresias: A GPU Cluster Manager for Distributed Deep LearningNext[2019 SOSP] PipeDream: Generalized Pipeline Parallelism for DNN Training
Last updated