Technical Program

Paper Detail

Paper IDL.6.3
Paper Title On Byzantine-Resilient High-Dimensional Stochastic Gradient Descent
Authors Deepesh Data, Suhas Diggavi, University of California, Los Angeles, United States
Session L.6: Gradient-Based Distributed Learning
Presentation Lecture
Track Statistics and Learning Theory
Manuscript  Click here to download the manuscript
Virtual Presentation  Click here to watch in the Virtual Symposium
Abstract We study stochastic gradient descent (SGD) in the master-worker architecture under Byzantine attacks. Building upon the recent advances in algorithmic high-dimensional robust statistics, in each SGD iteration, master employs a non-trivial decoding to estimate the true gradient from the unbiased stochastic gradients received from workers, some of which may be corrupt. We provide convergence analyses for both strongly-convex and non-convex smooth objectives under standard SGD assumptions. We can control the approximation error of our solution in both these settings by the mini-batch size of stochastic gradients; and we can make the approximation error as small as we want, provided that workers use a sufficiently large mini-batch size. Our algorithm can tolerate less than $\frac{1}{3}$ fraction of Byzantine workers. It can approximately find the optimal parameters in the strongly-convex setting {\em exponentially fast}, and reaches to an approximate stationary point in the non-convex setting with {\em linear speed}, i.e., with a rate of $\frac{1}{T}$, thus, matching the convergence rates of vanilla SGD in the Byzantine-free setting.

Plan Ahead


2021 IEEE International Symposium on Information Theory

11-16 July 2021 | Melbourne, Victoria, Australia

Visit Website!