signSGD via Zeroth-Order Oracle



Published on


In this paper, we design and analyze a new zeroth-order (ZO) stochastic optimization algorithm, ZO-signSGD, which enjoys dual advantages of gradient-free operations and signSGD. The latter requires only the sign information of gradient estimates but is able to achieve a comparable or even better convergence speed than SGD-type algorithms. Our study shows that ZO signSGD requires sqrt{d} times more iterations than signSGD, leading to a convergence rate of O(sqrt{d}/sqrt{T}) under mild conditions, where d is the number of optimization variables, and T is the number of iterations. In addition, we analyze the effects of different types of gradient estimators on the convergence of ZO-signSGD, and propose two variants of ZO-signSGD that at least achieve O(sqrt{d}/sqrt{T}) convergence rate. On the application side we explore the connection between ZO-signSGD and black-box adversarial attacks in robust deep learning. Our empirical evaluations on image classification datasets MNIST and CIFAR-10 demonstrate the superior performance of ZO-signSGD on the generation of adversarial examples from black-box neural networks.

Please cite our work using the BibTeX below.

title={sign{SGD} via Zeroth-Order Oracle},
author={Sijia Liu and Pin-Yu Chen and Xiangyi Chen and Mingyi Hong},
booktitle={International Conference on Learning Representations},
Close Modal