Authors

Published on

09/28/2021

Categories

This work develops a novel framework for communication-efficient distributed learning where the models to be learned are overparameterized. We focus on a class of kernel learning problems (which includes the popular neural tangent kernel (NTK) learning as a special case) and propose a novel {\it multi-agent kernel approximation} technique that allows the agents to distributedly estimate the full kernel function, and subsequently perform decentralized optimization, without directly exchanging any local data or parameters. The proposed framework is a significant departure from the classical consensus-based approaches, because the agents do not exchange problem parameters, and no consensus is required. We analyze the optimization and the generalization performance of the proposed framework for the ℓ2 loss. We show that with M agents and N total samples when certain generalized inner-product kernels (resp. the random features kernel) are used, each agent needs to communicate O(N2/M) bits (resp. O(NN/M) real values) to achieve minimax optimal generalization performance. We validate the theoretical results on 90 UCI benchmarking datasets (with average data size N≈1000) and show that each agent needs to share a total of 200N/M bits (resp. 3N/M real values) to closely match the performance of the centralized algorithms, and these numbers are independent of parameter and feature dimensions.

Please cite our work using the BibTeX below.

@inproceedings{ khanduri2022decentralized, title={Decentralized Learning for Overparameterized Problems: A Multi-Agent Kernel Approximation Approach}, author={Prashant Khanduri and Haibo Yang and Mingyi Hong and Jia Liu and Hoi To Wai and Sijia Liu}, booktitle={International Conference on Learning Representations}, year={2022}, url={https://openreview.net/forum?id=oj2yn1Q4Ett} }