On-Device Training Under 256KB Memory
Authors
Authors
- Song Han
- Chuang Gan
- Ji Lin
- Ligeng Zhu
- Wei-Ming Chen
- Wei-Chen Wang
Authors
- Song Han
- Chuang Gan
- Ji Lin
- Ligeng Zhu
- Wei-Ming Chen
- Wei-Chen Wang
Published on
12/04/2022
Categories
On-device training enables the model to adapt to new data collected from the sensors by fine-tuning a pre-trained model. Users can benefit from customized AI models without having to transfer the data to the cloud, protecting the privacy. However, the training memory consumption is prohibitive for IoT devices that have tiny memory resources. We propose an algorithm-system co-design framework to make on-device training possible with only 256KB of memory. On-device training faces two unique challenges: (1) the quantized graphs of neural networks are hard to optimize due to low bit-precision and the lack of normalization; (2) the limited hardware resource (memory and computation) does not allow full backpropagation. To cope with the optimization difficulty, we propose QuantizationAware Scaling to calibrate the gradient scales and stabilize 8-bit quantized training. To reduce the memory footprint, we propose Sparse Update to skip the gradient computation of less important layers and sub-tensors. The algorithm innovation is implemented by a lightweight training system, Tiny Training Engine, which prunes the backward computation graph to support sparse updates and offload the runtime auto-differentiation to compile time. Our framework is the first practical solution for on-device transfer learning of visual recognition on tiny IoT devices (e.g., a microcontroller with only 256KB SRAM), using less than 1/1000 of the memory of PyTorch and TensorFlow while matching the accuracy. Our study enables IoT devices not only to perform inference but also to continuously adapt to new data for on-device lifelong learning. A video demo can be found here.
Please cite our work using the BibTeX below.
@inproceedings{
lin2022ondevice,
title={On-Device Training Under 256{KB} Memory},
author={Ji Lin and Ligeng Zhu and Wei-Ming Chen and Wei-Chen Wang and Chuang Gan and song han},
booktitle={Advances in Neural Information Processing Systems},
editor={Alice H. Oh and Alekh Agarwal and Danielle Belgrave and Kyunghyun Cho},
year={2022},
url={https://openreview.net/forum?id=zGvRdBW06F5}
}