Research

APQ: Joint Search for Network Architecture, Pruning and Quantization Policy

CVPR

Authors

  • Song Han
  • Tianzhe Wang
  • Kuan Wang
  • Han Cai
  • Ji Lin
  • Zhijian Liu
  • Hanrui Wang
  • Yujun Lin

Published on

06/19/2020

We present APQ, a novel design methodology for efficient deep learning deployment. Unlike previous methods that separately optimize the neural network architecture, pruning policy, and quantization policy, we design to optimize them in a joint manner. To deal with the larger design space it brings, we devise to train a quantization-aware accuracy predictor that is fed to the evolutionary search to select the best fit. Since directly training such a predictor requires time-consuming quantization data collection, we propose to use predictor-transfer technique to get the quantization-aware predictor: we first generate a large dataset of pairs by sampling a pretrained unified supernet and doing direct evaluation; then we use these data to train an accuracy predictor without quantization, further transferring its weights to train the quantization-aware predictor, which largely reduces the quantization data collection time. Extensive experiments on ImageNet show the benefits of this joint design methodology: the model searched by our method maintains the same level accuracy as ResNet34 8-bit model while saving 8x BitOps; we obtain the same level accuracy as MobileNetV2+HAQ while achieving 2x/1.3x latency/energy saving; the marginal search cost of joint optimization for a new deployment scenario outperforms separate optimizations using ProxylessNAS+AMC+HAQ by 2.3% accuracy while reducing 600x GPU hours and CO2 emission.

Please cite our work using the BibTeX below.

@InProceedings{Wang_2020_CVPR,
author = {Wang, Tianzhe and Wang, Kuan and Cai, Han and Lin, Ji and Liu, Zhijian and Wang, Hanrui and Lin, Yujun and Han, Song},
title = {APQ: Joint Search for Network Architecture, Pruning and Quantization Policy},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}
Close Modal