PROVEN: Verifying Robustness of Neural Networks with a Probabilistic Approach


Published on


With deep neural networks providing state-of-the-art machine learning models for numerous machine learning tasks, quantifying the robustness of these models has become an important area of research. However, most of the research literature merely focuses on the \textit{worst-case} setting where the input of the neural network is perturbed with noises that are constrained within an p ball; and several algorithms have been proposed to compute certified lower bounds of minimum adversarial distortion based on such worst-case analysis. In this paper, we address these limitations and extend the approach to a \textit{probabilistic} setting where the additive noises can follow a given distributional characterization. We propose a novel probabilistic framework PROVEN to PRObabilistically VErify Neural networks with statistical guarantees — i.e., PROVEN certifies the probability that the classifier’s top-1 prediction cannot be altered under any constrained p norm perturbation to a given input. Importantly, we show that it is possible to derive closed-form probabilistic certificates based on current state-of-the-art neural network robustness verification frameworks. Hence, the probabilistic certificates provided by PROVEN come naturally and with almost no overhead when obtaining the worst-case certified lower bounds from existing methods such as Fast-Lin, CROWN and CNN-Cert. Experiments on small and large MNIST and CIFAR neural network models demonstrate our probabilistic approach can achieve up to around 75% improvement in the robustness certification with at least a 99.99% confidence compared with the worst-case robustness certificate delivered by CROWN.

Please cite our work using the BibTeX below.

  doi = {10.48550/ARXIV.1812.08329},
  url = {},
  author = {Weng, Tsui-Wei and Chen, Pin-Yu and Nguyen, Lam M. and Squillante, Mark S. and Oseledets, Ivan and Daniel, Luca},
  keywords = {Machine Learning (cs.LG), Cryptography and Security (cs.CR), Machine Learning (stat.ML), FOS: Computer and information sciences, FOS: Computer and information sciences},
  title = {PROVEN: Certifying Robustness of Neural Networks with a Probabilistic Approach},
  publisher = {arXiv},
  year = {2018},
  copyright = { perpetual, non-exclusive license}
Close Modal