Research

The Missing Invariance Principle found — the Reciprocal Twin of Invariant Risk Minimization

NeurIPS

Authors

Published on

12/04/2022

Categories

NeurIPS

Machine learning models often generalize poorly to out-of-distribution (OOD) data as a result of relying on features that are spuriously correlated with the label during training. Recently, the technique of Invariant Risk Minimization (IRM) was proposed to learn predictors that only use invariant features by conserving the feature-conditioned label expectation Ee[y|f(x)] across environments. However, more recent studies have demonstrated that IRM-v1, a practical version of IRM, can fail in various settings. Here, we identify a fundamental design flaw of IRM formulation that causes the failure. We then introduce a complementary notion of invariance, MRI, based on conserving the label-conditioned feature expectation Ee[f(x)|y], which is free of this flaw. Further, we introduce a simplified, practical version of the MRI formulation called MRI-v1. We prove that for general linear problems, MRI-v1 guarantees invariant predictors given sufficient number of environments. We also empirically demonstrate that MRI-v1 strongly out-performs IRM-v1 and consistently achieves near-optimal OOD generalization in image-based nonlinear problems.

Please cite our work using the BibTeX below.

@inproceedings{
huh2022the,
title={The Missing Invariance Principle found --  the Reciprocal Twin of Invariant Risk Minimization },
author={Dongsung Huh and Avinash Baidya},
booktitle={Advances in Neural Information Processing Systems},
editor={Alice H. Oh and Alekh Agarwal and Danielle Belgrave and Kyunghyun Cho},
year={2022},
url={https://openreview.net/forum?id=zz0FC7qBpkh}
}
Close Modal