Research

Model Fusion with Kullback–Leibler Divergence

ICML

Authors

Categories

ICML

We propose a method to fuse posterior distributions learned from heterogeneous datasets. Our algorithm relies on a mean field assumption for both the fused model and the individual dataset posteriors and proceeds using a simple assign-and average approach. The components of the dataset posteriors are assigned to the proposed global model components by solving a regularized variant of the assignment problem. The global components are then updated based on these assignments by their mean under a KL divergence. For exponential family variational distributions, our formulation leads to an efficient non-parametric algorithm for computing the fused model. Our algorithm is easy to describe and implement, efficient, and competitive with state-of-the-art on motion capture analysis, topic modeling, and federated learning of Bayesian neural networks.

This paper has been published at ICML 2020

Please cite our work using the BibTeX below.

@InProceedings{pmlr-v119-claici20a,
  title = 	 {Model Fusion with Kullback-Leibler Divergence},
  author =       {Claici, Sebastian and Yurochkin, Mikhail and Ghosh, Soumya and Solomon, Justin},
  booktitle = 	 {Proceedings of the 37th International Conference on Machine Learning},
  pages = 	 {2038--2047},
  year = 	 {2020},
  editor = 	 {III, Hal Daumé and Singh, Aarti},
  volume = 	 {119},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--18 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v119/claici20a/claici20a.pdf},
  url = 	 {http://proceedings.mlr.press/v119/claici20a.html},
  abstract = 	 {We propose a method to fuse posterior distributions learned from heterogeneous datasets. Our algorithm relies on a mean field assumption for both the fused model and the individual dataset posteriors and proceeds using a simple assign-and-average approach. The components of the dataset posteriors are assigned to the proposed global model components by solving a regularized variant of the assignment problem. The global components are then updated based on these assignments by their mean under a KL divergence. For exponential family variational distributions, our formulation leads to an efficient non-parametric algorithm for computing the fused model. Our algorithm is easy to describe and implement, efficient, and competitive with state-of-the-art on motion capture analysis, topic modeling, and federated learning of Bayesian neural networks.}
}
Close Modal