Research

Unsupervised Hierarchical Matching with Optimal Transport over Hyperbolic Spaces

AISTATS

Authors

  • David Alvarez-Melis
  • Youssef Mroueh, Tommi Jaakkola

Published on

08/28/2020

Categories

AISTATS

This paper focuses on the problem of unsupervised alignment of hierarchical data such as ontologies or lexical databases. This problem arises across areas, from natural language processing to bioinformatics, and is typically solved by appeal to outside knowledge bases and label-textual similarity. In contrast, we approach the problem from a purely geometric perspective: given only a vector-space representation of the items in the two hierarchies, we seek to infer correspondences across them. Our work derives from and interweaves hyperbolic-space representations for hierarchical data, on one hand, and unsupervised word-alignment methods, on the other. We first provide a set of negative results showing how and why Euclidean methods fail in this hyperbolic setting. We then propose a novel approach based on optimal transport over hyperbolic spaces, and show that it outperforms standard embedding alignment techniques in various experiments on cross-lingual WordNet alignment and ontology matching tasks.

This paper has been published at AISTATS 2020

Please cite our work using the BibTeX below.

@InProceedings{pmlr-v108-alvarez-melis20a,
  title = 	 {Unsupervised Hierarchy Matching with Optimal Transport over Hyperbolic Spaces},
  author =       {Alvarez-Melis, David and Mroueh, Youssef and Jaakkola, Tommi},
  booktitle = 	 {Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics},
  pages = 	 {1606--1617},
  year = 	 {2020},
  editor = 	 {Chiappa, Silvia and Calandra, Roberto},
  volume = 	 {108},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {26--28 Aug},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v108/alvarez-melis20a/alvarez-melis20a.pdf},
  url = 	 {http://proceedings.mlr.press/v108/alvarez-melis20a.html},
  abstract = 	 {This paper focuses on the problem of unsupervised alignment of hierarchical data such as ontologies or lexical databases. This problem arises across areas, from natural language processing to bioinformatics, and is typically solved by appeal to outside knowledge bases and label-textual similarity. In contrast, we approach the problem from a purely geometric perspective: given only a vector-space representation of the items in the two hierarchies, we seek to infer correspondences across them. Our work derives from and interweaves hyperbolic-space representations for hierarchical data, on one hand, and unsupervised word-alignment methods, on the other. We first provide a set of negative results showing how and why Euclidean methods fail in this hyperbolic setting. We then propose a novel approach based on optimal transport over hyperbolic spaces, and show that it outperforms standard embedding alignment techniques in various experiments on cross-lingual WordNet alignment and ontology matching tasks. }
}
Close Modal