Unsupervised Hierarchical Matching with Optimal Transport over Hyperbolic Spaces
Authors
Authors
- David Alvarez-Melis
- Youssef Mroueh, Tommi Jaakkola
Authors
- David Alvarez-Melis
- Youssef Mroueh, Tommi Jaakkola
Published on
08/28/2020
Categories
This paper focuses on the problem of unsupervised alignment of hierarchical data such as ontologies or lexical databases. This problem arises across areas, from natural language processing to bioinformatics, and is typically solved by appeal to outside knowledge bases and label-textual similarity. In contrast, we approach the problem from a purely geometric perspective: given only a vector-space representation of the items in the two hierarchies, we seek to infer correspondences across them. Our work derives from and interweaves hyperbolic-space representations for hierarchical data, on one hand, and unsupervised word-alignment methods, on the other. We first provide a set of negative results showing how and why Euclidean methods fail in this hyperbolic setting. We then propose a novel approach based on optimal transport over hyperbolic spaces, and show that it outperforms standard embedding alignment techniques in various experiments on cross-lingual WordNet alignment and ontology matching tasks.
Please cite our work using the BibTeX below.
@InProceedings{pmlr-v108-alvarez-melis20a,
title = {Unsupervised Hierarchy Matching with Optimal Transport over Hyperbolic Spaces},
author = {Alvarez-Melis, David and Mroueh, Youssef and Jaakkola, Tommi},
booktitle = {Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics},
pages = {1606--1617},
year = {2020},
editor = {Chiappa, Silvia and Calandra, Roberto},
volume = {108},
series = {Proceedings of Machine Learning Research},
month = {26--28 Aug},
publisher = {PMLR},
pdf = {http://proceedings.mlr.press/v108/alvarez-melis20a/alvarez-melis20a.pdf},
url = {http://proceedings.mlr.press/v108/alvarez-melis20a.html},
abstract = {This paper focuses on the problem of unsupervised alignment of hierarchical data such as ontologies or lexical databases. This problem arises across areas, from natural language processing to bioinformatics, and is typically solved by appeal to outside knowledge bases and label-textual similarity. In contrast, we approach the problem from a purely geometric perspective: given only a vector-space representation of the items in the two hierarchies, we seek to infer correspondences across them. Our work derives from and interweaves hyperbolic-space representations for hierarchical data, on one hand, and unsupervised word-alignment methods, on the other. We first provide a set of negative results showing how and why Euclidean methods fail in this hyperbolic setting. We then propose a novel approach based on optimal transport over hyperbolic spaces, and show that it outperforms standard embedding alignment techniques in various experiments on cross-lingual WordNet alignment and ontology matching tasks. }
}