Authors
- Yada Zhu
- Jinjun Xiong
- Lecheng Zheng
- Jingrui He
Published on
08/18/2022
With the advent of big data across multiple high-impact applications, we are often facing the challenge of complex heterogeneity. The newly collected data usually consist of multiple modalities and are characterized with multiple labels, thus exhibiting the coexistence of multiple types of heterogeneity. Although state-of-theart techniques are good at modeling the complex heterogeneity with sufficient label information, such label information can be quite expensive to obtain in real applications. Recently, researchers pay great attention to contrastive learning due to its prominent performance by utilizing rich unlabeled data. However, existing work on contrastive learning is not able to address the problem of false negative pairs, i.e., some ‘negative’ pairs may have similar representations if they have the same label. To overcome the issues, in this paper, we propose a unified heterogeneous learning framework, which combines both the weighted unsupervised contrastive loss and the weighted supervised contrastive loss to model multiple types of heterogeneity. We first provide a theoretical analysis showing that the vanilla contrastive learning loss easily leads to the sub-optimal solution in the presence of false negative pairs, whereas the proposed weighted loss could automatically adjust the weight based on the similarity of the learned representations to mitigate this issue. Experimental results on real-world data sets demonstrate the effectiveness and the efficiency of the proposed framework modeling multiple types of heterogeneity.
Please cite our work using the BibTeX below.
@misc{https://doi.org/10.48550/arxiv.2105.09401,
doi = {10.48550/ARXIV.2105.09401},
url = {https://arxiv.org/abs/2105.09401},
author = {Zheng, Lecheng and Xiong, Jinjun and Zhu, Yada and He, Jingrui},
keywords = {Machine Learning (cs.LG), Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Heterogeneous Contrastive Learning},
publisher = {arXiv},
year = {2021},
copyright = {arXiv.org perpetual, non-exclusive license}
}