Consider a general machine learning setting where the output is a set of labels or sequences. This output set is unordered and its size varies with the input. Whereas multi-label classification methods seem a natural first resort, they are not readily applicable to set-valued outputs because of the growth rate of the output space; and because conventional sequence generation doesn’t reflect sets’ order-free nature. In this paper, we propose a unified framework—sequential set generation (SSG)—that can handle output sets of labels and sequences. SSG is a meta-algorithm that leverages any probabilistic learning method for label or sequence prediction, but employs a proper regularization such that a new label or sequence is generated repeatedly until the full set is produced. Though SSG is sequential in nature, it does not penalize the ordering of the appearance of the set elements and can be applied to a variety of set output problems, such as a set of classification labels or sequences. We perform experiments with both benchmark and synthetic data sets and demonstrate SSG’s strong performance over baseline methods.
This work was published in AAAI 2019.
Please cite our work using the BibTeX below.
@article{Gao_Chen_Chenthamarakshan_Witbrock_2019, title={A Sequential Set Generation Method for Predicting Set-Valued Outputs}, volume={33}, url={https://ojs.aaai.org/index.php/AAAI/article/view/4136}, DOI={10.1609/aaai.v33i01.33012835}, abstractNote={<p>Consider a general machine learning setting where the output is a set of labels or sequences. This output set is unordered and its size varies with the input. Whereas multi-label classification methods seem a natural first resort, they are not readily applicable to set-valued outputs because of the growth rate of the output space; and because conventional sequence generation doesn’t reflect sets’ order-free nature. In this paper, we propose a unified framework—sequential set generation (SSG)—that can handle output sets of labels and sequences. SSG is a meta-algorithm that leverages any probabilistic learning method for label or sequence prediction, but employs a proper regularization such that a new label or sequence is generated repeatedly until the full set is produced. Though SSG is sequential in nature, it does not penalize the ordering of the appearance of the set elements and can be applied to a variety of set output problems, such as a set of classification labels or sequences. We perform experiments with both benchmark and synthetic data sets and demonstrate SSG’s strong performance over baseline methods.</p>}, number={01}, journal={Proceedings of the AAAI Conference on Artificial Intelligence}, author={Gao, Tian and Chen, Jie and Chenthamarakshan, Vijil and Witbrock, Michael}, year={2019}, month={Jul.}, pages={2835-2842} }