RepMet: Representative-based metric learning for classification and one-shot object detection

Computer Vision


Published on



Computer Vision

Distance metric learning (DML) has been successfully applied to object classification, both in the standard regime of rich training data and in the few-shot scenario, where each category is represented by only a few examples. In this work, we propose a new method for DML that simultaneously learns the backbone network parameters, the embedding space, and the multi-modal distribution of each of the training categories in that space, in a single end-to-end training process. Our approach outperforms state-of-the-art methods for DML-based object classification on a variety of standard fine-grained datasets. Furthermore, we demonstrate the effectiveness of our approach on the problem of few-shot object detection, by incorporating the proposed DML architecture as a classification head into a standard object detection model. We achieve the best results on the ImageNet-LOC dataset compared to strong baselines, when only a few training examples are available. We also offer the community a new episodic benchmark based on the ImageNet dataset for the few-shot object detection task.

Please cite our work using the BibTeX below.

  author    = {Eli Schwartz and
               Leonid Karlinsky and
               Joseph Shtok and
               Sivan Harary and
               Mattias Marder and
               Sharathchandra Pankanti and
               Rog{\'{e}}rio Schmidt Feris and
               Abhishek Kumar and
               Raja Giryes and
               Alexander M. Bronstein},
  title     = {RepMet: Representative-based metric learning for classification and
               one-shot object detection},
  journal   = {CoRR},
  volume    = {abs/1806.04728},
  year      = {2018},
  url       = {},
  archivePrefix = {arXiv},
  eprint    = {1806.04728},
  timestamp = {Wed, 16 Oct 2019 14:14:57 +0200},
  biburl    = {},
  bibsource = {dblp computer science bibliography,}

Close Modal