Salient
Object

Subitizing

People can immediately and precisely identify that an image contains 1, 2, 3 or 4 items by a simple glance. The phenomenon, known as Subitizing, inspires us to pursue the task of Salient Object Subitizing, i.e. predicting the existence and the number of salient objects in a scene using holistic cues.

Paper

Jianming Zhang, Shugao Ma, Mehrnoosh Sameki, Stan Sclaroff, Margrit Betke, Zhe Lin, Xiaohui Shen, Brian Price and Radomír Měch. "Salient Object Subitizing." To appear in Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. [PDF] [Supplementary]

Jianming Zhang, Shugao Ma, Mehrnoosh Sameki, Stan Sclaroff, Margrit Betke, Zhe Lin, Xiaohui Shen, Brian Price and Radomír Měch. "Salient Object Subitizing." Journal version under review, 2016. [arXiv] [Supplementary] New

 

[BibTex]

@inproceedings{zhang2015salient,

  title={Salient Object Subitizing},

  author={Zhang, Jianming and Ma, Shuga and Sameki, Mehrnoosh and Sclaroff, Stan and Betke,  Margrit and Lin, Zhe and Shen, Xiaohui and Price, Brian and M\v{e}ch, Radom\'{i}r},

  booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},

  year={2015},

}

 

@artical{zhang2016salient,

  title={Salient Object Subitizing},

  author={Zhang, Jianming and Ma, Shuga and Sameki, Mehrnoosh and Sclaroff, Stan and Betke,  Margrit and Lin, Zhe and Shen, Xiaohui and Price, Brian and M\v{e}ch, Radom\'{i}r},

  journal={arXiv preprint arXiv:1607.07525},

  year={2016},

}

Contact: jmzhang AT bu.edu

The SOS Dataset

Extended version: [Download (2G)] (see the arXiv paper for detials) New

Initial version: [Download (965M)] [Bounding Box Annotations (training split only)]

We have collected an image dataset for salient object subitizing. The source images are from four public image datasets: COCO, VOC07, ImageNet and SUN. Each  image is labeled as containing 0, 1, 2, 3 or 4+ salient objects by Amazon Mechanic Turk workers.

The MSO Dataset

[Download (176M)]

We assembled a Multi-Salient-Object (MSO) dataset. Images of the MSO dataset are taken from the test set of the SOS dataset. We removed images with severely overlapping salient objects. We also removed the images for which we find it ambiguous to label the indicated number of salient objects. This leaves us with 1224 images out of 1380 images from our SOS test set. As shown in the table below, more than half of the images in our MSO dataset contain either zero salient objects or multiple salient objects. We believe that this dataset provides a more realistic setting to evaluate salient object detection methods. Currently only bounding box annotations are available, but we will share the object segmentation annotations in the near future.

The CNN Subitizing Model

[GoogleNet] New

We provide our CNN Caffe models for salient object subitizing. The CNN models are trained based on our arXiv paper, which provides significant improvement over our initial model ([AlexNet]).

We visualize the fc7 layer of our finetuned CNN model (based on AlexNet) using the 2D tSNE embeding method by Andrej Karpathy and Laurens van der Maaten. We find that images with similar content and composition tend to be close to each other in the embedding space.

Change log:

05/01/2015: project page created.

05/18/2015: updated download links.

04/09/2016: added the link to bounding box annotations.

07/28/2016: added the journal version

08/01/2016: updated the data and software for the journal version