This image dataset contains image data suitable for object detection and segmentation. It contains 5 annotation types for Object Detection, Keypoint Detection, Stuff Segmentation, Panoptic Segmentation and Image Captioning all explained in details on the data format section of the dataset page (

Here is some information regarding the latest version of this dataset:

  • Number of images in the dataset: 330,000 images while more than 200,000 are labeled (roughly equal halves for training and validation+test)

  • Number of classes: 80 object categories, 91 stuff categories

  • Image resolution: 640×480

More details and links for download can be found on the dataset and challenge page and

If you use this dataset:

Please make sure to read Terms of Use available on

Please make sure to cite the paper:

T. Lin, M. Maire, S. Belongie, L. Bourdev, R. Girshick, J. Hays, P. Perona, D. Ramanan, C. Zitnick, P, Microsoft COCO: Common Objects in Context. Dollar. European Conference on Computer Vision (ECCV), 2014.

keywords: Vision, Image, Object Detection, Segmentation