Posts Tagged ‘Vision’

COCO

COCO: This image dataset contains image data suitable for object detection and segmentation. It contains 5 annotation types for Object Detection, Keypoint Detection, Stuff Segmentation, Panoptic Segmentation and Image Captioning all explained in details on the data format section of the dataset page (http://cocodataset.org/#format-data). Here is some information regarding the latest version of this dataset:…

Read More

SALICON

SALICON: This image dataset which is also a mouse tracking dataset, has been created from a subset of images from a parent dataset called MS COCO 2014 (available on http://cocodataset.org/#home) with an additional annotation type “fixation”. The visual attentional data for this dataset is collected by using mouse tracking methods. The research work related to…

Read More

SUN

SUN: This dataset contains thousands of color images for scenes recognition provided by Princeton University. The images include environmental scenes, places and objects. To create the dataset, WordNet English dictionary is used to find any nouns completing the sentence “I am in -a place-“ or “Let’s go to -the place-“ and data samples are manually…

Read More

LSUN

LSUN: This dataset contains millions of color images for scenes and objects which is far bigger than ImageNet dataset. The labels for this dataset are available based on human’s effort for labeling in conjunction with several different image classification models. The images are from parent databases Pascal Voc 2012 and 10 Million Images for 10…

Read More

DeepFashion

DeepFashion This dataset contains images of clothing items while each image is labeled with 50 categories and annotated with 1000 attributes, bounding box and clothing landmarks in different poses. Four datasets are developed according to the DeepFashion dataset including Attribute Prediction, Consumer-to-shop Clothes Retrieval, In-shop Clothes Retrieval and Landmark Detection in which only Attribute Prediction…

Read More

Fashion MNIST

Fashion MNIST: This dataset contains grayscale images for clothing generated by Zalando (https://jobs.zalando.com/tech/). The dataset is created to be a substitute for the original MNIST dataset for machine learning algorithms. This substitution seems necessary because achieving very high classification accuracies is easy by classical machine learning algorithms. Also, MNIST might have been overused. As a…

Read More

MNIST

MNIST: This dataset contains grayscale images for handwritten digits in which half of the training set and half of the test set are collected among Census Bureau employees and the second half of each training and test sets are collected among high school students. The dataset is a subset of images from two parent datasets…

Read More

CIFAR-10 & CIFAR-100

CIFAR-10 & CIFAR-100: These two datasets are labeled images from a parent dataset called Tiny Images Dataset (which is available on http://horatio.cs.nyu.edu/mit/tiny/data/index.html). CIFAR-10: Number of images in the dataset: 60,000 (50,000 images for training divided into 5 batches and 10,000 images for test in one batch) Image size: 32×32 Number of classes: 10 (airplane, automobile,…

Read More

ImageNet

ImageNet: This dataset contains images which are organized according to the WordNet hierarchy (WorldNet 3.0) in which every node refers to up to thousands of images. Each concept in WorldNet is described by synonym sets (synsets) which are words and phrases. ImageNet aims to have 1000 images per synset on average. Because the images are…

Read More

UCF50 & UCF101

UCF50 & UCF101: These two datasets contain realistic action recognition videos collected from Youtube with large variations in motion, pose, scales and conditions. The video files are categorized in groups with similar features, for example same person in the videos, similar viewpoints, background, etc. UCF50 Here is some information regarding this dataset: Number of Categories:…

Read More