数据集:

wider_face

子任务:

face-detection

语言:

en

计算机处理:

monolingual

大小:

10K<n<100K

语言创建人:

found

批注创建人:

expert-generated

预印本库:

arxiv:1511.06523
中文

Dataset Card for WIDER FACE

Dataset Summary

WIDER FACE dataset is a face detection benchmark dataset, of which images are selected from the publicly available WIDER dataset. We choose 32,203 images and label 393,703 faces with a high degree of variability in scale, pose and occlusion as depicted in the sample images. WIDER FACE dataset is organized based on 61 event classes. For each event class, we randomly select 40%/10%/50% data as training, validation and testing sets. We adopt the same evaluation metric employed in the PASCAL VOC dataset. Similar to MALF and Caltech datasets, we do not release bounding box ground truth for the test images. Users are required to submit final prediction files, which we shall proceed to evaluate.

Supported Tasks and Leaderboards

  • face-detection : The dataset can be used to train a model for Face Detection. More information on evaluating the model's performance can be found here .

Languages

English

Dataset Structure

Data Instances

A data point comprises an image and its face annotations.

{
  'image': <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1024x755 at 0x19FA12186D8>, 'faces': {
    'bbox': [
      [178.0, 238.0, 55.0, 73.0],
      [248.0, 235.0, 59.0, 73.0],
      [363.0, 157.0, 59.0, 73.0],
      [468.0, 153.0, 53.0, 72.0],
      [629.0, 110.0, 56.0, 81.0],
      [745.0, 138.0, 55.0, 77.0]
    ], 
    'blur': [2, 2, 2, 2, 2, 2],
    'expression': [0, 0, 0, 0, 0, 0],
    'illumination': [0, 0, 0, 0, 0, 0],
    'occlusion': [1, 2, 1, 2, 1, 2],
    'pose': [0, 0, 0, 0, 0, 0],
    'invalid': [False, False, False, False, False, False]
  }
}

Data Fields

  • image : A PIL.Image.Image object containing the image. Note that when accessing the image column: dataset[0]["image"] the image file is automatically decoded. Decoding of a large number of image files might take a significant amount of time. Thus it is important to first query the sample index before the "image" column, i.e. dataset[0]["image"] should always be preferred over dataset["image"][0]
  • faces : a dictionary of face attributes for the faces present on the image
    • bbox : the bounding box of each face (in the coco format)
    • blur : the blur level of each face, with possible values including clear (0), normal (1) and heavy
    • expression : the facial expression of each face, with possible values including typical (0) and exaggerate (1)
    • illumination : the lightning condition of each face, with possible values including normal (0) and exaggerate (1)
    • occlusion : the level of occlusion of each face, with possible values including no (0), partial (1) and heavy (2)
    • pose : the pose of each face, with possible values including typical (0) and atypical (1)
    • invalid : whether the image is valid or invalid.

Data Splits

The data is split into training, validation and testing set. WIDER FACE dataset is organized based on 61 event classes. For each event class, 40%/10%/50% data is randomly selected as training, validation and testing sets. The training set contains 12880 images, the validation set 3226 images and test set 16097 images.

Dataset Creation

Curation Rationale

The curators state that the current face detection datasets typically contain a few thousand faces, with limited variations in pose, scale, facial expression, occlusion, and background clutters, making it difficult to assess for real world performance. They argue that the limitations of datasets have partially contributed to the failure of some algorithms in coping with heavy occlusion, small scale, and atypical pose.

Source Data

Initial Data Collection and Normalization

WIDER FACE dataset is a subset of the WIDER dataset. The images in WIDER were collected in the following three steps: 1) Event categories were defined and chosen following the Large Scale Ontology for Multimedia (LSCOM) [22], which provides around 1000 concepts relevant to video event analysis. 2) Images are retrieved using search engines like Google and Bing. For each category, 1000-3000 images were collected. 3) The data were cleaned by manually examining all the images and filtering out images without human face. Then, similar images in each event category were removed to ensure large diversity in face appearance. A total of 32203 images are eventually included in the WIDER FACE dataset.

Who are the source language producers?

The images are selected from publicly available WIDER dataset.

Annotations

Annotation process

The curators label the bounding boxes for all the recognizable faces in the WIDER FACE dataset. The bounding box is required to tightly contain the forehead, chin, and cheek.. If a face is occluded, they still label it with a bounding box but with an estimation on the scale of occlusion. Similar to the PASCAL VOC dataset [6], they assign an ’Ignore’ flag to the face which is very difficult to be recognized due to low resolution and small scale (10 pixels or less). After annotating the face bounding boxes, they further annotate the following attributes: pose (typical, atypical) and occlusion level (partial, heavy). Each annotation is labeled by one annotator and cross-checked by two different people.

Who are the annotators?

Shuo Yang, Ping Luo, Chen Change Loy and Xiaoou Tang.

Personal and Sensitive Information

[More Information Needed]

Considerations for Using the Data

Social Impact of Dataset

[More Information Needed]

Discussion of Biases

[More Information Needed]

Other Known Limitations

[More Information Needed]

Additional Information

Dataset Curators

Shuo Yang, Ping Luo, Chen Change Loy and Xiaoou Tang

Licensing Information

Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) .

Citation Information

@inproceedings{yang2016wider,
    Author = {Yang, Shuo and Luo, Ping and Loy, Chen Change and Tang, Xiaoou},
    Booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    Title = {WIDER FACE: A Face Detection Benchmark},
    Year = {2016}}

Contributions

Thanks to @mariosasko for adding this dataset.