数据集:

nyanko7/coco-hosted

许可:

openrail
英文

使用说明:

from datasets import load_dataset
coco_dataset = load_dataset("nyanko7/coco-hosted")

每个实例具有以下结构:

{
    'image': <PIL.JpegImagePlugin.JpegImageFile>,
    'filepath': 'COCO_val2014_000000522418.jpg',
    'sentids': [681330, 686718, 688839, 693159, 693204],
    'filename': 'COCO_val2014_000000522418.jpg',
    'imgid': 1,
    'split': 'restval',
    'sentences': {
        'tokens': ['a', 'woman', 'wearing', 'a', 'net', 'on', 'her', 'head', 'cutting', 'a', 'cake'],
        'raw': 'A woman wearing a net on her head cutting a cake. ',
        'imgid': 1,
        'sentid': 681330
    },
    'cocoid': 522418
}