数据集:

quickdraw

任务:

图像分类

子任务:

multi-class-image-classification

语言:

计算机处理:

monolingual

大小:

10M<n<100M

语言创建人:

crowdsourced

批注创建人:

machine-generated

源数据集:

original

预印本库:

arxiv:1704.03477

许可:

cc-by-4.0

数据集介绍文件清单

英文

Quick, Draw! 数据集概述

数据集摘要

Quick, Draw! 数据集是由 Quick, Draw! 游戏的玩家贡献的 345 个类别的 5000 万个绘画作品的集合。这些绘画作品以时间戳向量的形式进行捕捉，并带有元数据，包括玩家被要求绘制的内容以及玩家所在的国家。

支持的任务和排行榜

图像分类：该任务的目标是将给定的草图分类为 345 个类别之一。该任务的（封闭）排行榜可在 here 处找到。

语言

英语。

数据集结构

数据实例

raw

一个数据点包括一个绘画作品及其元数据。

{
  'key_id': '5475678961008640',
  'word': 0,
  'recognized': True,
  'timestamp': datetime.datetime(2017, 3, 28, 13, 28, 0, 851730),
  'countrycode': 'MY',
  'drawing': {
    'x': [[379.0, 380.0, 381.0, 381.0, 381.0, 381.0, 382.0], [362.0, 368.0, 375.0, 380.0, 388.0, 393.0, 399.0, 404.0, 409.0, 410.0, 410.0, 405.0, 397.0, 392.0, 384.0, 377.0, 370.0, 363.0, 356.0, 348.0, 342.0, 336.0, 333.0], ..., [477.0, 473.0, 471.0, 469.0, 468.0, 466.0, 464.0, 462.0, 461.0, 469.0, 475.0, 483.0, 491.0, 499.0, 510.0, 521.0, 531.0, 540.0, 548.0, 558.0, 566.0, 576.0, 583.0, 590.0, 595.0, 598.0, 597.0, 596.0, 594.0, 592.0, 590.0, 589.0, 588.0, 586.0]],
    'y': [[1.0, 7.0, 15.0, 21.0, 27.0, 32.0, 32.0], [17.0, 17.0, 17.0, 17.0, 16.0, 16.0, 16.0, 16.0, 18.0, 23.0, 29.0, 32.0, 32.0, 32.0, 29.0, 27.0, 25.0, 23.0, 21.0, 19.0, 17.0, 16.0, 14.0], ..., [151.0, 146.0, 139.0, 131.0, 125.0, 119.0, 113.0, 107.0, 102.0, 99.0, 98.0, 98.0, 98.0, 98.0, 98.0, 98.0, 98.0, 98.0, 98.0, 98.0, 98.0, 100.0, 102.0, 104.0, 105.0, 110.0, 115.0, 121.0, 126.0, 131.0, 137.0, 142.0, 148.0, 150.0]],
    't': [[0, 84, 100, 116, 132, 148, 260], [573, 636, 652, 660, 676, 684, 701, 724, 796, 838, 860, 956, 973, 979, 989, 995, 1005, 1012, 1020, 1028, 1036, 1053, 1118], ..., [8349, 8446, 8468, 8484, 8500, 8516, 8541, 8557, 8573, 8685, 8693, 8702, 8710, 8718, 8724, 8732, 8741, 8748, 8757, 8764, 8773, 8780, 8788, 8797, 8804, 8965, 8996, 9029, 9045, 9061, 9076, 9092, 9109, 9167]]
  }
}

preprocess_simplified_drawings

从原始数据进行处理得到的简化版本数据集，该数据集包含经过简化的向量、去除时间信息并将数据放置和缩放到 256x256 区域。简化过程如下： 1. 将绘画作品对齐到左上角，使最小值为0。 2. 将绘画作品进行统一缩放，使最大值为255。 3. 用1像素的间距对所有笔画进行重采样。 4. 使用 Ramer-Douglas-Peucker algorithm 进行简化，epsilon 值为2.0。

{
  'key_id': '5475678961008640',
  'word': 0,
  'recognized': True,
  'timestamp': datetime.datetime(2017, 3, 28, 15, 28),
  'countrycode': 'MY',
  'drawing': {
    'x': [[31, 32], [27, 37, 38, 35, 21], [25, 28, 38, 39], [33, 34, 32], [5, 188, 254, 251, 241, 185, 45, 9, 0], [35, 35, 43, 125, 126], [35, 76, 80, 77], [53, 50, 54, 80, 78]],
    'y': [[0, 7], [4, 4, 6, 7, 3], [5, 10, 10, 7], [4, 33, 44], [50, 50, 54, 83, 86, 90, 86, 77, 52], [85, 91, 92, 96, 90], [35, 37, 41, 47], [34, 23, 22, 23, 34]]
  }
}

preprocessed_bitmaps（默认配置）

该配置包含从简化数据生成的 28x28 灰度位图图像，但是对齐到绘画的边界框的中心，而不是左上角。生成的代码可在 here 处找到。

{
  'image': <PIL.PngImagePlugin.PngImageFile image mode=L size=28x28 at 0x10B5B102828>,
  'label': 0
}

sketch_rnn 和 sketch_rnn_full

sketch_rnn_full 配置以适合循环神经网络输入的格式存储数据，并用于训练 Sketch-RNN 模型。与 sketch_rnn 不同，sketch_rnn_full 配置包含了每个类别的完整数据。

{
  'word': 0,
  'drawing': [[132, 0, 0], [23, 4, 0], [61, 1, 0], [76, 0, 0], [22, -4, 0], [152, 0, 0], [50, -5, 0], [36, -10, 0], [8, 26, 0], [0, 69, 0], [-2, 11, 0], [-8, 10, 0], [-56, 24, 0], [-23, 14, 0], [-99, 40, 0], [-45, 6, 0], [-21, 6, 0], [-170, 2, 0], [-81, 0, 0], [-29, -9, 0], [-94, -19, 0], [-48, -24, 0], [-6, -16, 0], [2, -36, 0], [7, -29, 0], [23, -45, 0], [13, -6, 0], [41, -8, 0], [42, -2, 1], [392, 38, 0], [2, 19, 0], [11, 33, 0], [13, 0, 0], [24, -9, 0], [26, -27, 0], [0, -14, 0], [-8, -10, 0], [-18, -5, 0], [-14, 1, 0], [-23, 4, 0], [-21, 12, 1], [-152, 18, 0], [10, 46, 0], [26, 6, 0], [38, 0, 0], [31, -2, 0], [7, -2, 0], [4, -6, 0], [-10, -21, 0], [-2, -33, 0], [-6, -11, 0], [-46, 1, 0], [-39, 18, 0], [-19, 4, 1], [-122, 0, 0], [-2, 38, 0], [4, 16, 0], [6, 4, 0], [78, 0, 0], [4, -8, 0], [-8, -36, 0], [0, -22, 0], [-6, -2, 0], [-32, 14, 0], [-58, 13, 1], [-96, -12, 0], [-10, 27, 0], [2, 32, 0], [102, 0, 0], [1, -7, 0], [-27, -17, 0], [-4, -6, 0], [-1, -34, 0], [-64, 8, 1], [129, -138, 0], [-108, 0, 0], [-8, 12, 0], [-1, 15, 0], [12, 15, 0], [20, 5, 0], [61, -3, 0], [24, 6, 0], [19, 0, 0], [5, -4, 0], [2, 14, 1]]
}

数据字段

raw

key_id: 绘画作品的唯一标识符。
word: 用于提示玩家绘制的类别。
recognized: 游戏是否识别了绘制作品。
timestamp: 绘画作品创建的时间。
countrycode: 玩家所在国家的两字母国家代码（ ISO 3166-1 alpha-2 ）。
drawing: 字典，其中 x 和 y 是像素坐标，t 是自第一个点以来的毫秒数。x、y 和 t 长度相同，并且用列表的列表表示，每个子列表对应一个笔画。由于显示和输入使用了不同的设备，原始绘画作品的边界框和点数可能差异巨大。

preprocessed_simplified_drawings

key_id: 绘画作品的唯一标识符。
word: 用于提示玩家绘制的类别。
recognized: 游戏是否识别了绘制作品。
timestamp: 绘画作品创建的时间。
countrycode: 玩家所在国家的两字母国家代码（ ISO 3166-1 alpha-2 ）。
drawing: 以字典形式表示的简化绘画作品，其中 x 和 y 是像素坐标。简化的过程在“数据实例”部分中有描述。

preprocessed_bitmaps（默认配置）

image: 包含 28x28 灰度位图的 PIL.Image.Image 对象。请注意，访问图像列时，自动解码图像文件：dataset[0]["image"]。解码大量图像文件可能需要很长时间。因此，优先查询样本索引和 "image" 列，即 dataset[0]["image"] 应始终优于 dataset["image"][0]。
label: 用于提示玩家绘制的类别。

单击此处查看完整的类别标签映射：

id	class
0	aircraft carrier
1	airplane
2	alarm clock
3	ambulance
4	angel
5	animal migration
6	ant
7	anvil
8	apple
9	arm
10	asparagus
11	axe
12	backpack
13	banana
14	bandage
15	barn
16	baseball bat
17	baseball
18	basket
19	basketball
20	bat
21	bathtub
22	beach
23	bear
24	beard
25	bed
26	bee
27	belt
28	bench
29	bicycle
30	binoculars
31	bird
32	birthday cake
33	blackberry
34	blueberry
35	book
36	boomerang
37	bottlecap
38	bowtie
39	bracelet
40	brain
41	bread
42	bridge
43	broccoli
44	broom
45	bucket
46	bulldozer
47	bus
48	bush
49	butterfly
50	cactus
51	cake
52	calculator
53	calendar
54	camel
55	camera
56	camouflage
57	campfire
58	candle
59	cannon
60	canoe
61	car
62	carrot
63	castle
64	cat
65	ceiling fan
66	cell phone
67	cello
68	chair
69	chandelier
70	church
71	circle
72	clarinet
73	clock
74	cloud
75	coffee cup
76	compass
77	computer
78	cookie
79	cooler
80	couch
81	cow
82	crab
83	crayon
84	crocodile
85	crown
86	cruise ship
87	cup
88	diamond
89	dishwasher
90	diving board
91	dog
92	dolphin
93	donut
94	door
95	dragon
96	dresser
97	drill
98	drums
99	duck
100	dumbbell
101	ear
102	elbow
103	elephant
104	envelope
105	eraser
106	eye
107	eyeglasses
108	face
109	fan
110	feather
111	fence
112	finger
113	fire hydrant
114	fireplace
115	firetruck
116	fish
117	flamingo
118	flashlight
119	flip flops
120	floor lamp
121	flower
122	flying saucer
123	foot
124	fork
125	frog
126	frying pan
127	garden hose
128	garden
129	giraffe
130	goatee
131	golf club
132	grapes
133	grass
134	guitar
135	hamburger
136	hammer
137	hand
138	harp
139	hat
140	headphones
141	hedgehog
142	helicopter
143	helmet
144	hexagon
145	hockey puck
146	hockey stick
147	horse
148	hospital
149	hot air balloon
150	hot dog
151	hot tub
152	hourglass
153	house plant
154	house
155	hurricane
156	ice cream
157	jacket
158	jail
159	kangaroo
160	key
161	keyboard
162	knee
163	knife
164	ladder
165	lantern
166	laptop
167	leaf
168	leg
169	light bulb
170	lighter
171	lighthouse
172	lightning
173	line
174	lion
175	lipstick
176	lobster
177	lollipop
178	mailbox
179	map
180	marker
181	matches
182	megaphone
183	mermaid
184	microphone
185	microwave
186	monkey
187	moon
188	mosquito
189	motorbike
190	mountain
191	mouse
192	moustache
193	mouth
194	mug
195	mushroom
196	nail
197	necklace
198	nose
199	ocean
200	octagon
201	octopus
202	onion
203	oven
204	owl
205	paint can
206	paintbrush
207	palm tree
208	panda
209	pants
210	paper clip
211	parachute
212	parrot
213	passport
214	peanut
215	pear
216	peas
217	pencil
218	penguin
219	piano
220	pickup truck
221	picture frame
222	pig
223	pillow
224	pineapple
225	pizza
226	pliers
227	police car
228	pond
229	pool
230	popsicle
231	postcard
232	potato
233	power outlet
234	purse
235	rabbit
236	raccoon
237	radio
238	rain
239	rainbow
240	rake
241	remote control
242	rhinoceros
243	rifle
244	river
245	roller coaster
246	rollerskates
247	sailboat
248	sandwich
249	saw
250	saxophone
251	school bus
252	scissors
253	scorpion
254	screwdriver
255	sea turtle
256	see saw
257	shark
258	sheep
259	shoe
260	shorts
261	shovel
262	sink
263	skateboard
264	skull
265	skyscraper
266	sleeping bag
267	smiley face
268	snail
269	snake
270	snorkel
271	snowflake
272	snowman
273	soccer ball
274	sock
275	speedboat
276	spider
277	spoon
278	spreadsheet
279	square
280	squiggle
281	squirrel
282	stairs
283	star
284	steak
285	stereo
286	stethoscope
287	stitches
288	stop sign
289	stove
290	strawberry
291	streetlight
292	string bean
293	submarine
294	suitcase
295	sun
296	swan
297	sweater
298	swing set
299	sword
300	syringe
301	t-shirt
302	table
303	teapot
304	teddy-bear
305	telephone
306	television
307	tennis racquet
308	tent
309	The Eiffel Tower
310	The Great Wall of China
311	The Mona Lisa
312	tiger
313	toaster
314	toe
315	toilet
316	tooth
317	toothbrush
318	toothpaste
319	tornado
320	tractor
321	traffic light
322	train
323	tree
324	triangle
325	trombone
326	truck
327	trumpet
328	umbrella
329	underwear
330	van
331	vase
332	violin
333	washing machine
334	watermelon
335	waterslide
336	whale
337	wheel
338	windmill
339	wine bottle
340	wine glass
341	wristwatch
342	yoga
343	zebra
344	zigzag

sketch_rnn 和 sketch_rnn_full

word: 用于提示玩家绘制的类别。
drawing: 一组笔画数组。笔画表示为 3 元组，包括 x 偏移量、y 偏移量和二进制变量。如果笔在该位置和下一个位置之间抬起，则该变量为 1，否则为 0。

单击此处查看在 Jupyter Notebook 或 Google Colab 中可视化绘画的代码：

import numpy as np
import svgwrite  # pip install svgwrite
from IPython.display import SVG, display

def draw_strokes(drawing, factor=0.045):
  """Displays vector drawing as SVG.

  Args:
    drawing: a list of strokes represented as 3-tuples
    factor: scaling factor. The smaller the scaling factor, the bigger the SVG picture and vice versa.

  """
  def get_bounds(data, factor):
    """Return bounds of data."""
    min_x = 0
    max_x = 0
    min_y = 0
    max_y = 0

    abs_x = 0
    abs_y = 0
    for i in range(len(data)):
      x = float(data[i, 0]) / factor
      y = float(data[i, 1]) / factor
      abs_x += x
      abs_y += y
      min_x = min(min_x, abs_x)
      min_y = min(min_y, abs_y)
      max_x = max(max_x, abs_x)
      max_y = max(max_y, abs_y)

    return (min_x, max_x, min_y, max_y)

  data = np.array(drawing)
  min_x, max_x, min_y, max_y = get_bounds(data, factor)
  dims = (50 + max_x - min_x, 50 + max_y - min_y)
  dwg = svgwrite.Drawing(size=dims)
  dwg.add(dwg.rect(insert=(0, 0), size=dims,fill='white'))
  lift_pen = 1
  abs_x = 25 - min_x
  abs_y = 25 - min_y
  p = "M%s,%s " % (abs_x, abs_y)
  command = "m"
  for i in range(len(data)):
    if (lift_pen == 1):
      command = "m"
    elif (command != "l"):
      command = "l"
    else:
      command = ""
    x = float(data[i,0])/factor
    y = float(data[i,1])/factor
    lift_pen = data[i, 2]
    p += command+str(x)+","+str(y)+" "
  the_color = "black"
  stroke_width = 1
  dwg.add(dwg.path(p).stroke(the_color,stroke_width).fill("none"))
  display(SVG(dwg.tostring()))

注意：Sketch-RNN 的输入为表示为 5 元组的笔画，绘画经过填充到一个共同的最大长度，并在之前加上特殊的开始令牌 [0, 0, 1, 0, 0]。5 元组表示包括 x 偏移量、y 偏移量和 p_1、p_2、p_3，它是一个具有3种可能笔状态（笔下去、笔抬起、绘画结束）的二进制独热向量。具体来说，前两个元素是笔与前一个点之间在 x 和 y 方向上的偏移距离。最后的3个元素表示具有 3 种可能状态的二进制独热向量。第一个笔状态 p1 表示笔当前触摸纸张，并且将绘制一条连接下一个点和当前点的线条。第二个笔状态 p2 表示在当前点之后将抬起笔，并且下一个点不会绘制线条。最后的笔状态 p3 表示绘画已结束，后续点（包括当前点）将不会呈现。

单击此处查看将绘画转换为 Sketch-RNN 输入格式的代码：

def to_sketch_rnn_format(drawing, max_len):
  """Converts a drawing to Sketch-RNN input format.

  Args:
    drawing: a list of strokes represented as 3-tuples
    max_len: maximum common length of all drawings

  Returns:
    NumPy array
  """
  drawing = np.array(drawing)
  result = np.zeros((max_len, 5), dtype=float)
  l = len(drawing)
  assert l <= max_len
  result[0:l, 0:2] = drawing[:, 0:2]
  result[0:l, 3] = drawing[:, 2]
  result[0:l, 2] = 1 - result[0:l, 3]
  result[l:, 4] = 1
  # Prepend special start token
  result = np.vstack([[0, 0, 1, 0, 0], result])
  return result

数据拆分

在 raw、preprocessed_simplified_drawings 和 preprocessed_bitmap（默认配置）这些配置中，所有的数据都包含在训练集中，共有 50426266 个实例。

sketch_rnn 和 sketch_rnn_full 将数据拆分为训练集、验证集和测试集。在 sketch_rnn 配置中，每个类别随机选择了 75K 个样本（70K 训练集、2.5K 验证集、2.5K 测试集）。因此，训练集包含 24150000 个实例，验证集有 862500 个实例，测试集也有 862500 个实例。sketch_rnn_full 配置包含了每个类别的所有（训练）数据，训练集有 43988874 个实例，验证集有 862500 个实例，测试集有 862500 个实例。

数据集的创建

策划理由

来自 GitHub 存储库的信息：

Quick, Draw! 数据集是由 Quick, Draw! 游戏的玩家贡献的 345 个类别的 5000 万个绘画作品的集合。这些绘画作品以时间戳向量的形式进行捕捉，并带有元数据，包括玩家被要求绘制的内容以及玩家所在的国家。您可以在 quickdraw.withgoogle.com/data 上浏览已识别的绘画作品。

我们将它们共享给开发人员、研究人员和艺术家，供其探索、研究和学习。

数据源

初始数据收集和标准化

该数据集包含从 Quick, Draw! 游戏（ Quick, Draw! ）中获得的向量绘画作品。游戏要求玩家在不到 20 秒的时间内绘制属于特定对象类别的对象。

谁是源语言生成者？

Quick, Draw! 游戏中的参与者。

注释

注释过程

注释是由机器生成的，并与玩家被要求绘制的类别相匹配。

谁是注释者？

注释是由机器生成的。

个人和敏感信息

某些绘画作品被认为存在问题（参见 https://github.com/googlecreativelab/quickdraw-dataset/issues/74 和 https://github.com/googlecreativelab/quickdraw-dataset/issues/18 ）。

使用数据的注意事项

数据集的社会影响

【需要更多信息】

偏差讨论

【需要更多信息】

其他已知限制

附加信息

数据集负责人

Jonas Jongejan、Henry Rowley、Takashi Kawashima、Jongmin Kim 和 Nick Fox-Gieg。

许可信息

数据由 Google, Inc. 在 Creative Commons Attribution 4.0 International 许可下提供。

引用信息

@article{DBLP:journals/corr/HaE17,
  author    = {David Ha and
               Douglas Eck},
  title     = {A Neural Representation of Sketch Drawings},
  journal   = {CoRR},
  volume    = {abs/1704.03477},
  year      = {2017},
  url       = {http://arxiv.org/abs/1704.03477},
  archivePrefix = {arXiv},
  eprint    = {1704.03477},
  timestamp = {Mon, 13 Aug 2018 16:48:30 +0200},
  biburl    = {https://dblp.org/rec/bib/journals/corr/HaE17},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

贡献者

感谢 @mariosasko 添加了这个数据集。

作者:

佚名

数据集大小:

417.49 KB