数据集:
turkish_ner
任务:
标记分类语言:
tr计算机处理:
monolingual大小:
100K<n<1M语言创建人:
expert-generated批注创建人:
machine-generated源数据集:
original预印本库:
arxiv:1702.02363许可:
cc-by-4.0使用大规模词表对土耳其自动标注的命名实体识别和文本分类语料库。构建的词表包含大约30万个实体,涵盖了25个不同领域的数千个细粒度实体类型。
[需要更多信息]
土耳其语
[需要更多信息]
[需要更多信息]
只有训练集。
[需要更多信息]
[需要更多信息]
谁是源语言制作者?[需要更多信息]
[需要更多信息]
标注者是谁?[需要更多信息]
[需要更多信息]
[需要更多信息]
[需要更多信息]
[需要更多信息]
H. Bahadir Sahin, Caglar Tirkaz, Eray Yildiz, Mustafa Tolga Eren和Omer Ozan Sonmez
Creative Commons Attribution 4.0 International
@InProceedings@article{DBLP:journals/corr/SahinTYES17, author = {H. Bahadir Sahin and Caglar Tirkaz and Eray Yildiz and Mustafa Tolga Eren and Omer Ozan Sonmez}, title = {Automatically Annotated Turkish Corpus for Named Entity Recognition and Text Categorization using Large-Scale Gazetteers}, journal = {CoRR}, volume = {abs/1702.02363}, year = {2017}, url = { http://arxiv.org/abs/1702.02363} , archivePrefix = {arXiv}, eprint = {1702.02363}, timestamp = {Mon, 13 Aug 2018 16:46:36 +0200}, biburl = { https://dblp.org/rec/journals/corr/SahinTYES17.bib} , bibsource = {dblp computer science bibliography, https://dblp.org} }
感谢 @merveenoyan 添加了此数据集。