数据集:
allenai/scicite
任务:
文本分类语言:
en计算机处理:
monolingual大小:
10K<n<100K语言创建人:
found源数据集:
original预印本库:
arxiv:1904.01608许可:
license:unknown这是一个用于对学术论文中的引用意图进行分类的数据集。每个 Json 对象的主要引用意图标签由 labelkey 指定,而引用上下文则在 context 键中指定。示例:{'string': 'In chacma baboons, male-infant relationships can be linked to both formation of friendships and paternity success [30,31].', 'sectionName': 'Introduction', 'label': 'background', 'citingPaperId': '7a6b2d4b405439', 'citedPaperId': '9d1abadc55b5e0', ...}您可以使用提供的 Semantic Scholar API( https://api.semanticscholar.org/ )获得关于论文的完整信息。标签有:Method(方法)、Background(背景)、Result(结果)。
'validation' 的一个示例如下所示。
{ "citeEnd": 68, "citeStart": 64, "citedPaperId": "5e413c7872f5df231bf4a4f694504384560e98ca", "citingPaperId": "8f1fbe460a901d994e9b81d69f77bfbe32719f4c", "excerpt_index": 0, "id": "8f1fbe460a901d994e9b81d69f77bfbe32719f4c>5e413c7872f5df231bf4a4f694504384560e98ca", "isKeyCitation": false, "label": 2, "label2": 0, "label2_confidence": 0.0, "label_confidence": 0.0, "sectionName": "Discussion", "source": 4, "string": "These results are in contrast with the findings of Santos et al.(16), who reported a significant association between low sedentary time and healthy CVF among Portuguese" }
所有拆分中的数据字段相同。
默认name | train | validation | test |
---|---|---|---|
default | 8194 | 916 | 1859 |
@inproceedings{cohan-etal-2019-structural, title = "Structural Scaffolds for Citation Intent Classification in Scientific Publications", author = "Cohan, Arman and Ammar, Waleed and van Zuylen, Madeleine and Cady, Field", booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)", month = jun, year = "2019", address = "Minneapolis, Minnesota", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/N19-1361", doi = "10.18653/v1/N19-1361", pages = "3586--3596", }
感谢 @lewtun , @patrickvonplaten , @mariamabarham , @thomwolf 添加此数据集。