数据集:
cakiki/args_me
任务:
文本检索子任务:
document-retrievalargs.me语料库(版本1.0,经过清理)包含382,545个论点,这些论点是从2019年中期的四个辩论门户网站上爬取得到的。这些辩论门户网站是Debatewise、IDebate.org、Debatepedia和Debate.org。论点是使用专为每个辩论门户网站设计的启发式方法提取的。
import datasets args = datasets.load_dataset('cakiki/args_me', 'corpus', streaming=True) args_iterator = iter(args) for arg in args_iterator: print(args['conclusion']) print(args['id']) print(args['argument']) print(args['stance']) break
文档检索,有争议问题的论点检索
args.me语料库是单语言的;它只包括英语(主要为美式英语)文档。
{'conclusion': 'Science is the best!', 'id': 'd6517702-2019-04-18T12:36:24Z-00000-000', 'argument': 'Science is aright I guess, but Physical Education (P.E) is better. Think about it, you could sit in a classroom for and hour learning about molecular reconfiguration, or you could play football with your mates. Why would you want to learn about molecular reconfiguration anyway? I think the argument here would be based on, healthy mind or healthy body. With science being the healthy mind and P.E being the healthy body. To work this one out all you got to do is ask Steven Hawkins. Only 500 words', 'stance': 'CON'}
[需要更多信息]
[需要更多信息]
[需要更多信息]
[需要更多信息]
源语言生产者是谁?[需要更多信息]
[需要更多信息]
注释者是谁?[需要更多信息]
[需要更多信息]
[需要更多信息]
[需要更多信息]
[需要更多信息]
[需要更多信息]
Creative Commons Attribution 4.0 International (CC BY 4.0)
@dataset{yamen_ajjour_2020_4139439, author = {Yamen Ajjour and Henning Wachsmuth and Johannes Kiesel and Martin Potthast and Matthias Hagen and Benno Stein}, title = {args.me corpus}, month = oct, year = 2020, publisher = {Zenodo}, version = {1.0-cleaned}, doi = {10.5281/zenodo.4139439}, url = {https://doi.org/10.5281/zenodo.4139439} }