数据集:
bigbio/biology_how_why_corpus
该数据集包含由领域专家撰写的185个“如何”和193个“为什么”的生物学问题,其中在一本本科教材中标识出一个或多个黄金答案段落。在注释过程中,专家没有受到任何限制,因此黄金答案可能比一个段落小,也可能涵盖多个段落。该数据集用于论文“Discourse Complements Lexical Semantics for Non-factoid Answer Reranking”(ACL 2014)中描述的问答系统。
@inproceedings{jansen-etal-2014-discourse, title = "Discourse Complements Lexical Semantics for Non-factoid Answer Reranking", author = "Jansen, Peter and Surdeanu, Mihai and Clark, Peter", booktitle = "Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)", month = jun, year = "2014", address = "Baltimore, Maryland", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/P14-1092", doi = "10.3115/v1/P14-1092", pages = "977--986", }