注意:这是针对SARA v1的数据版,对于SARA v2,请参见 https://nlp.jhu.edu/law/ (将很快在Huggingface上发布!)
如果您使用此数据集,请引用我们的工作:
@inproceedings{Holzenberger2020ADF, title={A Dataset for Statutory Reasoning in Tax Law Entailment and Question Answering}, author={Nils Holzenberger and Andrew Blair-Stanek and Benjamin Van Durme}, booktitle={NLLP@KDD}, year={2020} }
有两个任务:问题回答和自然语言推理,都有训练集和测试集。没有官方排行榜。
英语
以下是一个实例示例:
{ "id": "s151_a_neg", "text": "Alice's income in 2015 is $100000. She gets one exemption of $2000 for the year 2015 under section 151(c). Alice is not married.", "question": "Alice's total exemption for 2015 under section 151(a) is equal to $6000", "answer": "Contradiction", "facts": ":- discontiguous s151_c\/4.\n:- [statutes\/prolog\/init].\nincome_(alice_makes_money).\nagent_(alice_makes_money,alice).\nstart_(alice_makes_money,\"2015-01-01\").\nend_(alice_makes_money,\"2015-12-31\").\namount_(alice_makes_money,100000).\ns151_c(alice,_,2000,2015).", "test": ":- \\+ s151_a(alice,6000,2015)." }
数据拆分可以通过以下方式访问:
from datasets import load_dataset qa_test = load_dataset("jhu-clsp/SARA", "qa", split="test") qa_train = load_dataset("jhu-clsp/SARA", "qa", split="train") nli_test = load_dataset("jhu-clsp/SARA", "nli", split="test") nli_train = load_dataset("jhu-clsp/SARA", "nli", split="train")
详细信息请参阅论文: https://ceur-ws.org/Vol-2645/paper5.pdf