数据集:

PNLPhub/FarsTail

许可:

apache-2.0

预印本库:

arxiv:2009.08820

大小:

1K<n<10K

语言:

fa
中文

Dataset Summary

Persian (Farsi) language is a pluricentric language spoken by around 110 million people in countries like Iran, Afghanistan, and Tajikistan. Here, we present the first relatively large-scale Persian dataset for NLI task, called FarsTail. A total of 10,367 samples are generated from a collection of 3,539 multiple-choice questions. The train, validation, and test portions include 7,266, 1,537, and 1,564 instances, respectively

Licensing Information

[More Information Needed]

Citation Information

@article{amirkhani2020farstail,
  title={FarsTail: A Persian Natural Language Inference Dataset},
  author={Hossein Amirkhani, Mohammad Azari Jafari, Azadeh Amirak, Zohreh Pourjafari, Soroush Faridan Jahromi, and Zeinab Kouhkan},
  journal={arXiv preprint arXiv:2009.08820},
  year={2020}
}