数据集:

juletxara/xwinograd

英文

XWinograd

多语言winograd模式挑战。

语言和样本

  • "en": 2325
  • "fr": 83
  • "jp": 959
  • "pt": 263
  • "ru": 315
  • "zh": 504

数据集创建

Winograd模式挑战来自Tikhonov等人介绍的XWinograd数据集。由于它只包含16个中文模式,我们从clue/cluewsc2020中添加了488个中文模式。

如果你只想要原始的xWinograd中文模式,请执行:

load_dataset("juletxara/xwinograd", "zh")["test"][0][:16]

附加信息

引用信息

@misc{muennighoff2022crosslingual,
      title={Crosslingual Generalization through Multitask Finetuning}, 
      author={Niklas Muennighoff and Thomas Wang and Lintang Sutawika and Adam Roberts and Stella Biderman and Teven Le Scao and M Saiful Bari and Sheng Shen and Zheng-Xin Yong and Hailey Schoelkopf and Xiangru Tang and Dragomir Radev and Alham Fikri Aji and Khalid Almubarak and Samuel Albanie and Zaid Alyafeai and Albert Webson and Edward Raff and Colin Raffel},
      year={2022},
      eprint={2211.01786},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
@misc{tikhonov2021heads,
    title={It's All in the Heads: Using Attention Heads as a Baseline for Cross-Lingual Transfer in Commonsense Reasoning},
    author={Alexey Tikhonov and Max Ryabinin},
    year={2021},
    eprint={2106.12066},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

贡献

感谢Jordan Clive,@yongzx和@khalidalt对添加中文的支持。