数据集:
llm-book/jawiki-20220404-c400
This dataset contains passages, each of which consists of consecutive sentences no longer than 400 characters from Japanese Wikipedia as of 2022-04-04. This dataset is used in baseline systems for the AI王 question answering competition , such as cl-tohoku/AIO3_BPR_baseline .
Please refer to the original repository for further details.