数据集:

potsawee/xsum_eng2thai

中文

Dataset Card for "xsum_eng2thai ????"

  • This dataset is based on XSum .
  • The summaries were translated from English (as in the original XSum) to Thai using Meta's NLLB-200-3.3B .
  • The dataset is intended for Cross-Lingual Summarization (English Document -> Thai Summary).

Data Fields

  • id : BBC ID of the article.
  • document : a string containing the body of the news article
  • summary : a string containing a translated summary of the article.

Data Structure

{
    "id": "29750031",
    "document": "news article in English",
    "summary": "summary in Thai"
}

Data Splits

train/validation/test = 204045/11332/11334