数据集:

scaredmeow/shopee-reviews-tl-stars

语言:

tl

大小:

1K<n<10K

数字对象标识符:

10.57967/hf/0656

许可:

mpl-2.0
中文

Dataset Card for Dataset Name

Dataset Summary

This dataset card aims to be a base template for new datasets. It has been generated using this raw template .

Supported Tasks and Leaderboards

[More Information Needed]

Languages

Tagalog (TL)

Dataset Structure

Data Instances

A typical data point, comprises of a text and the corresponding label.

An example from the YelpReviewFull test set looks as follows:

{
    'label': 2,
    'text': 'Madaling masira yung sa may sinisintasan nya. Wala rin syang box. Sana mas ginawa pa na matibay para sana sulit yung pagkakabili'
}

Data Fields

  • 'text': The review texts are escaped using double quotes ("), and any internal double quote is escaped by 2 double quotes ("").
  • 'label': Corresponds to the score associated with the review (between 1 and 5).

Data Splits

The Shopee reviews tl 15 dataset is constructed by randomly taking 2100 training samples and 450 samples for testing and validation for each review star from 1 to 5. In total there are 10500 trainig samples and 2250 each in validation and testing samples.

Dataset Creation

Curation Rationale

[More Information Needed]

Source Data

Initial Data Collection and Normalization

[More Information Needed]

Who are the source language producers?

[More Information Needed]

Annotations

Annotation process

[More Information Needed]

Who are the annotators?

[More Information Needed]

Personal and Sensitive Information

[More Information Needed]

Considerations for Using the Data

Social Impact of Dataset

[More Information Needed]

Discussion of Biases

[More Information Needed]

Other Known Limitations

[More Information Needed]

Additional Information

Dataset Curators

[More Information Needed]

Licensing Information

[More Information Needed]

Citation Information

[More Information Needed]

Contributions

[More Information Needed]