数据集:
roman_urdu
任务:
文本分类语言:
ur计算机处理:
monolingual大小:
10K<n<100K语言创建人:
found批注创建人:
crowdsourced源数据集:
original许可:
license:unknown[More Information Needed]
[More Information Needed]
Urdu
[More Information Needed]
Wah je wah,Positive,
Each row consists of a short Urdu text, followed by a sentiment label. The labels are one of Positive , Negative , and Neutral . Note that the original source file is a comma-separated values file.
[More Information Needed]
[More Information Needed]
Initial Data Collection and Normalization[More Information Needed]
Who are the source language producers?[More Information Needed]
[More Information Needed]
Who are the annotators?[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
@InProceedings{Sharf:2018, title = "Performing Natural Language Processing on Roman Urdu Datasets", authors = "Zareen Sharf and Saif Ur Rahman", booktitle = "International Journal of Computer Science and Network Security", volume = "18", number = "1", pages = "141-148", year = "2018" } @misc{Dua:2019, author = "Dua, Dheeru and Graff, Casey", year = "2017", title = "{UCI} Machine Learning Repository", url = "http://archive.ics.uci.edu/ml", institution = "University of California, Irvine, School of Information and Computer Sciences" }
Thanks to @jaketae for adding this dataset.