数据集:

finer

语言:

fi

计算机处理:

monolingual

大小:

10K<n<100K

语言创建人:

other

批注创建人:

expert-generated

源数据集:

original

预印本库:

arxiv:1908.04212

许可:

mit
中文

Dataset Card for [Dataset Name]

Dataset Summary

[More Information Needed]

Supported Tasks and Leaderboards

[More Information Needed]

Languages

[More Information Needed]

Dataset Structure

Data Instances

[More Information Needed]

Data Fields

Each row consists of the following fields:

  • id : The sentence id
  • tokens : An ordered list of tokens from the full text
  • ner_tags : Named entity recognition tags for each token
  • nested_ner_tags : Nested named entity recognition tags for each token

Note that by design, the length of tokens , ner_tags , and nested_ner_tags will always be identical.

ner_tags and nested_ner_tags correspond to the list below:

[ "O", "B-DATE", "B-EVENT", "B-LOC", "B-ORG", "B-PER", "B-PRO", "I-DATE", "I-EVENT", "I-LOC", "I-ORG", "I-PER", "I-PRO" ]

IOB2 labeling scheme is used.

Data Splits

[More Information Needed]

Dataset Creation

Curation Rationale

[More Information Needed]

Source Data

Initial Data Collection and Normalization

[More Information Needed]

Who are the source language producers?

[More Information Needed]

Annotations

Annotation process

[More Information Needed]

Who are the annotators?

[More Information Needed]

Personal and Sensitive Information

[More Information Needed]

Considerations for Using the Data

Social Impact of Dataset

[More Information Needed]

Discussion of Biases

[More Information Needed]

Other Known Limitations

[More Information Needed]

Additional Information

Dataset Curators

[More Information Needed]

Licensing Information

[More Information Needed]

Citation Information

[More Information Needed]

Contributions

Thanks to @stefan-it for adding this dataset.