数据集:

bigbio/tmvar_v3

语言:

en

计算机处理:

monolingual

预印本库:

arxiv:2204.03637
英文

tmVar v3 数据集卡

该数据集包含500篇手动标注的PubMed文章,其中包含各种类型的突变提及以及每个提及的dbsnp标准化。此外,它还包含变异标准化选项,如来自ClinGen Allele Registry的等位基因特定标识符。该数据集可用于命名实体识别(NER)任务和命名实体消歧(NED)任务。该数据集没有分割。

引用信息

@misc{https://doi.org/10.48550/arxiv.2204.03637,
  title        = {tmVar 3.0: an improved variant concept recognition and normalization tool},
  author       = {
    Wei, Chih-Hsuan and Allot, Alexis and Riehle, Kevin and Milosavljevic,
    Aleksandar and Lu, Zhiyong
  },
  year         = 2022,
  publisher    = {arXiv},
  doi          = {10.48550/ARXIV.2204.03637},
  url          = {https://arxiv.org/abs/2204.03637},
  copyright    = {Creative Commons Attribution 4.0 International},
  keywords     = {
    Computation and Language (cs.CL), FOS: Computer and information sciences,
    FOS: Computer and information sciences
  }
}