数据集:
bigbio/tmvar_v3
该数据集包含500篇手动标注的PubMed文章,其中包含各种类型的突变提及以及每个提及的dbsnp标准化。此外,它还包含变异标准化选项,如来自ClinGen Allele Registry的等位基因特定标识符。该数据集可用于命名实体识别(NER)任务和命名实体消歧(NED)任务。该数据集没有分割。
@misc{https://doi.org/10.48550/arxiv.2204.03637, title = {tmVar 3.0: an improved variant concept recognition and normalization tool}, author = { Wei, Chih-Hsuan and Allot, Alexis and Riehle, Kevin and Milosavljevic, Aleksandar and Lu, Zhiyong }, year = 2022, publisher = {arXiv}, doi = {10.48550/ARXIV.2204.03637}, url = {https://arxiv.org/abs/2204.03637}, copyright = {Creative Commons Attribution 4.0 International}, keywords = { Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences } }