CORe模型 - BioBERT + 临床结果预训练

模型描述

CORe（临床结果表示）模型在论文 Clinical Outcome Predictions from Admission Notes using Self-Supervised Knowledge Integration 中介绍。它基于BioBERT，并在临床笔记、疾病描述和医学文章上进行了进一步的预训练，采用了专门的临床结果预训练目标。

如何使用CORe

您可以通过transformers库加载该模型：

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("bvanaken/CORe-clinical-outcome-biobert-v1")
model = AutoModel.from_pretrained("bvanaken/CORe-clinical-outcome-biobert-v1")

从那里，您可以在从患者结果知识中受益的临床任务上进行微调。

预训练数据

该模型基于 BioBERT 预训练的PubMed数据。临床结果预训练包括来自MIMIC III训练集的出院摘要（指定 here ），来自 MTSamples 的医学转录以及来自i2b2挑战2006-2012的临床笔记。它还包括来自PubMed Central（PMC）的大约10,000份病例报告，来自维基百科的疾病文章，以及从NIH网站提取的 MedQuAd 数据集的文章部分。

引用

@inproceedings{vanaken21,
  author    = {Betty van Aken and
               Jens-Michalis Papaioannou and
               Manuel Mayrdorfer and
               Klemens Budde and
               Felix A. Gers and
               Alexander Löser},
  title     = {Clinical Outcome Prediction from Admission Notes using Self-Supervised
               Knowledge Integration},
  booktitle = {Proceedings of the 16th Conference of the European Chapter of the
               Association for Computational Linguistics: Main Volume, {EACL} 2021,
               Online, April 19 - 23, 2021},
  publisher = {Association for Computational Linguistics},
  year      = {2021},
}

作者:

Betty van Aken

数据集大小:

826.6 MB

CORe模型 - BioBERT + 临床结果预训练

模型描述

预训练数据

更多信息

引用