模型:

mrm8488/spanbert-finetuned-squadv2

英文

SpanBERT(spanbert-base-cased)在SQuAD v2上进行微调

Facebook Research 创建,并在 SQuAD 2.0 上进行Q&A任务的微调。

SpanBERT的详细信息

SpanBERT: Improving Pre-training by Representing and Predicting Spans

下游任务(Q&A)的详细信息 - 数据集

SQuAD2.0 将SQuAD1.1中的100,000个问题与由众包工人以类似可回答问题的方式编写的50,000多个无法回答的问题相结合。要在SQuAD2.0上取得好成绩,系统不仅必须在可能时回答问题,还必须确定段落不支持任何答案并且放弃回答。

Dataset Split # samples
SQuAD2.0 train 130k
SQuAD2.0 eval 12.3k

模型训练

该模型在Tesla P100 GPU和25GB RAM上进行了训练。微调的脚本可以在 here 找到。

结果:

Metric # Value
EM 78.80
F1 82.22

原始指标:

{
  "exact": 78.80064010780762,
  "f1": 82.22801347271162,
  "total": 11873,
  "HasAns_exact": 78.74493927125506,
  "HasAns_f1": 85.60951483831069,
  "HasAns_total": 5928,
  "NoAns_exact": 78.85618166526493,
  "NoAns_f1": 78.85618166526493,
  "NoAns_total": 5945,
  "best_exact": 78.80064010780762,
  "best_exact_thresh": 0.0,
  "best_f1": 82.2280134727116,
  "best_f1_thresh": 0.0
}

比较:

Model EM F1 score
1238321 - 83.6*
1239321 78.80 82.22

模型实际操作

使用pipelines快速使用:

from transformers import pipeline

qa_pipeline = pipeline(
    "question-answering",
    model="mrm8488/spanbert-finetuned-squadv2",
    tokenizer="mrm8488/spanbert-finetuned-squadv2"
)

qa_pipeline({
    'context': "Manuel Romero has been working hardly in the repository hugginface/transformers lately",
    'question': "Who has been working hard for hugginface/transformers lately?"

})

# Output: {'answer': 'Manuel Romero','end': 13,'score': 6.836378586818937e-09, 'start': 0}

Manuel Romero/@mrm8488 创建

在西班牙用 ♥ 制作