模型:
biodatlab/score-claim-identification
这是一个用于从社会科学出版物的摘要中检测论点的模型卡。该模型接受一个摘要,执行句子分词,并预测每个句子的论点概率。该模型卡是在 SCORE 个数据集上进行训练的。在测试集上取得以下结果:
您可以使用HuggingFace的transformers库访问该模型,如下所示:
import spacy from transformers import AutoTokenizer from transformers import AutoModelForSequenceClassification nlp = spacy.load("en_core_web_lg") model_name = "biodatlab/score-claim-identification" tokenizer_name = "allenai/scibert_scivocab_uncased" tokenizer = AutoTokenizer.from_pretrained(tokenizer_name) model = AutoModelForSequenceClassification.from_pretrained(model_name) def inference(abstract: str): """ Split an abstract into sentences and perform claim identification. """ if abstract.strip() == "": return "Please provide an abstract as an input." claims = [] sents = [sent.text for sent in nlp(abstract).sents] # a list of sentences inputs = tokenizer( sents, return_tensors="pt", truncation=True, padding="longest" ) logits = model(**inputs).logits preds = logits.argmax(dim=1) # convert logits to predictions claims = [sent for sent, pred in zip(sents, preds) if pred == 1] if len(claims) > 0: return ".\n".join(claims) else: return "No claims found from a given abstract." claims = inference(abstract) # string of claim joining with \n
输入一个陈述句并将其分类为Claim(1)或Null(0)。以下是一些例子 -
Statement | Label |
---|---|
We consistently found that participants selectively chose to learn that bad (good) things happened to bad (good) people (Studies 1 to 7) that is, they selectively exposed themselves to deserved outcomes. | 1 (Claim) |
Members of higher status groups generalize characteristics of their ingroup to superordinate categories that serve as a frame of reference for comparisons with outgroups (ingroup projection). | 0 (Null) |
Motivational Interviewing helped the goal progress of those participants who, at pre-screening, reported engaging in many individual pro-environmental behaviors, but the more directive approach worked better for those participants who were less ready to change. | 1 (Claim) |
训练过程中使用了以下超参数:
Training Loss | Step | Validation Loss | Accuracy | F1 | Precision | Recall |
---|---|---|---|---|---|---|
0.038000 | 3996 | 0.007086 | 0.997964 | 0.993499 | 0.995656 | 0.991350 |
在biodatlab空间的gradio应用程序上了解更多信息。