该模型是 bert-base-uncased 模型的精调版本,用于将新闻文章分类为四个类别之一:世界(标签0),体育(标签1),商业(标签2),科技(标签3)。
您可以使用以下代码来使用该模型。
from transformers import BertForSequenceClassification, BertTokenizer, TextClassificationPipeline model_path = "JiaqiLee/bert-agnews" tokenizer = BertTokenizer.from_pretrained(model_path) model = BertForSequenceClassification.from_pretrained(model_path, num_labels=4) pipeline = TextClassificationPipeline(model=model, tokenizer=tokenizer) print(pipeline("Google scores first-day bump of 18 (USATODAY.com): USATODAY.com - Even a big first-day jump in shares of Google (GOOG) couldn't quiet debate over whether the Internet search engine's contentious auction was a hit or a flop."))
训练数据来源于HuggingFace AGNews dataset 。我们使用90%的train.csv数据对模型进行训练,剩余10%用于评估。
该模型在AGNews测试数据集中达到了0.9447的分类准确率。