T5-Encoder(T5-large model) fine-tuned on very small dataset for token classification

Simple experimental model that was trained in 3 epochs on very small dataset

Usage

from transformers import AutoTokenizer, AutoModelForTokenClassification, NerPipeline

model = AutoModelForTokenClassification.from_pretrained("imvladikon/t5-english-ner", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("imvladikon/t5-english-ner", trust_remote_code=True)

pipe = NerPipeline(model=model, tokenizer=tokenizer, aggregation_strategy="max")
print(pipe("London is the capital city of England and the United Kingdom"))
"""
[{'entity_group': 'LOCATION',
  'score': 0.84536326,
  'word': 'London',
  'start': 0,
  'end': 6},
 {'entity_group': 'LOCATION',
  'score': 0.8957489,
  'word': 'England',
  'start': 30,
  'end': 37},
 {'entity_group': 'LOCATION',
  'score': 0.73186326,
  'word': 'UnitedKingdom',
  'start': 46,
  'end': 60}]
"""

Usage in spacy

pip install spacy transformers git+https://github.com/explosion/spacy-huggingface-pipelines -q

import spacy
from spacy import displacy

text = "My name is Sarah and I live in London"

nlp = spacy.blank("en")
nlp.add_pipe("hf_token_pipe", config={"model": "imvladikon/t5-english-ner", "kwargs": {"trust_remote_code":True}})
doc = nlp(text)
print(doc.ents)
# (Sarah, London)

This model is a fine-tuned version of t5-large on the private(en) dataset. It achieves the following results on the evaluation set:

Loss: 0.1956
Commercial Item Precision: 0.0
Commercial Item Recall: 0.0
Commercial Item F1: 0.0
Commercial Item Number: 1
Date Precision: 0.8125
Date Recall: 0.9286
Date F1: 0.8667
Date Number: 14
Location Precision: 0.7143
Location Recall: 0.75
Location F1: 0.7317
Location Number: 20
Organization Precision: 0.8588
Organization Recall: 0.9125
Organization F1: 0.8848
Organization Number: 80
Other Precision: 0.3684
Other Recall: 0.3333
Other F1: 0.35
Other Number: 21
Person Precision: 0.8182
Person Recall: 0.9310
Person F1: 0.8710
Person Number: 29
Quantity Precision: 0.8
Quantity Recall: 0.8571
Quantity F1: 0.8276
Quantity Number: 14
Title Precision: 0.0
Title Recall: 0.0
Title F1: 0.0
Title Number: 7
Overall Precision: 0.75
Overall Recall: 0.7903
Overall F1: 0.7696
Overall Accuracy: 0.9534

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 2
eval_batch_size: 2
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 3.0

Training results

Training Loss	Epoch	Step	Validation Loss	Commercial Item Number	Date Precision	Date Recall	Date F1	Date Number	Location Precision	Location Recall	Location F1	Location Number	Organization Precision	Organization Recall	Organization F1	Organization Number	Other Precision	Other Recall	Other F1	Other Number	Person Precision	Person Recall	Person F1	Person Number	Quantity Precision	Quantity Recall	Quantity F1	Quantity Number	Title Number	Overall Precision	Overall Recall	Overall F1	Overall Accuracy
0.8868	1.0	708	0.2725	1	0.8125	0.9286	0.8667	14	0.4167	0.75	0.5357	20	0.8272	0.8375	0.8323	80	1.0	0.0476	0.0909	21	0.8438	0.9310	0.8852	29	0.6667	0.7143	0.6897	14	7	0.7348	0.7151	0.7248	0.9446
0.2984	2.0	1416	0.2121	1	0.8667	0.9286	0.8966	14	0.5	0.8	0.6154	20	0.8375	0.8375	0.8375	80	0.3077	0.1905	0.2353	21	0.8182	0.9310	0.8710	29	0.7333	0.7857	0.7586	14	7	0.7077	0.7419	0.7244	0.9481
0.1729	3.0	2124	0.1956	1	0.8125	0.9286	0.8667	14	0.7143	0.75	0.7317	20	0.8588	0.9125	0.8848	80	0.3684	0.3333	0.35	21	0.8182	0.9310	0.8710	29	0.8	0.8571	0.8276	14	7	0.75	0.7903	0.7696	0.9534

Framework versions

Transformers 4.21.1
Pytorch 1.12.0+cu113
Datasets 2.4.0
Tokenizers 0.12.1

WANDB

training logs and reports

作者:

Vladimir Gurevich

数据集大小:

1.25 GB