模型:
joon09/kor-naver-ner-name
韩国人姓名识别模型
kor-bert fine-tuning 模型
使用不太常用的韩文名字作为基础创建生成器后训练的模型,生成了16万个韩文名字进行训练。
例如) 안녕하세요. 임준영입니다. -> 안녕하세요. ***입니다.
from transformers import BertTokenizerFast, BertForTokenClassification from transformers import pipeline model_name = 'joon09/kor-naver-ner-name' tokenizer = BertTokenizerFast.from_pretrained(model_name) model = BertForTokenClassification.from_pretrained(model_name) nlp = pipeline("ner", model=model, tokenizer=tokenizer) ner('안녕하세요. 임준영입니다.',grouped_entities=True,aggregation_strategy='average') [{'entity_group': 'PER', 'score': 0.99999785, 'word': '임', 'start': 7, 'end': 8}, {'entity_group': 'PER', 'score': 0.82035744, 'word': '##준영', 'start': 8, 'end': 10}]