模型:

l3cube-pune/hindi-bert-scratch

英文

HindBERT-Scratch

HindBERT是一个印地语BERT模型。它是基于公开可用的印地语单语数据集从头训练的基础BERT模型。[项目链接] ( https://github.com/l3cube-pune/MarathiNLP )

有关数据集、模型和基线结果的更多详细信息可以在我们的[论文] ( link )中找到

最好的模型版本共享了 here

引用:

@article{joshi2022l3cubehind,
author = {Joshi, Raviraj},
year = {2022},
month = {09},
pages = {},
title = {L3Cube-HindBERT and DevBERT: Pre-Trained BERT Transformer models for Devanagari based Hindi and Marathi Languages},
doi = {10.13140/RG.2.2.14606.84809}
}

其他从头训练的模型如下所示: Marathi-Scratch Marathi-Tweets-Scratch Hindi-Scratch Dev-Scratch Kannada-Scratch Telugu-Scratch Malayalam-Scratch Gujarati-Scratch

更好的单语Indic BERT模型版本如下所示: Marathi BERT Marathi RoBERTa Marathi AlBERT

Hindi BERT Hindi RoBERTa Hindi AlBERT

Dev BERT Dev RoBERTa Dev AlBERT

Kannada BERT Telugu BERT Malayalam BERT Tamil BERT Gujarati BERT Oriya BERT Bengali BERT Punjabi BERT