模型:

l3cube-pune/hindi-bert-scratch

中文

HindBERT-Scratch

HindBERT is a Hindi BERT model. It is a base-BERT model trained from scratch on publicly available Hindi monolingual datasets. [project link] ( https://github.com/l3cube-pune/MarathiNLP )

More details on the dataset, models, and baseline results can be found in our [paper] ( link )

The best version of model is shared here

Citing:

@article{joshi2022l3cubehind,
author = {Joshi, Raviraj},
year = {2022},
month = {09},
pages = {},
title = {L3Cube-HindBERT and DevBERT: Pre-Trained BERT Transformer models for Devanagari based Hindi and Marathi Languages},
doi = {10.13140/RG.2.2.14606.84809}
}

Other Models trained from scratch are listed below: Marathi-Scratch Marathi-Tweets-Scratch Hindi-Scratch Dev-Scratch Kannada-Scratch Telugu-Scratch Malayalam-Scratch Gujarati-Scratch

Better versions of Monolingual Indic BERT models are listed below: Marathi BERT Marathi RoBERTa Marathi AlBERT

Hindi BERT Hindi RoBERTa Hindi AlBERT

Dev BERT Dev RoBERTa Dev AlBERT

Kannada BERT Telugu BERT Malayalam BERT Tamil BERT Gujarati BERT Oriya BERT Bengali BERT Punjabi BERT