英文

MahaBERT

MahaBERT 是一款马拉地语的 BERT 模型。它是基于 L3Cube-MahaCorpus 数据集和其他公开可用的马拉地语单语数据集对multilingual BERT (google/muril-base-cased) 模型进行微调的。[dataset link] ( https://github.com/l3cube-pune/MarathiNLP )

有关数据集、模型和基准结果的更多详细信息可以在我们的[论文] ( https://arxiv.org/abs/2202.01159 ) 中找到

@inproceedings{joshi-2022-l3cube,
    title = "{L}3{C}ube-{M}aha{C}orpus and {M}aha{BERT}: {M}arathi Monolingual Corpus, {M}arathi {BERT} Language Models, and Resources",
    author = "Joshi, Raviraj",
    booktitle = "Proceedings of the WILDRE-6 Workshop within the 13th Language Resources and Evaluation Conference",
    month = jun,
    year = "2022",
    address = "Marseille, France",
    publisher = "European Language Resources Association",
    url = "https://aclanthology.org/2022.wildre-1.17",
    pages = "97--101",
}

其他单语印度 BERT 模型如下所示: Marathi BERT Marathi RoBERTa Marathi AlBERT

Hindi BERT Hindi RoBERTa Hindi AlBERT

Dev BERT Dev RoBERTa Dev AlBERT

Kannada BERT Telugu BERT Malayalam BERT Tamil BERT Gujarati BERT Oriya BERT Bengali BERT Punjabi BERT Assamese BERT