模型:
danjohnvelasco/filipino-sentence-roberta-v1
我们对 NewsPH-NLI 进行了 RoBERTa Tagalog Base (finetuned on COHFIE) 微调,以学习编码菲律宾语/塔加洛语句子的句子嵌入。我们使用 sentence-transformers 对模型进行了微调。有关所有模型细节、训练设置和语料库详细信息,请参阅该论文: Automatic WordNet Construction using Word Sense Induction through Sentence Embeddings 。
此模型的预期用途是提取句子嵌入,用于聚类。由于我们没有对其进行偏见检查,因此此模型在生产中可能不安全。请谨慎使用。
在安装了 sentence-transformers 之后,使用此模型会更加容易:
pip install -U sentence-transformers
使用SentenceTransformer将句子编码为句子嵌入的方法如下:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("danjohnvelasco/filipino-sentence-roberta-v1") sentence_list = ["sentence 1", "sentence 2", "sentence 3"] sentence_embeddings = model.encode(sentence_list) print(sentence_embeddings)
如果您使用了此模型,请引用我们的工作:
@misc{https://doi.org/10.48550/arxiv.2204.03251, doi = {10.48550/ARXIV.2204.03251}, url = {https://arxiv.org/abs/2204.03251}, author = {Velasco, Dan John and Alba, Axel and Pelagio, Trisha Gail and Ramirez, Bryce Anthony and Cruz, Jan Christian Blaise and Cheng, Charibeth}, keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences}, title = {Automatic WordNet Construction using Word Sense Induction through Sentence Embeddings}, publisher = {arXiv}, year = {2022}, copyright = {Creative Commons Attribution 4.0 International} }