模型:

google/reformer-crime-and-punishment

英文

"罪与罚"上训练的Reformer模型

"罪与罚"是由费奥多尔·陀思妥耶夫斯基(Fyodor Dostoevsky)写的一部小说,被翻译成了英文。

"罪与罚"的训练数据取自 gs://trax-ml/reformer/crime-and-punishment-2554.txt ,大约包含50万个标记。

使用colab笔记本由作者 https://colab.research.google.com/github/google/trax/blob/master/trax/models/reformer/text_generation.ipynb 提出的方法在flax中训练了ReformerLM模型,并将权重转换为Hugging Face的PyTorch ReformerLM模型ReformerModelWithLMHead。

该模型是一个基于小型子词单元的语言模型。可以按以下方式生成文本:

model = ReformerModelWithLMHead.from_pretrained("google/reformer-crime-and-punishment")
tok = ReformerTokenizer.from_pretrained("google/reformer-crime-and-punishment")
tok.decode(model.generate(tok.encode("A few months later", return_tensors="pt"), do_sample=True,temperature=0.7, max_length=100)[0])

# gives:'A few months later on was more than anything in the flat. 
# “I have already.” “That’s not my notion that he had forgotten him. 
# What does that matter? And why do you mean? It’s only another fellow,” he said as he went out, as though he want'