bart-base-instructiongen + LongForm

Instead of generating questions from text, generate instructions for LLMs!

Check out a basic demo on Spaces
An example of how to use instructiongen models in a CLI script can be found here
You can find other models fine-tuned for instruction generation by searching for the instructiongen tag .

about

This model is a fine-tuned version of pszemraj/bart-base-instructiongen on the akoksal/LongForm dataset.

This was trained on a dataset of only instructions+outputs, with any inputs filtered out. This means that text of 1) cookies and cream 2) chocolate chip 3) mint chip 4) oreo will not get you "Rank the following ice cream flavors: oreo, mint chip, chocolate chip, cookies and cream" .

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 8e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
distributed_type: multi-GPU
gradient_accumulation_steps: 16
total_train_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.02
num_epochs: 3.0

对以上内容翻译成中文，不要翻译大写的英文, 保留a标签以及所有属性，按照此约束返回翻译后的中文

作者:

Peter Szemraj

数据集大小:

1.04 GB