模型:

OpenAssistant/pythia-12b-pre-v8-12.5k-steps

英文

注意:内部模型,尚未准备好使用

这是一个中间模型,用作进一步 pythia 12b SFT-8 实验的基础模型。它是在更广泛的指令调整数据集上进行训练的,训练步骤超过12.5k,批次大小为128,上下文大小为2048。gpt4all 数据集有 "作为语言模型" 的污染(>1.8k 条目)。我们后来添加了过滤,但是该模型(pre-v8)是在未经筛选的原始 gpt4all 数据集上训练的。

数据集:

pretrain:
  num_train_epochs: 1
  weight_decay: 0.0
  use_custom_sampler: true
  sort_by_length: false
  datasets:
    - gpteacher_roleplay:
        val_split: 0.05
    - red_pajama:
        fraction: 0.25
        max_val_set: 1000
    - wizardlm_70k:
        val_split: 0.05
        max_val_set: 500
    - joke:
        val_split: 0.05
    - poem_instructions:
        val_split: 0.025
    - oa_stackexchange:
        val_split: 0.05
        fraction: 0.1
        max_val_set: 1000
    - tell_a_joke:
        val_split: 0.05
        max_val_set: 250
    - webgpt:
        val_split: 0.05
        max_val_set: 250
    - gpt4all:
        val_split: 0.01
        max_val_set: 1000
    - alpaca_gpt4:
        val_split: 0.025
        max_val_set: 250
    - code_alpaca:
        val_split: 0.05
        max_val_set: 250
    - vicuna:
        max_val_set: 250
    - oig_file:
        source_url: https://huggingface.co/datasets/laion/OIG/resolve/main/unified_chip2.jsonl
        max_count: 10000
        min_length: 250
        val_split: 0.05
        max_val_set: 250
    - minimath:
        val_split: 0.05
    - humaneval_mbpp_codegen_qa:
        val_split: 0.05
    - humaneval_mbpp_testgen_qa:
        val_split: 0.05
    - grade_school_math_instructions:
        val_split: 0.05
    - recipes:
        val_split: 0.05
    - cmu_wiki_qa:
        val_split: 0.05
 - oa_wiki_qa_bart_10000row:
        val_split: 0.05
        max_val_set: 250
    - prosocial_dialogue:
        fraction: 0.1
        max_val_set: 250
    - explain_prosocial:
        fraction: 0.075
        max_val_set: 250
    - soda:
        fraction: 0.25
        max_val_set: 1000
    - oa_leet10k:
        val_split: 0.05
        max_val_set: 250
    - dolly15k:
        val_split: 0.05
        max_val_set: 300

Pythia:

pythia-12b-pretrain:
  dtype: fp16
  log_dir: "pythia_log_12b"
  learning_rate: 6e-6
  model_name: EleutherAI/pythia-12b-deduped
  output_dir: pythia_model_12b
  weight_decay: 0.0
  max_length: 2048
  warmup_steps: 100
  gradient_checkpointing: true
  gradient_accumulation_steps: 4
  per_device_train_batch_size: 4
  per_device_eval_batch_size: 4
  eval_steps: 251
  save_steps: 500
  num_train_epochs: 1
  save_total_limit: 2
  deepspeed_config: configs/zero_config_pretrain.json

使用的命令: deepspeed trainer_sft.py --show_dataset_stats --configs defaults pythia-12b-pretrain pretrain --cache_dir .cache/ --output_dir .saved/pythia-12b-super-pretrain2 --deepspeed