模型:

sileod/deberta-v3-large-tasksource-nli

数据集:

glue super_glue anli metaeval/babi_nli sick snli scitail hans alisawuffles/WANLI metaeval/recast sileod/probability_words_nli joey234/nan-nli pietrolesci/nli_fever pietrolesci/breaking_nli pietrolesci/conj_nli pietrolesci/fracas pietrolesci/dialogue_nli pietrolesci/mpe pietrolesci/dnc pietrolesci/gpt3_nli pietrolesci/recast_white pietrolesci/joci martn-nguyen/contrast_nli pietrolesci/robust_nli pietrolesci/robust_nli_is_sd pietrolesci/robust_nli_li_ts pietrolesci/gen_debiased_nli pietrolesci/add_one_rte metaeval/imppres pietrolesci/glue_diagnostics hlgd paws quora medical_questions_pairs conll2003 Anthropic/hh-rlhf Anthropic/model-written-evals truthful_qa nightingal3/fig-qa tasksource/bigbench bigbench blimp cos_e cosmos_qa dream openbookqa qasc quartz quail head_qa sciq social_i_qa wiki_hop wiqa piqa hellaswag pkavumba/balanced-copa 12ml/e-CARE art tasksource/mmlu winogrande codah ai2_arc definite_pronoun_resolution swag math_qa metaeval/utilitarianism mteb/amazon_counterfactual SetFit/insincere-questions SetFit/toxic_conversations turingbench/TuringBench trec tals/vitaminc hope_edi strombergnlp/rumoureval_2019 ethos tweet_eval discovery pragmeval silicone lex_glue papluca/language-identification imdb rotten_tomatoes ag_news yelp_review_full financial_phrasebank poem_sentiment dbpedia_14 amazon_polarity app_reviews hate_speech18 sms_spam humicroedit snips_built_in_intents banking77 hate_speech_offensive yahoo_answers_topics pacovaldez/stackoverflow-questions zapsdcn/hyperpartisan_news zapsdcn/sciie zapsdcn/citation_intent go_emotions scicite liar relbert/lexical_relation_classification metaeval/linguisticprobing metaeval/crowdflower metaeval/ethics emo google_wellformed_query tweets_hate_speech_detection has_part wnut_17 ncbi_disease acronym_identification jnlpba species_800 SpeedOfMagic/ontonotes_english blog_authorship_corpus launch/open_question_type health_fact commonsense_qa mc_taco ade_corpus_v2 prajjwal1/discosense circa YaHi/EffectiveFeedbackStudentWriting Ericwang/promptSentiment Ericwang/promptNLI Ericwang/promptSpoke Ericwang/promptProficiency Ericwang/promptGrammar Ericwang/promptCoherence PiC/phrase_similarity copenlu/scientific-exaggeration-detection quarel mwong/fever-evidence-related numer_sense dynabench/dynasent raquiba/Sarcasm_News_Headline sem_eval_2010_task_8 demo-org/auditor_review medmcqa aqua_rat RuyuanWan/Dynasent_Disagreement RuyuanWan/Politeness_Disagreement RuyuanWan/SBIC_Disagreement RuyuanWan/SChem_Disagreement RuyuanWan/Dilemmas_Disagreement lucasmccabe/logiqa wiki_qa metaeval/cycic_classification metaeval/cycic_multiplechoice metaeval/sts-companion metaeval/commonsense_qa_2.0 metaeval/lingnli metaeval/monotonicity-entailment metaeval/arct metaeval/scinli metaeval/naturallogic onestop_qa demelin/moral_stories corypaik/prost aps/dynahate metaeval/syntactic-augmentation-nli metaeval/autotnli lasha-nlp/CONDAQA openai/webgpt_comparisons Dahoas/synthetic-instruct-gptj-pairwise metaeval/scruples metaeval/wouldyourather sileod/attempto-nli metaeval/defeasible-nli metaeval/help-nli metaeval/nli-veridicality-transitivity metaeval/natural-language-satisfiability metaeval/lonli metaeval/dadc-limit-nli ColumbiaNLP/FLUTE metaeval/strategy-qa openai/summarize_from_feedback metaeval/folio metaeval/tomi-nli metaeval/avicenna stanfordnlp/SHP GBaker/MedQA-USMLE-4-options-hf sileod/wikimedqa declare-lab/cicero amydeng2000/CREAK metaeval/mutual inverse-scaling/NeQA inverse-scaling/quote-repetition inverse-scaling/redefine-math metaeval/puzzte metaeval/implicatures race metaeval/spartqa-yn metaeval/spartqa-mchoice metaeval/temporal-nli 3Ametaeval/temporal-nli 3Ametaeval/spartqa-mchoice 3Ametaeval/spartqa-yn 3Arace 3Ametaeval/implicatures 3Ametaeval/puzzte 3Ainverse-scaling/redefine-math 3Ainverse-scaling/quote-repetition 3Ainverse-scaling/NeQA 3Ametaeval/mutual 3Aamydeng2000/CREAK 3Adeclare-lab/cicero 3Asileod/wikimedqa 3AGBaker/MedQA-USMLE-4-options-hf 3Astanfordnlp/SHP 3Ametaeval/avicenna 3Ametaeval/tomi-nli 3Ametaeval/folio 3Aopenai/summarize_from_feedback 3Ametaeval/strategy-qa 3AColumbiaNLP/FLUTE 3Ametaeval/dadc-limit-nli 3Ametaeval/lonli 3Ametaeval/natural-language-satisfiability 3Ametaeval/nli-veridicality-transitivity 3Ametaeval/help-nli 3Ametaeval/defeasible-nli 3Asileod/attempto-nli 3Ametaeval/wouldyourather 3Ametaeval/scruples 3ADahoas/synthetic-instruct-gptj-pairwise 3Aopenai/webgpt_comparisons 3Alasha-nlp/CONDAQA 3Ametaeval/autotnli 3Ametaeval/syntactic-augmentation-nli 3Aaps/dynahate 3Acorypaik/prost 3Ademelin/moral_stories 3Aonestop_qa 3Ametaeval/naturallogic 3Ametaeval/scinli 3Ametaeval/arct 3Ametaeval/monotonicity-entailment 3Ametaeval/lingnli 3Ametaeval/commonsense_qa_2.0 3Ametaeval/sts-companion 3Ametaeval/cycic_multiplechoice 3Ametaeval/cycic_classification 3Awiki_qa 3Alucasmccabe/logiqa 3ARuyuanWan/Dilemmas_Disagreement 3ARuyuanWan/SChem_Disagreement 3ARuyuanWan/SBIC_Disagreement 3ARuyuanWan/Politeness_Disagreement 3ARuyuanWan/Dynasent_Disagreement 3Aaqua_rat 3Amedmcqa 3Ademo-org/auditor_review 3Asem_eval_2010_task_8 3Araquiba/Sarcasm_News_Headline 3Adynabench/dynasent 3Anumer_sense 3Amwong/fever-evidence-related 3Aquarel 3Acopenlu/scientific-exaggeration-detection 3APiC/phrase_similarity 3AEricwang/promptCoherence 3AEricwang/promptGrammar 3AEricwang/promptProficiency 3AEricwang/promptSpoke 3AEricwang/promptNLI 3AEricwang/promptSentiment 3AYaHi/EffectiveFeedbackStudentWriting 3Acirca 3Aprajjwal1/discosense 3Aade_corpus_v2 3Amc_taco 3Acommonsense_qa 3Ahealth_fact 3Alaunch/open_question_type 3Ablog_authorship_corpus 3ASpeedOfMagic/ontonotes_english 3Aspecies_800 3Ajnlpba 3Aacronym_identification 3Ancbi_disease 3Awnut_17 3Ahas_part 3Atweets_hate_speech_detection 3Agoogle_wellformed_query 3Aemo 3Ametaeval/ethics 3Ametaeval/crowdflower 3Ametaeval/linguisticprobing 3Arelbert/lexical_relation_classification 3Aliar 3Ascicite 3Ago_emotions 3Azapsdcn/citation_intent 3Azapsdcn/sciie 3Azapsdcn/hyperpartisan_news 3Apacovaldez/stackoverflow-questions 3Ayahoo_answers_topics 3Ahate_speech_offensive 3Abanking77 3Asnips_built_in_intents 3Ahumicroedit 3Asms_spam 3Ahate_speech18 3Aapp_reviews 3Aamazon_polarity 3Adbpedia_14 3Apoem_sentiment 3Afinancial_phrasebank 3Ayelp_review_full 3Aag_news 3Arotten_tomatoes 3Aimdb 3Apapluca/language-identification 3Alex_glue 3Asilicone 3Apragmeval 3Adiscovery 3Atweet_eval 3Aethos 3Astrombergnlp/rumoureval_2019 3Ahope_edi 3Atals/vitaminc 3Atrec 3Aturingbench/TuringBench 3ASetFit/toxic_conversations 3ASetFit/insincere-questions 3Amteb/amazon_counterfactual 3Ametaeval/utilitarianism 3Amath_qa 3Aswag 3Adefinite_pronoun_resolution 3Aai2_arc 3Acodah 3Awinogrande 3Atasksource/mmlu 3Aart 3A12ml/e-CARE 3Apkavumba/balanced-copa 3Ahellaswag 3Apiqa 3Awiqa 3Awiki_hop 3Asocial_i_qa 3Asciq 3Ahead_qa 3Aquail 3Aquartz 3Aqasc 3Aopenbookqa 3Adream 3Acosmos_qa 3Acos_e 3Ablimp 3Abigbench 3Atasksource/bigbench 3Anightingal3/fig-qa 3Atruthful_qa 3AAnthropic/model-written-evals 3AAnthropic/hh-rlhf 3Aconll2003 3Amedical_questions_pairs 3Aquora 3Apaws 3Ahlgd 3Apietrolesci/glue_diagnostics 3Ametaeval/imppres 3Apietrolesci/add_one_rte 3Apietrolesci/gen_debiased_nli 3Apietrolesci/robust_nli_li_ts 3Apietrolesci/robust_nli_is_sd 3Apietrolesci/robust_nli 3Amartn-nguyen/contrast_nli 3Apietrolesci/joci 3Apietrolesci/recast_white 3Apietrolesci/gpt3_nli 3Apietrolesci/dnc 3Apietrolesci/mpe 3Apietrolesci/dialogue_nli 3Apietrolesci/fracas 3Apietrolesci/conj_nli 3Apietrolesci/breaking_nli 3Apietrolesci/nli_fever 3Ajoey234/nan-nli 3Asileod/probability_words_nli 3Ametaeval/recast 3Aalisawuffles/WANLI 3Ahans 3Ascitail 3Asnli 3Asick 3Ametaeval/babi_nli 3Aanli 3Asuper_glue 3Aglue

语言:

en

预印本库:

arxiv:2301.05948

许可:

apache-2.0
英文

DeBERTa-v3-large-tasksource-nli模型卡片

DeBERTa-v3-large模型在 tasksource collection 个任务上进行了多任务学习的微调。您可以进一步微调此模型以用于任何分类或多选任务。该检查点在许多任务上具有强大的零样本验证性能(例如WNLI上的77%)。由于进行了多任务训练,未调谐模型的CLS嵌入也具有很强的线性探测性能(MNLI上为90%)。

这是具有MNLI分类器的共享模型。它的编码器在包括bigbench、Anthropic rlhf、anli等在内的许多数据集上进行了训练...,同时使用了一个共享编码器进行NLI和分类任务的训练,每个任务都有一个特定的CLS嵌入,其中有10%的概率将其丢弃以方便在没有它的情况下使用模型。所有多选模型使用相同的分类层。对于分类任务,如果标签匹配,则模型共享权重。每个任务的示例数被限制为64k。该模型训练了30k步,批量大小为384,并且峰值学习率为2e-5。

任务来源的训练代码: https://colab.research.google.com/drive/1iB4Oxl9_B5W3ZDzXoWJN-olUbqLBxgQS?usp=sharing

软件

https://github.com/sileod/tasksource/ https://github.com/sileod/tasknet/ 在Nvidia A100 40GB GPU上训练耗时6天。

引用

关于此 article: 的更多细节

@article{sileo2023tasksource,
  title={tasksource: Structured Dataset Preprocessing Annotations for Frictionless Extreme Multi-Task Learning and Evaluation},
  author={Sileo, Damien},
  url= {https://arxiv.org/abs/2301.05948},
  journal={arXiv preprint arXiv:2301.05948},
  year={2023}
}

加载特定分类器

可用的所有任务分类器。请参阅 https://huggingface.co/sileod/deberta-v3-large-tasksource-adapters

模型卡片联系方式

damien.sileo@inria.fr