模型:
echarlaix/distilbert-sst2-inc-dynamic-quantization-magnitude-pruning-0.1
模型描述:该模型是在SST-2数据集上进行微调的,使用动态量化和剪枝策略,经过幅值剪枝使稀疏度达到10%,通过使用 Intel® Neural Compressor 获得的。
需要安装Optimum:pip install optimum[neural-compressor]
要加载量化模型并使用Transformers pipelines 运行推理,可以按照以下步骤进行:
from transformers import AutoTokenizer, pipeline from optimum.intel import INCModelForSequenceClassification model_id = "echarlaix/distilbert-sst2-inc-dynamic-quantization-magnitude-pruning-0.1" model = INCModelForSequenceClassification.from_pretrained(model_id) tokenizer = AutoTokenizer.from_pretrained(model_id) cls_pipe = pipeline("text-classification", model=model, tokenizer=tokenizer) text = "He's a dreadful magician." outputs = cls_pipe(text)