模型:
echarlaix/distilbert-sst2-inc-dynamic-quantization-magnitude-pruning-0.1
模型描述:该模型是在SST-2数据集上进行微调的,使用动态量化和剪枝策略,经过幅值剪枝使稀疏度达到10%,通过使用 Intel® Neural Compressor 获得的。
需要安装Optimum:pip install optimum[neural-compressor]
要加载量化模型并使用Transformers pipelines 运行推理,可以按照以下步骤进行:
from transformers import AutoTokenizer, pipeline
from optimum.intel import INCModelForSequenceClassification
model_id = "echarlaix/distilbert-sst2-inc-dynamic-quantization-magnitude-pruning-0.1"
model = INCModelForSequenceClassification.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)
cls_pipe = pipeline("text-classification", model=model, tokenizer=tokenizer)
text = "He's a dreadful magician."
outputs = cls_pipe(text)