模型详情:INT8 DistilBERT base uncased finetuned SST-2
该模型是一个针对情感分类的DistilBERT模型进行微调的结果,在原始FP32模型上进行了INT8量化(训练后静态量化)。同一个模型以不同格式提供:PyTorch和ONNX。
Model Detail
|
Description
|
Model Authors - Company
|
Intel
|
Date
|
March 29, 2022 for PyTorch model & February 3, 2023 for ONNX model
|
Version
|
1
|
Type
|
NLP DistilBERT (INT8) - Sentiment Classification (+/-)
|
Paper or Other Resources
|
1235321
|
License
|
Apache 2.0
|
Questions or Comments
|
1236321
and
1237321
|
Intended Use
|
Description
|
Primary intended uses
|
Inference for sentiment classification (classifying whether a statement is positive or negative)
|
Primary intended users
|
Anyone
|
Out-of-scope uses
|
This model is already fine-tuned and quantized to INT8. It is not suitable for further fine-tuning in this form. To fine-tune your own model, you can start with
1238321
. The model should not be used to intentionally create hostile or alienating environments for people.
|
使用最佳英特尔进行PyTorch模型加载
from optimum.intel.neural_compressor import INCModelForSequenceClassification
model_id = "Intel/distilbert-base-uncased-finetuned-sst-2-english-int8-static"
int8_model = INCModelForSequenceClassification.from_pretrained(model_id)
使用最佳英特尔进行ONNX模型加载:
from optimum.onnxruntime import ORTModelForSequenceClassification
model_id = "Intel/distilbert-base-uncased-finetuned-sst-2-english-int8-static"
int8_model = ORTModelForSequenceClassification.from_pretrained(model_id)
Factors
|
Description
|
Groups
|
Movie reviewers from the internet
|
Instrumentation
|
Text movie single-sentence reviews taken from 4 authors. More information can be found in the original paper by
1239321
|
Environment
|
-
|
Card Prompts
|
Model deployment on alternate hardware and software can change model performance
|
Metrics
|
Description
|
Model performance measures
|
Accuracy
|
Decision thresholds
|
-
|
Approaches to uncertainty and variability
|
-
|
PyTorch INT8
|
ONNX INT8
|
FP32
|
Accuracy (eval-accuracy)
|
0.9037
|
0.9071
|
0.9106
|
Model Size (MB)
|
65
|
89
|
255
|
Training and Evaluation Data
|
Description
|
Datasets
|
The dataset can be found here:
12310321
. There dataset has a total of 215,154 unique phrases, annotated by 3 human judges.
|
Motivation
|
Dataset was chosen to showcase the benefits of quantization on an NLP classification task with the
12311321
and
12312321
|
Preprocessing
|
The calibration dataloader is the train dataloader. The default calibration sampling size 100 isn't divisible exactly by batch size 8, so the real sampling size is 104.
|
Quantitative Analyses
|
Description
|
Unitary results
|
The model was only evaluated on accuracy. There is no available comparison between evaluation factors.
|
Intersectional results
|
There is no available comparison between the intersection of evaluated factors.
|
Ethical Considerations
|
Description
|
Data
|
The data that make up the model are movie reviews from authors on the internet.
|
Human life
|
The model is not intended to inform decisions central to human life or flourishing. It is an aggregated set of movie reviews from the internet.
|
Mitigations
|
No additional risk mitigation strategies were considered during model development.
|
Risks and harms
|
The data are biased toward the particular reviewers' opinions and the judges (labelers) of the data. Significant research has explored bias and fairness issues with language models (see, e.g.,
12313321
, and
12314321
). Predictions generated by the model may include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups. Beyond this, the extent of the risks involved by using the model remain unknown.
|
Use cases
|
-
|
Caveats and Recommendations
|
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. There are no additional caveats or recommendations for this model.
|
Bibtex条目和引用信息
@misc{distilbert-base-uncased-finetuned-sst-2-english-int8-static
author = {Xin He, Yu Wenz},
title = {distilbert-base-uncased-finetuned-sst-2-english-int8-static},
year = {2022},
url = {https://huggingface.co/Intel/distilbert-base-uncased-finetuned-sst-2-english-int8-static},
}