449 Star 3.5K Fork 852

PaddlePaddle / PaddleOCR

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
README_en.md 3.26 KB
一键复制 编辑 原始数据 按行查看 历史
LDOUBLEV 提交于 2022-09-16 12:05 . fix slim doc

PP-OCR Models Quantization

Generally, a more complex model would achieve better performance in the task, but it also leads to some redundancy in the model. Quantization is a technique that reduces this redundancy by reducing the full precision data to a fixed number, so as to reduce model calculation complexity and improve model inference performance.

This example uses PaddleSlim provided APIs of Quantization to compress the OCR model.

It is recommended that you could understand following pages before reading this example:

Quick Start

Quantization is mostly suitable for the deployment of lightweight models on mobile terminals. After training, if you want to further compress the model size and accelerate the prediction, you can use quantization methods to compress the model according to the following steps.

  1. Install PaddleSlim
  2. Prepare trained model
  3. Quantization-Aware Training
  4. Export inference model
  5. Deploy quantization inference model

1. Install PaddleSlim

pip3 install paddleslim==2.3.2

2. Download Pre-trained Model

PaddleOCR provides a series of pre-trained models. If the model to be quantified is not in the list, you need to follow the Regular Training method to get the trained model.

3. Quant-Aware Training

Quantization training includes offline quantization training and online quantization training. Online quantization training is more effective. It is necessary to load the pre-trained model. After the quantization strategy is defined, the model can be quantified.

The code for quantization training is located in slim/quantization/quant.py. For example, the training instructions of slim PPOCRv3 detection model are as follows:

# download provided model
wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_distill_train.tar
tar xf ch_PP-OCRv3_det_distill_train.tar

python deploy/slim/quantization/quant.py -c configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_cml.yml -o Global.pretrained_model='./ch_PP-OCRv3_det_distill_train/best_accuracy'   Global.save_model_dir=./output/quant_model_distill/

If you want to quantify the text recognition model, you can modify the configuration file and loaded model parameters.

4. Export inference model

Once we got the model after pruning and fine-tuning, we can export it as an inference model for the deployment of predictive tasks:

python deploy/slim/quantization/export_model.py -c configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_cml.yml -o Global.checkpoints=output/quant_model/best_accuracy Global.save_inference_dir=./output/quant_inference_model

5. Deploy

The numerical range of the quantized model parameters derived from the above steps is still FP32, but the numerical range of the parameters is int8. The derived model can be converted through the opt tool of PaddleLite.

For quantitative model deployment, please refer to Mobile terminal model deployment

Python
1
https://gitee.com/paddlepaddle/PaddleOCR.git
git@gitee.com:paddlepaddle/PaddleOCR.git
paddlepaddle
PaddleOCR
PaddleOCR
release/2.6

搜索帮助