MaskRCNN OPs stream optimization · Pull Request !26 · PaddlePaddle/Paddle - Gitee.com

开源项目 > 人工智能 > 机器学习/深度学习 &&

已合并

PaddlePaddle-Gardener:stream_flow_opt PaddlePaddle:develop

创建于 2021-03-23 11:10

PR types

Performance optimization

PR changes

OPs

Describe

GetLengthLoD, GPUDistFpnProposalsHelper should run on context stream
remove two unnecessary context wait (no data is sent between host and device)
sub_lod_data can be memcpy in batch, reduce multiple times sychronization
The is ~1% e2e performance gain on trt-fp16/maskrcnn inference

展开设置折叠设置

Python

1

https://gitee.com/paddlepaddle/Paddle.git

git@gitee.com:paddlepaddle/Paddle.git

paddlepaddle

Paddle

Paddle