English | 简体中文
DeepSORT (Deep Cosine Metric Learning SORT) extends the original SORT (Simple Online and Realtime Tracking) algorithm, it adds a CNN model to extract features in image of human part bounded by a detector. It integrates appearance information based on a deep appearance descriptor, and assigns and updates the detected targets to the existing corresponding trajectories like ReID task. The detection bboxes result required by DeepSORT can be generated by any detection model, and then the saved detection result file can be loaded for tracking. Here we select the PCB + Pyramid ResNet101
model provided by PaddleClas as the ReID model.
backbone | input shape | MOTA | IDF1 | IDS | FP | FN | FPS | det result/model | ReID model | config |
---|---|---|---|---|---|---|---|---|---|---|
ResNet-101 | 1088x608 | 72.2 | 60.5 | 998 | 8054 | 21644 | - | det result | ReID model | config |
ResNet-101 | 1088x608 | 68.3 | 56.5 | 1722 | 17337 | 15890 | - | det model | ReID model | config |
backbone | input shape | MOTA | IDF1 | IDS | FP | FN | FPS | det result/model | ReID model | config |
---|---|---|---|---|---|---|---|---|---|---|
ResNet-101 | 1088x608 | 64.1 | 53.0 | 1024 | 12457 | 51919 | - | det result | ReID model | config |
ResNet-101 | 1088x608 | 61.2 | 48.5 | 1799 | 25796 | 43232 | - | det model | ReID model | config |
Notes: DeepSORT does not need to train on MOT dataset, only used for evaluation. Now it supports two evaluation methods.
det_results_dir
|——————MOT16-02.txt
|——————MOT16-04.txt
|——————MOT16-05.txt
|——————MOT16-09.txt
|——————MOT16-10.txt
|——————MOT16-11.txt
|——————MOT16-13.txt
For MOT16 dataset, you can download a detection result after matching called det_results_dir.zip provided by PaddleDetection:
wget https://dataset.bj.bcebos.com/mot/det_results_dir.zip
If you use a stronger detection model, you can get better results. Each txt is the detection result of all the pictures extracted from each video, and each line describes a bounding box with the following format:
[frame_id],[bb_left],[bb_top],[width],[height],[conf]
frame_id
is the frame number of the image
bb_left
is the X coordinate of the left bound of the object box
bb_top
is the Y coordinate of the upper bound of the object box
width,height
is the pixel width and height
conf
is the object score with default value 1
(the results had been filtered out according to the detection score threshold)
2.Load the detection model and the ReID model at the same time. Here, the JDE version of YOLOv3 is selected. For more detail of configuration, see configs/mot/deepsort/_base_/deepsort_yolov3_darknet53_pcb_pyramid_r101.yml
.
# Load the result file and ReID model to get the tracking result
CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/deepsort/deepsort_pcb_pyramid_r101.yml --det_results_dir {your detection results}
# Load the detection model and ReID model to get the tracking results
CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/deepsort/deepsort_yolov3_pcb_pyramid_r101.yml
Inference a vidoe on single GPU with following command:
# inference on video and save a video
CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/deepsort/deepsort_yolov3_pcb_pyramid_r101.yml --video_file={your video name}.mp4 --save_videos
Notes:
Please make sure that ffmpeg is installed first, on Linux(Ubuntu) platform you can directly install it by the following command:apt-get update && apt-get install -y ffmpeg
.
1.export detection model
CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/jde_yolov3_darknet53_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/jde_yolov3_darknet53_30e_1088x608.pdparams
2.export ReID model
CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/deepsort_yolov3_pcb_pyramid_r101.yml -o reid_weights=https://paddledet.bj.bcebos.com/models/mot/deepsort_pcb_pyramid_r101.pdparams
or
CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/deepsort_pcb_pyramid_r101.yml -o reid_weights=https://paddledet.bj.bcebos.com/models/mot/deepsort_pcb_pyramid_r101.pdparams
python deploy/python/mot_sde_infer.py --model_dir=output_inference/jde_yolov3_darknet53_30e_1088x608/ --reid_model_dir=output_inference/deepsort_yolov3_pcb_pyramid_r101/ --video_file={your video name}.mp4 --device=GPU --save_mot_txts
Notes:
The tracking model is used to predict the video, and does not support the prediction of a single image. The visualization video of the tracking results is saved by default. You can add --save_mot_txts
to save the txt result file, or --save_images
to save the visualization images.
@inproceedings{Wojke2017simple,
title={Simple Online and Realtime Tracking with a Deep Association Metric},
author={Wojke, Nicolai and Bewley, Alex and Paulus, Dietrich},
booktitle={2017 IEEE International Conference on Image Processing (ICIP)},
year={2017},
pages={3645--3649},
organization={IEEE},
doi={10.1109/ICIP.2017.8296962}
}
@inproceedings{Wojke2018deep,
title={Deep Cosine Metric Learning for Person Re-identification},
author={Wojke, Nicolai and Bewley, Alex},
booktitle={2018 IEEE Winter Conference on Applications of Computer Vision (WACV)},
year={2018},
pages={748--756},
organization={IEEE},
doi={10.1109/WACV.2018.00087}
}
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。