代码拉取完成,页面将自动刷新
原程序在 SLIC_BK 下,添加了 Makefile
# 以下代码均以本项目根目录为初始路径
cd SLIC_BK
make clean && make
./SLIC.out
优化后程序在 SLIC 目录下,utils.h 中的宏 #define THE_THREAD_NUMS
确定了线程数,请根据本机情况自行调整。
cd SLIC
make clean && make
./SLIC.out
超算集群上运行方式如下,仅使用单台机器,对 NUMA 的架构进行了一定访存优化。
export OMP_PLACES=cores
srun -p amd_256 -N 1 -t 10 ./SLIC.out
注:
在 AMD EPYC 7452 32-Core Processor (2 sockets)
,即双路 32 核共 64 核 64 线程 的机器上运行。
原始程序执行时间约为 5700ms,32 线程时执行时间约为 78ms,整体加速比约为 73,62 线程时执行时间约为 57 ms,整体加速比约为 100。(注:并非仅运用并行带来的加速比,实际上运用了一些单线程优化方法后,并行加速比并不可观)
指定 62 线程,使用环境变量 OMP_PLACES=cores
,执行效果如下:
$ export OMP_PLACES=cores
$ chmod +x ./SLIC.out
$ srun -p amd_256 -N 1 -t 10 ./SLIC.out
srun: job 555293 queued and waiting for resources
srun: job 555293 has been allocated resources
width = 2599, height = 3898
sz = 10130902
Initial time = 0 ms
Conversion time = 20 ms
DeleteEdges and Get_Seeds time = 0 ms
numk = 196
Dist iter time=4(4) ms
Dist iter time=6(2) ms
Dist iter time=8(2) ms
Dist iter time=10(2) ms
Dist iter time=12(2) ms
Dist iter time=14(2) ms
Dist iter time=16(2) ms
Dist iter time=18(2) ms
Dist iter time=20(2) ms
Dist iter time=22(2) ms
Computing time=28 ms
STEP = 227
Segmentation time = 29 ms
EC1 time=0 ms
EC2 time=3 ms
EC3 time=0 ms
EC4 time=1 ms
EnforceLabelConnectivity time = 6 ms
Computing time=57 ms
There are 0 points' labels are different from original file.
原始效果如下:
$ srun -p amd_256 -N 1 -t 10 ./SLIC.out
srun: job 438538 queued and waiting for resources
srun: job 438538 has been allocated resources
Computing time=5780 ms
There are 0 points' labels are different from original file.
优化过程中某一阶段如下:
$ srun -p amd_256 -N 1 -t 10 ./SLIC.out
srun: job 431514 queued and waiting for resources
srun: job 431514 has been allocated resources
width = 2599, height = 3898
sz = 10130902
Initial time = 3 ms
Conversion time = 80 ms
DeleteEdges and Get_Seeds time = 17 ms
numk = 196
Dist iter time=18(18) ms Dist iter time=0(0) ms
Dist iter time=28(10) ms Dist iter time=0(0) ms
Dist iter time=38(10) ms Dist iter time=0(0) ms
Dist iter time=48(10) ms Dist iter time=0(0) ms
Dist iter time=56(8) ms Dist iter time=0(0) ms
Dist iter time=66(10) ms Dist iter time=0(0) ms
Dist iter time=75(9) ms Dist iter time=0(0) ms
Dist iter time=77(2) ms Dist iter time=0(0) ms
Dist iter time=79(2) ms Dist iter time=0(0) ms
Dist iter time=81(2) ms Dist iter time=0(0) ms
STEP = 227
Segmentation time = 194 ms
EnforceLabelConnectivity time = 125 ms
Computing time=424 ms
There are 0 points' labels are different from original file.
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。