2.3K Star 8K Fork 4.2K

GVPMindSpore / mindspore

 / 详情

[MS][Lite] memory leak in parameter allocation

DONE
Bug-Report
创建于  
2021-04-20 13:42
name about labels
Bug Report Use this template for reporting a bug kind/bug comp/infer

Environment

  • Hardware Environment(Ascend/GPU/CPU):

Uncomment only one /device <> line, hit enter to put that in a new line, and remove leading whitespaces from that line:

device cpu

  • Software Environment:
    -- MindSpore version (source or binary):
    -- Python version (e.g., Python 3.7.5):
    -- OS platform and distribution (e.g., Linux Ubuntu 16.04):
    -- GCC/Compiler version (if compiled from source):

Related testcase

Steps to reproduce the issue

Simple execution of benchmark
benchmark_train --epochs=1 --modelFile=mobilenetv3_train.ms --inDataFile=train_io/mobilenetv3_input1.bin,train_io/mobilenetv3_input2.bin --expectedDataFile=train_io/mobilenetv3_output

This memory leak is unrelated to ToD. it is a bug in the scheduler reallocation (as can be seen below):
first by InferSubGraphShape
scheduler.cc:82
then by ScheduleSubGraphToKernels
scheduler.cc:87

It can be reproduced by executing lite benchmark application as well

Describe the current behavior

Parameter allocation is called twice, so leak created.

Describe the expected behavior

Allocation should be called once

Related log / screenshot

**first allocation: **
mindspore::kernel::PopulateSoftmaxCrossEntropyParameter (prim=0x7ffff5ac0994)
at /home/yoni/git-proj/mindspore/mindspore/lite/src/train/train_populate_parameter.cc:144
#1 0x00007ffff7c1bde8 in mindspore::lite::Scheduler::InferNodeShape (this=0x7fffffffd540, node=0x555555684c30,
infer_shape_interrupt=0x7fffffffd32b) at /home/yoni/git-proj/mindspore/mindspore/lite/src/scheduler.cc:153
#2 0x00007ffff7c1c60e in mindspore::lite::Scheduler::InferSubGraphShape (this=0x7fffffffd540, subgraph_index=0,
infer_shape_interrupt=0x7fffffffd32b) at /home/yoni/git-proj/mindspore/mindspore/lite/src/scheduler.cc:204
#3 0x00007ffff7c1b3ab in mindspore::lite::Scheduler::Schedule (this=0x7fffffffd540, dst_kernels=0x5555556ca2c0)
at /home/yoni/git-proj/mindspore/mindspore/lite/src/scheduler.cc:82
#4 0x00007ffff7c3008b in mindspore::lite::LiteSession::CompileGraph (this=0x5555556ca2b0, model=0x555555673c10)
at /home/yoni/git-proj/mindspore/mindspore/lite/src/lite_session.cc:402
#5 0x00007ffff7c46935 in mindspore::lite::TrainSession::CompileTrainGraph (this=0x5555556ca0a0, model=0x555555673c10)
at /home/yoni/git-proj/mindspore/mindspore/lite/src/train/train_session.cc:95
#6 0x00007ffff7c4a985 in mindspore::session::TrainSession::CreateSession (model=0x555555673c10, context=0x5555556728e0, train_mode=false)
at /home/yoni/git-proj/mindspore/mindspore/lite/src/train/train_session.cc:488
#7 0x000055555558a0d6 in mindspore::lite::NetTrain::RunNetTrain (this=0x7fffffffda00)
at /home/yoni/git-proj/mindspore/mindspore/lite/tools/benchmark_train/net_train.cc:582
#8 0x000055555558da31 in mindspore::lite::RunNetTrain (argc=9, argv=0x7fffffffdfa8)
at /home/yoni/git-proj/mindspore/mindspore/lite/tools/benchmark_train/net_train.cc:920
#9 0x000055555558254e in main (argc=9, argv=0x7fffffffdfa8)
at /home/yoni/git-proj/mindspore/mindspore/lite/tools/benchmark_train/main.cc:23

Second allocation for the same node:
mindspore::kernel::PopulateSoftmaxCrossEntropyParameter (prim=0x7ffff5ab9994)
at /home/yoni/git-proj/mindspore/mindspore/lite/src/train/train_populate_parameter.cc:144
#1 0x00007ffff7c18a88 in mindspore::lite::Scheduler::InferNodeShape (this=0x7fffffffd530, node=0x555555684bb0,
infer_shape_interrupt=0x7fffffffd04b) at /home/yoni/git-proj/mindspore/mindspore/lite/src/scheduler.cc:153
#2 0x00007ffff7c1a791 in mindspore::lite::Scheduler::FindBackendKernel (this=0x7fffffffd530,
in_tensors=std::vector of length 2, capacity 2 = {...}, out_tensors=std::vector of length 2, capacity 2 = {...}, node=0x555555684bb0,
prefer_data_type=mindspore::kTypeUnknown) at /home/yoni/git-proj/mindspore/mindspore/lite/src/scheduler.cc:517
#3 0x00007ffff7c1afcb in mindspore::lite::Scheduler::ScheduleNodeToKernel (this=0x7fffffffd530, src_node=0x555555684bb0,
prefer_data_type=mindspore::kTypeUnknown) at /home/yoni/git-proj/mindspore/mindspore/lite/src/scheduler.cc:579
#4 0x00007ffff7c1b4e1 in mindspore::lite::Scheduler::ScheduleSubGraphToKernels (this=0x7fffffffd530, subgraph_index=0,
dst_kernels=0x5555556ca240, in_tensors=0x0, out_tensors=0x0, prefer_data_type=mindspore::kTypeUnknown)
at /home/yoni/git-proj/mindspore/mindspore/lite/src/scheduler.cc:609
#5 0x00007ffff7c1813a in mindspore::lite::Scheduler::Schedule (this=0x7fffffffd530, dst_kernels=0x5555556ca240)
at /home/yoni/git-proj/mindspore/mindspore/lite/src/scheduler.cc:87
#6 0x00007ffff7c2cd2b in mindspore::lite::LiteSession::CompileGraph (this=0x5555556ca230, model=0x555555673a30)
at /home/yoni/git-proj/mindspore/mindspore/lite/src/lite_session.cc:402
#7 0x00007ffff7c435d5 in mindspore::lite::TrainSession::CompileTrainGraph (this=0x5555556ca020, model=0x555555673a30)
at /home/yoni/git-proj/mindspore/mindspore/lite/src/train/train_session.cc:95
#8 0x00007ffff7c47625 in mindspore::session::TrainSession::CreateSession (model=0x555555673a30, context=0x5555556729d0, train_mode=false)
at /home/yoni/git-proj/mindspore/mindspore/lite/src/train/train_session.cc:488
#9 0x000055555558a0d6 in mindspore::lite::NetTrain::RunNetTrain (this=0x7fffffffd9f0)
at /home/yoni/git-proj/mindspore/mindspore/lite/tools/benchmark_train/net_train.cc:582
#10 0x000055555558da31 in mindspore::lite::RunNetTrain (argc=9, argv=0x7fffffffdf98)
at /home/yoni/git-proj/mindspore/mindspore/lite/tools/benchmark_train/net_train.cc:920
#11 0x000055555558254e in main (argc=9, argv=0x7fffffffdf98)
at /home/yoni/git-proj/mindspore/mindspore/lite/tools/benchmark_train/main.cc:23

Special notes for this issue

评论 (6)

yonibaehr 创建了Bug-Report
yonibaehr 关联仓库设置为MindSpore/mindspore
展开全部操作日志

Hey yonibaehr_admin, Welcome to MindSpore Community.
All of the projects in MindSpore Community are maintained by @mindspore-ci-bot.
That means the developers can comment below every pull request or issue to trigger Bot Commands.
Please follow instructions at https://gitee.com/mindspore/community/blob/master/command.md to find the details.

Please add labels (comp or sig), for example, if you found an issue in data component, you can type "//comp/data" in comment, also you can visit "https://gitee.com/mindspore/community/blob/master/sigs/dx/docs/labels.md" to find more.
为了让问题更快得到响应,请您为该issue打上组件(comp)或兴趣组(sig)标签,例如,当你遇到有关data组件的问题时,你可以在评论中输入 "//comp/data", 这样issue会被打上"comp/data"标签,问题会分配给相应责任人更多的标签可以查看https://gitee.com/mindspore/community/blob/master/sigs/dx/docs/labels.md"

mindspore-dx-bot 添加了
 
kind/bug
标签
yonibaehr 修改了描述
yonibaehr 修改了描述
zhunaipan 负责人设置为zhanghaibo
zhunaipan 添加了comp/mslite(已删除)标签
zhunaipan 添加了
 
sig/mslite
标签
zhunaipan 优先级设置为主要

@zhanghaibo please help to check the bug report?

yonibaehr 修改了描述
yonibaehr 修改了描述
ehaleva 修改了描述
ehaleva 修改了标题
zhanghaibo 添加协作者zhanghaibo
zhanghaibo 负责人zhanghaibo 修改为lz
zhanghaibo 添加协作者xutianchun
zhanghaibo 添加协作者yonibaehr
lz 添加协作者lz
lz 负责人lz 修改为hangq
lz 取消协作者lz
hangq 添加协作者hangq
hangq 负责人hangq 修改为张学同

hello, @yonibaehr , Has this problem been resolved? If yes, please close this issue, thanks!
你好, @yonibaehr , 这个问题是否已经解决了呢? 如果是的,请关闭这个issue, 谢谢!

hello, @yonibaehr , Has this problem been resolved? If yes, please close this issue, thanks!
你好, @yonibaehr , 这个问题是否已经解决了呢? 如果是的,请关闭这个issue, 谢谢!

hello, @yonibaehr , Has this problem been resolved? If it is in progress, please change the status to WIP. If the issue was solved, please close this issue, thanks!
你好, @yonibaehr , 这个问题是否已经解决了呢? 如果正在进行中,请把issue状态设置为WIP;如果问题已解决,请关闭这个issue, 谢谢!

张学同 添加协作者张学同
张学同 负责人张学同 修改为lz
lz 任务状态TODO 修改为ACCEPTED
lz 任务状态ACCEPTED 修改为DONE
lz 移除了comp/mslite(已删除)标签

登录 后才可以发表评论

状态
负责人
项目
里程碑
Pull Requests
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
开始日期   -   截止日期
-
置顶选项
优先级
预计工期 (小时)
参与者(8)
5518576 mindspore ci 1587902139 8777557 test bot 1617846881 6560119 panza 1584156773
Python
1
https://gitee.com/mindspore/mindspore.git
git@gitee.com:mindspore/mindspore.git
mindspore
mindspore
mindspore

搜索帮助