name | about | labels |
---|---|---|
Bug Report | Use this template for reporting a bug | kind/bug |
dynamic_shape场景,graph,数据下沉模式,在不设置set_inputs时,sink_size>1时,会出现异常。异常的信息不够明确
Ascend
/GPU
/CPU
) / 硬件环境:Please delete the backend not involved / 请删除不涉及的后端:
/device ascend
Software Environment / 软件环境 (Mandatory / 必填):
-- MindSpore version (e.g., 1.7.0.Bxxx) :
-- Python version (e.g., Python 3.7.5) :
-- OS platform and distribution (e.g., Linux Ubuntu 16.04):
-- GCC/Compiler version (if compiled from source):
Excute Mode / 执行模式 (Mandatory / 必填)(PyNative
/Graph
):
Please delete the mode not involved / 请删除不涉及的模式:
/mode graph
test_ms_dynamic_shape_h_rank_dy_not_set_inputs_0001
用例执行成功
[WARNING] DEVICE(130913,fffd758250f0,python):2023-05-06-14:51:28.952.297 [mindspore/ccsrc/plugin/device/ascend/hal/device/ascend_kernel_runtime.cc:776] DumpTaskExceptionInfo] GetNext error may be caused by slow data processing (bigger than 20s / batch) or transfer data to device error.
[WARNING] DEVICE(130913,fffd758250f0,python):2023-05-06-14:51:28.952.315 [mindspore/ccsrc/plugin/device/ascend/hal/device/ascend_kernel_runtime.cc:778] DumpTaskExceptionInfo] Suggestion:
[WARNING] DEVICE(130913,fffd758250f0,python):2023-05-06-14:51:28.952.331 [mindspore/ccsrc/plugin/device/ascend/hal/device/ascend_kernel_runtime.cc:779] DumpTaskExceptionInfo] 1) Set the parameter dataset_sink_mode=False of model.train(...) or model.eval(...) and try again.
[WARNING] DEVICE(130913,fffd758250f0,python):2023-05-06-14:51:28.952.347 [mindspore/ccsrc/plugin/device/ascend/hal/device/ascend_kernel_runtime.cc:781] DumpTaskExceptionInfo] 2) Reduce the batch_size in data processing and try again.
[WARNING] DEVICE(130913,fffd758250f0,python):2023-05-06-14:51:28.952.363 [mindspore/ccsrc/plugin/device/ascend/hal/device/ascend_kernel_runtime.cc:782] DumpTaskExceptionInfo] 3) You can create iterator by interface create_dict_iterator() of dataset class to independently verify the performance of data processing without training. Refer to the link for data processing optimization suggestions: https://mindspore.cn/tutorials/experts/zh-CN/master/dataset/optimize.html
[WARNING] DEVICE(130913,fffd758250f0,python):2023-05-06-14:51:28.952.379 [mindspore/ccsrc/plugin/device/ascend/hal/device/ascend_kernel_runtime.cc:786] DumpTaskExceptionInfo] 4) If it is a dynamic dataset, please set the input to dynamic through `set_inputs`, or set sink_size to 1. It is recommended to use the former, because the latter has poor performance.
[CRITICAL] DEVICE(130913,fffd758250f0,python):2023-05-06-14:51:28.977.556 [mindspore/ccsrc/plugin/device/ascend/hal/hardware/ascend_graph_executor.cc:256] RunGraph] Run task for graph:kernel_graph_1 error! The details refer to 'Ascend Error Message'.
[WARNING] MD(130913,fffca6ffd0f0,python):2023-05-06-14:51:28.978.377 [mindspore/ccsrc/minddata/dataset/engine/datasetops/data_queue_op.cc:280] SendDataToAscend] Thread has already been terminated.
[ERROR] DEBUG(130913,ffff8b50a440,python):2023-05-06-14:51:29.269.453 [mindspore/ccsrc/debug/rdr/graph_recorder.cc:41] DumpIRProto] Open file '/home/jenkins0/workspace/TDT_deployment/solution_test/cases/03subject_test/02usability/model_develop/dynamic_shape/test_ms_dynamic_shape_hw_dy_not_set_inputs_0001_GRAPH_MODE/rank_0/rdr/SESSION.graph_build.0.20230506145102.pb' failed!
[ERROR] DEBUG(130913,ffff8b50a440,python):2023-05-06-14:51:29.569.766 [mindspore/ccsrc/debug/rdr/graph_recorder.cc:41] DumpIRProto] Open file '/home/jenkins0/workspace/TDT_deployment/solution_test/cases/03subject_test/02usability/model_develop/dynamic_shape/test_ms_dynamic_shape_hw_dy_not_set_inputs_0001_GRAPH_MODE/rank_0/rdr/SESSION.graph_build.1.20230506145127.pb' failed!
Traceback (most recent call last):
File "../test_ms_dynamic_shape_hw_dy_not_set_inputs_0001_GRAPH_MODE/train_custom_single_input_net.py", line 76, in <module>
train_net_with_model()
File "../test_ms_dynamic_shape_hw_dy_not_set_inputs_0001_GRAPH_MODE/train_custom_single_input_net.py", line 63, in train_net_with_model
sink_size=config.sink_size)
File "/home/miniconda3/envs/ci/lib/python3.7/site-packages/mindspore/train/model.py", line 1066, in train
initial_epoch=initial_epoch)
File "/home/miniconda3/envs/ci/lib/python3.7/site-packages/mindspore/train/model.py", line 100, in wrapper
func(self, *args, **kwargs)
File "/home/miniconda3/envs/ci/lib/python3.7/site-packages/mindspore/train/model.py", line 617, in _train
cb_params, sink_size, initial_epoch, valid_infos)
File "/home/miniconda3/envs/ci/lib/python3.7/site-packages/mindspore/train/model.py", line 700, in _train_dataset_sink_process
outputs = train_network(*inputs)
File "/home/miniconda3/envs/ci/lib/python3.7/site-packages/mindspore/nn/cell.py", line 620, in __call__
out = self.compile_and_run(*args, **kwargs)
File "/home/miniconda3/envs/ci/lib/python3.7/site-packages/mindspore/nn/cell.py", line 942, in compile_and_run
return _cell_graph_executor(self, *new_args, phase=self.phase)
File "/home/miniconda3/envs/ci/lib/python3.7/site-packages/mindspore/common/api.py", line 1439, in __call__
return self.run(obj, *args, phase=phase)
File "/home/miniconda3/envs/ci/lib/python3.7/site-packages/mindspore/common/api.py", line 1478, in run
return self._exec_pip(obj, *args, phase=phase_real)
File "/home/miniconda3/envs/ci/lib/python3.7/site-packages/mindspore/common/api.py", line 102, in wrapper
results = fn(*arg, **kwargs)
File "/home/miniconda3/envs/ci/lib/python3.7/site-packages/mindspore/common/api.py", line 1458, in _exec_pip
return self._graph_executor(args, phase)
RuntimeError: Run task for graph:kernel_graph_1 error! The details refer to 'Ascend Error Message'.
----------------------------------------------------
- C++ Call Stack: (For framework developers)
----------------------------------------------------
mindspore/ccsrc/plugin/device/ascend/hal/hardware/ascend_graph_executor.cc:256 RunGraph
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。
Please add labels (comp or sig), also you can visit https://gitee.com/mindspore/community/blob/master/sigs/dx/docs/labels.md to find more.
为了让代码尽快被审核,请您为Pull Request打上 组件(comp)或兴趣组(sig) 标签,打上标签的PR可直接推送给责任人进行审核。
更多的标签可以查看https://gitee.com/mindspore/community/blob/master/sigs/dx/docs/labels.md
以组件相关代码提交为例,如果你提交的是data组件代码,你可以这样评论:
//comp/data
当然你也可以邀请data SIG组来审核代码,可以这样写:
//sig/data
另外你还可以给这个PR标记类型,例如是bugfix或者是特性需求:
//kind/bug or //kind/feature
恭喜你,你已经学会了使用命令来打标签,接下来就在下面的评论里打上标签吧!
test_dynamic_shape_train_dataset_change
未设置set_inputs, GetNext算子输出shape发生改变,该场景需要适配。
图模式动态shape问题,转2.2
TDT例会决策结论,2.1版本动态shape支持场景与2.0保持一致,主要支持动态图动态shape,静态图动态shape等场景问题单挂在2.2版本
当前gpu环境,图模式动态shape "不设SetInput" + "sink_size大于1" 场景,目前不支持动态shape。有以下问题
[MS][ST][DYN]gpu环境,graph模式,动态shape场景,构造nhw维度动态,不设置set_Inputs时,出现非法内存异常
https://e.gitee.com/mind_spore/dashboard?issue=I7BEWD
登录 后才可以发表评论