name |
about |
labels |
Bug Report |
Use this template for reporting a bug |
kind/bug |
Describe the current behavior / 问题描述 (Mandatory / 必填)
在GPU(A100)_X86+Ubuntu平台下cyclegan 模型会偶现loss 为 nan的情况
Environment / 环境信息 (Mandatory / 必填)
-
Hardware Environment(
Ascend
/GPU
/CPU
) / 硬件环境:
Please delete the backend not involved / 请删除不涉及的后端:
GPU
-
Software Environment / 软件环境 (Mandatory / 必填):
-- MindSpore version (e.g., 1.7.0.Bxxx) :http://mindspore-repo.csi.rnd.huawei.com/productrepo/HiAI/Milan_C17/20240308/
-- Python version (e.g., Python 3.7.5) Python 3.7.5
-- OS platform and distribution (e.g., Linux Ubuntu 16.04):version/202403/20240311/r2.3_20240311195546_226fd7468e1d63d6b71309580d058d2f5f836625
-- GCC/Compiler version (if compiled from source):
2.3 B070CI( run包 Milan_C17/20240308)
-
Excute Mode / 执行模式 (Mandatory / 必填)(
PyNative
/Graph
):
Please delete the mode not involved / 请删除不涉及的模式:
/mode pynative/graph
Related testcase / 关联用例 (Mandatory / 必填)
test_ms_usability_benchmark_pynative_gpu_cyclegan_time_perf_loss_1p_0001
test_ms_usability_benchmark_graph_gpu_cyclegan_time_perf_loss_1p_0001
Steps to reproduce the issue / 重现步骤 (Mandatory / 必填)
1.get code from solution_test
2.cd solution_test/cases/02network/00cv/cyclegan/pynative/
3.以其中之一为例pytest -s test_ms_usability_benchmark_pynative_gpu_cyclegan_time_perf_loss_1p_0001.py
4.验证网络训练是否还会偶现loss 为 nan的情况
Describe the expected behavior / 预期结果 (Mandatory / 必填)
在GPU(A100)_X86+Ubuntu平台下cyclegan 模型不会偶现loss 为 nan的情况
Related log / screenshot / 日志 / 截图 (Mandatory / 必填)
Special notes for this issue/备注 (Optional / 选填)
https://testreporter.szv.dragon.tools.huawei.com/TestDataBot/analysis/taskdetailes?productLine=2012%20Laboratories&taskId=f432077aa31cd02c4cb674ec7089039c1dca555f0dc6f087109a1b732dd3204d&tmssPath=%2F03200tqk2t5d0%2F03210v300ep51%2F031j0vd3316oi%2F&title=DT_MindSpore_Net_smoke_Test_r2.3_20240313_B070_2024-03-13%2009:17:47&productId=mindspore&cidaProjectId=6473a8ad2e914293b9f537b00979fbc7&isMergedTask=true&testcaseid=65f08b5e4076c56e2c6f341f&workspaceId=65f08b5e0cca61569c218d09
在GPU(A100)_X86+Ubuntu平台下cyclegan 模型pynative模式历史执行情况:
https://testreporter.szv.dragon.tools.huawei.com/TestDataBot/analysis/taskdetailes?productLine=2012%20Laboratories&taskId=f432077aa31cd02c4cb674ec7089039c1dca555f0dc6f087109a1b732dd3204d&tmssPath=%2F03200tqk2t5d0%2F03210v300ep51%2F031j0vd3316oi%2F&title=DT_MindSpore_Net_smoke_Test_r2.3_20240313_B070_2024-03-13%2009:17:47&productId=mindspore&cidaProjectId=6473a8ad2e914293b9f537b00979fbc7&isMergedTask=true&testcaseid=65f08cee4076c56e2c6f39b2&workspaceId=65f08cee2f0dd35eae53bca1
在GPU(A100)_X86+Ubuntu平台下cyclegan 模型graph模式历史执行情况: