2.3K Star 8K Fork 4.2K

GVPMindSpore / mindspore

 / 详情

The result of fibonacci is wrong.

DONE
Bug-Report
创建于  
2021-04-23 11:14
name about labels
Bug Report Use this template for reporting a bug kind/bug

Environment

  • Hardware Environment(Ascend/GPU/CPU):
    /device ascend

  • Software Environment:
    -- MindSpore version (source or binary):
    -- Python version (e.g., Python 3.7.5):
    -- OS platform and distribution (e.g., Linux Ubuntu 16.04):
    -- GCC/Compiler version (if compiled from source):

Related testcase

import mindspore.ops.composite as C
from mindspore import context
from mindspore import Tensor
import mindspore as ms
from mindspore.common.api import ms_function

context.set_context(mode=context.GRAPH_MODE, save_graphs=True, save_graphs_path='./tir')
grad_by_all = C.GradOperation(get_all=True)
ONE = Tensor(1,ms.int32)
ZERO = Tensor(0,ms.int32)
@ms_function
def fibonacci(n):
    if(n < 1):
        return ZERO
    elif(n == 1):
        return ONE
    else:
        return fibonacci(n-1) + fibonacci(n-2)


x=Tensor(5,ms.int32)
print(x)
y = fibonacci(x)
print(y)

## Steps to reproduce the issue
1. python fibonacci.py
2. [ERROR] GE(108109,python):2021-05-08-15:57:50.620.928 [mindspore/ccsrc/runtime/device/ascend/ge_runtime/runtime_model.cc:231] Run] Call rt api rtStreamSynchronize failed, ret: 7bc83

3.

## Describe the current behavior

## Describe the expected behavior

## Related log / screenshot

## Special notes for this issue

评论 (16)

lanzhineng 创建了Bug-Report
lanzhineng 关联仓库设置为MindSpore/mindspore
展开全部操作日志

Hey lanzhineng, Welcome to MindSpore Community.
All of the projects in MindSpore Community are maintained by @mindspore-ci-bot.
That means the developers can comment below every pull request or issue to trigger Bot Commands.
Please follow instructions at https://gitee.com/mindspore/community/blob/master/command.md to find the details.

Please add labels (comp or sig), for example, if you found an issue in data component, you can type "//comp/data" in comment, also you can visit "https://gitee.com/mindspore/community/blob/master/sigs/dx/docs/labels.md" to find more.
为了让问题更快得到响应,请您为该issue打上组件(comp)或兴趣组(sig)标签,例如,当你遇到有关data组件的问题时,你可以在评论中输入 "//comp/data", 这样issue会被打上"comp/data"标签,问题会分配给相应责任人更多的标签可以查看https://gitee.com/mindspore/community/blob/master/sigs/dx/docs/labels.md"

mindspore-dx-bot 负责人设置为lanzhineng
mindspore-dx-bot 添加了
 
kind/bug
标签
lanzhineng 修改了标题
zhangqinghua 修改了描述
zhangqinghua 优先级设置为严重
lanzhineng 修改了描述

import mindspore.ops.composite as C
from mindspore import context
from mindspore import Tensor
import mindspore as ms
from mindspore.common.api import ms_function

context.set_context(mode=context.GRAPH_MODE, save_graphs=True, save_graphs_path='./tir')
grad_by_all = C.GradOperation(get_all=True)
ONE = Tensor(1,ms.int32)
ZERO = Tensor(0,ms.int32)
@ms_function
def fibonacci(n):
if(n < 1):
return ZERO
elif(n == 1):
return ONE
else:
return fibonacci(n-1) + fibonacci(n-2)

x=Tensor(5,ms.int32)
print(x)
y = fibonacci(x)
print(y)

把 常量改Tensor ,前端生成图是对的,在 图运行 卡死了。

(gdb) bt
#0 0x0000ffffbf697c38 in pthread_cond_wait@@GLIBC_2.17 () from /lib64/libpthread.so.0
#1 0x0000ffffa3e74048 in __gthread_cond_wait (__mutex=, __cond=)
at /home/isuru/miniforge3/conda-bld/ctng-compilers_1589429670044/work/.build/aarch64-conda_cos7-linux-gnu/build/build-cc-gcc-final/aarch64-conda_cos7-linux-gnu/libstdc++-v3/include/aarch64-conda_cos7-linux-gnu/bits/gthr-default.h:877
#2 std::condition_variable::wait (this=, __lock=...)
at /home/isuru/miniforge3/conda-bld/ctng-compilers_1589429670044/work/.build/aarch64-conda_cos7-linux-gnu/src/gcc/libstdc++-v3/src/c++11/condition_variable.cc:53
#3 0x0000ffffaaaa8dd8 in std::condition_variable::wait<mindspore::tensor::WaitEvent::Wait() const::{lambda()#1}>(std::unique_lockstd::mutex&, mindspore::tensor::WaitEvent::Wait() const::{lambda()#1}) (
this=0xaaaaad5898f0, __lock=..., __p=...) at /usr/include/c++/7.3.0/condition_variable:99
#4 0x0000ffffaaaa5770 in mindspore::tensor::WaitEvent::Wait (this=0xaaaaad5898b0)
at /ssd1/lzn/mindspore/mindspore/core/ir/tensor.h:92
#5 0x0000ffffaaaa5b88 in mindspore::tensor::Tensor::Wait (this=0xaaaaad5fc780)
at /ssd1/lzn/mindspore/mindspore/core/ir/tensor.h:327
#6 0x0000ffffac308478 in mindspore::TensorToPyData (tensor=...)
at /ssd1/lzn/mindspore/mindspore/ccsrc/utils/convert_utils_py.cc:46
#7 0x0000ffffac309924 in mindspore::ValuePtrToPyData (value=...)
at /ssd1/lzn/mindspore/mindspore/ccsrc/utils/convert_utils_py.cc:122
#8 0x0000ffffac30b4bc in mindspore::BaseRefToPyData (value=...)
at /ssd1/lzn/mindspore/mindspore/ccsrc/utils/convert_utils_py.cc:227
#9 0x0000ffffabd29e60 in mindspore::pipeline::ExecutorPy::Run (this=0xaaaaaaef13b0, args=...,
--Type for more, q to quit, c to continue without paging--
phase=...) at /ssd1/lzn/mindspore/mindspore/ccsrc/pipeline/jit/pipeline.cc:934

subgraph @19_5_✗✗fibonacci.67(%para3_n) {
%0([CNode]37) = Sub(%para3_n, Tensor(shape=[], dtype=Int32, value= 1)) primitive_attrs: {output_names: [output], input_names: [x, y]}
: (<Tensor[Int32]x[const vector][]>, <Tensor[Int32]x[const vector][]>) -> (<Tensor[Int32]x[const vector][]>)
# In file /ssd1/lzn/mindspore/mindspore/ops/composite/multitype_ops/sub_impl.py(50)/ return F.tensor_sub(x, y)/
%1([CNode]11) = call @15_1_fibonacci.65(%0)
: (<Tensor[Int32]x[const vector][]>) -> (<Tensor[Int32]x[const vector][]>)
# In file fib.py(18)/ return fibonacci(n-1) + fibonacci(n-2)/
%2([CNode]37) = Sub(%para3_n, Tensor(shape=[], dtype=Int32, value= 2)) primitive_attrs: {output_names: [output], input_names: [x, y]}
: (<Tensor[Int32]x[const vector][]>, <Tensor[Int32]x[const vector][]>) -> (<Tensor[Int32]x[const vector][]>)
# In file /ssd1/lzn/mindspore/mindspore/ops/composite/multitype_ops/sub_impl.py(50)/ return F.tensor_sub(x, y)/
%3([CNode]8) = call @15_1_fibonacci.65(%2)
: (<Tensor[Int32]x[const vector][]>) -> (<Tensor[Int32]x[const vector][]>)
# In file fib.py(18)/ return fibonacci(n-1) + fibonacci(n-2)/
%4([CNode]39) = Add(%1, %3) primitive_attrs: {output_names: [output], input_names: [x, y]}
: (<Tensor[Int32]x[const vector][]>, <Tensor[Int32]x[const vector][]>) -> (<Tensor[Int32]x[const vector][]>)
# In file /ssd1/lzn/mindspore/mindspore/ops/composite/multitype_ops/add_impl.py(129)/ return F.add(x, y)/
Return(%4)
: (<Tensor[Int32]x[const vector][]>)
# In file fib.py(18)/ return fibonacci(n-1) + fibonacci(n-2)/
}
11_validate_0073.ir 图是对的。

import mindspore.ops.composite as C
from mindspore import context
from mindspore import Tensor
import mindspore as ms
from mindspore.common.api import ms_function

context.set_context(mode=context.GRAPH_MODE, save_graphs=True, save_graphs_path='./tir')
grad_by_all = C.GradOperation(get_all=True)
ONE = Tensor(1,ms.int32)
ZERO = Tensor(0,ms.int32)
@ms_function
def fibonacci(n):
if(n < 1):
return 0
elif(n == 1):
return 1
else:
return fibonacci(n-1) + fibonacci(n-2)

x=Tensor(5,ms.int32)
print(x)
y = fibonacci(x)
print(y)

(base) lzn@dggphispre18279:~/tests$ python fib.py
5
[WARNING] DEBUG(340,python):2021-05-08-14:44:47.702.218 [mindspore/ccsrc/debug/debugger/debugger.cc:80] Debugger] Not enabling debugger. Debugger does not support CPU.
[WARNING] CORE(340,python):2021-05-08-14:44:47.884.654 [mindspore/core/ir/anf_extends.cc:62] fullname_with_scope] Input 0 of cnode is not a value node, its type is CNode.
2

结果不对
问题在于 计算结果为标量的时候没有泛化对
subgraph @16_5_✗✗fibonacci.58() {
Return(2)
: ()
# In file /home/lzn/tests/fib.py(18)/ return fibonacci(n-1) + fibonacci(n-2)/
}
这里应该返回调用子图,而不是 常量。

lanzhineng 添加协作者lanzhineng
lanzhineng 负责人lanzhineng 修改为zhangbuxue

输入图片说明
改成这样也还是跑不通,这跟标量泛化关系不太大。

输入图片说明
改成这样可通。

输入图片说明

报错是后端的,看上去是控制流输出标量时,后端可能不支持,需要后端 同事一起看看 。

zhangbuxue 添加协作者zhangbuxue
zhangbuxue 负责人zhangbuxue 修改为lanzhineng
zhangbuxue 取消协作者lanzhineng

(ci3.7) [root@bms-aiserver-pod12-170-21 test]# python fib.py
5
[ERROR] GE(114356,python):2021-05-08-16:01:32.880.257 [mindspore/ccsrc/runtime/device/ascend/ge_runtime/runtime_model.cc:231] Run] Call rt api rtStreamSynchronize failed, ret: 7bc83
[ERROR] DEVICE(114356,python):2021-05-08-16:01:32.880.602 [mindspore/ccsrc/runtime/device/ascend/ascend_kernel_runtime.cc:623] DumpTaskExceptionInfo] Task fail infos task_id: 4, stream_id: 3, tid: 114468, device_id: 4, retcode: 507011
[ERROR] DEVICE(114356,python):2021-05-08-16:01:32.880.633 [mindspore/ccsrc/runtime/device/ascend/ascend_kernel_runtime.cc:632] DumpTaskExceptionInfo] Dump node (Default/StackPush-op41) task error input/output data to: ./task_error_dump/4 trace:

[ERROR] SESSION(114356,python):2021-05-08-16:01:32.894.691 [mindspore/ccsrc/backend/session/ascend_session.cc:1199] Execute] run task error!

lanzhineng 计划开始日期设置为2021-05-15
lanzhineng 计划截止日期设置为2021-06-20
lanzhineng 计划开始日期2021-05-15 修改为2021-05-08
lanzhineng 计划截止日期2021-06-20 修改为2021-05-23
lanzhineng 添加协作者lanzhineng
lanzhineng 负责人lanzhineng 修改为jjfeing
lanzhineng 添加协作者liangzelang

标量问题,前端解决。Tensor 问题 请后端解决。

lanzhineng 修改了描述
jjfeing 添加协作者jjfeing
jjfeing 负责人jjfeing 修改为liangzelang
jjfeing 取消协作者liangzelang

打开export ASCEND_SLOG_PRINT_TO_STDOUT=1,发现是StackPush算子报错导致的Run task error
enter image description here

请AICPU算子同事 @yanzhenxiang2020 帮忙进一步分析算子报错原因。

liangzelang 添加了
 
v1.0.0
标签

05.11 进展:

算子StackPush报错

  • 根据device侧日志分析是StackPush算子无法找到index为0的栈,通过执行序发现是由于StackInitStackDestroy算子位置不对;
  • 算子的执行序不符合预期的原因是, 之前默认根图不会被其他子图调用,也就不会有start_label_end_goto_,插入的StackInitStackDestroy顺序就不会在生成执行序时被修改;但如果根图有start_label_end_goto_,生成执行序时StackInit等算子顺序就会被调整。
  • 相关修改 后续提PR合入

如上修改完成后报错如下:
enter image description here

分析执行序及后端构图后,发现本用例较为特殊:
enter image description here

  • 根图作为被多次调用的子图, 为了返回实际调用点在最后加入了LabelSwitch
  • 因此导致根图永远都无法退出
  • 当前多子图调用的方案无法支持根图作为被多次调用的子图,转需求吧
liangzelang 添加了comp/akg(已删除)标签

Why add an comp/akg label?

liangzelang 移除了comp/akg(已删除)标签
liangzelang 移除了
 
v1.0.0
标签

Why add an comp/akg label?

@anyrenwei 不小心点到的

@anyrenwei 不小心点到的

@liangzelang Ok..我正在看akg相关的issue,还以为这个case也跟akg这边有关系

liangzelang 添加了device/ascend(已删除)标签
chenfei_mindspore 优先级严重 修改为次要
liangzelang 计划截止日期2021-05-23 修改为2021-06-10
liangzelang 计划开始日期2021-05-08 修改为2021-05-31
liangzelang 计划截止日期2021-06-10 修改为2021-06-18
liangzelang 计划开始日期2021-05-31 修改为2021-06-11
liangzelang 计划截止日期2021-06-18 修改为2021-12-31
liangzelang 计划开始日期2021-06-11 修改为2021-06-21

import mindspore.ops.composite as C
from mindspore import context
from mindspore import Tensor
import mindspore as ms
from mindspore.common.api import ms_function

context.set_context(mode=context.GRAPH_MODE, save_graphs=True, save_graphs_path='./tir')
grad_by_all = C.GradOperation(get_all=True)
ONE = Tensor(1,ms.int32)
ZERO = Tensor(0,ms.int32)
@ms_function
def f(x):
def fibonacci(n):
if(n < 1):
return 0
elif(n == 1):
return 1
else:
return fibonacci(n-1) + fibonacci(n-2)

x=Tensor(5,ms.int32)
print(x)
y = f(x)
print(y)

验证OK

lanzhineng 任务状态TODO 修改为DONE
lanzhineng 负责人liangzelang 修改为lanzhineng
lanzhineng 取消协作者lanzhineng
lanzhineng 添加协作者liangzelang
lanzhineng 移除了device/ascend(已删除)标签

登录 后才可以发表评论

状态
负责人
项目
里程碑
Pull Requests
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
开始日期   -   截止日期
-
置顶选项
优先级
预计工期 (小时)
参与者(9)
5518576 mindspore ci 1587902139 8777557 test bot 1617846881 6575112 zhangbuxue 1607593393 6574854 jjfeing 1584438580
Python
1
https://gitee.com/mindspore/mindspore.git
git@gitee.com:mindspore/mindspore.git
mindspore
mindspore
mindspore

搜索帮助