The result of fibonacci is wrong.

name	about	labels
Bug Report	Use this template for reporting a bug	kind/bug

Environment

Hardware Environment(Ascend/GPU/CPU):
/device ascend
Software Environment:
-- MindSpore version (source or binary):
-- Python version (e.g., Python 3.7.5):
-- OS platform and distribution (e.g., Linux Ubuntu 16.04):
-- GCC/Compiler version (if compiled from source):

Related testcase

import mindspore.ops.composite as C
from mindspore import context
from mindspore import Tensor
import mindspore as ms
from mindspore.common.api import ms_function

context.set_context(mode=context.GRAPH_MODE, save_graphs=True, save_graphs_path='./tir')
grad_by_all = C.GradOperation(get_all=True)
ONE = Tensor(1,ms.int32)
ZERO = Tensor(0,ms.int32)
@ms_function
def fibonacci(n):
    if(n < 1):
        return ZERO
    elif(n == 1):
        return ONE
    else:
        return fibonacci(n-1) + fibonacci(n-2)


x=Tensor(5,ms.int32)
print(x)
y = fibonacci(x)
print(y)

## Steps to reproduce the issue
1. python fibonacci.py
2. [ERROR] GE(108109,python):2021-05-08-15:57:50.620.928 [mindspore/ccsrc/runtime/device/ascend/ge_runtime/runtime_model.cc:231] Run] Call rt api rtStreamSynchronize failed, ret: 7bc83

3.

## Describe the current behavior

## Describe the expected behavior

## Related log / screenshot

## Special notes for this issue

Hey lanzhineng, Welcome to MindSpore Community.
All of the projects in MindSpore Community are maintained by @mindspore-ci-bot.
That means the developers can comment below every pull request or issue to trigger Bot Commands.
Please follow instructions at https://gitee.com/mindspore/community/blob/master/command.md to find the details.

Please add labels (comp or sig), for example, if you found an issue in data component, you can type "//comp/data" in comment, also you can visit "https://gitee.com/mindspore/community/blob/master/sigs/dx/docs/labels.md" to find more.
为了让问题更快得到响应，请您为该issue打上组件(comp)或兴趣组(sig)标签，例如，当你遇到有关data组件的问题时，你可以在评论中输入 "//comp/data"，这样issue会被打上"comp/data"标签，问题会分配给相应责任人更多的标签可以查看https://gitee.com/mindspore/community/blob/master/sigs/dx/docs/labels.md"

import mindspore.ops.composite as C
from mindspore import context
from mindspore import Tensor
import mindspore as ms
from mindspore.common.api import ms_function

context.set_context(mode=context.GRAPH_MODE, save_graphs=True, save_graphs_path='./tir')
grad_by_all = C.GradOperation(get_all=True)
ONE = Tensor(1,ms.int32)
ZERO = Tensor(0,ms.int32)
@ms_function
def fibonacci(n):
if(n < 1):
return ZERO
elif(n == 1):
return ONE
else:
return fibonacci(n-1) + fibonacci(n-2)

x=Tensor(5,ms.int32)
print(x)
y = fibonacci(x)
print(y)

把常量改Tensor ，前端生成图是对的，在图运行卡死了。

(gdb) bt
#0 0x0000ffffbf697c38 in pthread_cond_wait@@GLIBC_2.17 () from /lib64/libpthread.so.0
#1 0x0000ffffa3e74048 in __gthread_cond_wait (__mutex=, __cond=)
at /home/isuru/miniforge3/conda-bld/ctng-compilers_1589429670044/work/.build/aarch64-conda_cos7-linux-gnu/build/build-cc-gcc-final/aarch64-conda_cos7-linux-gnu/libstdc++-v3/include/aarch64-conda_cos7-linux-gnu/bits/gthr-default.h:877
#2 std::condition_variable::wait (this=, __lock=...)
at /home/isuru/miniforge3/conda-bld/ctng-compilers_1589429670044/work/.build/aarch64-conda_cos7-linux-gnu/src/gcc/libstdc++-v3/src/c++11/condition_variable.cc:53
#3 0x0000ffffaaaa8dd8 in std::condition_variable::wait<mindspore::tensor::WaitEvent::Wait() const::{lambda()#1}>(std::unique_lockstd::mutex&, mindspore::tensor::WaitEvent::Wait() const::{lambda()#1}) (
this=0xaaaaad5898f0, __lock=..., __p=...) at /usr/include/c++/7.3.0/condition_variable:99
#4 0x0000ffffaaaa5770 in mindspore::tensor::WaitEvent::Wait (this=0xaaaaad5898b0)
at /ssd1/lzn/mindspore/mindspore/core/ir/tensor.h:92
#5 0x0000ffffaaaa5b88 in mindspore::tensor::Tensor::Wait (this=0xaaaaad5fc780)
at /ssd1/lzn/mindspore/mindspore/core/ir/tensor.h:327
#6 0x0000ffffac308478 in mindspore::TensorToPyData (tensor=...)
at /ssd1/lzn/mindspore/mindspore/ccsrc/utils/convert_utils_py.cc:46
#7 0x0000ffffac309924 in mindspore::ValuePtrToPyData (value=...)
at /ssd1/lzn/mindspore/mindspore/ccsrc/utils/convert_utils_py.cc:122
#8 0x0000ffffac30b4bc in mindspore::BaseRefToPyData (value=...)
at /ssd1/lzn/mindspore/mindspore/ccsrc/utils/convert_utils_py.cc:227
#9 0x0000ffffabd29e60 in mindspore::pipeline::ExecutorPy::Run (this=0xaaaaaaef13b0, args=...,
--Type for more, q to quit, c to continue without paging--
phase=...) at /ssd1/lzn/mindspore/mindspore/ccsrc/pipeline/jit/pipeline.cc:934

subgraph @19_5_✗✗fibonacci.67(%para3_n) {
%0([CNode]37) = Sub(%para3_n, Tensor(shape=[], dtype=Int32, value= 1)) primitive_attrs: {output_names: [output], input_names: [x, y]}
: (<Tensor[Int32]x[const vector][]>, <Tensor[Int32]x[const vector][]>) -> (<Tensor[Int32]x[const vector][]>)
# In file /ssd1/lzn/mindspore/mindspore/ops/composite/multitype_ops/sub_impl.py(50)/ return F.tensor_sub(x, y)/
%1([CNode]11) = call @15_1_fibonacci.65(%0)
: (<Tensor[Int32]x[const vector][]>) -> (<Tensor[Int32]x[const vector][]>)
# In file fib.py(18)/ return fibonacci(n-1) + fibonacci(n-2)/
%2([CNode]37) = Sub(%para3_n, Tensor(shape=[], dtype=Int32, value= 2)) primitive_attrs: {output_names: [output], input_names: [x, y]}
: (<Tensor[Int32]x[const vector][]>, <Tensor[Int32]x[const vector][]>) -> (<Tensor[Int32]x[const vector][]>)
# In file /ssd1/lzn/mindspore/mindspore/ops/composite/multitype_ops/sub_impl.py(50)/ return F.tensor_sub(x, y)/
%3([CNode]8) = call @15_1_fibonacci.65(%2)
: (<Tensor[Int32]x[const vector][]>) -> (<Tensor[Int32]x[const vector][]>)
# In file fib.py(18)/ return fibonacci(n-1) + fibonacci(n-2)/
%4([CNode]39) = Add(%1, %3) primitive_attrs: {output_names: [output], input_names: [x, y]}
: (<Tensor[Int32]x[const vector][]>, <Tensor[Int32]x[const vector][]>) -> (<Tensor[Int32]x[const vector][]>)
# In file /ssd1/lzn/mindspore/mindspore/ops/composite/multitype_ops/add_impl.py(129)/ return F.add(x, y)/
Return(%4)
: (<Tensor[Int32]x[const vector][]>)
# In file fib.py(18)/ return fibonacci(n-1) + fibonacci(n-2)/
}
11_validate_0073.ir 图是对的。

!11921:do not broaden when arg is not tensor/

import mindspore.ops.composite as C
from mindspore import context
from mindspore import Tensor
import mindspore as ms
from mindspore.common.api import ms_function

context.set_context(mode=context.GRAPH_MODE, save_graphs=True, save_graphs_path='./tir')
grad_by_all = C.GradOperation(get_all=True)
ONE = Tensor(1,ms.int32)
ZERO = Tensor(0,ms.int32)
@ms_function
def fibonacci(n):
if(n < 1):
return 0
elif(n == 1):
return 1
else:
return fibonacci(n-1) + fibonacci(n-2)

x=Tensor(5,ms.int32)
print(x)
y = fibonacci(x)
print(y)

(base) lzn@dggphispre18279:~/tests$ python fib.py
5
[WARNING] DEBUG(340,python):2021-05-08-14:44:47.702.218 [mindspore/ccsrc/debug/debugger/debugger.cc:80] Debugger] Not enabling debugger. Debugger does not support CPU.
[WARNING] CORE(340,python):2021-05-08-14:44:47.884.654 [mindspore/core/ir/anf_extends.cc:62] fullname_with_scope] Input 0 of cnode is not a value node, its type is CNode.
2

结果不对
问题在于计算结果为标量的时候没有泛化对
subgraph @16_5_✗✗fibonacci.58() {
Return(2)
: ()
# In file /home/lzn/tests/fib.py(18)/ return fibonacci(n-1) + fibonacci(n-2)/
}
这里应该返回调用子图，而不是常量。

输入图片说明
改成这样也还是跑不通，这跟标量泛化关系不太大。

输入图片说明
改成这样可通。

输入图片说明

报错是后端的，看上去是控制流输出标量时，后端可能不支持，需要后端同事一起看看。

(ci3.7) [root@bms-aiserver-pod12-170-21 test]# python fib.py
5
[ERROR] GE(114356,python):2021-05-08-16:01:32.880.257 [mindspore/ccsrc/runtime/device/ascend/ge_runtime/runtime_model.cc:231] Run] Call rt api rtStreamSynchronize failed, ret: 7bc83
[ERROR] DEVICE(114356,python):2021-05-08-16:01:32.880.602 [mindspore/ccsrc/runtime/device/ascend/ascend_kernel_runtime.cc:623] DumpTaskExceptionInfo] Task fail infos task_id: 4, stream_id: 3, tid: 114468, device_id: 4, retcode: 507011
[ERROR] DEVICE(114356,python):2021-05-08-16:01:32.880.633 [mindspore/ccsrc/runtime/device/ascend/ascend_kernel_runtime.cc:632] DumpTaskExceptionInfo] Dump node (Default/StackPush-op41) task error input/output data to: ./task_error_dump/4 trace:

[ERROR] SESSION(114356,python):2021-05-08-16:01:32.894.691 [mindspore/ccsrc/backend/session/ascend_session.cc:1199] Execute] run task error!

标量问题，前端解决。Tensor 问题请后端解决。

打开export ASCEND_SLOG_PRINT_TO_STDOUT=1,发现是StackPush算子报错导致的Run task error：
enter image description here

请AICPU算子同事 @yanzhenxiang2020 帮忙进一步分析算子报错原因。

05.11 进展：

算子`StackPush`报错

根据device侧日志分析是StackPush算子无法找到index为0的栈，通过执行序发现是由于StackInit和StackDestroy算子位置不对；
算子的执行序不符合预期的原因是, 之前默认根图不会被其他子图调用,也就不会有start_label_和end_goto_,插入的StackInit和StackDestroy顺序就不会在生成执行序时被修改；但如果根图有start_label_和end_goto_,生成执行序时StackInit等算子顺序就会被调整。
相关修改后续提PR合入

如上修改完成后报错如下：
enter image description here

分析执行序及后端构图后，发现本用例较为特殊：
enter image description here

根图作为被多次调用的子图, 为了返回实际调用点在最后加入了LabelSwitch
因此导致根图永远都无法退出
当前多子图调用的方案无法支持根图作为被多次调用的子图，转需求吧

Why add an comp/akg label?

Why add an comp/akg label?

@anyrenwei 不小心点到的

@anyrenwei 不小心点到的

@liangzelang Ok..我正在看akg相关的issue，还以为这个case也跟akg这边有关系

import mindspore.ops.composite as C
from mindspore import context
from mindspore import Tensor
import mindspore as ms
from mindspore.common.api import ms_function

context.set_context(mode=context.GRAPH_MODE, save_graphs=True, save_graphs_path='./tir')
grad_by_all = C.GradOperation(get_all=True)
ONE = Tensor(1,ms.int32)
ZERO = Tensor(0,ms.int32)
@ms_function
def f(x):
def fibonacci(n):
if(n < 1):
return 0
elif(n == 1):
return 1
else:
return fibonacci(n-1) + fibonacci(n-2)

x=Tensor(5,ms.int32)
print(x)
y = f(x)
print(y)

验证OK

GVP MindSpore / mindspore

内容风险标识

Environment

Related testcase

评论 (16)

算子`StackPush`报错

GVPMindSpore / mindspore

内容风险标识