[paddle.nn.functional.ctc_loss] FatalError: Segmentation fault is detected by the operating system.

avwztpqn  于 4个月前  发布在  其他
关注(0)|答案(1)|浏览(98)

bug描述 Describe the Bug

I encountered an issue when using paddle.nn.functional.ctc_loss with Python 3.12. The following code crashed and raised a Segmentation fault :

import paddle

def func():
    tensor1 = paddle.rand([5, 3, 15], dtype='float64')
    tensor2 =paddle.to_tensor(
       [[-879 , -11, -714 , -202 ],
        [-16, -312 , -93  , -494 ],
        [-919 , -1281, -1495, -15]], dtype='int32')

    res = paddle.nn.functional.ctc_loss(
        log_probs=tensor1,
        labels=tensor2,
        input_lengths=paddle.to_tensor([5, 5, 5]),
        label_lengths=paddle.to_tensor([4, 2, 2]),
        blank= 14,
        reduction="mean",

    )
    
    return res

result = func()
print(result)

The error message is as follows:

--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
0   paddle::pybind::eager_api_warpctc(_object*, _object*, _object*)
1   warpctc_ad_func(paddle::Tensor const&, paddle::Tensor const&, paddle::optional<paddle::Tensor> const&, paddle::optional<paddle::Tensor> const&, int, bool)
2   paddle::experimental::warpctc_intermediate(paddle::Tensor const&, paddle::Tensor const&, paddle::optional<paddle::Tensor> const&, paddle::optional<paddle::Tensor> const&, int, bool)
3   void phi::WarpctcKernel<double, phi::CPUContext>(phi::CPUContext const&, phi::DenseTensor const&, phi::DenseTensor const&, paddle::optional<phi::DenseTensor> const&, paddle::optional<phi::DenseTensor> const&, int, bool, phi::DenseTensor*, phi::DenseTensor*)
4   phi::WarpCTCFunctor<phi::CPUContext, double>::operator()(phi::CPUContext const&, double const*, double*, int const*, int const*, int const*, unsigned long, unsigned long, unsigned long, double*)
5   compute_ctc_loss_double
6   CpuCTC<double>::cost_and_grad(double const*, double*, double*, int const*, int const*, int const*)
7   CpuCTC<double>::compute_betas_and_grad(double*, double const*, double, int, int, int, int const*, int const*, int const*, double*, double*, double*)

----------------------
Error Message Summary:
----------------------
FatalError: `Segmentation fault` is detected by the operating system.
  [TimeInfo: *** Aborted at 1721874943 (unix time) try "date -d @1721874943" if you are using GNU date ***]
  [SignalInfo: *** SIGSEGV (@0xfffffffe93671f30) received by PID 4006959 (TID 0x7f6a67274280) from PID 18446744071887593264 ***]

Segmentation fault (core dumped)

The code was tested in Paddle version 3.0.0b1 -cpu. Interestingly, it works well when I switch to Python version 3.10.14 .

其他补充信息 Additional Supplementary Information

No response

8iwquhpp

8iwquhpp1#

https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/hardware_support/npu/install_cn.html
按照上述流程进行安装,执行到下面这一步的时候,也报了跟楼主一样的错误

飞桨基础健康检查

python -c "import paddle; paddle.utils.run_check()"

python==3.10.0

错误日志:
paddle.utils.run_check()
Running verify PaddlePaddle program ...
I0829 16:44:22.169772 804539 program_interpreter.cc:243] New Executor is Running.

C++ Traceback (most recent call last):

0 paddle::framework::StandaloneExecutor::Run(std::vector<std::string, std::allocator<std::string > > const&, bool)
1 paddle::framework::InterpreterCore::Run(std::vector<std::string, std::allocator<std::string > > const&, bool, bool, bool, bool)
2 paddle::framework::ProgramInterpreter::Run(std::vector<std::string, std::allocator<std::string > > const&, bool, bool, bool, bool)
3 paddle::framework::ProgramInterpreter::Build(std::vector<std::string, std::allocator<std::string > > const&, std::vector<paddle::framework::OpFuncNode, std::allocatorpaddle::framework::OpFuncNode >, bool)
4 paddle::framework::interpreter::BuildOpFuncList(phi::Place const&, paddle::framework::BlockDesc const&, std::set<std::string, std::less<std::string >, std::allocator<std::string > > const&, std::vector<paddle::framework::OpFuncNode, std::allocatorpaddle::framework::OpFuncNode >
, paddle::framework::VariableScope*, paddle::framework::interpreter::ExecutionConfig const&, std::vector<std::function<void (paddle::framework::OperatorBase*, paddle::framework::Scope*)>, std::allocator<std::function<void (paddle::framework::OperatorBase*, paddle::framework::Scope*)> > > const&, std::vector<std::function<void (paddle::framework::OperatorBase*, paddle::framework::Scope*)>, std::allocator<std::function<void (paddle::framework::OperatorBase*, paddle::framework::Scope*)> > > const&, bool, bool)
5 void custom_kernel::MatmulKernel<float, phi::CustomContext>(phi::CustomContext const&, phi::DenseTensor const&, phi::DenseTensor const&, bool, bool, phi::DenseTensor*)
6 aclnnMatmul
7 InitL2Phase2Context(char const*, aclOpExecutor*)
8 GetOpExecCacheFromExecutor(aclOpExecutor*)

Error Message Summary:

FatalError: Segmentation fault is detected by the operating system.
[TimeInfo: *** Aborted at 1724921062 (unix time) try "date -d @1724921062" if you are using GNU date ***]
[SignalInfo: *** SIGSEGV (@0xc46bb) received by PID 804539 (TID 0xffff433e59c0) from PID 804539 ***]

段错误 (核心已转储)

相关问题