Paddle Tensor holds the wrong type, it holds float, but desires to be int

xoshrz7s  于 2022-10-20  发布在  其他
关注(0)|答案(2)|浏览(145)
  • 版本、环境信息:

paddlePaddle 1.7.2.post97, compiled with
with_avx: ON
with_gpu: ON
with_mkl: ON
with_mkldnn: ON
with_python: ON

mkldnn预测核心代码:

def load_inference_model(args):
    use_gpu = False
    # 设置AnalysisConfig
    config = AnalysisConfig(os.path.join(args.weights, "model"), os.path.join(args.weights, "params"))
    if use_gpu:
        print("use gpu infer")
        config.enable_use_gpu(memory_pool_init_size_mb=3000)
    else:
        print("use cpu infer")
        config.disable_gpu()
        thread_num = 5
        config.set_cpu_math_library_num_threads(thread_num)
        config.enable_mkldnn()
    # 创建PaddlePredictor
    predictor = create_paddle_predictor(config)
    return predictor
  • 复现信息:paddle1.7.1训练的cascadeFrcnn模型(PaddleDetection,release/0.2),用paddle1.7.1官方镜像,开启mkldnn CPU预测时报错,yolov3等预测正常;

切换到paddle1.7.2镜像后,cascadeFrcnn用mkldnn CPU预测时正常,但yolov3系列模型都报错,报错如下:

use cpu infer
I0428 08:20:08.298830  1195 analysis_predictor.cc:84] Profiler is deactivated, and no profiling report will be generated.
I0428 08:20:08.321368  1195 analysis_predictor.cc:833] MODEL VERSION: 1.7.1
I0428 08:20:08.321425  1195 analysis_predictor.cc:835] PREDICTOR VERSION: 1.7.2
I0428 08:20:08.321964  1195 analysis_predictor.cc:423] MKLDNN is enabled
--- Running analysis [ir_graph_build_pass]
--- Running analysis [ir_graph_clean_pass]
--- Running analysis [ir_analysis_pass]
--- Running IR pass [mkldnn_placement_pass]
--- Running IR pass [simplify_with_basic_ops_pass]
--- Running IR pass [attention_lstm_fuse_pass]
--- Running IR pass [seqconv_eltadd_relu_fuse_pass]
--- Running IR pass [seqpool_cvm_concat_fuse_pass]
--- Running IR pass [fc_lstm_fuse_pass]
--- Running IR pass [mul_lstm_fuse_pass]
--- Running IR pass [fc_gru_fuse_pass]
--- Running IR pass [mul_gru_fuse_pass]
--- Running IR pass [seq_concat_fc_fuse_pass]
--- Running IR pass [fc_fuse_pass]
--- Running IR pass [repeated_fc_relu_fuse_pass]
--- Running IR pass [squared_mat_sub_fuse_pass]
--- Running IR pass [conv_bn_fuse_pass]
I0428 08:20:08.924206  1195 graph_pattern_detector.cc:101] ---  detected 72 subgraphs
--- Running IR pass [conv_eltwiseadd_bn_fuse_pass]
--- Running IR pass [conv_transpose_bn_fuse_pass]
--- Running IR pass [conv_transpose_eltwiseadd_bn_fuse_pass]
--- Running IR pass [is_test_pass]
--- Running IR pass [runtime_context_cache_pass]
--- Running IR pass [depthwise_conv_mkldnn_pass]
--- Running IR pass [conv_bn_fuse_pass]
--- Running IR pass [conv_eltwiseadd_bn_fuse_pass]
--- Running IR pass [conv_transpose_bn_fuse_pass]
--- Running IR pass [conv_transpose_eltwiseadd_bn_fuse_pass]
--- Running IR pass [conv_bias_mkldnn_fuse_pass]
I0428 08:20:09.013048  1195 graph_pattern_detector.cc:101] ---  detected 6 subgraphs
--- Running IR pass [conv_transpose_bias_mkldnn_fuse_pass]
--- Running IR pass [conv3d_bias_mkldnn_fuse_pass]
--- Running IR pass [conv_elementwise_add_mkldnn_fuse_pass]
I0428 08:20:09.071956  1195 graph_pattern_detector.cc:101] ---  detected 4 subgraphs
I0428 08:20:09.088235  1195 graph_pattern_detector.cc:101] ---  detected 3 subgraphs
I0428 08:20:09.106846  1195 graph_pattern_detector.cc:101] ---  detected 15 subgraphs
I0428 08:20:09.114917  1195 conv_elementwise_add_mkldnn_fuse_pass.cc:335] Fused graph 13
--- Running IR pass [conv_concat_relu_mkldnn_fuse_pass]
--- Running IR pass [conv_relu_mkldnn_fuse_pass]
I0428 08:20:09.140857  1195 graph_pattern_detector.cc:101] ---  detected 45 subgraphs
--- Running IR pass [conv_leaky_relu_mkldnn_fuse_pass]
I0428 08:20:09.156468  1195 graph_pattern_detector.cc:101] ---  detected 20 subgraphs
--- Running IR pass [conv_relu6_mkldnn_fuse_pass]
--- Running analysis [ir_params_sync_among_devices_pass]
--- Running analysis [adjust_cudnn_workspace_size_pass]
--- Running analysis [inference_op_replace_pass]
--- Running analysis [ir_graph_to_program_pass]
T0428 08:20:09.209525  1195 analysis_predictor.cc:462] ======= optimize end =======
W0428 08:20:09.372357  1195 naive_executor.cc:45] The NaiveExecutor can not work properly if the cmake flag ON_INFER is not set.
W0428 08:20:09.372431  1195 naive_executor.cc:47] Unlike the training phase, all the scopes and variables will be reused to save the allocation overhead.
W0428 08:20:09.372442  1195 naive_executor.cc:50] Please re-compile the inference library by setting the cmake flag ON_INFER=ON if you are running Paddle Inference
Traceback (most recent call last):
  File "preprocess/infer_demo_mkldnn.py", line 162, in <module>
    infer(args)
  File "preprocess/infer_demo_mkldnn.py", line 117, in infer
    outputs = predictor.run(data_list)
paddle.fluid.core_avx.EnforceNotMet:

--------------------------------------------
C++ Call Stacks (More useful to developers):
--------------------------------------------
0   std::string paddle::platform::GetTraceBackString<std::string const&>(std::string const&, char const*, int)
1   paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int)
2   int const* paddle::framework::Tensor::data<int>() const
3   paddle::operators::YoloBoxKernel<float>::Compute(paddle::framework::ExecutionContext const&) const
4   std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CPUPlace, false, 0ul, paddle::operators::YoloBoxKernel<float>, paddle::operators::YoloBoxKernel<double> >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&)
5   paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const
6   paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const
7   paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&)
8   paddle::framework::NaiveExecutor::Run()
9   paddle::AnalysisPredictor::Run(std::vector<paddle::PaddleTensor, std::allocator<paddle::PaddleTensor> > const&, std::vector<paddle::PaddleTensor, std::allocator<paddle::PaddleTensor> >*, int)

------------------------------------------
Python Call Stacks (More useful to users):
------------------------------------------
  File "/root/miniconda3/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2525, in append_op
    attrs=kwargs.get("attrs", None))
  File "/root/miniconda3/lib/python3.7/site-packages/paddle/fluid/layer_helper.py", line 43, in append_op
    return self.main_program.current_block().append_op(*args,**kwargs)
  File "/root/miniconda3/lib/python3.7/site-packages/paddle/fluid/layers/detection.py", line 1100, in yolo_box
    attrs=attrs)
  File "/home/lj/DLcode/PaddleDetection_v2/ppdet/modeling/anchor_heads/yolo_head.py", line 318, in get_prediction
    name=self.prefix_name + "yolo_box" + str(i))
  File "/home/lj/DLcode/PaddleDetection_v2/ppdet/modeling/architectures/yolov3.py", line 91, in build
    return self.yolo_head.get_prediction(body_feats, im_size)
  File "/home/lj/DLcode/PaddleDetection_v2/ppdet/modeling/architectures/yolov3.py", line 162, in test
    return self.build(feed_vars, mode='test')
  File "tools/export_model.py", line 191, in main
    test_fetches = model.test(feed_vars)
  File "tools/export_model.py", line 210, in <module>
    main()

----------------------
Error Message Summary:
----------------------
InvalidArgumentError: Tensor holds the wrong type, it holds float, but desires to be int.
  [Hint: Expected valid == true, but received valid:0 != true:1.] at (/paddle/paddle/fluid/framework/tensor_impl.h:33)
  [operator < yolo_box > error]
oo7oh9g9

oo7oh9g91#

  1. 初步怀疑可能是与yolo_box相关的OP输入的数据类型不正确
    参考 ZeroCopyRun()崩溃 #24146
  2. 希望能提供与服务解藕的单测代码,方便定位问题
bweufnob

bweufnob2#

的确是输入的数据类型不对,修正后可正常运行。yolov3的输入“image”是float32; “ image_shape”应该是int32。非常感谢!

相关问题