Paddle OCR End-to-end Inference

bprjcwpo  于 2021-11-29  发布在  Java
关注(0)|答案(7)|浏览(365)
  • 标题:OCR End-to-end 模型预测时出 core
  • 版本、环境信息:

   1)PaddlePaddle版本:1.2.0.post85
   2)CPU:预测若用CPU,请提供CPU型号,MKL/OpenBlas/MKLDNN/等数学库使用情况
   3)系统环境:CentOS 6.3,Python 2.7.14
-预测信息
   1)Python预测

  • 复现信息:

   预测代码基本同 https://github.com/PaddlePaddle/models/blob/develop/fluid/PaddleCV/ocr_recognition/attention_model.py 中的 attention_infer
   修改部分:

init_ids = fluid.layers.fill_constant_batch_size_like(
                         input=init_state, shape=[-1, 1], value=0, dtype='int64')
    init_ids = fluid.layers.lod_reset(x=init_ids, y=images)
    init_scores = fluid.layers.fill_constant_batch_size_like(
                         input=init_state, shape=[-1, 1], value=1, dtype='float32')
    init_scores = fluid.layers.lod_reset(x=init_scores, y=images)
  • 问题描述:

   第二个时刻的 pre_ids, pre_score 的 shape 和 lod 不太合理,导致 decoder_state_proj 的 shape 不对,最终在 decoder_state_expand = fluid.layers.sequence_expand 处出 core。
   相关的 log:

1547634835      pre_ids Tensor[array_read_0.tmp_0]
        shape: [12,1,]
        dtype: l
        LoD: [[ 0,12, ]]
        data: 0,0,0,0,0,0,0,0,0,0,
1547634835      pre_state       Tensor[array_read_1.tmp_0]
        shape: [12,128,]
        dtype: f
        data: 0.00607281,0.0318905,0.0380818,0,0.0172632,0,0,0.0151736,0,0.0335344,
1547634835      pre_score       Tensor[array_read_2.tmp_0]
        shape: [12,1,]
        dtype: f
        LoD: [[ 0,12, ]]
        data: 1,1,1,1,1,1,1,1,1,1,
1547634835      decoder_state_proj      Tensor[fc_36.tmp_0]
        shape: [12,128,]
        dtype: f
        data: -0.000996387,-0.0245469,0.0171594,0.00709899,0.00126243,-0.0303947,0.00325922,0.00518241,0.0150218,0.0053458,
1547634835      encoder_proj    Tensor[fc_34.tmp_0]
        shape: [768,128,]
        dtype: f
        LoD: [[ 0,64,128,192,256,320,384,448,512,576,640,704,768, ]]
        data: 0.00266408,-0.048337,-0.0366806,0.00265633,0.0432429,0.0254773,-0.00408232,-0.0223291,-0.0024811,-0.0433478,
1547634835      context Tensor[sequence_pool_1.tmp_0]
        shape: [12,256,]
        dtype: f
        data: 0.00655919,0.000162716,0,0.00365045,0.025687,0,0.0030535,0.0152447,0,0.00135731,
1547634835      pre_ids Tensor[array_read_0.tmp_0]
        shape: [1,1,]
        dtype: l
        LoD: [[ 0,12, ][ 0,0,0,0,0,0,0,0,0,0,1,1,1, ]]
        data: 92,
1547634835      pre_state       Tensor[array_read_1.tmp_0]
        shape: [12,128,]
        dtype: f
        data: 0.00450501,-0.0319699,-0.045378,0.00643824,-0.0349801,-0.0209818,-0.0276175,0.0185534,0.00094414,-0.0382715,
1547634835      pre_score       Tensor[array_read_2.tmp_0]
        shape: [1,1,]
        dtype: f
        LoD: [[ 0,12, ][ 0,0,0,0,0,0,0,0,0,0,1,1,1, ]]
        data: -3.49071,
1547634835      decoder_state_proj      Tensor[fc_36.tmp_0]
        shape: [12,128,]
        dtype: f
        data: -0.00517603,0.000552245,0.0362263,-0.049663,0.0338876,-0.0053585,0.0601997,-0.0220834,2.80392e-05,0.0360245,
1547634835      encoder_proj    Tensor[fc_34.tmp_0]
        shape: [768,128,]
        dtype: f
        LoD: [[ 0,64,128,192,256,320,384,448,512,576,640,704,768, ]]
        data: 0.00266408,-0.048337,-0.0366806,0.00265633,0.0432429,0.0254773,-0.00408232,-0.0223291,-0.0024811,-0.0433478,
1547634835      context Tensor[sequence_pool_1.tmp_0]
        shape: [12,256,]
        dtype: f
        data: 0.00655934,0.000162699,0,0.00364994,0.0256868,0,0.00305304,0.0152452,0,0.00135741,
1547634835      pre_ids Tensor[array_read_0.tmp_0]
        shape: [1,1,]
        dtype: l
        LoD: [[ 0,1, ][ 0,1, ]]
        data: 87,
1547634835      pre_state       Tensor[array_read_1.tmp_0]
        shape: [1,128,]
        dtype: f
        data: -0.0141358,0.015022,-0.0510009,0.0237546,-0.0339731,0.00935627,-0.0871612,0.0309292,0.0546338,-0.0213654,
1547634835      pre_score       Tensor[array_read_2.tmp_0]
        shape: [1,1,]
        dtype: f
        LoD: [[ 0,1, ][ 0,1, ]]
        data: -7.96483,
1547634835      decoder_state_proj      Tensor[fc_36.tmp_0]
        shape: [1,128,]
        dtype: f
        data: -0.0427243,0.0284236,0.00746911,-0.0142829,0.0307404,-0.0298453,0.0334686,-0.0290551,0.0832574,0.031448,
1547634835      encoder_proj    Tensor[fc_34.tmp_0]
        shape: [768,128,]
        dtype: f
        LoD: [[ 0,64,128,192,256,320,384,448,512,576,640,704,768, ]]
        data: 0.00266408,-0.048337,-0.0366806,0.00265633,0.0432429,0.0254773,-0.00408232,-0.0223291,-0.0024811,-0.0433478,

***Aborted at 1547634845 (unix time) try "date -d @1547634845" if you are using GNU date***

PC: @                0x0 (unknown)

***SIGSEGV (@0x7f164fd73000) received by PID 20438 (TID 0x7f16a4964700) from PID 1339502592; stack trace:***

    @     0x7f16a411b160 (unknown)
    @     0x7f165b4b1637 paddle::operators::SequenceExpandFunctor<>::operator()()
    @     0x7f165b4b6068 paddle::operators::SequenceExpandKernel<>::Compute()
    @     0x7f165b4b6433 _ZNSt17_Function_handlerIFvRKN6paddle9framework16ExecutionContextEEZNKS1_24OpKernelRegistrarFunctorINS0_8platform8CPUPlaceELb0ELm0EINS0_9operators20SequenceExpandKernelINS7_16CPUDeviceContextEfEENSA_ISB_dEENSA_ISB_iEENSA_ISB_lEEEEclEPKcSI_EUlS4_E_E9_M_invokeERKSt9_Any_dataS4_
    @     0x7f165c1a7c8c paddle::framework::OperatorWithKernel::RunImpl()
    @     0x7f165c1a3dcf paddle::framework::OperatorBase::Run()
    @     0x7f165a9924f3 paddle::framework::Executor::RunPreparedContext()
    @     0x7f165be54b93 paddle::operators::WhileOp::RunImpl()
    @     0x7f165c1a3dcf paddle::framework::OperatorBase::Run()
    @     0x7f165a9924f3 paddle::framework::Executor::RunPreparedContext()
    @     0x7f165a992f20 paddle::framework::Executor::Run()
    @     0x7f165a8a76db _ZZN8pybind1112cpp_function10initializeIZN6paddle6pybindL18pybind11_init_coreERNS_6moduleEEUlRNS2_9framework8ExecutorERKNS6_11ProgramDescEPNS6_5ScopeEibbE64_vIS8_SB_SD_ibbEINS_4nameENS_9is_methodENS_7siblingEEEEvOT_PFT0_DpT1_EDpRKT2_ENUlRNS_6detail13function_callEE1_4_FUNESV_
    @     0x7f165a8e378e pybind11::cpp_function::dispatcher()
    @     0x7f16a443655f PyEval_EvalFrameEx
    @     0x7f16a443886d PyEval_EvalCodeEx
    @     0x7f16a44359fc PyEval_EvalFrameEx
    @     0x7f16a443886d PyEval_EvalCodeEx
    @     0x7f16a44359fc PyEval_EvalFrameEx
    @     0x7f16a443886d PyEval_EvalCodeEx
    @     0x7f16a44389a2 PyEval_EvalCode
    @     0x7f16a4461782 PyRun_FileExFlags
    @     0x7f16a4462af9 PyRun_SimpleFileExFlags
    @     0x7f16a447882d Py_Main
    @     0x7f16a3675bd5 __libc_start_main
    @           0x4007a1 (unknown)
    @                0x0 (unknown)
dgsult0t

dgsult0t1#

根据 beam search 文档,第一个时刻的 pre_ids(即 init_ids)的 lod level 应为 2。
不知道这里怎么修改比较合适?

8dtrkrch

8dtrkrch2#

因为在decoder过程中使用了beam search, init_ids需要根据动态的batch size设置level为2的lod. 如果batch size为k, 则lod需要设置为 [[i for i in range(k+1)], [i for i in range(k+1)]],这里有两个问题:

  1. paddle 缺少range op
  2. lod_reset op仅支持设置1-level的lod: code

解决方法:

  1. 添加range op: PR
  2. 增强lod reset op: PR
  3. 用shape op和slice op获得batch_size=images.shape[0], 然后用range op和lod reset op为init_ids设置正确的lod.
z4iuyo4d

z4iuyo4d3#

【最新进展】
经过实验,按以下方式设置init_ids和init_scores的lod,可以跑通预测流程:

261     # ids, scores as memory
262     ids_array = fluid.layers.create_array('int64')
263     scores_array = fluid.layers.create_array('float32')
264
265     init_ids = fluid.layers.fill_constant_batch_size_like(
266                           input=init_state, shape=[-1, 1], value=0, dtype='int64')
267     init_scores = fluid.layers.fill_constant_batch_size_like(
268                           input=init_state, shape=[-1, 1], value=1, dtype='float32')
269
270     rois_shape = fluid.layers.shape(init_state)
271     batch_size = fluid.layers.slice(rois_shape, axes=[0], starts=[0], ends=[1]) + 1
272     lod_leval = fluid.layers.range(start=0, end=batch_size, step=1, dtype=batch_size.dtype)
273     init_ids = fluid.layers.lod_reset(init_ids, lod_leval)
274     init_ids = fluid.layers.lod_reset(init_ids, lod_leval, append=True)
275     init_scores = fluid.layers.lod_reset(init_scores, init_ids)
276
277     fluid.layers.Print(init_scores, message='init_scores', summarize=10)
278     fluid.layers.Print(init_ids, message='init_ids', summarize=10)
279
280     fluid.layers.array_write(init_ids, array=ids_array, i=counter)
281     fluid.layers.array_write(init_scores, array=scores_array, i=counter)

另外,需要注解掉以下两句,因为Print op不支持TensorArray类型。。。

354 #    fluid.layers.Print(ids_array, message='ids_array')
355 #    fluid.layers.Print(scores_array, message='scores_array')
z3yyvxxp

z3yyvxxp4#

roisshape[0,]LoD[[ 0,0, ]] 时,roi-perspective-transform 会报错:

ROIs should be a 2-D LoDTensor of shape (num_rois, 8)given as [[x0, y0, x1, y1, x2, y2, x3, y3], ...]
soat7uwm

soat7uwm5#

ROI预测为空的bug, 采用IF-ELSE判断,待debug跟进

368yc8dk

368yc8dk6#

解决 roi 为空预测报错的问题:

  1. 在构图相关代码中加入条件判断
rois_shape = fluid.layers.shape(roi_boxes)
    shape = fluid.layers.reduce_prod(rois_shape)
    ones = fluid.layers.fill_constant([1], value=1, dtype='int32')
    cond = fluid.layers.equal(x=shape, y=ones)
    out = fluid.layers.create_global_var(
        shape=[1], value=0.0, dtype='float32', persistable=True)
    with fluid.layers.control_flow.Switch() as switch:
        with switch.case(cond):
            fluid.layers.assign(input=roi_boxes, output=out)
        with switch.default():
            # f_common, roi_boxes (RoI proposals) => f_regions
            roi_dim = roi_boxes.shape
            roi_proposals = fluid.layers.slice(
                roi_boxes, axes=[1], starts=[2], ends=[roi_dim[1]])
            roi_proposals = fluid.layers.lod_reset(
                x=roi_proposals, y=roi_boxes)
            f_regions = fluid.layers.roi_perspective_transform(
                f_common,
                roi_proposals,
                transformed_height=8,
                transformed_width=64,
                spatial_scale=0.25)
            transcripts = seq2seq_recog_head_predict(args, f_regions)
            fluid.layers.assign(input=transcripts, output=out)
    return roi_boxes, out
  1. 在预测过程中加入 shape 判断
boxes, scripts = exe.run(
    test_program,
    feed={'image': img_data},
    fetch_list=[roi_boxes, transcripts],
    return_numpy=False)
boxes_batch = np.array(boxes)
if boxes_batch.shape[-1] == 1:
    continue

遗留问题:

  1. 问题1:报错

  2. 先测无 roi 图片,再测有 roi 图片,报错

Invoke operator beam_search error.
Python Callstacks: 
  File "/root/env/ocr_end/local/lib/python2.7/site-packages/paddle/fluid/framework.py", line 1300, in append_op
    attrs=kwargs.get("attrs", None))
  File "/root/env/ocr_end/local/lib/python2.7/site-packages/paddle/fluid/layer_helper.py", line 56, in append_op
    return self.main_program.current_block().append_op(*args,**kwargs)
  File "/root/env/ocr_end/local/lib/python2.7/site-packages/paddle/fluid/layers/nn.py", line 3977, in beam_search
    'end_id': end_id,
  File "/root/workspace/ocr-end2end/paddle-fluid-end2end/paddle_textnet/model/seq2seq_attention.py", line 363, in seq2seq_recog_head_predict
    1,  # end_id
  File "/root/workspace/ocr-end2end/paddle-fluid-end2end/paddle_textnet/model/network.py", line 264, in se_resnext_end2end_predict
    transcripts = seq2seq_recog_head_predict(args, f_regions)
  File "test_end2end.py", line 154, in main
    test_args, input_image)
  File "test_end2end.py", line 250, in <module>
    main()
C++ Callstacks: 
an illegal memory access was encountered at [/root/workspace/autoML/slim_branch/Paddle/paddle/fluid/platform/device_context.cc:327]
  1. 先测有 roi 图片,再测无 roi 图片,不报错
  2. 问题2:save 的 model 不包含 switch 里面的 op

打开保存的 __model__ 发现不包含 range op

xlpyo6sf

xlpyo6sf7#

问题1定位问题:
使用 fluid.layers.beam_search 会报错,删掉不报错
目前先删掉,待 debug

问题2定位问题:
paddle 版本问题

相关问题