- 标题:paddle-trt预测faster-rcnn模型加载报错
- 版本、环境信息:
1)PaddlePaddle版本:v1.8.5
2)GPU:Tesla P4, CUDA10, cuDNN7.6, TensorRT7.0.0.11
3)系统环境:Ubuntu16.04
- 预测信息
1)C++预测:编译的paddle-trt预测库,编译参数:
cmake .. \
-DWITH_MKL=OFF \
-DWITH_MKLDNN=OFF \
-DCMAKE_BUILD_TYPE=Release \
-DWITH_PYTHON=OFF \
-DWITH_XBYAK=ON \
-DTENSORRT_ROOT=/opt/TensorRT-7.0.0.11 \
-DON_INFER=ON \
-DFLUID_INFERENCE_INSTALL_DIR=/ljay/workspace/proj/ljay-cuda10/Paddle/install
- 问题描述:编译的预测库,使用AnalysisConfig,加载faster-rcnn模型报dims的错误如下:
I1009 12:20:44.080718 3812 graph_pattern_detector.cc:101] --- detected 55 subgraphs
--- Running IR pass [conv_eltwiseadd_affine_channel_fuse_pass]
--- Running IR pass [shuffle_channel_detect_pass]
--- Running IR pass [quant_conv2d_dequant_fuse_pass]
--- Running IR pass [delete_quant_dequant_op_pass]
--- Running IR pass [simplify_with_basic_ops_pass]
--- Running IR pass [embedding_eltwise_layernorm_fuse_pass]
--- Running IR pass [multihead_matmul_fuse_pass_v2]
--- Running IR pass [skip_layernorm_fuse_pass]
--- Running IR pass [conv_bn_fuse_pass]
--- Running IR pass [fc_fuse_pass]
I1009 12:20:44.197888 3812 graph_pattern_detector.cc:101] --- detected 2 subgraphs
I1009 12:20:44.199591 3812 graph_pattern_detector.cc:101] --- detected 2 subgraphs
--- Running IR pass [tensorrt_subgraph_pass]
I1009 12:20:44.252492 3812 tensorrt_subgraph_pass.cc:115] --- detect a sub-graph with 13 nodes
W1009 12:20:44.253955 3812 tensorrt_subgraph_pass.cc:285] The Paddle lib links the 7011 version TensorRT, make sure the runtime TensorRT you are using is no less than this version, otherwise, there might be Segfault!
I1009 12:20:44.254004 3812 tensorrt_subgraph_pass.cc:321] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time.
terminate called after throwing an instance of 'paddle::platform::EnforceNotMet'
what():
--------------------------------------------
C++ Call Stacks (More useful to developers):
--------------------------------------------
0 std::string paddle::platform::GetTraceBackString<std::string >(std::string&&, char const*, int)
1 paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int)
2 paddle::inference::tensorrt::OpConverter::ConvertBlockToTRTEngine(paddle::framework::BlockDesc*, paddle::framework::Scope const&, std::vector<std::string, std::allocator<std::string > > const&, std::unordered_set<std::string, std::hash<std::string >, std::equal_to<std::string >, std::allocator<std::string > > const&, std::vector<std::string, std::allocator<std::string > > const&, paddle::inference::tensorrt::TensorRTEngine*)
3 paddle::inference::analysis::TensorRtSubgraphPass::CreateTensorRTOp(paddle::framework::ir::Node*, paddle::framework::ir::Graph*, std::vector<std::string, std::allocator<std::string > > const&, std::vector<std::string, std::allocator<std::string > >*) const
4 paddle::inference::analysis::TensorRtSubgraphPass::ApplyImpl(paddle::framework::ir::Graph*) const
5 paddle::framework::ir::Pass::Apply(paddle::framework::ir::Graph*) const
6 paddle::inference::analysis::IRPassManager::Apply(std::unique_ptr<paddle::framework::ir::Graph, std::default_delete<paddle::framework::ir::Graph> >)
7 paddle::inference::analysis::IrAnalysisPass::RunImpl(paddle::inference::analysis::Argument*)
8 paddle::inference::analysis::Analyzer::RunAnalysis(paddle::inference::analysis::Argument*)
9 paddle::AnalysisPredictor::OptimizeInferenceProgram()
10 paddle::AnalysisPredictor::PrepareProgram(std::shared_ptr<paddle::framework::ProgramDesc> const&)
11 paddle::AnalysisPredictor::Init(std::shared_ptr<paddle::framework::Scope> const&, std::shared_ptr<paddle::framework::ProgramDesc> const&)
12 std::unique_ptr<paddle::PaddlePredictor, std::default_delete<paddle::PaddlePredictor> > paddle::CreatePaddlePredictor<paddle::AnalysisConfig, (paddle::PaddleEngineKind)2>(paddle::AnalysisConfig const&)
13 std::unique_ptr<paddle::PaddlePredictor, std::default_delete<paddle::PaddlePredictor> > paddle::CreatePaddlePredictor<paddle::AnalysisConfig>(paddle::AnalysisConfig const&)
----------------------
Error Message Summary:
----------------------
InvalidArgumentError: TensorRT's tensor input requires at least 2 dimensions, but input fpn_topdown_res4_sum.tmp_0 has 1 dims.
[Hint: Expected shape.size() > 1UL, but received shape.size():1 <= 1UL:1.] at (/ljay/workspace/proj/ljay-cuda10-paddletrt/Paddle/paddle/fluid/inference/tensorrt/engine.h:67)
Aborted (core dumped)
config配置如下:
paddle::AnalysisConfig config;
std::map<std::string, std::vector<int>> min_input_shape = {{"image", {1, 3, 1, 1}}};
std::map<std::string, std::vector<int>> max_input_shape = {{"image", {1, 3, 1312, 1312}}};
std::map<std::string, std::vector<int>> opt_input_shape = {{"image", {1, 3, 960, 960}}};
config.SetModelBuffer(
prog_file_cont.data(), prog_file_cont.length(),
params_file_cont.data(), params_file_cont.length()
);
config.EnableUseGpu(1000, _config.device);
config.EnableTensorRtEngine(20 << 20,
1,
3,
paddle::AnalysisConfig::Precision::kFloat32,
false,
false);
config.SetTRTDynamicShapeInfo(min_input_shape, max_input_shape, opt_input_shape);
请问:
1. faster-rcnn是否已经支持?
2. 如上报错是什么问题,应该怎么解决?
谢谢~
4条答案
按热度按时间ajsxfq5m1#
完整log:
[抱歉之前贴的是Retinanet的log,根据建议将最小子图节点数改为32后运行成功]
但将faster-rcnn的最小子图节点数改为32后报了另一个错误,日志如下:
c3frrgcw2#
请问解决了吗,『最小子图节点数改为32』 怎么实现的呢
w9apscun3#
void EnableTensorRtEngine(int workspace_size = 1 << 20,
int max_batch_size = 1, int min_subgraph_size = 3,
Precision precision = Precision::kFloat32,
bool use_static = false,
bool use_calib_mode = true);
设置min_subgraph_size=32
o7jaxewo4#
请问解决了吗,『最小子图节点数改为32』 怎么实现的呢
@shikeno 我用的1.8.5版本的paddle,据说需要用2.0-beta,才能解决
Instruction: CHECK_BROADCAST 52 51
的问题,还没验证。