Paddle 使用IfElse控制时,报维数错误

u4dcyp6a  于 2021-11-30  发布在  Java
关注(0)|答案(3)|浏览(312)

版本、环境信息:
1)PaddlePaddle版本:1.5
2)CPU:
3)GPU:None
4)系统环境:centos 6.3, python 2.7.13

训练信息
1)单机,3cpu

复现信息:
逐步执行时,发现是这一步出错:optimizer.minimize(train_loss)。

问题描述:请详细描述您的问题,同步贴出报错信息、日志、可复现的代码片段
希望根据输入(cnt_in)的值判别,当该输入<0时,将embedding出的向量各分量乘以-1
模型建立和编译都能通过,但是开始训练时会报错:按位乘法维数不匹配。
(如果乘子fct不是通过IfElse给出,而是直接通过fill_constant定义的话,就不会出错)

模型网格:
self.lr_dat["usr_in"] = fluid.layers.data(name='usr_in', shape=[1], dtype='int64')
self.lr_dat["cnt_in"] = fluid.layers.data(name='cnt_in', shape=[1], dtype='int64')
self.lr_dat["label_in"] = fluid.layers.data(name='label_in', shape=[1], dtype='float32')

self.lr["usr_emb"] = fluid.layers.embedding(input=self.lr_dat["usr_in"], size=[7, 32], is_sparse=False)
    self.lr["cnt_emb"] = fluid.layers.embedding(input=self.lr_dat["cnt_in"], size=[14, 32], is_sparse=False)

    # cnt_fct = fluid.layers.fill_constant(shape=[32], value=-1.0, dtype='float32')

    cond = self.lr_dat["cnt_in"] > fluid.layers.zeros_like(self.lr_dat["cnt_in"])
    ifel = fluid.layers.IfElse(cond)
    with ifel.true_block():
        fct = fluid.layers.fill_constant(shape=[32], value=1.0, dtype='float32')
        ifel.output(fct)
    with ifel.false_block():
        fct = fluid.layers.fill_constant(shape=[32], value=-1.0, dtype='float32')
        ifel.output(fct)

    cnt_fct = ifel()
    self.lr["cnt_emb"] = fluid.layers.elementwise_mul(self.lr["cnt_emb"], cnt_fct)

    self.lr["inner_prod"] = fluid.layers.cos_sim(self.lr["usr_emb"], self.lr["cnt_emb"])
    self.lr["pred"] = fluid.layers.sigmoid(self.lr["inner_prod"])
    self.lr["raw_loss"] = fluid.layers.square_error_cost(input=self.lr["pred"], label=self.lr_dat["label_in"])
    self.lr["train_loss"] = fluid.layers.mean(self.lr["raw_loss"])

错误信息:
WARNING:root:2019-09-11 11:04:40.428257: start training
I0911 11:04:40.473984 65127 parallel_executor.cc:329] The number of CPUPlace, which is used in ParallelExecutor, is 3. And the Program will be copied 3 copies
I0911 11:04:40.478063 65127 build_strategy.cc:340] SeqOnlyAllReduceOps:0, num_trainers:1
Traceback (most recent call last):
File "test_ifelse.py", line 115, in
train_resu = exe.run(train_prog_comp, feed=data, fetch_list=[network.lr["train_loss"]])
File "/home/paddle/tools/paddle_release_home/python/lib/python2.7/site-packages/paddle/fluid/executor.py", line 666, in run
return_numpy=return_numpy)
File "/home/paddle/tools/paddle_release_home/python/lib/python2.7/site-packages/paddle/fluid/executor.py", line 528, in _run_parallel
exe.run(fetch_var_names, fetch_var_name)
paddle.fluid.core_avx.EnforceNotMet: Invoke operator elementwise_mul error.
Python Callstacks:
File "/home/paddle/tools/paddle_release_home/python/lib/python2.7/site-packages/paddle/fluid/framework.py", line 1771, in append_op
attrs=kwargs.get("attrs", None))
File "/home/paddle/tools/paddle_release_home/python/lib/python2.7/site-packages/paddle/fluid/layer_helper.py", line 43, in append_op
return self.main_program.current_block().append_op(*args,**kwargs)
File "/home/paddle/tools/paddle_release_home/python/lib/python2.7/site-packages/paddle/fluid/layers/nn.py", line 9904, in _elementwise_op
'use_mkldnn': use_mkldnn})
File "/home/paddle/tools/paddle_release_home/python/lib/python2.7/site-packages/paddle/fluid/layers/nn.py", line 9965, in elementwise_mul
return _elementwise_op(LayerHelper('elementwise_mul',**locals()))
File "test_ifelse.py", line 64, in construct_network
self.lr["cnt_emb"] = fluid.layers.elementwise_mul(self.lr["cnt_emb"], cnt_fct)
File "test_ifelse.py", line 80, in
network.construct_network()
C++ Callstacks:
Enforce failed. Expected x_dims[i + axis] == y_dims[i], but received x_dims[i + axis]:32 != y_dims[i]:64.
Broadcast dimension mismatch. at [/paddle/paddle/fluid/operators/elementwise/elementwise_op_function.h:63]
PaddlePaddle Call Stacks:
0 0x7f0786f9f808p void paddle::platform::EnforceNotMet::Initstd::string(std::string, char const*, int) + 360
1 0x7f0786f9fb57p paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int) + 87
2 0x7f07876b6b5cp paddle::operators::get_mid_dims(paddle::framework::DDim const&, paddle::framework::DDim const&, int, int*, int*, int*) + 364
3 0x7f07876baba0p void paddle::operators::ElementwiseComputeEx<paddle::operators::MulFunctor, paddle::platform::CPUDeviceContext, float, float>(paddle::framework::ExecutionContext const&, paddle::framework::Tensor const*, paddle::framework::Tensor const*, int, paddle::operators::MulFunctor, paddle::framework::Tensor*) + 816
4 0x7f0787c26aa3p void paddle::operators::default_elementwise_mul<paddle::platform::CPUDeviceContext, float>(paddle::framework::ExecutionContext const&, paddle::framework::Tensor const*, paddle::framework::Tensor const*, paddle::framework::Tensor*) + 115

ktca8awb

ktca8awb1#

你这个code的目的是说对于是true的地方生成1对于false的地方生成-1么

hsvhsicv

hsvhsicv2#

不是,是希望给不同类型的idx在embedding的时候乘上不同的权重

bf1o4zei

bf1o4zei3#

基本确认上面的问题原因,是paddle的IfElse模块中分别建立了fct之后,会merge在一起,因此两部分新建的2个32维向量最终成了64维。

现在修改代码如下:

self.lr_dat["cnt_in"] = fluid.layers.data(name='cnt_in', shape=[1], dtype='int64')
    self.lr["cnt_emb"] = fluid.layers.embedding(input=self.lr_dat["cnt_in"], size=[14, 32], is_sparse=False)
    raw_fct = fluid.layers.fill_constant(shape=[4, 32], value=1.0, dtype='float32')
    # raw_fct = fluid.layers.fill_constant_batch_size_like(input=self.lr["cnt_emb"]
    #                                                      , dtype='float32', shape=[32], value=1.0)
    limit = fluid.layers.fill_constant_batch_size_like(input=self.lr_dat["cnt_in"],
                                                       dtype='int64', shape=[4, 1], value=0)
    # limit = fluid.layers.fill_constant(shape=[1], value=0, dtype='int64')
    cond = fluid.layers.greater_than(x=self.lr_dat["cnt_in"], y=limit)
    ifel = fluid.layers.IfElse(cond)
    with ifel.true_block():
        fct = ifel.input(raw_fct)
        # fct_true = fluid.layers.fill_constant(shape=[32], value=1.0, dtype='float32')
        ifel.output(fct)
    with ifel.false_block():
        fct = ifel.input(raw_fct)
        tmp = fluid.layers.fill_constant_batch_size_like(input=raw_fct, dtype="int64", shape=[-1, 32], value=-1)
        fct = fluid.layers.elementwise_mul(fct, tmp)
        # fct_false = fluid.layers.fill_constant(shape=[32], value=-1.0, dtype='float32')
        ifel.output(fct)

依然会报错:

WARNING:root:2019-09-12 14:45:48.854809: start
Traceback (most recent call last):
File "test_ifelse.py", line 93, in
optimizer.minimize(network.lr["train_loss"])
File "</home/paddle/tools/paddle_release_home/python/lib/python2.7/site-packages/decorator.pyc:decorator-gen-20>", line 2, in minimize
File "/home/paddle/tools/paddle_release_home/python/lib/python2.7/site-packages/paddle/fluid/wrapped_decorator.py", line 25, inimpl
return wrapped_func(args,kwargs)
File "/home/paddle/tools/paddle_release_home/python/lib/python2.7/site-packages/paddle/fluid/dygraph/base.py", line 87, in
impl
*
return func(*args,**kwargs)
File "/home/paddle/tools/paddle_release_home/python/lib/python2.7/site-packages/paddle/fluid/optimizer.py", line 594, in minimize
no_grad_set=no_grad_set)
File "/home/paddle/tools/paddle_release_home/python/lib/python2.7/site-packages/paddle/fluid/optimizer.py", line 493, in backward
no_grad_set, callbacks)
File "/home/paddle/tools/paddle_release_home/python/lib/python2.7/site-packages/paddle/fluid/backward.py", line 578, in append_backward

  • append_backward_vars*(root_block, fwd_op_num, grad_to_var, grad_info_map)

File "/home/paddle/tools/paddle_release_home/python/lib/python2.7/site-packages/paddle/fluid/backward.py", line 392, in append_backward_vars
op_desc.infer_shape(block.desc)
paddle.fluid.core_avx.EnforceNotMet: at [/paddle/paddle/fluid/operators/controlflow/conditional_block_op.cc:147]
PaddlePaddle Call Stacks:
0 0x7f9e358802e0p void paddle::platform::EnforceNotMet::Init<char const*>(char const*, char const*, int) + 352
1 0x7f9e35880659p paddle::platform::EnforceNotMet::EnforceNotMet(std::__exception_ptr::exception_ptr, char const*, int) + 137
2 0x7f9e3666a441p paddle::operators::ConditionalBlockGradInferShape::operator()(paddle::framework::InferShapeContext*) const + 1089

(这里给传入的raw_fct赋值维度[4, 32],其中4是对应了batch_size。)

相关问题