Paddle Porting models from other frameworks to paddlepaddle

dw1jzc5e  于 2021-11-30  发布在  Java
关注(0)|答案(9)|浏览(334)

I am looking to port a object detection model trained in keras to paddlepaddle.
What I have ->

  1. Knowledge of the exact architecture of the model. I can implement them both in keras and in paddlepaddle
  2. Trained model present in keras framework

What I want ->
Trained model in paddlepaddle framework

Naive approach ->
Store the weights and biases from the keras model as a numpy and then define the model in paddlepaddle framework and load these weights into the model.
Issue ->
I am familiar with keras, and so I can easily extract the weights from the model and save it in numpy format. But I am not familiar with the paddlepaddle framework, so I don't know how to load weights into the model, one layer at a time, in form of numpy matrices.

P.S : If there is any other/better approach of doing what I am trying to do, let me know.

Thank You.

8nuwlpux

8nuwlpux1#


# pd_weights is a dict with parameter's name as key and a numpy array as value.

for block in fluid.default_main_program().blocks:
    for param in block.all_parameters():
        pd_var = fluid.global_scope().find_var(param.name)
        pd_param = pd_var.get_tensor()
        print("load: {}".format(param.name))
        pd_param.set(pd_weights[param.name], place)
fcipmucu

fcipmucu2#

I tried this

print("Parameters")
    for block in fluid.default_main_program().blocks:
        print("Blocks", block)
        for param in block.all_parameters():
            print("Params", param)
            pd_var = fluid.global_scope().find_var(param.name)
            pd_param = pd_var.get_tensor()
            print("load: {}".format(param.name))
            # pd_param.set(pd_weights[param.name], place)

Output ->

Parameters
Blocks idx: 0
parent_idx: -1

I inserted the code snippet above at line 208 in the following file ->
https://github.com/PaddlePaddle/models/blob/develop/PaddleCV/ssd/train.py

What went wrong? I assumed that the output should have been the names of all the parameters present in my model.

mpbci0fu

mpbci0fu3#

https://github.com/PaddlePaddle/models/blob/develop/PaddleCV/ssd/train.py
The network is defined in train_prog but not the fluid.default_main_program(). So you should write:

for block in train_prog.blocks:
...
...
jljoyd4f

jljoyd4f4#

@wanghaoshuang Seems to work!!

Is there a pd_param.get() too? Because I want to see whether the weights were set properly or not

Thank You

2izufjch

2izufjch5#

You can get parameters from scope as numpy array.

pd_param = numpy.array(fluid.global_scope().find_var(param.name).get_tensor())
print pd_param
c2e8gylq

c2e8gylq6#

Ohh great!! Will try that.

Btw, one final thing. The weights that I have ported are still trainable right? I mean just because I loaded the weights from numpy doesn't mean the default is set to not trainable or something right?

Thank You

wa7juj8i

wa7juj8i7#

Yes. They are still trainable.

tv6aics1

tv6aics18#

@wanghaoshuang what if I want them to not be trainable? How can I control that?

ycggw6v2

ycggw6v29#

I tried the following ->

for block in train_prog.blocks:
    for param in block.all_parameters():
        pd_var = fluid.global_scope().find_var(param.name)
        pd_param = pd_var.get_tensor()
        print("load: {}".format(param.name))
        pd_param.set(pd_weights[param.name], place)
        param.stop_gradient = True

But this didn't work and weights were still getting updated. Probably because the optimizer is defined and optimizer.minimize is called before this. So I thought I need to do optimizer.minimize later

I tried


# In function build_program

...
                    loss = fluid.layers.reduce_sum(loss)
                    for block in main_prog.blocks:
                        for param in block.all_parameters():
                            param.stop_gradient = True
                    optimizer = optimizer_setting(train_params)
                    optimizer.minimize(loss)
...

But this gave me segmentation fault. Not only that, but after I run this code, my GPU memory is occupied but I can see no python process running and so finally I had to reboot the server to get the memory back.

Then I thought maybe I should isolate the optimizer.minimize part, so I did this ->

train_py_reader, loss, optimizer = build_program(
        main_prog=train_prog,
        startup_prog=startup_prog,
        train_params=train_params,
        is_train=True,
        name_dict=pd_weights)
    # Right after the build_program is run, I set the weights here
    .....
    # Now I try to run the optimizer again. I also tried commenting the optimizer in build_program. Both ways gave the same error
    with fluid.program_guard(train_prog, startup_prog):
        with fluid.unique_name.guard():
            with fluid.unique_name.guard("train"):
                optimizer.minimize(loss)

I got the error

ValueError: Variable train_generated_var_0 has been created before. The previous type is VarType.STEP_SCOPES; the new type is VarType.LOD_TENSOR. They are not matched

I hope this helps... Any idea how I can freeze the weights which I have just loaded and set into the model? @wanghaoshuang

相关问题