如何将一个带有非顺序架构(如ResNet)的Keras模型拆分为子模型?

s71maibg  于 2023-02-08  发布在  其他
关注(0)|答案(3)|浏览(169)

我的模型是一个resnet-152,我想将其分为两个子模型,问题是第二个子模型,我不知道如何构建从中间层到输出的模型
我试过这个代码从这个响应,它不为我工作这里是我的代码:

def getLayerIndexByName(model, layername):
    for idx, layer in enumerate(model.layers):
        if layer.name == layername:
            return idx

idx = getLayerIndexByName(resnet, 'res3a_branch2a')

input_shape = resnet.layers[idx].get_input_shape_at(0) # which is here in my case (None, 55, 55, 256)

layer_input = Input(shape=input_shape[1:]) # as keras will add the batch shape

# create the new nodes for each layer in the path
x = layer_input
for layer in resnet.layers[idx:]:
    x = layer(x)

# create the model
new_model = Model(layer_input, x)

我得到这个错误:

ValueError: Input 0 is incompatible with layer res3a_branch1: expected axis -1 of input shape to have value 256 but got shape (None, 28, 28, 512).

我也试过这个功能:

def split(model, start, end):
    confs = model.get_config()
    kept_layers = set()
    for i, l in enumerate(confs['layers']):
        if i == 0:
            confs['layers'][0]['config']['batch_input_shape'] = model.layers[start].input_shape
            if i != start:
                confs['layers'][0]['name'] += str(random.randint(0, 100000000)) # rename the input layer to avoid conflicts on merge
                confs['layers'][0]['config']['name'] = confs['layers'][0]['name']
        elif i < start or i > end:
            continue
        kept_layers.add(l['name'])
    # filter layers
    layers = [l for l in confs['layers'] if l['name'] in kept_layers]
    layers[1]['inbound_nodes'][0][0][0] = layers[0]['name']
    # set conf
    confs['layers'] = layers
    confs['input_layers'][0][0] = layers[0]['name']
    confs['output_layers'][0][0] = layers[-1]['name']
    # create new model
    submodel = Model.from_config(confs)
    for l in submodel.layers:
        orig_l = model.get_layer(l.name)
        if orig_l is not None:
            l.set_weights(orig_l.get_weights())
    return submodel

我得到这个错误:

ValueError: Unknown layer: Scale

因为我的resnet 152包含一个Scale层。
下面是一个工作版本:

import resnet   # pip install resnet
from keras.models import Model
from keras.layers import Input

def getLayerIndexByName(model, layername):
    for idx, layer in enumerate(model.layers):
        if layer.name == layername:
            return idx

resnet = resnet.ResNet152(weights='imagenet')

idx = getLayerIndexByName(resnet, 'res3a_branch2a')

model1 = Model(inputs=resnet.input, outputs=resnet.get_layer('res3a_branch2a').output)

input_shape = resnet.layers[idx].get_input_shape_at(0) # get the input shape of desired layer
print(input_shape[1:])
layer_input = Input(shape=input_shape[1:]) # a new input tensor to be able to feed the desired layer

# create the new nodes for each layer in the path
x = layer_input
for layer in resnet.layers[idx:]:
    x = layer(x)

# create the model
model2 = Model(layer_input, x)

model2.summary()

下面是错误:

ValueError: Input 0 is incompatible with layer res3a_branch1: expected axis -1 of input shape to have value 256 but got shape (None, 28, 28, 512)
snvhrwxg

snvhrwxg1#

正如我在评论部分提到的,由于ResNet模型没有线性架构(即,它具有跳跃连接,并且一个层可以连接到多个层),您不能简单地在循环中一个接一个地遍历模型的层,并在循环中将一个层应用到前一层的输出上(即,不同于具有线性架构的模型,this method works)。
因此,您需要找到各层之间的连通性,并遍历该连通图,以便能够构建原始模型的子模型。目前,我想到了以下解决方案:
1.指定子模型的最后一层。
1.从该层开始,找到所有连接到它的层。
1.获取这些连接图层的输出。
1.将最后一个图层应用于收集的输出。
显然,步骤#3暗示了递归:要得到连接层的输出(即X),我们首先需要找到它们的连接层(即Y),得到它们的输出(即Y的输出),然后将它们应用到这些输出上(即将X应用到Y的输出上)。此外,要找到连接层,您需要了解一些Keras的内部信息,这些信息已经在this answer中介绍过。因此,我们提出了以下解决方案:

from keras.applications.resnet50 import ResNet50
from keras import models
from keras import layers

resnet = ResNet50()

# this is the split point, i.e. the starting layer in our sub-model
starting_layer_name = 'activation_46'

# create a new input layer for our sub-model we want to construct
new_input = layers.Input(batch_shape=resnet.get_layer(starting_layer_name).get_input_shape_at(0))

layer_outputs = {}
def get_output_of_layer(layer):
    # if we have already applied this layer on its input(s) tensors,
    # just return its already computed output
    if layer.name in layer_outputs:
        return layer_outputs[layer.name]

    # if this is the starting layer, then apply it on the input tensor
    if layer.name == starting_layer_name:
        out = layer(new_input)
        layer_outputs[layer.name] = out
        return out

    # find all the connected layers which this layer
    # consumes their output
    prev_layers = []
    for node in layer._inbound_nodes:
        prev_layers.extend(node.inbound_layers)

    # get the output of connected layers
    pl_outs = []
    for pl in prev_layers:
        pl_outs.extend([get_output_of_layer(pl)])

    # apply this layer on the collected outputs
    out = layer(pl_outs[0] if len(pl_outs) == 1 else pl_outs)
    layer_outputs[layer.name] = out
    return out

# note that we start from the last layer of our desired sub-model.
# this layer could be any layer of the original model as long as it is
# reachable from the starting layer
new_output = get_output_of_layer(resnet.layers[-1])

# create the sub-model
model = models.Model(new_input, new_output)

重要说明:

1.该解决方案假设原始模型中的每一层仅被使用过一次,即,它不适用于其中层可以被共享并且因此可以被不止一次地应用于不同输入Tensor的暹罗网络。
1.如果要将模型正确分割为多个子模型,则仅将这些图层用作分割点是有意义的(例如,在上述代码中由starting_layer_name指示),它们不在分支中(例如在ResNet中合并层之后的激活层是一个好的选择,但是您选择的res3a_branch2a不是一个好的选项,因为它在一个分支中)。为了更好地查看模型的原始架构,您可以使用plot_model()实用函数来绘制它的图:

from keras.applications.resnet50 import ResNet50
from keras.utils import plot_model

resnet = ResNet50()
plot_model(model, to_file='resnet_model.png')

1.由于新节点是在构造子模型之后创建的,因此不要试图构造另一个子模型**,该子模型与前面的子模型 * 在上面的代码 * 的同一次运行中具有重叠(即,如果它没有重叠,则是可以的!);否则,您可能会遇到错误。

enyaitl3

enyaitl32#

在这种情况下,当存在索引为middle的层时,该层仅连接到前一层(# middle-1),并且之后的所有层不直接连接到其之前的层,我们可以利用每个模型被保存为层列表的事实,并且以这种方式创建两个部分模型:

model1 = keras.models.Model(inputs=model.input, outputs=model.layers[middle - 1].output)
    
input = keras.Input(shape=model.layers[middle-1].output_shape[1:])
# layers is a dict in the form {name : output}
layers = {}
layers[model.layers[middle-1].name] = input
for layer in model.layers[middle:]:
    if type(layer.input) == list:
        x = []
        for layer_input in layer.input:
            x.append(layers[layer_input.name.split('/')[0]])
    else:
        x = layers[layer.input.name.split('/')[0]]
    y = layer(x)
    layers[layer.name] = y
model2 = keras.Model(inputs = [input], outputs = [y])

然后很容易检查model2.predict(model1.predict(x))是否给出与model.predict(x)相同的结果

ecfsfe2w

ecfsfe2w3#

我在分割一个用于迁移学习的初始CNN时遇到了类似的问题,只设置某个点之后的层为可训练的。

def get_layers_above(cutoff_layer,model):

  def get_next_level(layer,model):
    def wrap_list(val):
      if type(val) is list:
        return val
      return [val] 
    r=[]
    for output_t in wrap_list(layer.output):
      r+=[x for x in model.layers if output_t.name in [y.name for y in wrap_list(x.input)]]
    return r

  visited=set()
  to_visit=set([cutoff_layer])

  while to_visit:
    layer=to_visit.pop()
    to_visit.update(get_next_level(layer,model))
    visited.add(layer)
  return list(visited)

我选择迭代而不是递归的解决方案,因为对于具有许多收敛分支的网络,宽度优先遍历集合似乎是一个更安全的解决方案。
应该这样使用(例如InceptionV3)

model = tf.keras.applications.InceptionV3(include_top=False,weights='imagenet',input_shape=(299,299,3))
layers=get_layers_above(model.get_layer('mixed9'),model)
print([l.name for l in layers])

输出

['batch_normalization_89',
 'conv2d_93',
 'activation_86',
 'activation_91',
 'mixed10',
 'activation_88',
 'batch_normalization_85',
 'activation_93',
 'batch_normalization_90',
 'conv2d_87',
 'conv2d_86',
 'batch_normalization_86',
 'activation_85',
 'conv2d_91',
 'batch_normalization_91',
 'batch_normalization_87',
 'activation_90',
 'mixed9',
 'batch_normalization_92',
 'batch_normalization_88',
 'activation_87',
 'concatenate_1',
 'activation_89',
 'conv2d_88',
 'conv2d_92',
 'average_pooling2d_8',
 'activation_92',
 'mixed9_1',
 'conv2d_89',
 'conv2d_85',
 'conv2d_90',
 'batch_normalization_93']

相关问题