我正在实现“感知损失”功能。但是,PyTorch和Tensorflow给出了不同的结果。我使用了相同的图像。请让我知道为什么。
tensorflow
class FeatureExtractor(tf.keras.Model):
def __init__(self, n_layers):
super(FeatureExtractor, self).__init__()
extractor = tf.keras.applications.VGG16(weights="imagenet",
include_top=False,input_shape=(256, 256, 3))
extractor.trainable = True
#features = [extractor.layers[i].output for i in n_layers]
features = [extractor.get_layer(i).output for i in n_layers]
self.extractor = tf.keras.models.Model(extractor.inputs,features)
def call(self, x):
return self.extractor(x)
def loss_function(generated_image, target_image,
feature_extractor):
MSE = tf.keras.losses.MeanSquaredError()
mse_loss = MSE(generated_image, target_image)
real_features = feature_extractor(target_image)
generated_features = feature_extractor(generated_image)
perceptual_loss = 0
for i in range(len(real_features)):
loss = MSE(real_features[i], generated_features[i])
print(loss)
perceptual_loss += loss
return mse_loss, perceptual_loss
运行时间:
feature_extractor = FeatureExtractor(n_layers=["block1_conv1","block1_conv2",
"block3_conv2","block4_conv2"])
mse_loss, perceptual_loss = loss_function(image1, image2,
feature_extractor)
print(f"{mse_loss} {perceptual_loss} {mse_loss+perceptual_loss}")
它给出:
output:
tf.Tensor(0.0014001362, shape=(), dtype=float32)
tf.Tensor(0.030578917, shape=(), dtype=float32)
tf.Tensor(2.6163354, shape=(), dtype=float32)
tf.Tensor(0.842701, shape=(), dtype=float32)
0.002584027126431465 3.4910154342651367 3.4935994148254395
皮拓奇
class FeatureExtractor(torch.nn.Module):
def __init__(self, n_layers):
super(FeatureExtractor, self).__init__()
extractor = models.vgg16(pretrained=True).features
index = 0
self.layers = nn.ModuleList([])
for i in range(len(n_layers)):
self.layers.append(torch.nn.Sequential())
for j in range(index, n_layers[i] + 1):
self.layers[i].add_module(str(j), extractor[j])
index = n_layers[i] + 1
for param in self.parameters():
param.requires_grad = False
def forward(self, x):
result = []
for i in range(len(self.layers)):
x = self.layers[i](x)
result.append(x)
return result
def loss_function(generated_image, target_image, feature_extractor):
MSE = nn.MSELoss(reduction='mean')
mse_loss = MSE(generated_image, target_image)
real_features = feature_extractor(target_image)
generated_features = feature_extractor(generated_image)
perceptual_loss = 0
for i in range(len(real_features)):
loss = MSE(real_features[i], generated_features[i])
perceptual_loss += loss
print(loss)
return mse_loss, perceptual_loss
运行时间:
feature_extractor = FeatureExtractor(n_layers=[1, 3, 13, 20]).to(device)
mse_loss, perceptual_loss = loss_function(image1, image2,
feature_extractor)
print(f"{mse_loss} {perceptual_loss} {mse_loss+perceptual_loss}")
它给出:
output:
tensor(0.0003)
tensor(0.0029)
tensor(0.2467)
tensor(0.2311)
0.002584027359262109 0.4810013473033905 0.483585387468338
1条答案
按热度按时间vu8f3i0k1#
虽然是相同的模型,但是由于初始化参数不同,最终模型的参数可能会有所不同。对于不同的框架,比如keras和pytorch,在训练之前对输入图像进行预处理是不同的。因此,即使是相同的图像,处理后的tenor值也是不同的。下面的代码是一个例子,可以帮助理解。
另外,该模型的训练目标是分类精度,因此网络中间的特征图之间的差异结果是有意义的。