卷积神经网络模型之——GoogLeNet网络结构与代码实现

x33g5p2x  于2022-07-26 转载在 其他  
字(7.8k)|赞(0)|评价(0)|浏览(524)

GoogLeNet网络简介

GoogLeNet原文地址:Going Deeper with Convolutions:https://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Szegedy_Going_Deeper_With_2015_CVPR_paper.pdf

GoogLeNet在2014年由Christian Szegedy提出,它是一种全新的深度学习结构。

GoogLeNet网络的主要创新点在于:

  1. 提出Inception结构在多个尺寸上同时进行卷积再聚合;

  1. 使用1X1的卷积进行降维以及映射处理;
  2. 添加两个辅助分类器帮助训练;
    辅助分类器是将中间某一层的输出用作分类,并按一个较小的权重加到最终分类结果中。
  3. 使用平均池化层代替全连接层,大大减少了参数量。

GoogLeNet网络结构

GoogLeNet的完整网络结构如下所示:

下面我们将其逐层拆分讲解并结合代码分析

Inception之前的几层结构

在进入Inception结构之前,GoogLeNet网络先堆叠了两个卷积(实则3个,有一个1X1的卷积)和两个最大池化层。

  1. # input(3,224,224)
  2. self.front = nn.Sequential(
  3. nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3), # output(64,112,112)
  4. nn.ReLU(inplace=True),
  5. nn.MaxPool2d(kernel_size=3,stride=2,ceil_mode=True), # output(64,56,56)
  6. nn.Conv2d(64,64,kernel_size=1),
  7. nn.Conv2d(64,192,kernel_size=3,stride=1,padding=1), # output(192,56,56)
  8. nn.ReLU(inplace=True),
  9. nn.MaxPool2d(kernel_size=3,stride=2,ceil_mode=True), # output(192,28,28)
  10. )

Inception结构

Inception模块只会改变特征图的通道数,而不会改变尺寸大小。

Inception结构相对复杂,我们重新创建一个类来构建此结构,并通过参数不同的参数来控制各层的通道数。

  1. class Inception(nn.Module):
  2. '''
  3. in_channels: 输入通道数
  4. out1x1:分支1输出通道数
  5. in3x3:分支2的3x3卷积的输入通道数
  6. out3x3:分支2的3x3卷积的输出通道数
  7. in5x5:分支3的5x5卷积的输入通道数
  8. out5x5:分支3的5x5卷积的输出通道数
  9. pool_proj:分支4的最大池化层输出通道数
  10. '''
  11. def __init__(self,in_channels,out1x1,in3x3,out3x3,in5x5,out5x5,pool_proj):
  12. super(Inception, self).__init__()
  13. self.branch1 = nn.Sequential(
  14. nn.Conv2d(in_channels, out1x1, kernel_size=1),
  15. nn.ReLU(inplace=True)
  16. )
  17. self.branch2 = nn.Sequential(
  18. nn.Conv2d(in_channels,in3x3,kernel_size=1),
  19. nn.ReLU(inplace=True),
  20. nn.Conv2d(in3x3,out3x3,kernel_size=3,padding=1),
  21. nn.ReLU(inplace=True)
  22. )
  23. self.branch3 = nn.Sequential(
  24. nn.Conv2d(in_channels, in5x5, kernel_size=1),
  25. nn.ReLU(inplace=True),
  26. nn.Conv2d(in5x5, out5x5, kernel_size=5, padding=2),
  27. nn.ReLU(inplace=True)
  28. )
  29. self.branch4 = nn.Sequential(
  30. nn.MaxPool2d(kernel_size=3,stride=1,padding=1),
  31. nn.Conv2d(in_channels,pool_proj,kernel_size=1),
  32. nn.ReLU(inplace=True)
  33. )
  34. def forward(self,x):
  35. branch1 = self.branch1(x)
  36. branch2 = self.branch2(x)
  37. branch3 = self.branch3(x)
  38. branch4 = self.branch4(x)
  39. outputs = [branch1,branch2,branch3,branch4]
  40. return torch.cat(outputs,1) # 按通道数叠加

Inception3a模块

  1. # input(192,28,28)
  2. self.inception3a = Inception(192, 64, 96, 128, 16, 32, 32) # output(256,28,28)

Inception3b + MaxPool

  1. # input(256,28,28)
  2. self.inception3b = Inception(256, 128, 128, 192, 32, 96, 64) # output(480,28,28)
  3. self.maxpool3 = nn.MaxPool2d(3, stride=2, ceil_mode=True) # output(480,14,14)

Inception4a

  1. # input(480,14,14)
  2. self.inception4a = Inception(480, 192, 96, 208, 16, 48, 64) # output(512,14,14)

Inception4b

  1. # input(512,14,14)
  2. self.inception4b = Inception(512, 160, 112, 224, 24, 64, 64) # output(512,14,14)

Inception4c

  1. # input(512,14,14)
  2. self.inception4c = Inception(512, 160, 112, 224, 24, 64, 64) # output(512,14,14)

Inception4d

  1. # input(512,14,14)
  2. self.inception4d = Inception(512, 112, 144, 288, 32, 64, 64) # output(528,14,14)

Inception4e+MaxPool

  1. # input(528,14,14)
  2. self.inception4e = Inception(528, 256, 160, 320, 32, 128, 128) # output(832,14,14)
  3. self.maxpool4 = nn.MaxPool2d(3, stride=2, ceil_mode=True) # output(832,7,7)

Inception5a

  1. # input(832,7,7)
  2. self.inception5a = Inception(832, 256, 160, 320, 32, 128, 128) # output(832,7,7)

Inception5b

  1. # input(832,7,7)
  2. self.inception5b = Inception(832, 384, 192, 384, 48, 128, 128) # output(1024,7,7)

Inception之后的几层结构

辅助分类模块

除了以上主干网络结构以外,GoogLeNet还提供了两个辅助分类模块,用于将中间某一层的输出用作分类,并按一个较小的权重(0.3)加到最终分类结果。

与Inception模块一样,我们也重新创建一个类来搭建辅助分类模块结构。

  1. class AccClassify(nn.Module):
  2. # in_channels: 输入通道
  3. # num_classes: 分类数
  4. def __init__(self,in_channels,num_classes):
  5. self.avgpool = nn.AvgPool2d(kernel_size=5, stride=3)
  6. self.conv = nn.MaxPool2d(in_channels, 128, kernel_size=1) # output[batch, 128, 4, 4]
  7. self.relu = nn.ReLU(inplace=True)
  8. self.fc1 = nn.Linear(2048, 1024)
  9. self.fc2 = nn.Linear(1024, num_classes)
  10. def forward(self,x):
  11. x = self.avgpool(x)
  12. x = self.conv(x)
  13. x = self.relu(x)
  14. x = torch.flatten(x, 1)
  15. x = F.dropout(x, 0.5, training=self.training)
  16. x = F.relu(self.fc1(x), inplace=True)
  17. x = F.dropout(x, 0.5, training=self.training)
  18. x = self.fc2(x)
  19. return x

辅助分类模块1

第一个中间层输出位于Inception4a之后,将Inception4a的输出经过平均池化,1X1卷积和全连接后等到分类结果。

  1. self.acc_classify1 = AccClassify(512,num_classes)

辅助分类模块2

  1. self.acc_classify2 = AccClassify(528,num_classes)

整体网络结构

pytorch搭建完整代码

  1. """
  2. #-*-coding:utf-8-*-
  3. # @author: wangyu a beginner programmer, striving to be the strongest.
  4. # @date: 2022/7/5 18:37
  5. """
  6. import torch.nn as nn
  7. import torch
  8. import torch.nn.functional as F
  9. class GoogLeNet(nn.Module):
  10. def __init__(self,num_classes=1000,aux_logits=True):
  11. super(GoogLeNet, self).__init__()
  12. self.aux_logits = aux_logits
  13. # input(3,224,224)
  14. self.front = nn.Sequential(
  15. nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3), # output(64,112,112)
  16. nn.ReLU(inplace=True),
  17. nn.MaxPool2d(kernel_size=3,stride=2,ceil_mode=True), # output(64,56,56)
  18. nn.Conv2d(64,64,kernel_size=1),
  19. nn.Conv2d(64,192,kernel_size=3,stride=1,padding=1), # output(192,56,56)
  20. nn.ReLU(inplace=True),
  21. nn.MaxPool2d(kernel_size=3,stride=2,ceil_mode=True), # output(192,28,28)
  22. )
  23. # input(192,28,28)
  24. self.inception3a = Inception(192, 64, 96, 128, 16, 32, 32) # output(64+128+32+32=256,28,28)
  25. self.inception3b = Inception(256, 128, 128, 192, 32, 96, 64) # output(480,28,28)
  26. self.maxpool3 = nn.MaxPool2d(3, stride=2, ceil_mode=True) # output(480,14,14)
  27. self.inception4a = Inception(480, 192, 96, 208, 16, 48, 64) # output(512,14,14)
  28. self.inception4b = Inception(512, 160, 112, 224, 24, 64, 64) # output(512,14,14)
  29. self.inception4c = Inception(512, 128, 128, 256, 24, 64, 64) # output(512,14,14)
  30. self.inception4d = Inception(512, 112, 144, 288, 32, 64, 64) # output(528,14,14)
  31. self.inception4e = Inception(528, 256, 160, 320, 32, 128, 128) # output(832,14,14)
  32. self.maxpool4 = nn.MaxPool2d(3, stride=2, ceil_mode=True) # output(832,7,7)
  33. self.inception5a = Inception(832, 256, 160, 320, 32, 128, 128) # output(832,7,7)
  34. self.inception5b = Inception(832, 384, 192, 384, 48, 128, 128) # output(1024,7,7)
  35. if self.training and self.aux_logits:
  36. self.acc_classify1 = AccClassify(512,num_classes)
  37. self.acc_classify2 = AccClassify(528,num_classes)
  38. self.avgpool = nn.AdaptiveAvgPool2d((1,1)) # output(1024,1,1)
  39. self.dropout = nn.Dropout(0.4)
  40. self.fc = nn.Linear(1024,num_classes)
  41. def forward(self,x):
  42. # input(3,224,224)
  43. x = self.front(x) # output(192,28,28)
  44. x= self.inception3a(x) # output(256,28,28)
  45. x = self.inception3b(x)
  46. x = self.maxpool3(x)
  47. x = self.inception4a(x)
  48. if self.training and self.aux_logits:
  49. classify1 = self.acc_classify1(x)
  50. x = self.inception4b(x)
  51. x = self.inception4c(x)
  52. x = self.inception4d(x)
  53. if self.training and self.aux_logits:
  54. classify2 = self.acc_classify2(x)
  55. x = self.inception4e(x)
  56. x = self.maxpool4(x)
  57. x = self.inception5a(x)
  58. x = self.inception5b(x)
  59. x = self.avgpool(x)
  60. x = torch.flatten(x,dims=1)
  61. x = self.dropout(x)
  62. x= self.fc(x)
  63. if self.training and self.aux_logits:
  64. return x,classify1,classify2
  65. return x
  66. class Inception(nn.Module):
  67. '''
  68. in_channels: 输入通道数
  69. out1x1:分支1输出通道数
  70. in3x3:分支2的3x3卷积的输入通道数
  71. out3x3:分支2的3x3卷积的输出通道数
  72. in5x5:分支3的5x5卷积的输入通道数
  73. out5x5:分支3的5x5卷积的输出通道数
  74. pool_proj:分支4的最大池化层输出通道数
  75. '''
  76. def __init__(self,in_channels,out1x1,in3x3,out3x3,in5x5,out5x5,pool_proj):
  77. super(Inception, self).__init__()
  78. # input(192,28,28)
  79. self.branch1 = nn.Sequential(
  80. nn.Conv2d(in_channels, out1x1, kernel_size=1),
  81. nn.ReLU(inplace=True)
  82. )
  83. self.branch2 = nn.Sequential(
  84. nn.Conv2d(in_channels,in3x3,kernel_size=1),
  85. nn.ReLU(inplace=True),
  86. nn.Conv2d(in3x3,out3x3,kernel_size=3,padding=1),
  87. nn.ReLU(inplace=True)
  88. )
  89. self.branch3 = nn.Sequential(
  90. nn.Conv2d(in_channels, in5x5, kernel_size=1),
  91. nn.ReLU(inplace=True),
  92. nn.Conv2d(in5x5, out5x5, kernel_size=5, padding=2),
  93. nn.ReLU(inplace=True)
  94. )
  95. self.branch4 = nn.Sequential(
  96. nn.MaxPool2d(kernel_size=3,stride=1,padding=1),
  97. nn.Conv2d(in_channels,pool_proj,kernel_size=1),
  98. nn.ReLU(inplace=True)
  99. )
  100. def forward(self,x):
  101. branch1 = self.branch1(x)
  102. branch2 = self.branch2(x)
  103. branch3 = self.branch3(x)
  104. branch4 = self.branch4(x)
  105. outputs = [branch1,branch2,branch3,branch4]
  106. return torch.cat(outputs,1)
  107. class AccClassify(nn.Module):
  108. def __init__(self,in_channels,num_classes):
  109. self.avgpool = nn.AvgPool2d(kernel_size=5, stride=3)
  110. self.conv = nn.MaxPool2d(in_channels, 128, kernel_size=1) # output[batch, 128, 4, 4]
  111. self.relu = nn.ReLU(inplace=True)
  112. self.fc1 = nn.Linear(2048, 1024)
  113. self.fc2 = nn.Linear(1024, num_classes)
  114. def forward(self,x):
  115. x = self.avgpool(x)
  116. x = self.conv(x)
  117. x = self.relu(x)
  118. x = torch.flatten(x, 1)
  119. x = F.dropout(x, 0.5, training=self.training)
  120. x = F.relu(self.fc1(x), inplace=True)
  121. x = F.dropout(x, 0.5, training=self.training)
  122. x = self.fc2(x)
  123. return x
  124. # print(GoogLeNet())

结构图

相关文章