pytorch 转换为图像的线性回归图层告诉我什么?

hwazgwia  于 2022-11-09  发布在  其他
关注(0)|答案(1)|浏览(93)

我正在关注this tutorial
然而,我决定把线性层,然后使他们转换成1 * 19 * 19的图像。这样做,我得到了一堆像素随机的地方。x1c 0d1x
下面是我修改后的代码。为了描述我所做的,我基本上是从0-10剪切标签,然后从10剪切图片数组的长度。这样我就把标签和混乱的图片分开了。

import torch
import torchvision
import torchvision.transforms as transforms
import torch.optim as optim
import torch.nn as nn
import torch.nn.functional as F
import matplotlib.pyplot as plt

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

batch_size = 4

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size,
                                          shuffle=True)

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size,
                                         shuffle=False)

classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5,8 * 7 * 7) # 8 * 7 * 7
        self.fc2 = nn.Linear(8 * 7 * 7, 6 * 8 * 8) # 6 * 8 * 8
        self.fc3 = nn.Linear(6 * 8 * 8,19 * 19 + 10) # 19 * 19 + 10

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        #### IMPORTANT: this here is where I extract the picture of this layer, comparing against the
        # regression layers and this here!
        self.picx = self.pool(F.relu(self.conv2(x)))
        ####
        x2 = torch.flatten(self.picx, 1) # flatten all dimensions except batch
        x2 = F.relu(self.fc1(x2))
        x2 = F.relu(self.fc2(x2))
        x2 = self.fc3(x2)
        return x2

net = Net()

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

for epoch in range(5):  # loop over the dataset multiple times
    for i, data in enumerate(trainloader, 0):
        # get the inputs; data is a list of [inputs, labels]
        inputs, labels = data

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = net(inputs).to(device)
        loss = criterion(outputs, labels).to(device)
        loss.backward()
        optimizer.step()

        running_loss = loss.item()

with torch.no_grad():

    dataiter = iter(testloader)
    images, labels = dataiter.next()

    outputs = net.forward(images) ### pic = to the extracted convulotional layer, vs the regression layer.

    _, predicted = torch.max(outputs[...,0:10], 1)

    print(predicted, labels)

    preds = torch.reshape(outputs[...,10:1454], (4,19,19))

    plt.imshow(preds[0].detach().numpy())
    plt.show()

correct = 0
total = 0

# since we're not training, we don't need to calculate the gradients for our outputs

with torch.no_grad():
    for data in testloader:
        images, labels = data
        # calculate outputs by running images through the network
        outputs = net(images)
        # the class with the highest energy is what we choose as prediction
        _, predicted = torch.max(outputs[...,0:10], 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print(f'Accuracy of the network on the 10000 test images: {100 * correct // total} %')

这张图像意味着什么?有没有什么方法可以得到一个遮罩,让像素在哪里检测到标签?换句话说,我想有一张图像,它在哪里绘制像素,它正在看到一个主题,如狗或猫。

c3frrgcw

c3frrgcw1#

我不想打击你,但这是一个形象当你把Conv2d层的输出变平,并把这个输出传递给2个Linear层时,你就失去了对神经元的任何空间意义。“线性”或“密集”层把上一层的每个节点都连接到下一层的每个节点,从而有效地丢弃了原始输入图像中的任何神经元/节点与场所之间的关系。
如果你想查看你的网络所关注的图像部分以便做出决定,你会想查看卷积层的内部。这是一个有很多有效方法的重要问题。一个流行的方法是Grad-CAM。如果你想要更简单的方法,您可以尝试将每个通道分别标绘为卷积层之一的输出;但即使这样也很难解释。

相关问题