TorchVision中通过AlexNet网络进行图像分类

x33g5p2x  于2021-11-27 转载在 其他  
字(3.5k)|赞(0)|评价(0)|浏览(217)

    TorchVision中给出了AlexNet的pretrained模型,模型存放位置为https://download.pytorch.org/models/alexnet-owt-4df8aa71.pth ,可通过models.alexnet函数下载,此函数实现在torchvision/models/alexnet.py中,下载后在Ubuntu上存放在~/.cache/torch/hub/checkpoints目录下,在Windows上存放在C:\Users\spring.cache\torch\hub\checkpoints目录下,其中spring为用户名。

    AlexNet的介绍参考:https://blog.csdn.net/fengbingchun/article/details/112709281

    在推理(inference)过程中,模型的输入是一个tensor,shape需要是[1,c,h,w],原始图像进行预处理操作包括:

    (1).resize到短边为256,长边等比缩放。

    (2).在中心裁剪图像大小到224*224。

    (3).将数据从numpy.ndarray转换到tensor;原数据shape为[h,w,c],转换后tensor shape为[c,h,w];原数据值范围为[0,255],转换后值范围为[0.0,1.0]。

    (4).使用均值和标准差对tensor图像进行归一化。

    (5).将tensor的shape从[c,h,w]转换到[1,c,h,w]。

    模型是通过ImageNet数据集训练获得的,它的图像分类数是1000,ImageNet数据集的介绍参考:https://blog.csdn.net/fengbingchun/article/details/88606621

    以下为测试代码:

  1. import torch
  2. from torchvision import models
  3. from torchvision import transforms
  4. import cv2
  5. from PIL import Image
  6. import math
  7. import numpy as np
  8. #print(dir(models))
  9. images_path = "../../data/image/"
  10. images_name = ["5.jpg", "6.jpg", "7.jpg", "8.jpg", "9.jpg", "10.jpg"]
  11. images_data = [] # opencv
  12. tensor_data = [] # pytorch tensor
  13. def images_stitch(images, cols=3, name="result.jpg"): # 图像简单拼接
  14. '''images: list, opencv image data; cols: number of images per line; name: save image result name'''
  15. width_total = 660
  16. width, height = width_total // cols, width_total // cols
  17. number = len(images)
  18. height_total = height * math.ceil(number / cols)
  19. mat1 = np.zeros((height_total, width_total, 3), dtype="uint8") # in Python images are represented as NumPy arrays
  20. for idx in range(number):
  21. height_, width_, _ = images[idx].shape
  22. if height_ != width_:
  23. if height_ > width_:
  24. width_ = math.floor(width_ / height_ * width)
  25. height_ = height
  26. else:
  27. height_ = math.floor(height_ / width_ * height)
  28. width_ = width
  29. else:
  30. height_, width_ = height, width
  31. mat2 = cv2.resize(images[idx], (width_, height_))
  32. offset_y, offset_x = (height - height_) // 2, (width - width_) // 2
  33. start_y, start_x = idx // cols * height, idx % cols * width
  34. mat1[start_y + offset_y:start_y + height_+offset_y, start_x + offset_x:start_x + width_+offset_x, :] = mat2
  35. cv2.imwrite(images_path+name, mat1)
  36. for name in images_name:
  37. img = cv2.imread(images_path + name)
  38. print(f"name: {images_path+name}, opencv image shape: {img.shape}") # (h,w,c)
  39. images_data.append(img)
  40. img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
  41. img_pil = Image.fromarray(img)
  42. transform = transforms.Compose([
  43. transforms.Resize(256),
  44. transforms.CenterCrop(224),
  45. transforms.ToTensor(),
  46. transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
  47. ])
  48. tensor = transform(img_pil)
  49. print(f"tensor shape: {tensor.shape}, max: {torch.max(tensor)}, min: {torch.min(tensor)}") # (c,h,w)
  50. tensor = torch.unsqueeze(tensor, 0) # 返回一个新的tensor,对输入的既定位置插入维度1
  51. print(f"tensor shape: {tensor.shape}, max: {torch.max(tensor)}, min: {torch.min(tensor)}") # (1,c,h,w)
  52. tensor_data.append(tensor)
  53. images_stitch(images_data)
  54. model = models.alexnet(pretrained=True) # AlexNet网络
  55. #print(model) # 可查看模型结构,与torchvision/models/alexnet.py中一致
  56. model.eval() # AlexNet is required to be put in evaluation mode in order to do prediction/evaluation
  57. with open("imagenet_classes.txt") as f:
  58. classes = [line.strip() for line in f.readlines()] # the line number specified the class number
  59. for x in range(len(tensor_data)):
  60. prediction = model(tensor_data[x])
  61. #print(prediction.shape) # [1,1000]
  62. _, index = torch.max(prediction, 1)
  63. percentage = torch.nn.functional.softmax(prediction, dim=1)[0] * 100
  64. print(f"result: {classes[index[0]]}, {percentage[index[0]].item()}")
  65. print("test finish")

    执行结果如下:以下原始测试图像来自网络,每张图像仅输出可信度值最高的一个类别。从上往下,从左往右,每张图像的分类结果依次是:goldfish(金鱼)、hen(母鸡)、ostrich(鸵鸟)、African crocodile(非洲鳄鱼)、goose(鹅)、hartebeest(羚羊)。

    GitHubhttps://github.com/fengbingchun/PyTorch_Test 

相关文章