opencv 使用边界框列表从图像中裁剪多个边界框

7kqas0il  于 2023-11-22  发布在  其他
关注(0)|答案(2)|浏览(240)

使用Amazon的Rekognition,我使用以下方法从JSON响应中提取了感兴趣的边界框:

  1. def __init__(self, image):
  2. self.shape = image.shape
  3. def bounding_box_convert(self, bounding_box):
  4. xmin = int(bounding_box['Left'] * self.shape[1])
  5. xmax = xmin + int(bounding_box['Width'] * self.shape[1])
  6. ymin = int(bounding_box['Top'] * self.shape[0])
  7. ymax = ymin + int(bounding_box['Height'] * self.shape[0])
  8. return (xmin,ymin,xmax,ymax)
  9. def polygon_convert(self, polygon):
  10. pts = []
  11. for p in polygon:
  12. x = int(p['X'] * self.shape[1])
  13. y = int(p['Y'] * self.shape[0])
  14. pts.append( [x,y] )
  15. return pts
  16. def get_bounding_boxes(jsondata):
  17. objectnames = ('Helmet','Hardhat')
  18. bboxes = []
  19. a = jsondata
  20. if('Labels' in a):
  21. for label in a['Labels']:
  22. #-- skip over anything that isn't hardhat,helmet
  23. if(label['Name'] in objectnames):
  24. print('extracting {}'.format(label['Name']))
  25. lbl = "{}: {:0.1f}%".format(label['Name'], label['Confidence'])
  26. print(lbl)
  27. for instance in label['Instances']:
  28. coords = tmp.bounding_box_convert(instance['BoundingBox'])
  29. bboxes.append(coords)
  30. return bboxes
  31. if __name__=='__main__':
  32. imagefile = 'image011.jpg'
  33. bgr_image = cv2.imread(imagefile)
  34. tmp = Tmp(bgr_image)
  35. jsonname = 'json_000'
  36. fin = open(jsonname, 'r')
  37. jsondata = json.load(fin)
  38. bb = get_bounding_boxes(jsondata)
  39. print(bb)

字符串
输出是边界框的列表:

  1. [(865, 731, 1077, 906), (1874, 646, 2117, 824)]


我可以很容易地从列表中提取一个位置,并保存为一个新的图像,使用:

  1. from PIL import Image
  2. img = Image.open("image011.jpg")
  3. area = (865, 731, 1077, 906)
  4. cropped_img = img.crop(area)
  5. cropped_img.save("cropped.jpg")


然而,我还没有找到一个很好的解决方案来使用“bb”列表输出从图像中裁剪和保存多个边界框。
我确实找到了一个从csv中提取信息的解决方案:Most efficient/quickest way to crop multiple bounding boxes in 1 image, over thousands of images?
但是,我相信有一种比将边界框数据保存到CSV并阅读它更有效的方法。
我不是很擅长写自己的函数-所有的建议都非常感谢!

np8igboo

np8igboo1#

假设你的边界框坐标是x,y,w,h的形式,你可以做ROI = image[y:y+h,x:x+w]来裁剪。对于这个输入图像:


的数据
使用来自how to get ROI Bounding Box Coordinates without Guess & Check的脚本获取x,y,w,h边界框坐标,以裁剪出这些ROI:



我们只需遍历边界框列表并使用Numpy切片对其进行裁剪。提取的ROI:



这里有一个最小的例子:

  1. import cv2
  2. import numpy as np
  3. image = cv2.imread('1.png')
  4. bounding_boxes = [(17, 24, 47, 47),
  5. (74, 28, 47, 50),
  6. (125, 15, 51, 61),
  7. (184, 18, 53, 53),
  8. (247, 25, 44, 46),
  9. (296, 6, 65, 66)
  10. ]
  11. num = 0
  12. for box in bounding_boxes:
  13. x,y,w,h = box
  14. ROI = image[y:y+h, x:x+w]
  15. cv2.imwrite('ROI_{}.png'.format(num), ROI)
  16. num += 1
  17. cv2.imshow('ROI', ROI)
  18. cv2.waitKey()

字符串

展开查看全部
cedebl8k

cedebl8k2#

建议的解决方案很慢,因为这个操作可以矢量化。看起来,确实,一些流行的框架(Tensorflow,Torch)让用户进行这种预处理,而其他框架(参见MatLab的bboxcrop)。下面是我在自己的研究中使用的矢量化代码:

  1. def crop_bounding_boxes(boxes,window):
  2. """Crop bounding boxes to the speficied window.
  3. Args:
  4. boxes: A tensor of shape `[n_boxes,4]` describing bounding boxes. Each box is in pixel units and in the format `x_min,y_min,x_max,y_max`
  5. window A tensor of shape `[4]` describing the window. The window is in pixel units and in the format `x_min,y_min,x_max,y_max`
  6. Returns:
  7. _type_: _description_
  8. """
  9. """
  10. Args:
  11. boxes
  12. The annotation boxes are assumed to be in pixels and in the format `x_min,y_min,x_max,y_max`.
  13. """
  14. # assume boxes and patch are given as (x1,y1,x2,y2)
  15. # compute intersections of rectangles
  16. tf_ops = [tf.maximum,tf.maximum,tf.minimum,tf.minimum]
  17. cropped_boxes = [op(window[pos],boxes[:,pos]) for (pos,op) in enumerate(tf_ops)]
  18. cropped_boxes = tf.stack(cropped_boxes,axis=-1)
  19. mask = tf.logical_and( tf.less(cropped_boxes[:,0],cropped_boxes[:,2]), tf.less(cropped_boxes[:,1],cropped_boxes[:,3]) )
  20. cropped_boxes = tf.boolean_mask(cropped_boxes,mask)
  21. # move the coordinates origin to (x1,y1)
  22. corner = tf.concat([window[:2],window[:2]],axis=0)
  23. corner = tf.broadcast_to(corner, cropped_boxes.shape)
  24. cropped_boxes = cropped_boxes - corner
  25. return cropped_boxes

字符串
这里有一个小演示。考虑一个简单的图像,中间有一个盒子

  1. import tensorflow as tf
  2. import matplotlib.pyplot as plt
  3. img = tf.zeros(shape=(256,256,1),dtype=tf.float32)
  4. boxes = tf.constant([[0.25,0.25,0.75,0.75]])
  5. img_with_box = tf.image.draw_bounding_boxes([img],[boxes],colors=[[1.0,1.0,1.0]])[0]
  6. plt.imshow(img_with_box.numpy(), cmap="gray")


的数据
使用上面的实用程序,我们将它与4个作物的边界框一起沿着

  1. import itertools
  2. xy = list(itertools.product(range(2),repeat=2))
  3. IMG_PATCHES = list((x*128,y*128,(x+1)*128,(y+1)*128) for y,x in xy)
  4. fig,axs = plt.subplots(2,2,figsize=(12,12))
  5. boxes = tf.cast(boxes*256, dtype=tf.int32)
  6. for ax,img_patch in zip(axs.ravel(),IMG_PATCHES):
  7. # crop bounding boxes to the image patch
  8. cropped_boxes = crop_bounding_boxes(boxes,img_patch)
  9. # for display, convert boxes to the format expected by TF API: [y_min, x_min, y_max, x_max] + 0-1 scale
  10. x1,y1,x2,y2 = img_patch
  11. scale = tf.constant([x2-x1,y2-y1,x2-x1,y2-y1],dtype=tf.float32)
  12. cropped_boxes = tf.cast(cropped_boxes,tf.float32)/tf.broadcast_to(scale,cropped_boxes.shape)
  13. cropped_boxes = tf.gather(cropped_boxes, [1,0,3,2], axis=-1)
  14. img_with_boxes = tf.image.draw_bounding_boxes([img[y1:y2,x1:x2,:]],tf.expand_dims(cropped_boxes,0),colors=[[1.0,1.0,1.0]])
  15. ax.imshow(img_with_boxes.numpy().squeeze(), origin='upper', extent=(x1,x2,y2,y1), cmap="gray")
  16. ax.set_title(str(img_patch))
  17. plt.show()


输出:

最后,一个关于真实的世界数据的视觉上有吸引力的例子,注解的树。
之后:



还有一个full notebook

展开查看全部

相关问题