opencv 使用边界框列表从图像中裁剪多个边界框

7kqas0il 于 2023-11-22 发布在其他

关注(0)|答案(2)|浏览(240)

使用Amazon的Rekognition，我使用以下方法从JSON响应中提取了感兴趣的边界框：

def __init__(self, image):
        self.shape = image.shape 
    def bounding_box_convert(self, bounding_box):
        xmin = int(bounding_box['Left'] * self.shape[1])
        xmax = xmin + int(bounding_box['Width'] * self.shape[1])
        ymin = int(bounding_box['Top'] * self.shape[0])
        ymax = ymin + int(bounding_box['Height'] * self.shape[0])
        return (xmin,ymin,xmax,ymax)
    def polygon_convert(self, polygon):
        pts = []
        for p in polygon:
            x = int(p['X'] * self.shape[1])
            y = int(p['Y'] * self.shape[0])
            pts.append( [x,y] )
        return pts
def get_bounding_boxes(jsondata):
    objectnames = ('Helmet','Hardhat')
    bboxes = []
    a = jsondata
    if('Labels' in a):
        for label in a['Labels']:
            #-- skip over anything that isn't hardhat,helmet
            if(label['Name'] in objectnames):
                print('extracting {}'.format(label['Name']))
                lbl = "{}: {:0.1f}%".format(label['Name'], label['Confidence'])
                print(lbl)
                for instance in label['Instances']:
                    coords = tmp.bounding_box_convert(instance['BoundingBox'])
                    bboxes.append(coords)
    return bboxes
if __name__=='__main__':
    imagefile = 'image011.jpg'
    bgr_image = cv2.imread(imagefile)
    tmp = Tmp(bgr_image)
    jsonname = 'json_000'
    fin = open(jsonname, 'r')
    jsondata = json.load(fin)
    bb = get_bounding_boxes(jsondata)
    print(bb)

字符串
输出是边界框的列表：

[(865, 731, 1077, 906), (1874, 646, 2117, 824)]

型
我可以很容易地从列表中提取一个位置，并保存为一个新的图像，使用：

from PIL import Image
img = Image.open("image011.jpg")
area = (865, 731, 1077, 906)
cropped_img = img.crop(area)
cropped_img.save("cropped.jpg")

型
然而，我还没有找到一个很好的解决方案来使用“bb”列表输出从图像中裁剪和保存多个边界框。
我确实找到了一个从csv中提取信息的解决方案：Most efficient/quickest way to crop multiple bounding boxes in 1 image, over thousands of images?。
但是，我相信有一种比将边界框数据保存到CSV并阅读它更有效的方法。
我不是很擅长写自己的函数-所有的建议都非常感谢！

opencv

来源：https://stackoverflow.com/questions/59722712/crop-multiple-bounding-boxes-from-image-with-list-of-bounding-boxes

2条答案

按热度按时间

np8igboo1#

假设你的边界框坐标是x,y,w,h的形式，你可以做ROI = image[y:y+h,x:x+w]来裁剪。对于这个输入图像：

的数据
使用来自how to get ROI Bounding Box Coordinates without Guess & Check的脚本获取x,y,w,h边界框坐标，以裁剪出这些ROI：

的
我们只需遍历边界框列表并使用Numpy切片对其进行裁剪。提取的ROI：

的
这里有一个最小的例子：

import cv2
import numpy as np 
image = cv2.imread('1.png')
bounding_boxes = [(17, 24, 47, 47),
                  (74, 28, 47, 50),
                  (125, 15, 51, 61),
                  (184, 18, 53, 53),
                  (247, 25, 44, 46),
                  (296, 6, 65, 66)
]
num = 0
for box in bounding_boxes:
    x,y,w,h = box
    ROI = image[y:y+h, x:x+w]
    cv2.imwrite('ROI_{}.png'.format(num), ROI)
    num += 1
    cv2.imshow('ROI', ROI)
    cv2.waitKey()

字符串

展开查看全部

赞(0）回复(0）举报 2023-11-22

cedebl8k2#

建议的解决方案很慢，因为这个操作可以矢量化。看起来，确实，一些流行的框架（Tensorflow，Torch）让用户进行这种预处理，而其他框架（参见MatLab的bboxcrop）。下面是我在自己的研究中使用的矢量化代码：

def crop_bounding_boxes(boxes,window):
    """Crop bounding boxes to the speficied window. 
    Args:
        boxes: A tensor of shape `[n_boxes,4]` describing bounding boxes. Each box is in pixel units and in the format `x_min,y_min,x_max,y_max`
        window A tensor of shape `[4]` describing the window. The window is in pixel units and in the format `x_min,y_min,x_max,y_max`
    Returns:
        _type_: _description_
    """    
    """
    Args:
        boxes 
    The annotation boxes are assumed to be in pixels and in the format `x_min,y_min,x_max,y_max`.
    """
    # assume boxes and patch are given as (x1,y1,x2,y2)
    # compute intersections of rectangles
    tf_ops = [tf.maximum,tf.maximum,tf.minimum,tf.minimum]
    cropped_boxes = [op(window[pos],boxes[:,pos]) for (pos,op) in enumerate(tf_ops)]
    cropped_boxes = tf.stack(cropped_boxes,axis=-1)
    mask = tf.logical_and( tf.less(cropped_boxes[:,0],cropped_boxes[:,2]), tf.less(cropped_boxes[:,1],cropped_boxes[:,3]) )
    cropped_boxes = tf.boolean_mask(cropped_boxes,mask)
    # move the coordinates origin to (x1,y1)
    corner = tf.concat([window[:2],window[:2]],axis=0)
    corner = tf.broadcast_to(corner, cropped_boxes.shape)
    cropped_boxes = cropped_boxes - corner
    return cropped_boxes

字符串
这里有一个小演示。考虑一个简单的图像，中间有一个盒子

import tensorflow as tf
import matplotlib.pyplot as plt
img = tf.zeros(shape=(256,256,1),dtype=tf.float32)
boxes = tf.constant([[0.25,0.25,0.75,0.75]])
img_with_box = tf.image.draw_bounding_boxes([img],[boxes],colors=[[1.0,1.0,1.0]])[0]
plt.imshow(img_with_box.numpy(), cmap="gray")

型

的数据
使用上面的实用程序，我们将它与4个作物的边界框一起沿着

import itertools
xy = list(itertools.product(range(2),repeat=2))
IMG_PATCHES = list((x*128,y*128,(x+1)*128,(y+1)*128) for y,x in xy)
fig,axs = plt.subplots(2,2,figsize=(12,12))
boxes = tf.cast(boxes*256, dtype=tf.int32)
for ax,img_patch in zip(axs.ravel(),IMG_PATCHES):
    # crop bounding boxes to the image patch
    cropped_boxes = crop_bounding_boxes(boxes,img_patch)
    # for display, convert boxes to the format expected by TF API: [y_min, x_min, y_max, x_max] + 0-1 scale
    x1,y1,x2,y2 = img_patch
    scale = tf.constant([x2-x1,y2-y1,x2-x1,y2-y1],dtype=tf.float32)
    cropped_boxes = tf.cast(cropped_boxes,tf.float32)/tf.broadcast_to(scale,cropped_boxes.shape)
    cropped_boxes = tf.gather(cropped_boxes, [1,0,3,2], axis=-1)
    img_with_boxes = tf.image.draw_bounding_boxes([img[y1:y2,x1:x2,:]],tf.expand_dims(cropped_boxes,0),colors=[[1.0,1.0,1.0]])
    ax.imshow(img_with_boxes.numpy().squeeze(), origin='upper', extent=(x1,x2,y2,y1), cmap="gray")
    ax.set_title(str(img_patch))
plt.show()