如何在Python OpenCV中通过findcontours函数旋转边界框?

r8xiu3jd  于 2022-12-04  发布在  Python
关注(0)|答案(1)|浏览(191)

我有下面的图像:

我使用OpenCV来寻找这张图像中的轮廓,以便将“122”分成“1”、“2”和“2”。我使用OCR来对后面的数字进行分类。我使用的代码如下:

invert = cv2.bitwise_not(image)
gray = cv2.cvtColor(invert, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
# perform edge detection, find contours in the edge map, and sort the
# resulting contours from left-to-right
edged = cv2.Canny(blurred, 30, 150)
cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,
    cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
cnts = sort_contours(cnts, method="left-to-right")[0]

# initialize the list of contour bounding boxes and associated
# characters that we'll be OCR'ing
chars = []
preds = []
for c in cnts:
    # compute the bounding box of the contour
    (x, y, w, h) = cv2.boundingRect(c)

    # filter out bounding boxes, ensuring they are neither too small
    # nor too large
    if (w >= 5 and w <= 150) and (h >= 15 and h <= 120):
        # extract the character and threshold it to make the character
        # appear as *white* (foreground) on a *black* background, then
        # grab the width and height of the thresholded image
        roi = gray[y:y + h, x:x + w]
        thresh = cv2.threshold(roi, 0, 255,
            cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
        (tH, tW) = thresh.shape

        # if the width is greater than the height, resize along the
        # width dimension
        if tW > tH:
            thresh = imutils.resize(thresh, width=32)
        # otherwise, resize along the height
        else:
            thresh = imutils.resize(thresh, height=32)

        # re-grab the image dimensions (now that its been resized)
        # and then determine how much we need to pad the width and
        # height such that our image will be 32x32
        (tH, tW) = thresh.shape
        dX = int(max(0, 32 - tW) / 2.0)
        dY = int(max(0, 32 - tH) / 2.0)

        # pad the image and force 32x32 dimensions
        padded = cv2.copyMakeBorder(thresh, top=dY, bottom=dY,
            left=dX, right=dX, borderType=cv2.BORDER_CONSTANT,
            value=(0, 0, 0))
        padded = cv2.resize(padded, (28, 28))

        # prepare the padded image for classification via our
        # handwriting OCR model
        padded = padded.astype("float32") / 255.0
        padded = np.expand_dims(padded, axis=-1)

        # update our list of characters that will be OCR'd
        chars.append((padded, (x, y, w, h)))
        x,y,w,h = cv2.boundingRect(c)
        roi=image[y:y+h,x:x+w]
        plt.imshow(roi)

这段代码非常适用于没有倾斜的数字,并且间隔很大,但是在这个图像中我们看到“1”有一点倾斜。结果围绕着“1”的边界框也包括了相邻“2”的一部分。

有没有人对我如何稍微旋转边界框以排除两者的一部分有什么建议?

ni65a41a

ni65a41a1#

如果不了解边界框将如何在下游使用,就很难给予具体的建议。
最简单的方法是使用boxPoints函数。它将返回轮廓周围最小边界框的角坐标。或者,您可以拟合一条线到轮廓,并使用该线的Angular 来旋转边界框。

相关问题