opencv 从文本文档中删除不清晰的水印

3pmvbmvn  于 2023-08-06  发布在  其他
关注(0)|答案(1)|浏览(108)

各位OpenCV爱好者,
我正面临着一个问题,自动化拍摄的文件阅读器和文件,我试图阅读有水印下的文字,使OCR困难。我已经设法调整直方图,并提高整体质量的图像使用对比度有限自适应直方图均衡化(CLAHE)和其他一些矩阵操作(代码如下)。我需要找出一种方法来进一步消除这种类型的图像噪声,提高OCR能力。

clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))
gray_image = clahe.apply(gray_image)
height, width, channels = hsv_image.shape
histogram = cv2.calcHist([hsv_image], [2], None, [256], [0, 256])
brightness_value = sum([idx * value[0] for idx, value in enumerate(histogram)]) / (255 * height * width)
brightness = contrast = int(math.log(2 - brightness_value) * 100) + 30
gray_image = np.int16(gray_image)
gray_image = gray_image * (contrast / 127 + 1) - contrast + brightness
gray_image = np.clip(gray_image, 0, 255)
if sharpen:
    gray = np.uint8(gray_image)
    sharpening_kernel = np.array([[0, -1, 0], [-1, 5, -1], [0, -1, 0]])
    gray = cv2.filter2D(gray, -1, sharpening_kernel)
    return gray
return np.uint8(gray_image)

字符串
Original Image

Improved image

任何建议/代码,以改善这一进一步是高度赞赏!

a11xaf1n

a11xaf1n1#

这里有一种使用Python/OpenCV的方法,通过进行除法归一化,然后进行阈值处理。
输入:
x1c 0d1x的数据

import cv2
import numpy as np

# read the image as grayscale
img = cv2.imread('certificate.jpg', cv2.IMREAD_GRAYSCALE)

# blur img
blur = cv2.GaussianBlur(img, (0,0), sigmaX=99, sigmaY=99)

# divide img by blur
divide = cv2.divide(img, blur, scale=255)

# threshold
thresh = cv2.threshold(divide, 200, 255, cv2.THRESH_BINARY)[1]

# save results
cv2.imwrite('certificate_divide.jpg',divide)
cv2.imwrite('certificate_thresh.jpg',thresh)

# show results
cv2.imshow("divide", divide)
cv2.imshow("thresh", thresh)
cv2.waitKey(0)

字符串
划分标准化:



阈值


相关问题