python-3.x 使用pdf 2 image和table-transformers时，如何将图像坐标转换为PDF坐标？

dauxcl2d 于 2023-05-23 发布在 Python

关注(0)|答案(2)|浏览(207)

我正在使用pdf2image将pdf转换为图像，并使用表格转换器检测表格。我需要你帮我定位。
问题是，我得到了完美的表格边框，但图像中的像素与PDF坐标不同。如何将图像坐标转换为PDF坐标？下面是我的代码以供参考：

from pdf2image import convert_from_path
images = convert_from_path('/content/Sample Statement Format Bancslink.pdf')
for i in range(len(images)):
  images[i].save('/content/pages_sbi/page'+str(i)+'.jpeg')

python-3.x

来源：https://stackoverflow.com/questions/76304011/how-can-i-convert-image-coordinates-to-pdf-coordinates-when-using-pdf2image-and

2条答案

按热度按时间

icnyk63a1#

下面是如何使用PyMuPDF将图像坐标转换回PDF页面坐标。
这当然是一页一页地工作。因此，在下文中，假设图像文件是从对应的页面制成的。

import fitz  # PyMuPDF import
doc = fitz.open("input.pdf")
page = doc[pno]  # page number pno is 0-based
image = f"image{pno}.jpg"  # filename of the matching image of the page
# rectangle, e.g. one that wraps a table in the image
# x0, y0 are coordinates of its top-left point
# x1, y1 is the bottom-right point
rect = fitz.Rect(x0, y0, x1, y1)
# make a PyMuPDF iamge from the JPEG
pix = fitz.Pixmap(image)
# make a matrix that converts any image coordinates to page coordinates
mat = pix.irect.torect(page.rect)
# now every image coordinate can be converted to page coordinates
# e.g. this is the table rect in page coordinates:
pdfrect = rect * mat
# if you don't want PyMuPDF objects as rectangle, just use
# tuple(pdfrect) to retrieve the 4 coordinates

顺便说一句，PyMuPDF还能够将页面渲染为图像。因此，如果你的表检测机制可以逐页调用，你可以这样做一个循环：
1.使用PyMuPDF读取页面
1.将页面转换为图像。可能也在记忆里。
1.将页面图像传递给表识别器，后者返回表坐标
1.使用表格坐标并将其转换为页面坐标，如上所示。

展开查看全部

赞(0）回复(0）举报 2023-05-23

zxlwwiss2#

好吧，找到了完美的解决方案，几乎可以解决所有问题。
请将以下代码视为PDF to Image的代码：

from pdf2image import convert_from_path
images = convert_from_path('PATH')
!mkdir pages
for i in range(len(images)):
  images[i].save('/content/pages/page'+str(i)+'.jpeg')

现在，您需要首先获取PDF的数据：

from pypdf import PdfReader
reader = PdfReader('PATH')
box = reader.pages[0].mediabox
pdf_width = box.width
pdf_height = box.height

现在读取并获取有关图像的数据：

import cv2
im = cv2.imread('/content/pages/page0.jpeg')
height, width, channels = im.shape

现在考虑x_1，x_2，y_1和y_2作为图像中的坐标。要在PDF中获取相同的位置，请使用以下代码：

x_1  = x_1/width*pdf_width
y_1  = y_1/width*pdf_width
x_2  = x_2/width*pdf_width
y_2  = y_2/width*pdf_width

将此坐标用于您的工作。

展开查看全部

赞(0）回复(0）举报 2023-05-23

我来回答

python-3.x 使用pdf 2 image和table-transformers时，如何将图像坐标转换为PDF坐标？

2条答案

相关问题

热门标签

最新问答