Get bounding box of an image FAST #908
-
| Is there any approach to get image bbox using xref or any other methods which are faster than   | 
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 14 replies
-
| You don't seem to need the xref at all, do you? Or any detail on how the page appearance references the image? If this is true, I recommend you use text extraction - although this seems not to be obvious: pprint([b for b in page.get_text("blocks") if b[-1] == 1])  # take only image blocks
[(344.25,
  88.93597412109375,
  540.0,
  175.18597412109375,
  '<image: DeviceRGB, width 261, height 115, bpc 8>',
  0,
  1)]An image block is represented by a 1 as last item. The first 4 items of each block represent the bbox of the text block, in our case the bbox of the image. In [8]: %timeit imgs=[b for b in page.get_text("blocks") if b[-1] == 1]
22.4 ms ± 245 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [9]: images = doc.get_page_images(1,full=True)
In [10]: %timeit  imgs=[page.get_image_bbox(i) for i in images]
2.46 s ± 10.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [11]:So you have 22.4 milliseconds versus 2.46 seconds. | 
Beta Was this translation helpful? Give feedback.
-
| The reason why we have such an apparent functional overlap here is, that the text extraction works for all document types - not just PDFs. The  | 
Beta Was this translation helpful? Give feedback.
You don't seem to need the xref at all, do you? Or any detail on how the page appearance references the image?
If I get you right, all you need are bbox coordinates of raster images actually shown on the page.
If this is true, I recommend you use text extraction - although this seems not to be obvious:
There is a performance oriented variant, which delivers text blocks of which every image is represented by a line of text with image metadata:
An image block is represented by a 1 a…