pdfminer3k has no method named create_pages in PDFPage

2024/9/19 18:45:17

Since I want to move from python 2 to 3, I tried to work with pdfmine.3kr in python 3.4. It seems like they have edited everything. Their change logs do not reflect the changes they have done but I had no success in parsing pdf with pdfminer3k. For example:

They have moved PDFDocument into pdfparser (sorry, if I spell incorrectly). PDFPage used to have create_pages method which is gone now. All I can see inside PDFPage are internal methods. Does anybody has a working example of pdfminer3k? It seems like there is no new documentation to reflect any of the changes.

Answer

If you are interested in reading text from a pdf file the following code works with pdfminer3k using python 3.4.

from pdfminer.pdfparser import PDFParser, PDFDocument
from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter
from pdfminer.converter import PDFPageAggregator
from pdfminer.layout import LAParams, LTTextBox, LTTextLinefp = open('file.pdf', 'rb')
parser = PDFParser(fp)
doc = PDFDocument()
parser.set_document(doc)
doc.set_parser(parser)
doc.initialize('')
rsrcmgr = PDFResourceManager()
laparams = LAParams()
device = PDFPageAggregator(rsrcmgr, laparams=laparams)
interpreter = PDFPageInterpreter(rsrcmgr, device)
# Process each page contained in the document.
for page in doc.get_pages():interpreter.process_page(page)layout = device.get_result()for lt_obj in layout:if isinstance(lt_obj, LTTextBox) or isinstance(lt_obj, LTTextLine):print(lt_obj.get_text())fp.close()
https://en.xdnf.cn/q/72368.html

Related Q&A

curve fitting zipf distribution matplotlib python

I tried to fit the following plot(red dot) with the Zipf distribution PDF in Python, F~x^(-a). I simply chose a=0.56 and plotted y = x^(-0.56), and I got the curve shown below. The curve is obviously …

Running python/ruby script on iPhone?

From the recent news from the Apple, I learned that one has to use C/C++/Objective-C for iPhone App. Accordingly, its not possible to use MacPython or similar to make iPhone App. But as the python/ruby…

Unexpected behavior of universal newline mode with StringIO and csv modules

Consider the following (Python 3.2 under Windows):>>> import io >>> import csv >>> output = io.StringIO() # default parameter newline=None >>> csvdata = [1, …

logger chain in python

Im writing python package/module and would like the logging messages mention what module/class/function they come from. I.e. if I run this code:import mymodule.utils.worker as workerw = worker.Worker()…

How to make data to be shown in tabular form in discord.py?

Hi I am creating a bot that makes points table/leaderboard , below is the code which works really nice. def check(ctx):return lambda m: m.author == ctx.author and m.channel == ctx.channelasync def get_…

Getting Python version using Go

Im trying to get my Python version using Go:import ("log""os/exec""strings" )func verifyPythonVersion() {_, err := exec.LookPath("python")if err != nil {log.Fata…

Python shutil.copytree() is there away to track the status of the copying

I have a lot of raster files (600+) in directories that I need copy into a new location (including their directory structure). Is there a way to track the status of the copying using shutil.copytree()?…

Py2exe error: [Errno 2] No such file or directory

C:\Users\Shalia\Desktop\accuadmin>python setup_py2exe.py py2exe running py2exe10 missing Modules------------------ ? PIL._imagingagg imported from PIL.ImageDraw ? PyQt4 …

pandas rolling window mean in the future

I would like to use the pandas.DataFrame.rolling method on a data frame with datetime to aggregate future values. It looks it can be done only in the past, is it correct?

When should I use type checking (if ever) in Python?

Im starting to learn Python and as a primarily Java developer the biggest issue I am having is understanding when and when not to use type checking. Most people seem to be saying that Python code shoul…