pyPDF2 "Stream has ended unexpectedly"



这是我的第一个python代码。写入程序传递一个错误。这似乎是在循环浏览pdf的过程中随机发生的。

try: except: pass将不起作用,因为它只会跳过有问题的文件,而不会为其生成输出

strict=False似乎对作者不起作用。

错误:

PdfReadWarning: Multiple definitions in dictionary at byte 0x6eb54 for key /PageMode [generic.py:587]
PdfReadWarning: Multiple definitions in dictionary at byte 0x75740 for key /PageMode [generic.py:587]
PdfReadWarning: Multiple definitions in dictionary at byte 0xabc13 for key /PageMode [generic.py:587]
Traceback (most recent call last):
File "C:Userskmincey.BCSBLOCALAppDataLocalProgramsPythonPython39librunpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:Userskmincey.BCSBLOCALAppDataLocalProgramsPythonPython39librunpy.py", line 87, in _run_code
exec(code, run_globals)
File "c:Userskmincey.BCSBLOCAL.vscodeextensionsms-python.python-2022.4.0pythonFileslibpythondebugpy__main__.py", line 45, in <module>
cli.main()
File "c:Userskmincey.BCSBLOCAL.vscodeextensionsms-python.python-2022.4.0pythonFileslibpythondebugpy/..debugpyservercli.py", line 444, in main
run()
File "c:Userskmincey.BCSBLOCAL.vscodeextensionsms-python.python-2022.4.0pythonFileslibpythondebugpy/..debugpyservercli.py", line 285, in run_file
runpy.run_path(target_as_str, run_name=compat.force_str("__main__"))
File "C:Userskmincey.BCSBLOCALAppDataLocalProgramsPythonPython39librunpy.py", line 268, in run_path
return _run_module_code(code, init_globals, run_name,
File "C:Userskmincey.BCSBLOCALAppDataLocalProgramsPythonPython39librunpy.py", line 97, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "C:Userskmincey.BCSBLOCALAppDataLocalProgramsPythonPython39librunpy.py", line 87, in _run_code
exec(code, run_globals)
File "c:Userskmincey.BCSBLOCALDesktopPython_scriptsPDFsealer_V2.py", line 56, in <module>
output_pdf.write(f)
File "C:Userskmincey.BCSBLOCALAppDataLocalProgramsPythonPython39libsite-packagesPyPDF2pdf.py", line 482, in write
self._sweepIndirectReferences(externalReferenceMap, self._root)
File "C:Userskmincey.BCSBLOCALAppDataLocalProgramsPythonPython39libsite-packagesPyPDF2pdf.py", line 571, in _sweepIndirectReferences
self._sweepIndirectReferences(externMap, realdata)
File "C:Userskmincey.BCSBLOCALAppDataLocalProgramsPythonPython39libsite-packagesPyPDF2pdf.py", line 547, in _sweepIndirectReferences
value = self._sweepIndirectReferences(externMap, value)
File "C:Userskmincey.BCSBLOCALAppDataLocalProgramsPythonPython39libsite-packagesPyPDF2pdf.py", line 571, in _sweepIndirectReferences
self._sweepIndirectReferences(externMap, realdata)
File "C:Userskmincey.BCSBLOCALAppDataLocalProgramsPythonPython39libsite-packagesPyPDF2pdf.py", line 547, in _sweepIndirectReferences
value = self._sweepIndirectReferences(externMap, value)
File "C:Userskmincey.BCSBLOCALAppDataLocalProgramsPythonPython39libsite-packagesPyPDF2pdf.py", line 556, in _sweepIndirectReferences
value = self._sweepIndirectReferences(externMap, data[i])
File "C:Userskmincey.BCSBLOCALAppDataLocalProgramsPythonPython39libsite-packagesPyPDF2pdf.py", line 571, in _sweepIndirectReferences
self._sweepIndirectReferences(externMap, realdata)
File "C:Userskmincey.BCSBLOCALAppDataLocalProgramsPythonPython39libsite-packagesPyPDF2pdf.py", line 547, in _sweepIndirectReferences
value = self._sweepIndirectReferences(externMap, value)
File "C:Userskmincey.BCSBLOCALAppDataLocalProgramsPythonPython39libsite-packagesPyPDF2pdf.py", line 556, in _sweepIndirectReferences
value = self._sweepIndirectReferences(externMap, data[i])
File "C:Userskmincey.BCSBLOCALAppDataLocalProgramsPythonPython39libsite-packagesPyPDF2pdf.py", line 577, in _sweepIndirectReferences
newobj = data.pdf.getObject(data)
File "C:Userskmincey.BCSBLOCALAppDataLocalProgramsPythonPython39libsite-packagesPyPDF2pdf.py", line 1611, in getObject
retval = readObject(self.stream, self)
File "C:Userskmincey.BCSBLOCALAppDataLocalProgramsPythonPython39libsite-packagesPyPDF2generic.py", line 66, in readObject
return DictionaryObject.readFromStream(stream, pdf)
File "C:Userskmincey.BCSBLOCALAppDataLocalProgramsPythonPython39libsite-packagesPyPDF2generic.py", line 579, in readFromStream
value = readObject(stream, pdf)
File "C:Userskmincey.BCSBLOCALAppDataLocalProgramsPythonPython39libsite-packagesPyPDF2generic.py", line 68, in readObject
return readHexStringFromStream(stream)
File "C:Userskmincey.BCSBLOCALAppDataLocalProgramsPythonPython39libsite-packagesPyPDF2generic.py", line 311, in readHexStringFromStream
raise PdfStreamError("Stream has ended unexpectedly")
PyPDF2.utils.PdfStreamError: Stream has ended unexpectedly

我读过几篇关于需要在阅读器中放入strict=False以传递警告而不是错误的帖子。https://stackoverflow.com/questions/42570432/pypdf2-stream-has-ended-unexpectedly,https://github.com/mstamy2/PyPDF2/issues/99.这在大多数情况下都有效,然而,作者现在似乎成了问题所在。

提前谢谢你的建议。

供参考的循环片段:

for file in input_pdf:
output_pdf = PdfFileWriter()
sg.OneLineProgressMeter('My Meter', i, page_count, 'And now we Wait.....')
PageObj = PyPDF2.PdfFileReader(open(file, "rb"), strict=False).getPage(0)
PageObj.scaleTo(11*72, 17*72)
PageObj.mergePage(Seal_pdf.getPage(0))
output_pdf.addPage(PageObj)
output_filename = f"{file}"
f = open(output_filename, "wb+")
output_pdf.write(f)
i = i + 1
f.close()

由于@cards和@KJ的帮助,我发现问题是我试图覆盖一个正在使用的文件。事实上,原作仍然被束缚在记忆中,一旦到达作者手中,就会破坏它。我采用的解决方案是简单地用不同的名称保存文件,并编写更多的代码来清理目录。谢谢你的协助。

最新更新