Unicode error PyPdf



我尝试使用requests库下载多个PDF,并使用PYPDF合并它们。总的来说,这很好,但是对于某些PDF,我只会遇到一个错误。

mwe.py

import requests
from pyPdf import PdfFileWriter, PdfFileReader
from StringIO import StringIO

input = PdfFileReader(StringIO(response.content))
input.decrypt("")
output = PdfFileWriter()
output.addPage(input.getPage(0))
outputStream = file("document-output.pdf", "wb")
output.write(outputStream)
outputStream.close()
session.close()

错误

Traceback (most recent call last):
  File "mwe.py", line 21, in <module>
    input.decrypt("")
  File "/usr/local/lib/python2.7/dist-packages/pyPdf/pdf.py", line 894, in decrypt
    return self._decrypt(password)
  File "/usr/local/lib/python2.7/dist-packages/pyPdf/pdf.py", line 904, in _decrypt
    user_password, key = self._authenticateUserPassword(password)
  File "/usr/local/lib/python2.7/dist-packages/pyPdf/pdf.py", line 945, in _authenticateUserPassword
    encrypt.get("/EncryptMetadata", BooleanObject(False)).getObject())
  File "/usr/local/lib/python2.7/dist-packages/pyPdf/pdf.py", line 1818, in _alg35
    key = _alg32(password, rev, keylen, owner_entry, p_entry, id1_entry)
  File "/usr/local/lib/python2.7/dist-packages/pyPdf/pdf.py", line 1729, in _alg32
    m.update(id1_entry)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)

对于跟踪,我读了文件中的输入,但我认为在这种情况下不重要。

我发现了这个问题的一些相关问题,但我无法解决我的特定问题。

好吧,我发现这似乎是pypdf内部的错误(1.13)https://github.com/mstamy2/pypdf2/issues/51

使用pypdf2(1.26.0)而不是预期的。

最新更新