Unicode解码错误:'utf-8'编解码器无法解码位置 65534-65535 中的字节:意外的数据结束



我想用简单的AES加密来加密文件,这是我的python3源代码。

import os, random, struct
from Crypto.Cipher import AES
def encrypt_file(key, in_filename, out_filename=None, chunksize=64*1024):
if not out_filename:
out_filename = in_filename + '.enc'
iv = os.urandom(16)
encryptor = AES.new(key, AES.MODE_CBC, iv)
filesize = os.path.getsize(in_filename)
with open(in_filename, 'rb') as infile:
with open(out_filename, 'wb') as outfile:
outfile.write(struct.pack('<Q', filesize))
outfile.write(iv)
while True:
chunk = infile.read(chunksize)
if len(chunk) == 0:
break
elif len(chunk) % 16 != 0:
chunk += ' ' * (16 - len(chunk) % 16)
outfile.write(encryptor.encrypt(chunk.decode('UTF-8','strict')))

它对一些文件很好,对一些文件会遇到错误信息,例如:

encrypt_file("qwertyqwertyqwer",'/tmp/test1',out_filename=None,chunksize=64*1024(

没有错误信息,工作正常。

encrypt_file("qwertyqwertyqwer",'/tmp/test2',out_filename=None,chunksize=64*1024(

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 17, in encrypt_file
UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 65534-65535: unexpected end of data

如何修复我的encrypt_file函数?

按照t.m.adam的说法,修复

outfile.write(encryptor.encrypt(chunk.decode('UTF-8','strict')))

作为

outfile.write(encryptor.encrypt(chunk))

尝试使用某些文件。

encrypt_file("qwertyqwertyqwer",'/tmp/test' , out_filename=None, chunksize=64*1024)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 16, in encrypt_file
TypeError: can't concat bytes to str

代码的主要问题是使用字符串。AES处理二进制数据,如果您使用PyCryptodome,此代码将引发TypeError:

Object type <class 'str'> cannot be passed to C code

Pycrypto接受字符串,但在内部将其编码为字节,因此将字节解码为字符串是没有意义的,因为它将被编码回字节。此外,它使用ASCII编码(使用PyCrypto v2.6.1、Python v2.7测试(,因此,例如,此代码:

encryptor.encrypt(u'ψ' * 16)

将引发UnicodeEncodeError:

File "C:Python27libsite-packagesCryptoCipherblockalgo.py", line 244, in encrypt
return self._cipher.encrypt(plaintext)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-15

加密或解密数据时应始终使用字节。然后,如果明文是文本,则可以将其解码为字符串。

下一个问题是填充方法。它会产生一个字符串,因此当你试图将其应用于明文时,你会得到一个TypeError,明文应该是字节。如果你用字节填充,可以解决这个问题

chunk +=b' '* (16 - len(chunk) % 16)

但最好使用PKCS7填充(目前使用的是零填充,但使用的是空格而不是零字节(。

PyCryptodome提供填充函数,但您似乎在使用PyCrypto。在这种情况下,您可以实现PKCS7填充,或者更好地复制PyCryptodome的填充函数。

try:
from Crypto.Util.Padding import pad, unpad
except ImportError:
from Crypto.Util.py3compat import bchr, bord
def pad(data_to_pad, block_size):
padding_len = block_size-len(data_to_pad)%block_size
padding = bchr(padding_len)*padding_len
return data_to_pad + padding
def unpad(padded_data, block_size):
pdata_len = len(padded_data)
if pdata_len % block_size:
raise ValueError("Input data is not padded")
padding_len = bord(padded_data[-1])
if padding_len<1 or padding_len>min(block_size, pdata_len):
raise ValueError("Padding is incorrect.")
if padded_data[-padding_len:]!=bchr(padding_len)*padding_len:
raise ValueError("PKCS#7 padding is incorrect.")
return padded_data[:-padding_len]

padunpad函数从Crypto.Util.Padding复制并修改为仅使用PKCS7填充。请注意,当使用PKCS7填充时,填充最后一个块是很重要的,即使它的大小是块大小的倍数,否则您将无法正确地取消填充。

将这些更改应用于encrypt_file功能,

def encrypt_file(key, in_filename, out_filename=None, chunksize=64*1024):
if not out_filename:
out_filename = in_filename + '.enc'
iv = os.urandom(16)
encryptor = AES.new(key, AES.MODE_CBC, iv)
filesize = os.path.getsize(in_filename)
with open(in_filename, 'rb') as infile:
with open(out_filename, 'wb') as outfile:
outfile.write(struct.pack('<Q', filesize))
outfile.write(iv)
pos = 0
while pos < filesize:
chunk = infile.read(chunksize)
pos += len(chunk)
if pos == filesize:
chunk = pad(chunk, AES.block_size)
outfile.write(encryptor.encrypt(chunk))

以及匹配的decrypt_file功能

def decrypt_file(key, in_filename, out_filename=None, chunksize=64*1024):
if not out_filename:
out_filename = in_filename + '.dec'
with open(in_filename, 'rb') as infile:
filesize = struct.unpack('<Q', infile.read(8))[0]
iv = infile.read(16)
encryptor = AES.new(key, AES.MODE_CBC, iv)
with open(out_filename, 'wb') as outfile:
encrypted_filesize = os.path.getsize(in_filename)
pos = 8 + 16 # the filesize and IV.
while pos < encrypted_filesize:
chunk = infile.read(chunksize)
pos += len(chunk)
chunk = encryptor.decrypt(chunk)
if pos == encrypted_filesize:
chunk = unpad(chunk, AES.block_size)
outfile.write(chunk)

此代码与Python2/Python3兼容,应该与PyCryptodome或PyCrypto一起使用。

但是,如果您正在使用PyCrypto,我建议您更新到PyCryptodome。PyCryptodome是PyCrypto的一个分支,它公开了相同的API(因此您不必过多地更改代码(,以及一些额外的功能:填充函数、经过身份验证的加密算法、KDF等。另一方面,PyCrypto不再得到维护,而且一些版本还存在基于堆的缓冲区溢出漏洞:CVE-2013-7459。

除了公认的答案外,我相信展示简单AES加密的多种实现对读者/新学习者也很有用:

import os
import sys
import pickle
import base64
import hashlib
import errno
from Crypto import Random
from Crypto.Cipher import AES
DEFAULT_STORAGE_DIR = os.path.join(os.path.dirname(__file__), '.ncrypt')
def create_dir(dir_name):
""" Safely create a new directory. """
try:
os.makedirs(dir_name)
return dir_name
except OSError as e:
if e.errno != errno.EEXIST:
raise OSError('Unable to create directory.')

class AESCipher(object):
DEFAULT_CIPHER_PICKLE_FNAME = "cipher.pkl"
def __init__(self, key):
self.bs = 32  # block size
self.key = hashlib.sha256(key.encode()).digest()
def encrypt(self, raw):
raw = self._pad(raw)
iv = Random.new().read(AES. block_size)
cipher = AES.new(self.key, AES.MODE_CBC, iv)
return base64.b64encode(iv + cipher.encrypt(raw))
def decrypt(self, enc):
enc = base64.b64decode(enc)
iv = enc[:AES.block_size]
cipher = AES.new(self.key, AES.MODE_CBC, iv)
return self._unpad(cipher.decrypt(enc[AES.block_size:])).decode('utf-8')
def _pad(self, s):
return s + (self.bs - len(s) % self.bs) * chr(self.bs - len(s) % self.bs)
@staticmethod
def _unpad(s):
return s[:-ord(s[len(s)-1:])]

并举例说明上述用法:

while True:
option = input('n'.join(["="*80,
"| Select an operation:",
"| 1) E : Encrypt",
"| 2) D : Decrypt",
"| 3) H : Help",
"| 4) G : Generate new cipher",
"| 5) Q : Quit",
"="*80,
"> "])).lower()
print()
if option == 'e' or option == 1:
plaintext = input('Enter plaintext to encrypt: ')
print("Encrypted: {}".format(cipher.encrypt(plaintext).decode("utf-8")))
elif option == 'd' or option == 2:
ciphertext = input('Enter ciphertext to decrypt: ')
print("Decrypted: {}".format(cipher.decrypt(ciphertext.encode("utf-8"))))
elif option == 'h' or option == 3:
print("Help:ntE: Encrypt plaintextntD: Decrypt ciphertext.")
elif option == 'g' or option == 4:
if input("Are you sure? [yes/no]: ").lower() in ["yes", "y"]:
cipher = AESCipher(key=input('Enter cipher password: '))
with open(pickle_fname, 'wb') as f:
pickle.dump(cipher, f)
print("Generated new cipher.")
elif option == 'q' or option == 5:
raise EOFError
else:
print("Unknown operation.")

相关内容

  • 没有找到相关文章

最新更新