小贝子编程

识别文件类型gz或zip时出现UnicodeDecodeError

本文关键字：UnicodeDecodeError zip 文件类型 gz 识别 python python-3.x
更新时间 : 2023-09-18
英文 : UnicodeDecodeError while identifying file type-gz or zip

我有下面的代码来识别给定的文件是gz还是zip文件。但是，它返回错误UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte

你能帮我解决一下这里出了什么问题吗？提前谢谢。

header_dict = {
"x1fx8bx08": "gz",
"x50x4bx03x04": "zip"
}
len_max = max(len(x) for x in header_dict)
with open(filename) as f:
file_start = f.read(len_max)
for header, file_type in header_dict.items():
if file_start.startswith(header):
return file_type
return "no match"

以二进制模式打开文件，这样它就不会试图解码字节。

with open(filename, 'rb') as f:

然后让你的测试字符串也是二进制的：

header_dict = {
b"x1fx8bx08": "gz",
b"x50x4bx03x04": "zip"
}

识别文件类型gz或zip时出现UnicodeDecodeError

相关内容

最新更新

热门标签：