小贝子编程

将 html 解码的 str 写入文件和将 html 二进制数据直接写入文件有什么区别吗?

本文关键字：文件 html 区别二进制解码 str 数据 python http binary
更新时间 : 2023-09-14
英文 : Any difference between write html decoded str to file and write html binary data directly to a file?

代码：

from urllib import request
response = request.urlopen('http://www.amazon.com/')
body = response.read()
with open('test.html', 'wb') as f:
f.write(body)
with open('test2.html', 'w') as f:
f.write(body.decode('utf-8'))

有什么不同或需要注意的吗？

第一种方式

with open('test.html', 'wb') as f:
f.write(body)

只需保存您下载的二进制数据。

第二种方式

with open('test2.html', 'w') as f:
f.write(body.decode('utf-8'))

假设数据是 UTF-8，尝试将这些 UTF-8 字节解码为 Unicode 文本，然后将其重新编码为默认文件编码，如locale.getpreferredencoding(False)指定。因此，如果数据已经是 UTF-8，它会浪费地解码并重新编码它。如果它不是UTF-8，那么它指定了错误的编码来解码它。如果文件仅包含纯 7 位 ASCII 数据，这将正常工作，否则它会给出错误的结果或引发UnicodeDecodeError。

将 html 解码的 str 写入文件和将 html 二进制数据直接写入文件有什么区别吗?

相关内容

最新更新

热门标签：