有内存意识的在文件开头添加字节的方法

我正试图在文件的开头写一个字节数组，在(很久(以后，我想再次拆分它们，以检索原始文件。字节数组只是一个小jpeg。

# write a byte array at the beginning of a file
def write_byte_array_to_beginning_of_file( byte_array, file_path, out_file_path ):
with open( file_path, "rb" ) as f:
with open( out_file_path, "wb" ) as f2:
f2.write( byte_array )
f2.write( f.read( ) )

该函数工作时会占用大量内存。它似乎在做某事之前先把文件读入内存。我需要处理一些超过40gb的文件，而且只能在8Gb RAM的小型NAS上完成。

什么是有记忆意识的人才能做到这一点？

您可以分块读取原始文件，而不是读取整个文件。

def write_byte_array_to_beginning_of_file( byte_array, file_path, out_file_path, chunksize = 10 * 1024 * 1024 ):
with open( file_path, "rb" ) as f, open( out_file_path, "wb" ) as f2:
f2.write( byte_array )
while True:
block = f.read(chunksize)
if not block:
break
f2.write(block)

默认情况下，这会以10MB的块读取它，您可以覆盖它。

相关内容

最新更新

热门标签：