如何根据每行的第一个字符迭代地将文本数据写入新文件



我收到许多格式如下的文本文件:

100000054896524Textext
30000680235498065464065     texttext
50005065321465406546406     16227322
7000056432586846403546854065354096
50046540632146540665406     16268431
7000066543241564786413468464163156
30065406346840654065486     TEXTETXT

我需要根据这些行的第一个字符将这些文件的内容写入新文件,这样每个第一个字符就有n个文件。对于上面的数据,我有四个新文件:

file1.txt:

100000054896524Textext

file3.txt:

30000680235498065464065     texttext
30065406346840654065486     TEXTETXT

file5.txt:

50005065321465406546406     16227322
50046540632146540665406     16268431

file7.txt:

7000056432586846403546854065354096
7000066543241564786413468464163156

我似乎不知道该怎么做。我试过以下几种:

with open('test_file.txt','r') as file_handle:
file_content = file_handle.read()
with open('file1.txt', 'w') as file_handle:
for line in file_content:
if line[0] == '1':
file_handle.write(line+'n')
with open('file3.txt', 'w') as file_handle:
for line in file_content:
if line[0] == '3':
file_handle.write(line+'n')

5和7等等,但这只会让我得到一堆1和3的文件,而没有数据。。。

我不明白的是什么?非常感谢。

使用readlines()而不是read()(第2行(

使用file_handle.read()而不是file_handle.readlines()将返回一个字符串,因此使用file_handle.read()将逐个字符迭代。

使用readlines()将逐行迭代,因为该函数将返回一个列表。

与其为每个文件单独调用open,不如使用字典。下面是一个工作示例:

output = {}
with open('testfile.txt') as f:
for line in f:
start_char = line[0]
if start_char not in output:
output[start_char] = []
output[start_char].append(line)
for start_char in output.keys():
with open('file{}.txt'.format(start_char), 'w') as f:
f.writelines(output[start_char])

read将文件作为单个字符串读取。迭代时,是逐字符迭代,而不是逐行迭代。您可以使用file_content = file_handle.readlines()来迭代行而不是字符。

不要为每个文件复制代码,而是设置一个缓存,让脚本动态创建文件。

# will hold open file objects for "file0.txt", ..., "file9.txt"
# as needed
file_cache = [None] * 10
try:
with open('test_file.txt') as file_handle:
for line in file_handle:
num = int(line[0])
if file_cache[num] is None:
file_cache[num] = open(f"file{num}.txt", "w")
file_cache[num].write(line)
# todo: May want to catch exceptions and delete all files on fail
# except:...
finally:
for fp in file_cache:
if fp:
fp.close()

您可以在读取输入文件时根据需要动态打开正确的文件:

open_files = {}
with open('test_file.txt','r') as file_handle:
for line in file_handle:
digit = line[0]
fname = f'file{digit}.txt'
if fname in open_files:
write_file = open_files[fname]
else:
open_files[fname] = open(fname, 'w')
write_file = open_files[fname]
write_file.write(line)
for write_file in open_files.values():
write_file.close()

最新更新