我有一个数据文件,我想删除每行中每个单词的前3个字符
下面是我的文件的例子:
input
"13X5106,18C2295,17C1462,17X4893,14X4215,16C3729,14C1026,END"
"17C2308,14C1030,15C904,20C1602,17C1017,18C1030,END"
"13C2369,20C1505,18X4245,15C1224,14C1031,12C885,17C936,END"
"11C3080,13C4123,16C1180,14C1141,15C932,18C1467,END"
output
"5106,2295,1462,4893,4215,3729,1026,END"
"2308,1030,904,1602,1017,1030,END"
"2369,1505,4245,1224,1031,885,936,END"
"3080,4123,1180,1141,932,1467,END"
我尝试编码,但输出不是我想要的方式。
file1 = open('D:pythonProjectblock1.txt','r')
data = file1.read()
remove_char = [sub[3:] for sub in data]
print(remove_char)
如果使用file1.readlines()
,则需要用逗号分隔。唯一的问题是它可能在末尾引入行结束符。这是因为您的END
字符串在每行的末尾。但这很容易消除,如下所示:
代码:
file1 = open('D:pythonProjectblock1.txt','r')
remove_char = [[s[3:] for s in sub.split(',')] for sub in file1.readlines()]
for the_list in remove_char:
print(the_list[0:-1])
输出:['5106', '2295', '1462', '4893', '4215', '3729', '1026']
['2308', '1030', '904', '1602', '1017', '1030']
['2369', '1505', '4245', '1224', '1031', '885', '936']
['3080', '4123', '1180', '1141', '932', '1467']
我用f.readlines
读取文件,并在每行上去掉"
。然后每个词被,
分割并作为word[3:]
处理。
with open("...", "r") as f:
lines = f.readlines()
lines = map(lambda x: x.replace('"',"").strip("n").split(","), lines)
res = []
for line in lines:
new_line = []
for word in line:
if word != "END":
word = word[3:]
new_line.append(word)
res.append(",".join(new_line))
res = "n".join(res)
print(res)
# Output
"""
5106,2295,1462,4893,4215,3729,1026,END
2308,1030,904,1602,1017,1030,END
2369,1505,4245,1224,1031,885,936,END
3080,4123,1180,1141,932,1467,END
"""
您可以尝试这样打印for循环中的每一行:
file1 = open('D:pythonProjectblock1.txt')
data = file1.readlines()
for sub in data:
line = [j[3:] for i in [eval(sub)] for j in i.split(',')[:-1]]+[eval(sub)[-3:]]
remove_char = f'"{chr(44).join(line)}"'
print(remove_char)
或生成器表达式:
remove_char = 'n'.join('"'+chr(44).join(j[3:] for i in [eval(s)]
for j in i.split(chr(44))[:-1])+','+chr(44).join([eval(s)[-3:]])+'"'
for s in open('D:pythonProjectblock1.txt').readlines())
print(remove_char)
输出:
"5106,2295,1462,4893,4215,3729,1026,END"
"2308,1030,904,1602,1017,1030,END"
"2369,1505,4245,1224,1031,885,936,END"
"3080,4123,1180,1141,932,1467,END"
下面是一个使用列表推导式的快速解决方案:
data = ["13X5106", "18C2295"] # this is a sample list of strings
print([code[3:] for code in data if code != "END"])
这将打印所有字符串的相同列表,丢弃前三个字符,跳过"END"字符串:
['5106', '2295']