将多行txt数据转换为CSV



我想知道是否有人能为我指明正确的方向。

这是我的数据样本。

TRANS,"GUS000017787609","","","INSTL","","","","","","",,"","",20211025,
MTPNT,"",45654,"","","","","",,,
ASSET,"","INSTL","METER","","CR","G4SZV-2","FLN",2020,"XXXTYU422000","32","","LI"

我需要使用python以某种方式将此类信息转换为CSV。我有数千行数据,每个TRANS、MTPNT和ASSET都被认为是一个";行";。

有人知道在这种数据上预成型ETL的最佳技术类型是什么吗?

您可以使用grouper配方一次读取3个CSV行并将它们组合。例如:

import csv
from itertools import zip_longest, chain
def grouper(iterable, n, fillvalue=None):
"Collect data into non-overlapping fixed-length chunks or blocks"
# grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
args = [iter(iterable)] * n
return zip_longest(*args, fillvalue=fillvalue)


with open('input.csv') as f_input, open('output.csv', 'w', newline='') as f_output:    
csv_input = csv.reader(f_input)
csv_output = csv.writer(f_output)

for triple_row in grouper(csv_input, 3, ''):
row = list(chain.from_iterable(triple_row))
#row[2] = 'test'      # modify 3rd value before writing  
csv_output.writerow(row)

给你:

TRANS,GUS000017787609,,,INSTL,,,,,,,,,,20211025,,MTPNT,,45654,,,,,,,,,ASSET,,INSTL,METER,,CR,G4SZV-2,FLN,2020,XXXTYU422000,32,,LI

最新更新