你好,我有622列的csv文件。我需要把它分成100-100列。我已经尝试了一些代码,但没有得到输出。请告诉我我哪里做错了。
的例子:
file - cleanBaby.csv
在输出中我想要:
cleanBaby1.csv ---- 1 to 100 columns
cleanBaby2.csv ---- 101 to 200
cleanBaby3,csv ---- 201 to 300
cleanBaby4.csv ---- 301 to 400
cleanBaby5.csv ---- 401 to 500
cleanBaby6.csv ---- 501 to 600
cleanBaby7.csv ---- 601 to 622
我试过下面的代码-
df = pd.read_csv(r"D:UsersSPate233DownloadsiMedicalraw_layercleanBaby.csv", delimiter=',')
lst = df.columns
print(len(lst))
csvfile = open(r'D:UsersSPate233DownloadsiMedicalraw_layercleanBaby.csv', 'r', encoding='utf-8').readlines()
def chunks(lst, n):
for i in range(0, len(lst), n):
yield lst[i:i + n]
for n, headers_chunk in enumerate(chunks(lst, 100)):
with open(r"D:UsersSPate233DownloadsiMedicalraw_layercleanBaby{}.csv".format(n), "w") as f:
for header in headers_chunk:
f.write(header + ",")
f.writelines(csvfile[n+100:n])
使用pandas.DataFrame.groupby
和enumerate
的一种方法:
# Sample data
df = pd.DataFrame(np.random.random((500, 622)))
# Make grouper
indices = np.arange(df.shape[1])//100
for n, (k, d) in enumerate(df.groupby(indices, axis=1), 1):
name = "test%s.csv" % n
print(name, d.shape)
d.to_csv(name, index=False)
输出:
test1.csv (500, 100)
test2.csv (500, 100)
test3.csv (500, 100)
test4.csv (500, 100)
test5.csv (500, 100)
test6.csv (500, 100)
test7.csv (500, 22)