我有一个类似4列的csv文件test.csv
A | B | C | D
======================
aed | etge | 3r4 | pu9
frt | eide | 9h4 | sd2
jey | edlr | 8d2 | bu6
使用python,我希望将B列附加在A列下,将D列附加在C列下所以我有低于的
A | C
===========
aed | 3r4
frt | 9h4
jey | 8d2
etge | pu9
eide | sd2
edlr | bu6
建议使用panda。
试试这样的东西:
import pandas as pd
dataFrame = pd.DataFrame({"A":["aed","etge","3r4"],
"B":["aed","etawge","3r4"],
"C":["aed","etgase","3r4"],
"D":["aed","etgqee","3r4"],})
AB = pd.concat([dataFrame["A"],dataFrame["B"]])
CD = pd.concat([dataFrame["C"],dataFrame["D"]])
final_dataFrame = pd.concat([AB,CD], axis=1)
final_dataFrame.columns=["A","C"]
我没有使用与您完全相同的数据,但这说明了如何做到这一点。您可以使用pandas.read_csv读取csv文件。
编辑:如果你想直接从文件中读取,你必须首先更改文件,使其没有"=====&";,所以它应该是这样的:
A | B | C | D
aed | etge | 3r4 | pu9
frt | eide | 9h4 | sd2
jey | edlr | 8d2 | bu6
一旦完成,就这样做:
# read the file. If test.csv is not in the same folder, then you have to give the complete file path.
dataFrame = pd.read_csv("test.csv", sep="|")
# remove unnecessary white spaces.
dataFrame = dataFrame.apply(lambda x: x.str.strip() if x.dtype == "object" else x)
# create a new column by combining column 0 and 1.
AB = pd.melt(dataFrame.iloc[:, [0, 1]])["value"]
# create a new column by combining column 2 and 3.
CD = pd.melt(dataFrame.iloc[:, [2, 3]])["value"]
# combine the previous two columns
final_dataFrame = pd.concat([AB, CD], axis=1)
# give them names "A" and "C"
final_dataFrame.columns = ["A", "C"]
print(final_dataFrame)
如果你不担心可读性,你可以将不同的步骤组合起来,如下所示:
dataFrame = pd.read_csv("file.csv", sep="|").apply(lambda x: x.str.strip() if x.dtype == "object" else x)
final_dataFrame = pd.concat([pd.melt(dataFrame.iloc[:, [0, 1]])["value"], pd.melt(dataFrame.iloc[:, [2, 3]])["value"]], axis=1)
final_dataFrame.columns = ["A", "C"]
print(final_dataFrame)
这给出了结果:
A C
0 aed 3r4
1 frt 9h4
2 jey 8d2
3 etge pu9
4 eide sd2
5 edlr bu6