我在Python Pandas中有2个dataframe,如下所示:
DF1
COL1 | ... | COLn
-----|------|-------
A | ... | ...
B | ... | ...
A | ... | ...
.... | ... | ...
DF2
G1 | G2
----|-----
A | 1
B | 2
C | 3
D | 4
我需要用DF2 G2
DF1 COL1
中的值因此,我需要DF1的格式如下:
COL1 | ... | COLn
-----|------|-------
1 | ... | ...
2 | ... | ...
1 | ... | ...
.... | ... | ...
当然,我的表很大,它可以很好地自动做到这一点,而不是手动调整值:)
我如何在Python Pandas中做到这一点?
import pandas as pd
df1 = pd.DataFrame({"COL1": ["A", "B", "A"]}) # Add more columns as required
df2 = pd.DataFrame({"G1": ["A", "B", "C", "D"], "G2": [1, 2, 3, 4]})
df1["COL1"] = df1["COL1"].map(df2.set_index("G1")["G2"])
输出df1:
COL1
0 1
1 2
2 1
您可以尝试使用Dataframe的assign或update方法:
import pandas as pd
df1 = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
df2 = pd.DataFrame({'B': [7, 8, 9]})
试
df1 = df1.assign(B=df2['B'])# assign will create a new Dataframe
或
df1.update(df2)# update makes a in place modification
这里是文档的链接https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.assign.html
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.update.html