我有两个Pandas DataFrames,一个包含我想要更新的数据,另一个提供基于MultiIndex键的查找以设置值。
例如,我有两个csv:
水果.csv
Fruit,Color,State,more,data
Apple,Red,Good,etc.,etc.
Apple,Green,Mouldy,etc.,etc.
Apple,Green,Excellent,etc.,etc.
Pear,Red,Excellent,etc.,etc.
Pear,Green,Good,etc.,etc.
Lime,Green,Bad,etc.,etc.
评级.csv
Fruit,State,Rating
Apple,Excellent,11
Apple,Good,8
Apple,Bad,4
Apple,Mouldy,0
Pear,Excellent,9
Pear,Good,5
Pear,Bad,2
Pear,Mouldy,1
Lime,Excellent,10
Lime,Good,7
Lime,Bad,5
Lime,Mouldy,2
我已经读到DataFrames:
static_data_dir = Path(__file__).resolve().parent
fruit = pd.read_csv(static_data_dir.joinpath("fruit.csv"), index_col=["Fruit","Color"])
rating = pd.read_csv(static_data_dir.joinpath("rating.csv"), index_col=["Fruit","State"])
State more data
Fruit Color
Apple Red Good etc. etc.
Green Mouldy etc. etc.
Green Excellent etc. etc.
Pear Red Excellent etc. etc.
Green Good etc. etc.
Lime Green Bad etc. etc.
Rating
Fruit State
Apple Excellent 11
Good 8
Bad 4
Mouldy 0
Pear Excellent 9
Good 5
Bad 2
Mouldy 1
Lime Excellent 10
Good 7
Bad 5
Mouldy 2
现在想用评级DataFrame的Rating值替换水果DataFrame中的State值,结果如下。
State more data
Fruit Color
Apple Red 8 etc. etc.
Green 0 etc. etc.
Green 11 etc. etc.
Pear Red 9 etc. etc.
Green 5 etc. etc.
Lime Green 5 etc. etc.
实际上,我想使用pandas.Series.replace
,但传入一个带有元组键的dict,但这似乎不受支持。
{'Rating': {('Apple', 'Bad'): 4,
('Apple', 'Excellent'): 11,
('Apple', 'Good'): 8,
('Apple', 'Mouldy'): 0,
('Lime', 'Bad'): 5,
('Lime', 'Excellent'): 10,
('Lime', 'Good'): 7,
('Lime', 'Mouldy'): 2,
('Pear', 'Bad'): 2,
('Pear', 'Excellent'): 9,
('Pear', 'Good'): 5,
('Pear', 'Mouldy'): 1}}
我该如何最好地实现这一目标?
读取两个csv作为正常数据帧,然后通过设置how="left"
,使用fruit
数据帧中的键合并Fruit
和State
列。最后设置Fruit
和Color
列作为索引。
import pandas as pd
fruit = pd.read_csv("fruit.csv")
rating = pd.read_csv("rating.csv")
fruit['State'] = fruit.merge(rating, on=["Fruit", "State"], how="left")["Rating"]
fruit.set_index(["Fruit","Color"], inplace=True)
print(fruit)
State more data
Fruit Color
Apple Red 8 etc. etc.
Green 0 etc. etc.
Green 11 etc. etc.
Pear Red 9 etc. etc.
Green 5 etc. etc.
Lime Green 5 etc. etc.