我的一个数据框架是:
name value
0 Harry a
1 Kenny b
2 Zoey h
另一个原因是:
list topic
0 Jame, Harry, Noah topic1
1 lee, zee topic2
我想如果dataframe1的任何名称在dataframe2的列表中,它应该在dataframe1中添加一个名称列'present',其值为各自的主题。
name value present
0 Harry a topic1
1 Kenny b none
2 Zoey h none
更新查询
df1
name value
0 Harry Lee a
1 Kenny b
2 Zoey h
df2相同,期望结果为
name value present
0 Harry Lee a topic1 topic2
1 Kenny b none
2 Zoey h none
我们需要用explode
修剪df1然后我们可以做map
df1['list'] = df1['list'].str.split(',')
s = df1.explode('list')
df['present'] = df.name.map(dict(zip(s['list'],s['topic'])))
df
Out[550]:
name value present
0 Harry a topic1
1 Kenny b NaN
2 Zoey h NaN
import pandas as pd
import numpy as np
df1 = pd.DataFrame({"name":['Harry', 'Kenny', 'Zoey'], "value":["a", "b", "h"]})
df2 = pd.DataFrame({"list": ["Jame, Harry, Noah", "lee, zee"], "topic": ["topic1", "topic2"]})
def add_column(x):
try:
present = df2[df2['list'].str.contains(x)].iloc[0,1]
except IndexError:
present = np.NAN
return present
df1['present'] = df1['name'].apply(add_column)
您可以爆炸第二个数据框的list
列,然后将其与name
和list
列合并回left
连接上的第一个数据框,然后使用None
删除list
列和fillna
:
df1.merge(df2.explode('list'), left_on='name', right_on='list', how='left').drop(columns='list').fillna('None')
name value topic
0 Harry a topic1
1 Kenny b None
2 Zoey h None