从另一个数据帧中获取描述，该数据帧的代码列为panda

输入：

df1 = pd.DataFrame([[101, 'DC1', 'QA,DEMO'],
[101, 'EM5', 'QA,DEMO'],
[102, 'RA6', 'PA,QA,GF'],
[103, 'DC1', 'AB,LK'],
[103, 'RA6', 'OO'],
[103, 'PA4', 'AB,OO']
]
, columns=['Call_id', 'Agent_id', 'Task_code'])
df2 = pd.DataFrame([['QA', 'Enter phone number'],
['DEMO', 'ASK EMAIL ID'],
['PA', 'make notes'],
['GF', 'take call back'],
['AB', 'Apply bell mark'],
['LK', 'call Low markets'],
['OO','Out of order']
]
, columns=['Task_code', 'Task_Desc'])

输出：

df = pd.DataFrame([[101, 'DC1', 'QA,DEMO','Enter phone number and ASK EMAIL ID'],
[101, 'EM5', 'QA,DEMO','Enter phone number and ASK EMAIL ID'],
[102, 'RA6', 'PA,QA,GF','make notes and Enter phone number and take call back'],
[103, 'DC1', 'AB,LK','Apply bell mark and call Low markets'],
[103, 'RA6', 'OO','Out of order'],
[103, 'PA4', 'AB,OO','Apply bell mark and Out of order']
]
, columns=['Call_id', 'Agent_id', 'Task_code','Task_desc'])

我想在Python 3.6 中为每个代理id和call_id在df1上将TASK_desc与"and"合并

task_mapper = dict(zip(df2["Task_code"],df2["Task_Desc"]))
def get_description(values):
values = values.strip().split(",")
values = list(filter(None,values))
descriptions = [task_mapper.get(value,"") for value in values]
return " and ".join(descriptions)
df1["Task_desc"] = df1["Task_code"].apply(get_description)
print(df1) # your output

您可以创建字典，用于按,映射拆分的值，然后按and分隔符连接值：

d = df2.set_index('Task_code')['Task_Desc']
f = lambda x:  ' and '.join(d.get(y) for y in x.split(',') if y in d)
df1['Task_Desc'] = df1['Task_code'].apply(f)
print (df1)
Call_id Agent_id Task_code  
0      101      DC1   QA,DEMO   
1      101      EM5   QA,DEMO   
2      102      RA6  PA,QA,GF   
3      103      DC1     AB,LK   
4      103      RA6        OO   
5      103      PA4     AB,OO   
Task_Desc  
0                Enter phone number and ASK EMAIL ID  
1                Enter phone number and ASK EMAIL ID  
2  make notes and Enter phone number and take cal...  
3               Apply bell mark and call Low markets  
4                                       Out of order  
5                   Apply bell mark and Out of order

或者us panda函数用于分解列、合并和聚合具有不同分隔符的join：

f1 = lambda x: ' and '.join(x.unique())
f2 = lambda x: ','.join(x.unique())
df = (df1.assign(Task_code = df1['Task_code'].str.split(','))
.explode('Task_code')
.merge(df2, on='Task_code')
.groupby(['Call_id','Agent_id'])
.agg({'Task_code':f2, 'Task_Desc':f1})
.reset_index())
print (df)
Call_id Agent_id Task_code  
0      101      DC1   QA,DEMO   
1      101      EM5   QA,DEMO   
2      102      RA6  QA,PA,GF   
3      103      DC1     AB,LK   
4      103      PA4     AB,OO   
5      103      RA6        OO   
Task_Desc  
0                Enter phone number and ASK EMAIL ID  
1                Enter phone number and ASK EMAIL ID  
2  Enter phone number and make notes and take cal...  
3               Apply bell mark and call Low markets  
4                   Apply bell mark and Out of order  
5                                       Out of order

这是我的方法：拆分为列表

df1['desc'] = df1.Task_code.str.split(',')

制作字典以查找

dictionary = dict(zip(df2.Task_code,df2.Task_Desc))
def descdic(l):
return ' and '.join([dictionary[i] for i in l])

应用于数据

df1.desc = df1.desc.apply(lambda x: descdic(x))

相关内容

最新更新

热门标签：