我有下面的数据框架,
MID C_PG ACT
GOAL PERFORMANCE_GOAL_V2 ['view_goal,view_card']
GOAL PERFORMANCE_GOAL_V2 ['expand,view_goal,select,add_activity']
GOAL PERFORMANCE_GOAL_V2 ['view_goal_list']
我想通过拆分'ACT'列中的字符串并重命名列来将其转换为以下数据框:
MID C_PG step1 step2 step3 step4
GOAL PERFORMANCE_GOAL_V2 view_goal view_card na na
GOAL PERFORMANCE_GOAL_V2 expand view_goal select add_activity
GOAL PERFORMANCE_GOAL_V2 view_goal na na na
我已经试过了:
df = df.set_index(['MID', 'C_PG']).apply(lambda x: str(x).split(',', expand=True))
But I got error:
'expand'是split()的无效关键字参数
谁能提供一个解决方案?
好了,确保先到import re
:
import re
df['ACT'] = df['ACT'].apply(lambda x: re.sub("[|]|'",'',x))
df = df.join(df['ACT'].str.split(',', expand=True).add_prefix('step')).fillna('na')
df = df.drop(columns=['ACT'])
输出:
MID C_PG step0 step1 step2 step3
0 GOAL PERFORMANCE_GOAL_V2 view_goal view_card na na
1 GOAL PERFORMANCE_GOAL_V2 expand view_goal select add_activity
2 GOAL PERFORMANCE_GOAL_V2 view_goal_list na na na