返回一个不是下一个步骤,但在第一个步骤中的用户



我有下面的样本数据集,这是一个分组的结果,我按步骤和活动源分组。并在Set

中返回分组的useriddf2=df[['CampaignSource','UserId','Steps']].groupby(['Steps','CampaignSource'],as_index=False).agg(lambda x: set(x))

tbody> <<tr>
StepsCampaignSourceSet_UserId
"Step-1""Apple""Jeff"John","Antonio","Jon"
"Step-1">"Banana""Jeff"John","Antonio", Jon"
"Step-1">"Potato""Jeff"John","Antonio", Jon"
"Step-2">"Apple""Jeff","John"
"Step-2">"Banana""Jeff"John","Antonio"
"Step-2">"Potato""Jeff","John"
"Step-3">"Apple""Jeff"
"Step-3">"Banana""Jeff","John"
"Step-3">"Potato""Jeff"
pd.DataFrame([item for sub in (list(df.groupby("CampaignSource").agg(lambda x: x).apply(lambda x: list(zip([x.name] * len(x["Steps"]), x["Steps"][:-1], [(list(set(s) - set(x["Set_UserId"][i+1]))) for i,s in enumerate(x["Set_UserId"][:-1])])), axis=1).to_dict().values())) for item in sub])

稍微复杂一点:)如果你想要那种形状…如果您想要的是另一种形状,它可以更简单

相关内容

  • 没有找到相关文章