我有下面的基表,我想把它分为一个有番石榴表和一个没有番石榴表。我正在考虑使用一个标志来获得下面的中间表,但不确定从那里去哪里。
基本表
user_id fruit
user1 passionfruit
user1 guava
user1 banana
user2 orange
user2 coconut
user3 guava
user4 melon
有番石榴
user_id fruit
user1 passionfruit
user1 guava
user1 banana
user3 guava
没有番石榴
user_id fruit
user2 orange
user2 coconut
user4 melon
中间工作台
user_id fruit has_guava
user1 passionfruit 0
user1 guava 1
user1 banana 0
user2 orange 0
user2 coconut 0
user3 guava 1
user4 melon 0
无groupby
检查isin
out = df[df.user_id.isin(df.loc[df.fruit.isin(['guava']),'user_id'])]
Out[322]:
user_id fruit
0 user1 passionfruit
1 user1 guava
2 user1 banana
5 user3 guava
先尝试groupby
,然后尝试filter
。
df_ = (df.
groupby('user_id').
filter(lambda group: group['fruit'].eq('guava').any())
)
print(df_)
user_id fruit
0 user1 passionfruit
1 user1 guava
2 user1 banana
5 user3 guava