可以应用其他组的使用信息吗

对于组中的每个元素，确定它是否存在于下一个组中(按照这些组的出现顺序，不一定是数字(。对于最后一组-所有False。

示例：

df = pd.DataFrame({'group': [ 0,   1,   1,   0,   2 ], 
'val': ['a', 'b', 'a', 'c', 'c']})
grouped = df.groupby('group')

print(result)
0     True
1    False
2    False
3    False
4    False
Name: val, dtype: bool

最好的方法是什么？我可以这样完成，但它似乎太古怪了：

keys = list(grouped.groups.keys())
iterator_keys = iter(keys[1:])
def f(ser):
if ser.name == keys[-1]:
return ser.isin([])
next_key = next(iterator_keys)
return ser.isin(grouped.get_group(next_key)['val'])
result = grouped['val'].apply(f)

尝试：

g = df.groupby("group")
m = g["val"].agg(set).shift(-1, fill_value=set())
x = g["val"].transform(lambda x: x.isin(m[x.name]))
print(x)

打印：

0     True
1    False
2    False
3    False
4    False
Name: val, dtype: bool

注：

如果您想用任意values(不一定用False(替换最后一组的值，可以执行以下操作：

m = g["val"].agg(set).shift(-1)
x = g["val"].transform(lambda x: x.isin(m[x.name])
if not pd.isnull(m[x.name])
else values)

例如，如果设置values = True，则x将为：

0     True
1    False
2    False
3    False
4     True
Name: val, dtype: bool

相关内容

最新更新

热门标签：