删除与 df 中另一列不相似的列

你好，我想比较一下John和Kelly之间的数据。有时对于John列(此处为"world_john")，Kelly没有附属的"world_kelly";列，反之亦然。需要删除World_john，因为不需要进行比较。这在一般代码中可能吗?

df
``   world_john    fruit_list_john   fruit_list_kelly  output_john output_kelly
0   The start          hungry             banana         high         high
1    world             pear              apple            high         high
2   yesterday          fruit              pear            high         high
...

预期输出:

fruit_list_john   fruit_list_kelly   output_john    output_kelly
0       hungry           banana        high           high
1       pear              apple        high            high
2       fruit             pear         high            high

如果我理解正确，您要删除任何没有'kelly'列的'john'列，反之亦然。一种方法是遍历列并检查是否有对应的列。如果不存在，则删除该列。

import pandas as pd
df = pd.DataFrame({'world_john':[1], 'fruit_list_john':[1], 'fruit_list_kelly':[1], 'output_john':[1], 'output_kelly':[1]})
for col in df.columns:
if col.endswith('_john') and col.replace('_john', '_kelly') not in df.columns:
df = df.drop(columns=[col])
if col.endswith('_kelly') and col.replace('_kelly', '_john') not in df.columns:
df = df.drop(columns=[col])

相关内容

最新更新

热门标签：