如何找到哪个列包含某个值?



我有一个这样的数据框架:

test = pd.DataFrame({"id":[1,2,3,4],
"name_1":["peter","bobby","alex","chris"],
"name_1_flag":["real","fake","fake","real"],
"name_2":["hector","abi","henrik","miko"],
"name_2_flag":["fake","real","fake","fake"],
"name_3":["hans","khan","will","than"],
"name_3_flag":["fake","fake","real","fake"]})
id name_1 name_1_flag  name_2 name_2_flag name_3 name_3_flag
0   1  peter        real  hector        fake   hans        fake
1   2  bobby        fake     abi        real   khan        fake
2   3   alex        fake  henrik        fake   will        real
3   4  chris        real    miko        fake   than        fake

如何找到包含单词"real"的行/列元组?。

最理想的输出是这样的数组或序列:

col_index
0           3
1           5
2           7
3           3

使用np.where:

test["col_index"] = np.where(test.eq("real"))[1] + 1
print(test)

id name_1 name_1_flag  name_2 name_2_flag name_3 name_3_flag  col_index
0   1  peter        real  hector        fake   hans        fake          3
1   2  bobby        fake     abi        real   khan        fake          5
2   3   alex        fake  henrik        fake   will        real          7
3   4  chris        real    miko        fake   than        fake          3

解决方案:

Trynp.argmax:

>>> np.argmax(test.eq('real').to_numpy(), axis=1) + 1
array([3, 5, 7, 3], dtype=int64)
>>> 

—或get_indexer:

test.columns.get_indexer(test.eq('real').idxmax(axis=1)) + 1

.T.reset_index(drop=True):

test.T.reset_index(drop=True).eq('real').idxmax() + 1

使它成为一个列:

np.argmax:

test["col_index"] = np.argmax(test.eq('real').to_numpy(), axis=1) + 1

get_indexer:

test["col_index"] = test.columns.get_indexer(test.eq('real').idxmax(axis=1)) + 1

.T:

test["col_index"] = test.T.reset_index(drop=True).eq('real').idxmax() + 1



所有输出:

id name_1 name_1_flag  name_2 name_2_flag name_3 name_3_flag  col_index
0   1  peter        real  hector        fake   hans        fake          3
1   2  bobby        fake     abi        real   khan        fake          5
2   3   alex        fake  henrik        fake   will        real          7
3   4  chris        real    miko        fake   than        fake          3

让我们试试

s = test.where(lambda x : x=='real').stack()
test['new'] = test.columns.get_indexer(s.index.get_level_values(1))+1
test
Out[11]: 
id name_1 name_1_flag  name_2 name_2_flag name_3 name_3_flag  new
0   1  peter        real  hector        fake   hans        fake    3
1   2  bobby        fake     abi        real   khan        fake    5
2   3   alex        fake  henrik        fake   will        real    7
3   4  chris        real    miko        fake   than        fake    3

您也可以使用dot:

print (test.eq("real").dot(range(test.columns.size))+1)
0    3
1    5
2    7
3    3
dtype: int32

相关内容

  • 没有找到相关文章

最新更新