我有一个这样的数据框架:
test = pd.DataFrame({"id":[1,2,3,4],
"name_1":["peter","bobby","alex","chris"],
"name_1_flag":["real","fake","fake","real"],
"name_2":["hector","abi","henrik","miko"],
"name_2_flag":["fake","real","fake","fake"],
"name_3":["hans","khan","will","than"],
"name_3_flag":["fake","fake","real","fake"]})
id name_1 name_1_flag name_2 name_2_flag name_3 name_3_flag
0 1 peter real hector fake hans fake
1 2 bobby fake abi real khan fake
2 3 alex fake henrik fake will real
3 4 chris real miko fake than fake
如何找到包含单词"real"的行/列元组?。
最理想的输出是这样的数组或序列:
col_index
0 3
1 5
2 7
3 3
使用np.where:
test["col_index"] = np.where(test.eq("real"))[1] + 1
print(test)
id name_1 name_1_flag name_2 name_2_flag name_3 name_3_flag col_index
0 1 peter real hector fake hans fake 3
1 2 bobby fake abi real khan fake 5
2 3 alex fake henrik fake will real 7
3 4 chris real miko fake than fake 3
解决方案:
Trynp.argmax
:
>>> np.argmax(test.eq('real').to_numpy(), axis=1) + 1
array([3, 5, 7, 3], dtype=int64)
>>>
—或get_indexer
:
test.columns.get_indexer(test.eq('real').idxmax(axis=1)) + 1
或.T.reset_index(drop=True)
:
test.T.reset_index(drop=True).eq('real').idxmax() + 1
使它成为一个列:
np.argmax
:
test["col_index"] = np.argmax(test.eq('real').to_numpy(), axis=1) + 1
get_indexer
:
test["col_index"] = test.columns.get_indexer(test.eq('real').idxmax(axis=1)) + 1
.T
:
test["col_index"] = test.T.reset_index(drop=True).eq('real').idxmax() + 1
所有输出:
id name_1 name_1_flag name_2 name_2_flag name_3 name_3_flag col_index
0 1 peter real hector fake hans fake 3
1 2 bobby fake abi real khan fake 5
2 3 alex fake henrik fake will real 7
3 4 chris real miko fake than fake 3
让我们试试
s = test.where(lambda x : x=='real').stack()
test['new'] = test.columns.get_indexer(s.index.get_level_values(1))+1
test
Out[11]:
id name_1 name_1_flag name_2 name_2_flag name_3 name_3_flag new
0 1 peter real hector fake hans fake 3
1 2 bobby fake abi real khan fake 5
2 3 alex fake henrik fake will real 7
3 4 chris real miko fake than fake 3
您也可以使用dot
:
print (test.eq("real").dot(range(test.columns.size))+1)
0 3
1 5
2 7
3 3
dtype: int32