我想将Panda数据帧转换为马尔可夫链事务矩阵
import pandas as pd
dict1={'state_num_x': {0: 0, 1: 1, 2: 1,3: 1,4: 2,5: 2,6: 2,7: 3,8: 3,9: 4,10: 5,11: 5,
12: 5,13: 5,14: 5,15: 5,16: 6,17: 6,18: 6,19: 7,20: 7,21: 7},
'state_num_y': {0: 1,1: 1,2: 2,3: 5,4: 1,5: 4,6: 6,7: 1,8: 6,9: 1,10: 1,11: 2,
12: 3,13: 5,14: 6,15: 7,16: 1,17: 2,18: 5,19: 1,20: 4,21: 6},
'Sum_Prob': {0: 0.9999999999999999,1: 0.0369363131137667,2: 0.7408182206817178,
3: 0.22224546620451535,4: 0.0369363131137667,5: 0.7408182206817178,
6: 0.22224546620451535,7: 0.17028359283647593,8: 0.8297164071635239,
9: 0.9999999999999999,10: 0.003599493183089517,11: 0.08889818648180613,
12: 0.13334727972270924,13: 0.021335564755633474,14: 0.012001255175043838,
15: 0.7408182206817178,16: 0.015600748358133354,17: 0.8297164071635239,
18: 0.1546828444783427,19: 0.015600748358133354,20: 0.8297164071635239,21: 0.1546828444783427}}
df=pd.DataFrame.from_dict(dict1)
看起来像
state_num_x state_num_y Sum_Prob
0 1 1.000000
1 1 0.036936
1 2 0.740818
. . .
. . .
7 1 0.015601
7 4 0.829716
7 6 0.154683
让我们调用结果数组arr_tx
arr_tx[0][1]应等于1
arr_tx[1][1]应等于0.036936
arr_tx[1][2]应等于0.740818
它应该是8x8矩阵,并且缺失的值应该等于零。
所以最终结果应该看起来像
0,1,0,0,0,0,0,0,
0,0.036936,0.740818,0,0,0.222245,0,0
.,.,.,.,.,.,.,.
看起来您想要一个pivot_table
:
df.pivot_table(index='state_num_x', columns='state_num_y',
values='Sum_Prob', fill_value=0)
输出:
state_num_y 1 2 3 4 5 6 7
state_num_x
0 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
1 0.036936 0.740818 0.000000 0.000000 0.222245 0.000000 0.000000
2 0.036936 0.000000 0.000000 0.740818 0.000000 0.222245 0.000000
3 0.170284 0.000000 0.000000 0.000000 0.000000 0.829716 0.000000
4 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
5 0.003599 0.088898 0.133347 0.000000 0.021336 0.012001 0.740818
6 0.015601 0.829716 0.000000 0.000000 0.154683 0.000000 0.000000
7 0.015601 0.000000 0.000000 0.829716 0.000000 0.154683 0.000000