我有以下系列:
groups['combined']
0 (28, 1) 1
1 (32, 1) 1
2 (36, 1) 1
3 (37, 1) 1
4 (84, 1) 1
....
Name: combined, Length: 14476, dtype: object
如何将此数据帧转换为.tocoo()
矩阵和.tolil()
?
参考列的形成combined
原始熊猫数据帧:
import pandas as pd
pd.DataFrame
({0:[28,32,36,37,84],1: [1,1,1,1,1], 2: [1,1,1,1,1]})
. col 0 具有超过 10K 个独特功能,col 1
有 39 个组,col 2
只有 1 个。
Formation of COOrdinate format from original pandas DataFrame
import scipy.sparse as sps
groups.set_index([0, 1], inplace=True)
sps.coo_matrix((groups[2], (groups.index.labels[0], groups.index.labels[1])))
-------------结果到---------
<10312x39 sparse matrix of type '<class 'numpy.int64'>'
with 14476 stored elements in COOrdinate format>
>In regards to lil matrix
print(len(networks[0]), len(networks[1]), networks[0].nunique(), networks[1].nunique())
667966 667966 10312 10312
networks[:5]
0 1
0 176 1
1 233 1
2 283 1
3 371 1
4 394 1
# make row and col labels
rows = networks[0]
cols = networks[1]
# crucial third array in python
networks.set_index([0, 1], inplace=True)
Ntw= sps.coo_matrix((networks[2], (networks.index.labels[0],
networks.index.labels[1])))
d=Ntw.tolil()
d
生成
<10312x10312 sparse matrix of type '<class 'numpy.int64'>'
with 667966 stored elements in LInked List format>