pandas dataframe to coo matrix and to lil matix



我有以下系列:

groups['combined'] 
0            (28, 1)  1
1           (32, 1)  1
2           (36, 1)  1
3           (37, 1)  1
4           (84, 1)  1
....
Name: combined, Length: 14476, dtype: object

如何将此数据帧转换为.tocoo()矩阵和.tolil()

参考列的形成combined原始熊猫数据帧:

import pandas as pdpd.DataFrame({0:[28,32,36,37,84],1: [1,1,1,1,1], 2: [1,1,1,1,1]}). col 0 具有超过 10K 个独特功能,col 1有 39 个组,col 2只有 1 个。

Formation of COOrdinate format from original pandas DataFrame

import scipy.sparse as sps
groups.set_index([0, 1], inplace=True)
sps.coo_matrix((groups[2], (groups.index.labels[0], groups.index.labels[1])))

-------------结果到---------

<10312x39 sparse matrix of type '<class 'numpy.int64'>'
with 14476 stored elements in COOrdinate format>

>In regards to lil matrix

print(len(networks[0]), len(networks[1]), networks[0].nunique(), networks[1].nunique())
667966 667966 10312 10312
networks[:5]
0   1
0   176 1
1   233 1
2   283 1
3   371 1
4   394 1

# make row and col labels
rows = networks[0]
cols = networks[1]
# crucial third array in python
networks.set_index([0, 1], inplace=True)
Ntw= sps.coo_matrix((networks[2], (networks.index.labels[0], 
networks.index.labels[1])))

d=Ntw.tolil()
d

生成

<10312x10312 sparse matrix of type '<class 'numpy.int64'>'
with 667966 stored elements in LInked List format>

相关内容

  • 没有找到相关文章