我有以下数据:
import numpy as np
import pandas as pd
arr = np.array([0, 1, 2, 3, 4, 6, 7, 5])
x = pd.Series([0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,2,3,4,5,7,0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,2,3,4,6,7])
print(arr)
print(type(arr))
[0 1 2 3 4 6 7 5]
<class 'numpy.ndarray'>
下面的代码可以很好地处理上面的数据:
m = [[0] * len(arr) for _ in enumerate(arr)]
for (i, j) in zip(x, x[1:]):
m[i][j] += 1
但是,当数据如下时,上述代码会产生以下错误:
arr = np.arry([0, 1, 2, 3, 4, 6, 7])
x = pd.Series([0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,2,3,4,6,7])
错误——
m[i][j] += 1IndexError:列表索引超出范围"
@Ananda是正确的,但是你写的东西仍然可以工作,真正的问题是你传递了两种不同的类型
顶部是
arr = np.array([0, 1, 2, 3, 4, 6, 7, 5])
,底部是
arr = np.arry([[0, 1, 2, 3, 4, 6, 7]])
你需要底部是
arr = np.arry([0, 1, 2, 3, 4, 6, 7])
注意缺少第二组括号…
我想这就是你想要做的。
import numpy as np
import pandas as pd
arr = np.array([0, 1, 2, 3, 4, 6, 7])
x = pd.Series([0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,2,3,4,6,7,0,1,2,3,4,6,7])
m = [[0] * (np.max(arr)+1) for _ in enumerate(np.arange(np.max(arr)+1))]
for (i, j) in zip(x, x[1:]):
m[i][j] += 1
在创建变量m
时,您需要取arr
的最大值,而不是它的长度。