Numpy/Pandas关联多个不同长度的数组



我可以使用这个方法将两个不同长度的数组关联起来:

import pandas as pd
import numpy as np
from scipy.stats.stats import pearsonr
a = [0, 0.4, 0.2, 0.4, 0.2, 0.4, 0.2, 0.5]
b = [25, 40, 62, 58, 53, 54]
df = pd.DataFrame(dict(x=a))
CORR_VALS = np.array(b)
def get_correlation(vals):
return pearsonr(vals, CORR_VALS)[0]
df['correlation'] = df.rolling(window=len(CORR_VALS)).apply(get_correlation)

我得到这样的结果:

In [1]: df
Out[1]: 
x  correlation
0  0.0          NaN
1  0.4          NaN
2  0.2          NaN
3  0.4          NaN
4  0.2          NaN
5  0.4     0.527932
6  0.2    -0.159167
7  0.5     0.189482

首先,pearson coeff应该是这个数据集中的最大值…

其次,我如何为多组数据做到这一点?我想要一个像我在df.corr()中得到的输出。使用适当的索引和列标记。

例如,假设我有以下数据集:
a = [0, 0.4, 0.2, 0.4, 0.2, 0.4, 0.2, 0.5]
b = [25, 40, 62, 58, 53, 54]
c = [ 0, 0.4, 0.2, 0.4, 0.2, 0.45, 0.2, 0.52, 0.52, 0.4, 0.21, 0.2, 0.4, 0.51]
d = [ 0.4, 0.2, 0.5]

我想要一个16个Pearson系数的相关矩阵…

import pandas as pd
import numpy as np
from scipy.stats.stats import pearsonr
a = [0, 0.4, 0.2, 0.4, 0.2, 0.4, 0.2, 0.5]
b = [25, 40, 62, 58, 53, 54]
c = [ 0, 0.4, 0.2, 0.4, 0.2, 0.45, 0.2, 0.52, 0.52, 0.4, 0.21, 0.2, 0.4, 0.51]
d = [ 0.4, 0.2, 0.5]
# To store the data
dict_series = {'a': a,'b': b,'c':c,'d':d}
list_series_names = [i for i in dict_series.keys()]
def get_max_correlation_from_lists(a, b):
# This is to make sure the longest list is in the dataframe
if len(b)>=len(a):
a_old = a
a = b
b= a_old
# Taking the body from the original code.
df = pd.DataFrame(dict(x=a))
CORR_VALS = np.array(b)
def get_correlation(vals):
return pearsonr(vals, CORR_VALS)[0]
# Collecting the max
return df.rolling(window=len(CORR_VALS)).apply(get_correlation).max().values[0]
# This is to create the "correlations" matrix
correlations_matrix = pd.DataFrame(index=list_series_names,columns=list_series_names )
for i in list_series_names:
for j in list_series_names:
correlations_matrix.loc[i,j]=get_max_correlation_from_lists(dict_series[i], dict_series[j])
print(correlations_matrix)
a         b         c         d
a       1.0  0.527932  0.995791       1.0
b  0.527932       1.0   0.52229  0.427992
c  0.995791   0.52229       1.0  0.992336
d       1.0  0.427992  0.992336       1.0

相关内容

  • 没有找到相关文章

最新更新