我正在学习numpy,并且有这个csv文件:
,August,September,October,November,December,January
Johney,84,81.3,82.8,80.1,77.4,75.2
Miki,79.6,75.2,75,74.3,72.8,71.4
Ali,67.5,66.5,65.3,65.9,65.6,64
Bob,110.7,108.2,104.1,101,98.3,95.5
这需要更改为显示重量相对于前一个月变化的表格,如下所示:
[[ 0., -0.03214286, 0.01845018, -0.0326087 , -0.03370787, -0.02842377],
[ 0., -0.05527638, -0.00265957, -0.00933333, -0.02018843, -0.01923077],
[ 0., -0.01481481, -0.01804511, 0.00918836, -0.00455235, -0.02439024],
[ 0., -0.02258356, -0.03789279, -0.02977906, -0.02673267, -0.02848423]]
我对这个文件有一些其他问题,我的代码看起来是这样的:
import numpy as np
def load_training_data(filename):
data = np.genfromtxt(filename, delimiter=',',skip_header=1)
data = data[:, 1:]
with open(filename,'r') as file:
header = ((file.readline()).rstrip()).split(',')[1:]
row = [row.split(',')[0] for row in file]
column_names = np.array(header)
row_names = np.array(row)
return data,column_names,row_names
def get_diff_data(data, column_names, row_names):
#find the diffrence between the months columns
d = np.diff(data)
#create a column of zeros
z = np.zeros((len(row_names),1))
#add the zero colmun to the matrix
t = np.hstack((z,d))
return t
我设法计算出了第一个:
def get_relative_diff_table(data, column_names, row_names):
dif = get_diff_data(data, column_names, row_names)
calc = (data[0:1,1] - data[0:1,0])/data[0:1,0]
但是我很难把这个需要的计算应用到所有其他的计算中,除了一个接一个地写
带数据:
In [115]: data = np.genfromtxt(txt, delimiter=',', skip_header=1,)
In [116]: data
Out[116]:
array([[ nan, 84. , 81.3, 82.8, 80.1, 77.4, 75.2],
[ nan, 79.6, 75.2, 75. , 74.3, 72.8, 71.4],
[ nan, 67.5, 66.5, 65.3, 65.9, 65.6, 64. ],
[ nan, 110.7, 108.2, 104.1, 101. , 98.3, 95.5]])
nan
是字符串列;我们在这里不需要它(也可以使用usecols
只加载数字列(
In [117]: data = data[:,1:]
In [118]: np.diff(data, 1)/data[:,:-1]
Out[118]:
array([[-0.03214286, 0.01845018, -0.0326087 , -0.03370787, -0.02842377],
[-0.05527638, -0.00265957, -0.00933333, -0.02018843, -0.01923077],
[-0.01481481, -0.01804511, 0.00918836, -0.00455235, -0.02439024],
[-0.02258356, -0.03789279, -0.02977906, -0.02673267, -0.02848423]])
您可以使用获得字符串形式的第一列
In [121]: names = np.genfromtxt(txt, delimiter=',',usecols=[0], dtype=str,ski
...: p_header=1)
In [122]: names
Out[122]: array(['Johney', 'Miki', 'Ali', 'Bob'], dtype='<U6')
In [123]: