我正试图通过使用以下代码进行小数缩放来规范我的CSV数据
def decimal_scaling(data):
data = np.array(data, dtype=np.float32)
max_row = data.max(axis=0)
c = np.array([len(str(int(number))) for number in np.abs(max_row)])
return data/(10**c)
X = decimal_scaling(
glcm_df[['dissimilarity_0', 'dissimilarity_45', 'dissimilarity_90', 'dissimilarity_135',
'correlation_0', 'correlation_45', 'correlation_90', 'correlation_135',
'homogeneity_0', 'homogeneity_45', 'homogeneity_90', 'homogeneity_135',
'contrast_0', 'contrast_45', 'contrast_90', 'contrast_135',
'ASM_0', 'ASM_45', 'ASM_90', 'ASM_135',
'energy_0', 'energy_45', 'energy_90', 'energy_135']].values)
但是,每次我运行它时,我总是会收到这样的错误:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-21-5b1233475b8c> in <module>
22 'contrast_0', 'contrast_45', 'contrast_90', 'contrast_135',
23 'ASM_0', 'ASM_45', 'ASM_90', 'ASM_135',
---> 24 'energy_0', 'energy_45', 'energy_90', 'energy_135']].values)
<ipython-input-21-5b1233475b8c> in decimal_scaling(data)
13 data = np.array(data, dtype=np.float32)
14 max_row = data.max(axis=0)
---> 15 c = np.array([len(str(int(number))) for number in np.abs(max_row)])
16 return data/(10**c)
17
<ipython-input-21-5b1233475b8c> in <listcomp>(.0)
13 data = np.array(data, dtype=np.float32)
14 max_row = data.max(axis=0)
---> 15 c = np.array([len(str(int(number))) for number in np.abs(max_row)])
16 return data/(10**c)
17
ValueError: cannot convert float NaN to integer
我不确定出了什么问题。
Numpyfloats
允许NaN
值,但int
不允许。因此,NaN在浮点计算中传播,直到达到int
转换。
也就是说,您正在读取data
,它会产生一些NaN值,然后max
会为这些行返回NaN,abs
也会返回NaN。然后int()
会抱怨。
尝试:
data = np.array(data, dtype=np.float32) # from your code
print(np.argwhere(np.isnan(data)))
以查找您的NaN值所在的位置。