对于下面的代码,我可以做什么呢?嗯,我正在使用LSTM构建一个股票预测模型,每次我尝试运行下面的代码来规范化新过滤的数据集时,我都会在代码后面得到如下所示的错误;
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from keras.models import Sequential
from keras.layers import LSTM,Dropout,Dense
from matplotlib.pylab import rcParams
rcParams['figure.figsize']=20,10
from sklearn.preprocessing import MinMaxScaler
scaler=MinMaxScaler(feature_range=(0,1))
scaler=MinMaxScaler(feature_range=(0,1))
final_dataset=new_dataset.values
train_data=final_dataset[0:987,:]
valid_data=final_dataset[987:,:]
new_dataset.index=new_dataset.Date
new_dataset.drop("Date",axis=1,inplace=True)
scaler=MinMaxScaler(feature_range=(0,1))
scaled_data=scaler.fit_transform(final_dataset)
x_train_data,y_train_data=[],[]
for i in range(60,len(train_data)):
x_train_data.append(scaled_data[i-60:i,0])
y_train_data.append(scaled_data[i,0])
x_train_data,y_train_data=np.array(x_train_data),np.array(y_train_data)
x_train_data=np.reshape(x_train_data,(x_train_data.shape[0],x_train_data.shape[1],1))
每次我运行它,我得到这个错误下面,我试图纠正它几次,但它一直弹出。错误:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-37-15343d926050> in <module>
8 new_dataset.drop("Date",axis=1,inplace=True)
9 scaler=MinMaxScaler(feature_range=(0,1))
---> 10 scaled_data=scaler.fit_transform(final_dataset)
11
12 x_train_data,y_train_data=[],[]
~anaconda3libsite-packagessklearnbase.py in fit_transform(self, X, y, **fit_params)
697 if y is None:
698 # fit method of arity 1 (unsupervised transformation)
--> 699 return self.fit(X, **fit_params).transform(X)
700 else:
701 # fit method of arity 2 (supervised transformation)
~anaconda3libsite-packagessklearnpreprocessing_data.py in fit(self, X, y)
361 # Reset internal state before fitting
362 self._reset()
--> 363 return self.partial_fit(X, y)
364
365 def partial_fit(self, X, y=None):
~anaconda3libsite-packagessklearnpreprocessing_data.py in partial_fit(self, X, y)
394
395 first_pass = not hasattr(self, 'n_samples_seen_')
--> 396 X = self._validate_data(X, reset=first_pass,
397 estimator=self, dtype=FLOAT_DTYPES,
398 force_all_finite="allow-nan")
~anaconda3libsite-packagessklearnbase.py in _validate_data(self, X, y, reset, validate_separately, **check_params)
419 out = X
420 elif isinstance(y, str) and y == 'no_validation':
--> 421 X = check_array(X, **check_params)
422 out = X
423 else:
~anaconda3libsite-packagessklearnutilsvalidation.py in inner_f(*args, **kwargs)
61 extra_args = len(args) - len(all_args)
62 if extra_args <= 0:
---> 63 return f(*args, **kwargs)
64
65 # extra_args > 0
~anaconda3libsite-packagessklearnutilsvalidation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator)
614 array = array.astype(dtype, casting="unsafe", copy=False)
615 else:
--> 616 array = np.asarray(array, order=order, dtype=dtype)
617 except ComplexWarning as complex_warning:
618 raise ValueError("Complex data not supportedn"
~AppDataRoamingPythonPython38site-packagesnumpycore_asarray.py in asarray(a, dtype, order)
81
82 """
---> 83 return array(a, dtype, copy=False, order=order)
84
85
TypeError: float() argument must be a string or a number, not 'Timestamp'
sklearn期望浮点值,即数字,并且您给它Timestamp
对象。错误TypeError: float() argument must be a string or a number, not 'Timestamp
是说python内置的float()
不知道如何将时间戳转换为浮点数。
为了避免这个问题,你可以自己把时间戳转换成数字,然后再把它们传递给你的函数。
- 如果所有日期都≥1970,并且您希望分辨率小于一天,请使用
timestamp
:new_dataset['Date'] = new_dataset['Date'].apply(pd.Timestamp.timestamp)
- 如果你每天最多有一个分辨率,并且不想要日期限制,使用
toordinal
:new_dataset['Date'] = new_dataset['Date'].apply(pd.Timestamp.toordinal)
我通过给日期赋值来解决这个问题。所以我已经创建了索引,并使用下一个代码
df.set_index(['Date'])