尝试在时间序列数据上获得人工智能的一些经验。所以我写了一些东西,创建了一个简单的序列,并让AI说在1440个时间步长内,它是更高、更低还是在序列中最后一个数字的1.0以内。
我正试图通过GRU或LTSM来实现这一点。但两人似乎都没有正确训练。分类准确率约为30-40%(似乎它选择了一个单一的答案并坚持下去(。我尝试过GRU和LTSM,不同的尺寸,更密集的层,不太密集的层和我序列中的更多项目,等等。
如果其中任何一个不是特别漂亮的Python,请原谅,它来自C/C++。
import tensorflow as tf
import numpy as np
from tqdm import tqdm
import random
import math
class ShortPredSequence(tf.keras.utils.Sequence):
def __GenerateSequences(self):
self.x = []
self.y = []
higher = 0
lower = 0
similar = 0
for _ in tqdm(range(0,self.num_seq)):
sequence = []
up = round(random.uniform(0, .005), 4)
down = round(random.uniform(0, .005), 4)
val = round(random.uniform(1, 1000), 4)
for x in range(0, 480*3):
if x%2 == 0:
val += up
else:
val -= down
sequence.append([val])
self.x.append(sequence)
diff = ((up - down) * 480*3)
y = None
if diff > 1.0:
higher += 1
y = [[1.0],[0.0],[0.0]]
elif diff < -1.0:
lower += 1
y = [[0.0],[0.0],[1.0]]
else:
similar += 1
y = [[0.0],[1.0],[0.0]]
self.y.append(y)
print(higher,lower,similar)
def __init__(self, numSeq, batchSize=32):
self.batch_size = batchSize
self.num_seq = numSeq
self.__GenerateSequences()
def __len__(self):
return int(math.floor(self.num_seq/self.batch_size))
def __getitem__(self, index):
lower = index * self.batch_size
upper = lower + self.batch_size
return np.array(self.x[lower:upper]), np.array(self.y[lower:upper])
sequence = ShortPredSequence(1000, batchSize=64)
model = tf.keras.Sequential([
tf.keras.layers.GRU(32),
tf.keras.layers.Dense(256, activation="sigmoid"),
tf.keras.layers.Dense(256, activation="sigmoid"),
tf.keras.layers.Dense(3, activation="sigmoid")
])
model.compile(optimizer="adadelta",
loss="categorical_crossentropy",
metrics="categorical_accuracy")
model.fit(sequence, epochs=5)
代码中(X,y(的形状分别为(64, 1440, 1)
和(64, 3, 1)
,这意味着输入数据中有batch_size=64, timesteps=1440, num_feature=1
,输出包含每个批次一个特征的3个样本。
该模型必须对3个不同的类别进行分类,这些类别是分类数据。在这种情况下,您必须将y_data
作为(64,)
元素,其中每个元素代表一个类,或者像[[0,0,1][0,1,0],[1,0,0]]
一样进行一个热编码。我认为这段代码中的模型试图为每个输入样本学习多个输出。(对于多输入/多输出用例,您需要使用模型子类或函数API(。
我在下面修改了你的代码
class ShortPredSequence(tf.keras.utils.Sequence):
def __init__(self, numSeq, batchSize=32):
self.batch_size = batchSize
self.num_seq = numSeq
self.__GenerateSequences()
def __GenerateSequences(self):
self.x = []
self.y = []
higher = 0
lower = 0
similar = 0
for i in range(0,self.num_seq):
sequence = []
up = round(random.uniform(0, .005), 4)
down = round(random.uniform(0, .005), 4)
val = round(random.uniform(1, 1000), 4)
for i in range(0, 480*3):
if i%2 == 0:
val += up
else:
val -= down
sequence.append([val])
self.x.append(sequence)
diff = ((up - down) * 480*3)
y = None
if diff > 1.0:
higher += 1
y = 2
elif diff < -1.0:
lower += 1
y = 0
else:
similar += 1
y = 1
self.y.append(y)
# print(higher,lower,similar)
# print(self.x)
def __len__(self):
return int(math.floor(self.num_seq/self.batch_size))
def __getitem__(self, index):
lower = index * self.batch_size
upper = lower + self.batch_size
return np.array(self.x[lower:upper]), np.expand_dims(np.array(self.y[lower:upper]), axis=-1)
model = tf.keras.Sequential(
[
tf.keras.layers.GRU(32),
tf.keras.layers.Dense(3,kernel_initializer=tf.initializers.zeros),
tf.keras.layers.Lambda(lambda x: x * 200)
]
)
model.compile(optimizer=keras.optimizers.Adam(),
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=["accuracy"])
model.fit(sequence, epochs=10)
精度波动,但数值在变化。
Output:
Epoch 1/5
15/15 [==============================] - 6s 183ms/step - loss: 1.8426 - accuracy: 0.4531
Epoch 2/5
15/15 [==============================] - 3s 183ms/step - loss: 1.3448 - accuracy: 0.3760
Epoch 3/5
15/15 [==============================] - 3s 184ms/step - loss: 1.1640 - accuracy: 0.3938
Epoch 4/5
15/15 [==============================] - 3s 183ms/step - loss: 1.1139 - accuracy: 0.4667
Epoch 5/5
15/15 [==============================] - 3s 183ms/step - loss: 1.1297 - accuracy: 0.3802
我不是专家,但我只是尝试了一下,这就是我认为的