我的LSTM遇到了一个问题。我想做的是:
我有一个格式为的数据集
0.04,-9.77,0.71,1,0,0,0
...
...
前三个参数是由加速度计收集的数据:X加速度、Y加速度、Z加速度
最后四列是标签:
[1,0,0,0] [0,1,0,0] [0,0,1,0] [0,0,0,1] [0,0,0,0]
其中每个表示不同的类。
我的网络声明如下:
class Config:
def __init__(self):
"""network parameters"""
self.batch_size = 16
self.input_size = 3
self.seq_max_len = 20
self.rnn_size = 50
self.keep_prob = 1
self.mlp_hidden_size = 100
self.mlp_projection_activation = tf.nn.tanh
self.num_classes = 4
self.learning_rate = 0.001
self.epochs = 10
model = tf.keras.Sequential([
tf.keras.layers.InputLayer(input_shape=(config.seq_max_len, config.input_size)),
tf.keras.layers.LSTM(units=config.rnn_size, return_sequences=True, return_state=False),
tf.keras.layers.Dense(units=config.mlp_hidden_size, activation=config.mlp_projection_activation),
tf.keras.layers.Dense(units=config.num_classes, activation='softmax'),
])
loss_fn = tf.keras.losses.CategoricalCrossentropy(from_logits=True)
model.compile(optimizer='adam', loss=loss_fn, metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=config.batch_size, epochs=config.epochs)
现在,问题是,这并没有像我希望的那样奏效。当我尝试预测时,假设使用一个数组:
arr = np.array([(-0.12,-9.85,0.82),(-1.33,-10,1.61),(-1.57,-10.04,0.9),(0.08,-9.14,0.51),(3.77,-8.36,-0.55),(6.71,-8.43,-1.69),
(9.22,-8.28,-2.63),(10.75,-7.65,-2.98),(9.26,-7.61,-2.35),(6.16,-7.85,-1.77),(2.35,-8.51,-0.78),(-1.10,-8.87,0.71),(-3.61,-9.14,2.31),
(-5.49,-9.65,3.69),
(-5.33,-9.49,3.14),
(-4.24,-9.26,3.30),
(-2.43,-9.06,2.24),
(-0.39,-8.87,1.29),
(3.61,-8.55,-1.22),
(7.10,-8.28,-1.57)])
由20个三维矢量(加速度(三元组组成,我得到的是
predictions = model.predict_classes(arr)
[[0 2 2 0 0 0 0 0 0 0 0 2 2 2 2 2 2 2 0 0]]
其是表示对arr向量中的每个三元组的预测的向量。我想要的是,在20个三元组之后,只有一个预测。这是因为我的数据代表了一个时间序列,我感兴趣的是知道网络是否能够在一定量的加速度矢量(在这种情况下为20(之后对数据进行分类。
能帮帮我吗?
编辑
完整代码:
import tensorflow as tf
import numpy as np
import pandas as pd
import random
import sys
np.set_printoptions(threshold=sys.maxsize)
def get_dataset(filename, config):
df = pd.read_csv(filename, header=None, skiprows=1)
x = df[[0, 1, 2]].values
y = df[[3, 4, 5, 6]].values
dataset_x, dataset_y = [],[]
for i in range(x.shape[0]//config.seq_max_len):
sequence_x, sequence_y = [],[]
for j in range(config.seq_max_len):
sequence_x.append(x[i*config.seq_max_len + j])
sequence_y.append(y[i*config.seq_max_len + j])
dataset_x.append(sequence_x)
dataset_y.append(sequence_y)
return np.array(dataset_x), np.array(dataset_y)
class Config:
def __init__(self):
"""definizione dei parametri della rete"""
self.batch_size = 16
self.input_size = 3
self.seq_max_len = 20
self.rnn_size = 50
self.keep_prob = 1
self.mlp_hidden_size = 100
self.mlp_projection_activation = tf.nn.tanh
self.num_classes = 4
self.learning_rate = 0.001
self.epochs = 10
config = Config()
x_train, y_train = get_dataset('data_new.csv', config)
arr = np.array([(-0.12,-9.85,0.82),(-1.33,-10,1.61),(-1.57,-10.04,0.9),(0.08,-9.14,0.51),(3.77,-8.36,-0.55),(6.71,-8.43,-1.69),
(9.22,-8.28,-2.63),(10.75,-7.65,-2.98),(9.26,-7.61,-2.35),(6.16,-7.85,-1.77),(2.35,-8.51,-0.78),(-1.10,-8.87,0.71),(-3.61,-9.14,2.31),
(-5.49,-9.65,3.69),
(-5.33,-9.49,3.14),
(-4.24,-9.26,3.30),
(-2.43,-9.06,2.24),
(-0.39,-8.87,1.29),
(3.61,-8.55,-1.22),
(7.10,-8.28,-1.57)])
arr2 = np.reshape(arr,(1,20,3))
print(arr2.shape)
model = tf.keras.Sequential([
tf.keras.layers.InputLayer(input_shape=(config.seq_max_len, config.input_size)),
tf.keras.layers.LSTM(units=config.rnn_size, return_sequences=True, return_state=False),
tf.keras.layers.Dense(units=config.mlp_hidden_size, activation=config.mlp_projection_activation),
tf.keras.layers.Dense(units=config.num_classes, activation='softmax'),
])
loss_fn = tf.keras.losses.CategoricalCrossentropy(from_logits=True)
model.compile(optimizer='adam', loss=loss_fn, metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=config.batch_size, epochs=config.epochs)
predictions = model.predict(arr2)
predictions = np.argmax(predictions, axis=-1)
print("PREDIZIONI---------")
print(predictions.shape)
print(predictions)
可能存在两个问题。一个是如果你设置
tf.keras.layers.LSTM(units=.., return_sequences=True, return_state=False)
如果在模型的最后一层打印model. summary()
,则会得到如下结果。这可能不是你想要的最后一层。
dense_5 (Dense) (None, 20, 4) 404
=================================================================
因此,您应该使用return_sequence = False
来获得最终的层输出形状,如下所示:
dense_7 (Dense) (None, 4) 404
=================================================================
其次,您在损失函数中设置
....CategoricalCrossentropy(from_logits=True)
但是你在最后一层设置activation='softma'
来获得概率,而不是logits。
....Dense(units=config.num_classes, activation='softmax')
因此,基于此设置参数如下:
....LSTM(units=.., return_sequences=False, return_state=False)
...
....CategoricalCrossentropy(from_logits=False) # compute probabilities
...
y_pred = model.predict(arr)
y_pred = np.argmax(y_pred, axis=-1)