有没有一种简单的方法可以将这样的模型从keras转换为pytorch?
我在keras中有如下代码:
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.regularizers import l2
state_dim = 10
architecture = (256, 256) # units per layer
learning_rate = 0.0001 # learning rate
l2_reg = 0.00000001 # L2 regularization
trainable = True
num_actions = 3
layers = []
n = len(architecture) # n = 2
for i, units in enumerate(architecture, 1):
layers.append(Dense(units=units,
input_dim=state_dim if i == 1 else None,
activation='relu',
kernel_regularizer=l2(l2_reg),
name=f'Dense_{i}',
trainable=trainable))
layers.append(Dropout(.1))
layers.append(Dense(units=num_actions,
trainable=trainable,
name='Output'))
model = Sequential(layers)
model.compile(loss='mean_squared_error',
optimizer=Adam(lr=learning_rate))
输出如下:
Model: "sequential_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
Dense_1 (Dense) (None, 256) 2816
_________________________________________________________________
Dense_2 (Dense) (None, 256) 65792
_________________________________________________________________
dropout_3 (Dropout) (None, 256) 0
_________________________________________________________________
Output (Dense) (None, 3) 771
=================================================================
Total params: 69,379
Trainable params: 69,379
Non-trainable params: 0
_________________________________________________________________
None
我必须承认,我有点力不从心,所以任何建议都很感激。我正在努力阅读pytorch文档,如果我能做到的话,我会用一个可能的答案来更新我的问题。
这是我最好的尝试:
state_dim = 10
architecture = (256, 256) # units per layer
learning_rate = 0.0001 # learning rate
l2_reg = 0.00000001 # L2 regularization
trainable = True
num_actions = 3
import torch
from torch import nn
class CustomModel(nn.Module):
def __init__(self):
super().__init__()
self.layers = nn.Sequential(
nn.Linear(state_dim, architecture[0]),
nn.ReLU(),
nn.Linear(architecture[0], architecture[1]),
nn.ReLU(),
nn.Dropout(0.25),
nn.Linear(architecture[1], num_actions),
)
def forward(self, x):
return self.layers(x)
model = CustomModel()
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
它输出了一个看起来很有前景的输出:
CustomModel(
(layers): Sequential(
(0): Linear(in_features=10, out_features=256, bias=True)
(1): ReLU()
(2): Linear(in_features=256, out_features=256, bias=True)
(3): ReLU()
(4): Dropout(p=0.25, inplace=False)
(5): Linear(in_features=256, out_features=3, bias=True)
)
)
然而,仍有一些项目没有得到解答:
- 激活是否在正确的位置
- 如何在前两个线性/密集层中添加kernel_regularizer=l2(l2_reg(
- 我们如何使这些层可训练
欢迎任何意见。