深度学习和深水模型给出的对数损失非常不同(0.4 vs 0.6)



在AWS中,我按照此处的说明,使用社区AMI ami-97591381(h2o版本:3.13.0.356(启动了g2.2xlarge EC2。

这是我的代码,您可以在我公开 S3 链接时运行它:

library(h2o)
library(jsonlite)
library(curl)
localH2O = h2o.init()
df.truth <- h2o.importFile("https://s3.amazonaws.com/nw.data.test.us.east/df.truth.zeroed", header = T, sep=",")
df.truth$isFemale <- h2o.asfactor(df.truth$isFemale)
hotnames.truth <- fromJSON("https://s3.amazonaws.com/nw.data.test.us.east/hotnames.json", simplifyVector = T)
# Training and validation sets
splits <- h2o.splitFrame(df.truth, c(0.9), seed=1234)
train.truth <- h2o.assign(splits[[1]], "train.truth.hex")   
valid.truth <- h2o.assign(splits[[2]], "valid.truth.hex")
# Train a model using non-GPU deeplearning
dl.2 <- h2o.deeplearning(         
  training_frame = train.truth, model_id="dl.2",
  validation_frame = valid.truth,      
  x=setdiff(hotnames.truth[1:(length(hotnames.truth)/2)], c("isFemale", "nwtcs")),
  y="isFemale", stopping_metric = "AUTO", seed = 1,
  sparse = F, mini_batch_size = 20)
# Train a model using GPU-enabled deepwater
dw.2 <- h2o.deepwater(         
  training_frame = train.truth, model_id="dw.2", 
  validation_frame = valid.truth,         
  x=setdiff(hotnames.truth[1:(length(hotnames.truth)/2)], c("isFemale", "nwtcs")),
  y="isFemale", stopping_metric = "AUTO", seed = 1,
  sparse = F, mini_batch_size = 20) 

当我检查这两个模型时,令我惊讶的是,我发现对数损失有很大的差异:

非图形用户界面

print(dl.2)
Model Details:
==============
H2OBinomialModel: deeplearning
Model ID:  dl.2
Status of Neuron Layers: predicting isFemale, 2-class classification, bernoulli distribution, CrossEntropy loss, 160,802 weights/biases, 2.0 MB, 1,041,465 training samples, mini-batch size 1
  layer units      type dropout       l1       l2 mean_rate rate_rms momentum
1     1   600     Input  0.00 %
2     2   200 Rectifier  0.00 % 0.000000 0.000000  0.104435 0.102760 0.000000
3     3   200 Rectifier  0.00 % 0.000000 0.000000  0.031395 0.055490 0.000000
4     4     2   Softmax         0.000000 0.000000  0.001541 0.001438 0.000000
  mean_weight weight_rms mean_bias bias_rms
1
2    0.018904   0.144034  0.150630 0.415525
3   -0.023333   0.081914  0.545394 0.251275
4    0.029091   0.295439 -0.004396 0.357609
H2OBinomialMetrics: deeplearning
** Reported on training data. **
** Metrics reported on temporary training frame with 9877 samples **
MSE:  0.1213733
RMSE:  0.3483868
LogLoss:  0.388214
Mean Per-Class Error:  0.2563669
AUC:  0.8433182
Gini:  0.6866365
Confusion Matrix (vertical: actual; across: predicted) for F1-optimal threshold:
          0    1    Error        Rate
0      6546 1079 0.141508  =1079/7625
1       836 1416 0.371226   =836/2252
Totals 7382 2495 0.193885  =1915/9877
H2OBinomialMetrics: deeplearning
** Reported on validation data. **
** Metrics reported on full validation frame **
MSE:  0.126671
RMSE:  0.3559087
LogLoss:  0.4005941
Mean Per-Class Error:  0.2585051
AUC:  0.8309913
Gini:  0.6619825
Confusion Matrix (vertical: actual; across: predicted) for F1-optimal threshold:
           0    1    Error         Rate
0      11746 3134 0.210618  =3134/14880
1       1323 2995 0.306392   =1323/4318
Totals 13069 6129 0.232160  =4457/19198

启用 GPU

print(dw.2)
Model Details:
==============
H2OBinomialModel: deepwater
Model ID:  dw.2b
Status of Deep Learning Model: MLP: [200, 200], 630.8 KB, predicting isFemale, 2-class classification, 1,708,160 training samples, mini-batch size 20
  input_neurons     rate momentum
1           600 0.000369 0.900000

H2OBinomialMetrics: deepwater
** Reported on training data. **
** Metrics reported on temporary training frame with 9877 samples **
MSE:  0.1615781
RMSE:  0.4019677
LogLoss:  0.629549
Mean Per-Class Error:  0.3467246
AUC:  0.7289561
Gini:  0.4579122
Confusion Matrix (vertical: actual; across: predicted) for F1-optimal threshold:
          0    1    Error        Rate
0      4843 2782 0.364852  =2782/7625
1       740 1512 0.328597   =740/2252
Totals 5583 4294 0.356586  =3522/9877
H2OBinomialMetrics: deepwater
** Reported on validation data. **
** Metrics reported on full validation frame **
MSE:  0.1651776
RMSE:  0.4064205
LogLoss:  0.6901861
Mean Per-Class Error:  0.3476629
AUC:  0.7187362
Gini:  0.4374724
Confusion Matrix (vertical: actual; across: predicted) for F1-optimal threshold:
          0    1    Error         Rate
0      8624 6256 0.420430  =6256/14880
1      1187 3131 0.274896   =1187/4318
Totals 9811 9387 0.387697  =7443/19198

如上所示,非 GPU 和 GPU 模型之间的对数损失差异很大:

Logloss
+----------------------------------+
|                 | non-GPU | GPU  |
+----------------------------------+
| training data   | 0.39    | 0.63 |
+----------------------------------|
| validation data | 0.40    | 0.69 |
+----------------------------------+

我知道由于训练的随机性质,我会得到不同的结果,但我不会期望非 GPU 和 GPU 之间有如此巨大的差异。

h2o.deeplearning是H2O内置的深度学习算法。它的并行化非常好,可以很好地处理大数据,但不使用 GPU。

h2o.deepwater是(可能(Tensorflow的包装器,并且(可能(使用GPU(但它可以使用CPU,并且可以使用不同的后端(。

换句话说,这与使用 CPU 或使用 GPU 没有区别:您使用的是两种不同的深度学习实现。

顺便说一句,我建议您增加epochs的数量(从默认值 10 增加到 200 - 请记住,这意味着运行时间将延长 20 倍(,看看差异是否仍然存在。或者比较分数历史图表,看看Tensorflow是否到达了那里,但只需要,比如说,50%以上的时期就可以获得相同的对数分数。

最新更新