r语言 - 使用 H2O 随机森林的堆使用错误



我在使用 h2o.randomforest 时收到此错误。 请参阅下面的函数调用和相关错误。

base_line_rf <- h2o.randomForest(x=2:ncol(train),
                                y=1,
                                ntrees = 10000,
                                mtries = ncol(train)-1,
                                training_frame = train,
                                model_id <- model_id,
                                stopping_rounds = 5,
                                stopping_tolerance = 0,
                                stopping_metric = "AUC",
                                binomial_double_trees = TRUE
)

错误:

java.lang.AssertionError: I am really confused about the heap usage; MEM_MAX=7624720384 heapUsedGC=7626295912
    at water.MemoryManager.set_goals(MemoryManager.java:97)
    at water.MemoryManager.malloc(MemoryManager.java:265)
    at water.MemoryManager.malloc(MemoryManager.java:222)
    at water.MemoryManager.malloc8d(MemoryManager.java:281)
    at hex.tree.DHistogram.init(DHistogram.java:281)
    at hex.tree.DHistogram.init(DHistogram.java:240)
    at hex.tree.ScoreBuildHistogram2$ComputeHistoThread.computeChunk(ScoreBuildHistogram2.java:326)
    at hex.tree.ScoreBuildHistogram2$ComputeHistoThread.map(ScoreBuildHistogram2.java:306)
    at water.LocalMR.compute2(LocalMR.java:84)
    at water.LocalMR.compute2(LocalMR.java:76)
    at water.LocalMR.compute2(LocalMR.java:76)
    at water.LocalMR.compute2(LocalMR.java:76)
    at water.H2O$H2OCountedCompleter.compute(H2O.java:1255)
    at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
    at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
    at jsr166y.ForkJoinPool$WorkQueue.popAndExecAll(ForkJoinPool.java:904)
    at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:977)
    at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
    at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)

此错误的原因是什么?

谢谢

根据您的问题,您需要设置 H2O 集群以使用更多内存运行,以适应您的 10000 棵树随机森林。看起来 H2O 集群(Java 进程)是使用 8GB 内存创建的,但是根据您的 10000 树设置,它需要更多的内存,然后给定 8GB。

max_mem_size 7624.720384 MB (Configured)
heapUsedGC - 7626.295912 MB (Required)

看起来您正在使用 H2O,因此您可以在 h2o.init() 函数中传递 max_mem_size=12G(意味着 H2O 集群将以 12GB 内存开始),如下所示,这应该符合您的随机森林要求:

h2o.init(max_mem_size="12G")

您还可以使用以下命令检查您的 H2O 集群详细信息:

> h2o.clusterInfo()
R is connected to the H2O cluster: 
    H2O cluster uptime:         19 seconds 80 milliseconds 
    H2O cluster version:        3.14.0.3 
    H2O cluster version age:    27 days  
    H2O cluster name:           H2O_started_from_R_avkashchauhan_hwc594 
    H2O cluster total nodes:    1 
    H2O cluster total memory:   10.65 GB <=== This is the max memory size
    H2O cluster total cores:    8 
    H2O cluster allowed cores:  8 
    H2O cluster healthy:        TRUE 
    H2O Connection ip:          localhost 
    H2O Connection port:        54321 
    H2O Connection proxy:       NA 
    H2O Internal Security:      FALSE 
    H2O API Extensions:         XGBoost, Algos, AutoML, Core V3, Core V4 
    R Version:                  R version 3.4.1 (2017-06-30) 

相关内容

  • 没有找到相关文章

最新更新