我在使用 h2o.randomforest 时收到此错误。 请参阅下面的函数调用和相关错误。
base_line_rf <- h2o.randomForest(x=2:ncol(train),
y=1,
ntrees = 10000,
mtries = ncol(train)-1,
training_frame = train,
model_id <- model_id,
stopping_rounds = 5,
stopping_tolerance = 0,
stopping_metric = "AUC",
binomial_double_trees = TRUE
)
错误:
java.lang.AssertionError: I am really confused about the heap usage; MEM_MAX=7624720384 heapUsedGC=7626295912
at water.MemoryManager.set_goals(MemoryManager.java:97)
at water.MemoryManager.malloc(MemoryManager.java:265)
at water.MemoryManager.malloc(MemoryManager.java:222)
at water.MemoryManager.malloc8d(MemoryManager.java:281)
at hex.tree.DHistogram.init(DHistogram.java:281)
at hex.tree.DHistogram.init(DHistogram.java:240)
at hex.tree.ScoreBuildHistogram2$ComputeHistoThread.computeChunk(ScoreBuildHistogram2.java:326)
at hex.tree.ScoreBuildHistogram2$ComputeHistoThread.map(ScoreBuildHistogram2.java:306)
at water.LocalMR.compute2(LocalMR.java:84)
at water.LocalMR.compute2(LocalMR.java:76)
at water.LocalMR.compute2(LocalMR.java:76)
at water.LocalMR.compute2(LocalMR.java:76)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1255)
at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
at jsr166y.ForkJoinPool$WorkQueue.popAndExecAll(ForkJoinPool.java:904)
at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:977)
at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
此错误的原因是什么?
谢谢
根据您的问题,您需要设置 H2O 集群以使用更多内存运行,以适应您的 10000 棵树随机森林。看起来 H2O 集群(Java 进程)是使用 8GB 内存创建的,但是根据您的 10000 树设置,它需要更多的内存,然后给定 8GB。
max_mem_size 7624.720384 MB (Configured)
heapUsedGC - 7626.295912 MB (Required)
看起来您正在使用 H2O,因此您可以在 h2o.init() 函数中传递 max_mem_size=12G(意味着 H2O 集群将以 12GB 内存开始),如下所示,这应该符合您的随机森林要求:
h2o.init(max_mem_size="12G")
您还可以使用以下命令检查您的 H2O 集群详细信息:
> h2o.clusterInfo()
R is connected to the H2O cluster:
H2O cluster uptime: 19 seconds 80 milliseconds
H2O cluster version: 3.14.0.3
H2O cluster version age: 27 days
H2O cluster name: H2O_started_from_R_avkashchauhan_hwc594
H2O cluster total nodes: 1
H2O cluster total memory: 10.65 GB <=== This is the max memory size
H2O cluster total cores: 8
H2O cluster allowed cores: 8
H2O cluster healthy: TRUE
H2O Connection ip: localhost
H2O Connection port: 54321
H2O Connection proxy: NA
H2O Internal Security: FALSE
H2O API Extensions: XGBoost, Algos, AutoML, Core V3, Core V4
R Version: R version 3.4.1 (2017-06-30)