tez上的配置单元错误:java.lang.OutOfMemoryError



在具有70多列的配置单元表上按日期执行分区时,我面临此错误:

错误:状态:失败错误:顶点失败,顶点名称=映射1,顶点ID=顶点1612203694878_0265_4_00,诊断=[任务失败,任务ID=任务_1612203694878_0265_4 _000058,诊断=[TaskAttempt 0失败,信息=[Container Container_e16_16122036914878_0265 _01_000167已完成,诊断设置为[Containe failed,exitCode=-104。[2021-02-02 11:00:58.498]Container[pid=1577,containerID=Container_e16_1612203694878_0265_01_000167]正在运行3022848B,超出"物理"内存限制。当前使用情况:已使用1 GB物理内存中的1.0 GB;已使用2.7 GB的2.1 GB虚拟内存。正在杀死容器。容器_16_1612203694878_0265_01_000167的工艺树转储:|-PID PPID PGRPID sessiond CMD_NAME USER_MODE_TIME(毫秒)SYSTEM_TIME(毫米)VMEM_USAGE(字节)rssem_uusage(页面)FULL_CMD_LINE|-1577 1567 1577 1577:+ResizeTLAB-XX:+PrintGCDetails-verbose:gc-XX: +PrintGCTimeStamps-Dlog4j.configurationorClass=org.apache.tez.comon.TezLog4j Configurator-Dlog4.config=tez-container-log4j.properties-Dyarn.app.container.log.dir=/usr/hadoop/yarn/log/application_1612203694878_0265/container_e16_16122036914878_0265 _01_000167-Dtez.root.logger=INFO,CLA-Djava.io.tmpdir=/usr/hadoop/syar/local/usercache/hive/appcache/application_1612203694878_0265/container_e16_16122036914878_0265 _01 _000167/tmp org.apache.tez.runtime.task.TezChild slave-06.n.faryhq.corp 43250 container_e16_1 61220369488_0265 _01_000167 application_161220369.4878_065 1>usr/hadoop/syar/log/application_1612203694878_0265/container_e16_16122036914878_0265 _01_000167/stdout 2>usr/hadoop/syar/log/application_1612203694878_0265/container_e16_16122036914878_0265 _01_000167/stderr|-1658 1577 157 7 1577(java)1414 128 2788896768 262581/usr/jdk64/jdk1.8.0_112/bin/java-Xmx819m-服务器-Djava.net。preferrPv4Stack=true-Dhdp.版本=3.1.4.0-315-XX:+PrintGCDetails-verbose:gc-XX:+PrintGCTimeStamps-XX:+UseNUMA-XX:/UseG1GC-XX:+ResizeTLAB-服务器-贾瓦.net。preferPv4Stack=true-XX:NewRatio=8-XX:+UseNUMA-XX:+UseG1GC-XX:+ResizeTLAB-XX:+PrintGCDetails-verbose:gc-XX: +PrintGCTimeStamps-Dlog4j.configurationorClass=org.apache.tez.comon.TezLog4j Configurator-Dlog4.config=tez-container-log4j.properties-Dyarn.app.container.log.dir=/usr/hadoop/yarn/log/application_1612203694878_0265/container_e16_16122036914878_0265 _01_000167-Dtez.root.logger=INFO,CLA-Djava.io.tmpdir=/usr/hadoop/syar/local/usercache/hive/appcache/application_1612203694878_0265/container_e16_16122036914878_0265 _01 _000167/tmp org.apache.tez.runtime.task.TezChild slave-06.n.faryhq.corp 43250 container_e16_1[2021-02-02 11:00:58.512]应要求杀死集装箱。退出代码为143[2021-02-02 11:00:58.521]容器已退出,退出代码为143。]],TaskAttempt 1失败,info=[Error:运行任务时出错(失败):java.lang.OutOfMemoryError:java堆空间位于java.nio.HeapByteBuffer。(HeapByteBuffer.java:57)位于java.nio.ByteBuffer.allocate(ByteBuffer.java:335)位于org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.allocateSpace(PipelindSorter.java:256)网址:org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.(PipelindSorter.java:205)网址:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.start(OrderedPartitionedKVOutput.java:146)网址:org.apache.hadop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:193)网址:org.apache.hadop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266)网址:org.apache.hadop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)位于org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)网址:org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)网址:org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)位于java.security.AccessController.doPrivileged(本机方法)位于javax.security.auth.Subject.doAs(Subject.java:422)网址:org.apache.hadop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)网址:org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)网址:org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)网址:org.apache.tez.common.CallableWithNdc.call(CallableWith恩德c.java:36)网址:com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptible(TrustedListenableFutureTask.java:125)网址:com.google.common.util.concurrent.InterrubleTask.run(InterrubleTask.java:69)网址:com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListnableFutureTask.java:78)位于java.util.concurrent.ThreadPoolExecutiator.runWorker(ThreadPoolExecutiator.java:1142)位于java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)在java.lang.Thread.run(线程.java:745),errorMessage=无法从此错误中恢复:java.lang.OutOfMemoryError:java堆空间位于java.nio.HeapByteBuffer。(HeapByteBuffer.java:57)位于java.nio.ByteBuffer.allocate(ByteBuffer.java:335)位于org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.allocateSpace(PipelindSorter.java:256)网址:org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.(PipelindSorter.java:205)网址:org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.start(OrderedPartitionedKVOutput.java:146)网址:org.apache.hadop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:193)网址:org.apache.hadop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266)网址:org.apache.hadop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)位于org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)网址:org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)网址:org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)位于java.security.AccessController.doPrivileged(本机方法)位于javax.security.auth.Subject.doAs(Subject.java:422)网址:org.apache.hadop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)网址:org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)网址:org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)网址:org.apache.tez.common.CallableWithNdc.call(CallableWith恩德c.java:36)网址:com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptible(TrustedListenableFutureTask.java:125)网址:com.google.common.util.concurrent.InterrubleTask.run(InterrubleTask.java:69)网址:com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListnableFutureTask.java:78)位于java.util.concurrent.ThreadPoolExecutiator.runWorker(ThreadPoolExecutiator.java:1142)位于java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)在java.lang.Thread.run(线程.java:745)]],由于OWN_TAK_FAILURE,Vertex未成功,失败的任务:1 killed任务:17,Vertex Vertex_1612203694878_0265_4_00[地图1]killed/失败的原因:OWN_TAS_FAILURE]错误:顶点已终止,Vertex名称=还原器2,Vertex Id=Vertex_1612203694878_0265_4_01,诊断=[顶点在运行状态下收到终止。,由于OTHER_Vertex_FAILURE,顶点未成功,失败的任务:0已终止的任务:2,顶点Vertex 1612203694.878_0265 _4_01[还原器2]已终止/失败的原因:OTHER_Vertex_FAIURE]错误:由于VEREX_FAILURE,DAG未成功。failedVertices:1 killed Vertices:1

尝试(按此顺序)

  1. 提高映射程序的并行性。目标是获得更多更小的映射器。检查它启动和调整数字的映射器数量。如果您有太大的不可拆分格式的文件(如gzip),这将不起作用,请继续执行下面的两个步骤。

    --This is example, check your current setings and adjust to get x2 or more mappers
    set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
    set tez.grouping.max-size=32000000; --bigger files will be splitted
    set tez.grouping.min-size=32000;    --smaller files will be combined on single mapper
    
  2. 禁用地图端聚合(地图端聚合通常导致OOM)

    set hive.map.aggr=false;
    
  3. 如果以上两个步骤没有帮助,请增加映射器内存(尝试找到最小工作容器大小)

    set hive.tez.container.size=9216; --Adjust figures and chose minimum working size
    set hive.tez.java.opts=-Xmx6144m;
    

当您在Tez上使用Have时,您必须至少定义这4个参数中的所有参数,例如:

set hive.tez.container.size=8192;
set tez.am.resource.memory=8192;
set tez.runtime.io.sort.mb=2048;
set hive.tez.java.opts=-Xmx6144m;
set tez.am.launch.cmd-opts=-Xmx4096m;

相关内容

最新更新