当我的操作员被杀时,我偶尔会从web UI中看到以下日志。在协商容器时,有什么方法可以控制用于与YARN通信的内存设置吗?
容器堆和最大内存的典型YARN设置与Apex内存分配模型的关系如何。
我在web UI中看到的信息消息如下:
Container [pid=14699,containerID=container_1462863487071_0015_01_000012] is running beyond physical memory limits. Current usage: 1.5 GB of 1.5 GB physical memory used; 6.1 GB of 3.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1462863487071_0015_01_000012 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 14817 14699 14699 14699 (java) 1584 1654 6426968064 393896 /usr/java/default/bin/java -Xmx4429185024 -Ddt.attr.APPLICATION_PATH=hdfs://dwh109.qaperf2.sac.int.threatmetrix.com:8020/user/dtadmin/datatorrent/apps/application_1462863487071_0015 -Djava.io.tmpdir=/data3/yarn/nm/usercache/root/appcache/application_1462863487071_0015/container_1462863487071_0015_01_000012/tmp -Ddt.cid=container_1462863487071_0015_01_000012 -Dhadoop.root.logger=INFO,RFA -Dhadoop.log.dir=/data3/yarn/container-logs/application_1462863487071_0015/container_1462863487071_0015_01_000012 -Ddt.loggers.level=com.datatorrent.*:INFO,org.apache.*:INFO com.datatorrent.stram.engine.StreamingContainer
|- 14699 14697 14699 14699 (bash) 1 2 108646400 303 /bin/bash -c /usr/java/default/bin/java -Xmx4429185024 -Ddt.attr.APPLICATION_PATH=hdfs://dwh109.qaperf2.sac.int.threatmetrix.com:8020/user/dtadmin/datatorrent/apps/application_1462863487071_0015 -Djava.io.tmpdir=/data3/yarn/nm/usercache/root/appcache/application_1462863487071_0015/container_1462863487071_0015_01_000012/tmp -Ddt.cid=container_1462863487071_0015_01_000012 -Dhadoop.root.logger=INFO,RFA -Dhadoop.log.dir=/data3/yarn/container-logs/application_1462863487071_0015/container_1462863487071_0015_01_000012 -Ddt.loggers.level=com.datatorrent.*:INFO,org.apache.*:INFO com.datatorrent.stram.engine.StreamingContainer 1>/data3/yarn/container-logs/application_1462863487071_0015/container_1462863487071_0015_01_000012/stdout 2>/data3/yarn/container-logs/application_1462863487071_0015/container_1462863487071_0015_01_000012/stderr
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
看起来运算符需要更多内存。您可以添加此属性以将更多内存分配给容器。在properties.xml中,对于应用程序中的运算符O,您可以指定属性:
<property>
<name>dt.operator.O.attr.MEMORY_MB</name>
<value>2048</value>
</property>
有关更多的提前选项,请查看物理计划准备代码。
https://github.com/apache/incubator-apex-core/blob/ddb7471edd37ef228432c7d80e1e118368e68450/engine/src/main/java/com/datatorrent/stram/plan/physical/PhysicalPlan.java
有关更多故障排除指南,请参阅
http://docs.datatorrent.com/troubleshooting/#configuring-内存