猪Hadoop流帮助



我有问题运行猪流。当我启动一个交互式pig实例(顺便说一下,我是在一个交互式pig AWS EMR实例的主节点上通过SSH/Putty执行这个操作)时,只有一台机器我的pig流式传输可以完美地工作(它也可以在我的windows cloudera虚拟机映像上工作)。然而,当我切换到使用多台电脑时,它只是停止工作并给出各种错误。

注意:

  • 我能够在多计算机实例上运行没有任何流命令的Pig脚本。
  • 我所有的pig工作都是在pig MapReduce模式下完成的,而不是在-x本地模式下。
  • 我的python脚本(stream1.py)在顶部#有这个!/usr/bin/env python

下面是我到目前为止尝试过的选项的小示例(下面所有的命令都是在主/主节点上的grunt shell中完成的,我通过ssh/putty访问它):

这就是我如何将python文件放到mater节点上,以便它可以使用:

cp s3n://darin.emr-logs/stream1.py stream1.py
copyToLocal stream1.py /home/hadoop/stream1.py
chmod 755 stream1.py

这些是我的各种流尝试:

cooc = stream ct_pag_ph through `stream1.py`
dump coco;
ERROR 2090: Received Error while processing the reduce plan: 'stream1.py ' failed with exit status: 127
cooc = stream ct_pag_ph through `python stream1.py`;
dump coco;
ERROR 2090: Received Error while processing the reduce plan: 'python stream1.py ' failed with exit status: 2
DEFINE X `stream1.py`; 
cooc = stream ct_bag_ph through X;
dump coco;
ERROR 2090: Received Error while processing the reduce plan: 'stream1.py ' failed with exit status: 127
DEFINE X `stream1.py`; 
cooc = stream ct_bag_ph through `python X`;
dump coco;
ERROR 2090: Received Error while processing the reduce plan: 'python X ' failed with exit status: 2
DEFINE X `stream1.py` SHIP('stream1.py');
cooc = STREAM ct_bag_ph THROUGH X;
dump cooc;
ERROR 2017: Internal error creating job configuration.
DEFINE X `stream1.py` SHIP('/stream1.p');
cooc = STREAM ct_bag_ph THROUGH X;
dump cooc;
DEFINE X `stream1.py` SHIP('stream1.py') CACHE('stream1.py');
cooc = STREAM ct_bag_ph THROUGH X;
ERROR 2017: Internal error creating job configuration.
define X 'python /home/hadoop/stream1.py' SHIP('/home/hadoop/stream1.py');
cooc = STREAM ct_bag_ph THROUGH X;
DEFINE X `stream1.py` SHIP('stream1.py');

对我来说是有效的,根据您的先决条件,并且在当前本地目录中有stream1.py。

一种确定这一点的方法:

DEFINE X `python stream1.py` SHIP('/local/path/stream1.py');

SHIP的目的是将命令复制到所有任务的工作目录

相关内容

  • 没有找到相关文章

最新更新