Apache Pig在Windows上设置错误



我试图在Windows系统上安装和运行Apache Pig 0.15.0,但没有成功。我打算在Apache Hadoop 2.7.1中使用它。

<<p> 上下文/strong>
我遵循了基础教程Getting Started,"Download Pig"部分。我下载了"Pig -0.15.0",并设置了Pig的路径。

我可以输入"grunt",但是当我尝试运行一个简单的脚本时,例如:

logs = LOAD 'PigInput/logs' USING PigStorage(';');
STORE logs INTO 'logs-output.txt';

它给了我以下错误:
误差

2015-07-15 12:54:27,157 [main] WARN  org.apache.pig.backend.hadoop20.PigJobControl - falling back to default JobControl (not using hadoop 0.20 ?)
java.lang.NoSuchFieldException: runnerState
        at java.lang.Class.getDeclaredField(Class.java:1953)
        at org.apache.pig.backend.hadoop20.PigJobControl.<clinit>(PigJobControl.java:51)
        at org.apache.pig.backend.hadoop.executionengine.shims.HadoopShims.newJobControl(HadoopShims.java:109)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:314)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:196)
        at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:304)
        at org.apache.pig.PigServer.launchPlan(PigServer.java:1390)
        at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1375)
        at org.apache.pig.PigServer.execute(PigServer.java:1364)
        at org.apache.pig.PigServer.access$500(PigServer.java:113)
        at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1689)
        at org.apache.pig.PigServer.registerQuery(PigServer.java:623)
        at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1082)
        at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:505)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66)
        at org.apache.pig.Main.run(Main.java:565)
        at org.apache.pig.Main.main(Main.java:177)
2015-07-15 12:54:27,165 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2015-07-15 12:54:27,186 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.reduce.markreset.buffer.percent is deprecated. Inst
ead, use mapreduce.reduce.markreset.buffer.percent
2015-07-15 12:54:27,187 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buf
fer.percent is not set, set to default 0.3
2015-07-15 12:54:27,188 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.output.compress is deprecated. Instead, use mapreduce.o
utput.fileoutputformat.compress
2015-07-15 12:54:27,190 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted ru
n in-process
2015-07-15 12:54:27,585 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/C:/pig-0.15.0/p
ig-0.15.0-core-h1.jar to DistributedCache through /tmp/temp27293389/tmp1227477167/pig-0.15.0-core-h1.jar
2015-07-15 12:54:27,627 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/C:/pig-0.15.0/l
ib/automaton-1.11-8.jar to DistributedCache through /tmp/temp27293389/tmp-1342585295/automaton-1.11-8.jar
2015-07-15 12:54:27,664 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/C:/pig-0.15.0/l
ib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp27293389/tmp-510663803/antlr-runtime-3.4.jar
2015-07-15 12:54:27,769 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/C:/hadoop-2.7.1
/share/hadoop/common/lib/guava-11.0.2.jar to DistributedCache through /tmp/temp27293389/tmp-1466437686/guava-11.0.2.jar
2015-07-15 12:54:27,817 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/C:/pig-0.15.0/l
ib/joda-time-2.5.jar to DistributedCache through /tmp/temp27293389/tmp672491704/joda-time-2.5.jar
2015-07-15 12:54:27,905 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2015-07-15 12:54:27,959 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for
submission.
2015-07-15 12:54:27,969 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker.http.address is deprecated. Instead, use ma
preduce.jobtracker.http.address
2015-07-15 12:54:27,979 [JobControl] INFO  org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
2015-07-15 12:54:27,989 [JobControl] ERROR org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl - Error while trying to run jobs.
java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.setupUdfEnvAndStores(PigOutputFormat.java:235)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.checkOutputSpecs(PigOutputFormat.java:183)
        at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:266)
        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:139)
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
        at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335)
        at org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:240)
        at org.apache.pig.backend.hadoop20.PigJobControl.run(PigJobControl.java:121)
        at java.lang.Thread.run(Thread.java:745)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276)
2015-07-15 12:54:28,005 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2015-07-15 12:54:28,014 [main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Spec
ify -stop_on_failure if you want Pig to stop immediately on failure.
2015-07-15 12:54:28,016 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job null has failed! Stop runnin
g all dependent jobs
2015-07-15 12:54:28,017 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2015-07-15 12:54:28,025 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2015-07-15 12:54:28,027 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion   PigVersion      UserId  StartedAt       FinishedAt      Features
2.7.1   0.15.0  Administrator   2015-07-15 12:54:27     2015-07-15 12:54:28     UNKNOWN
Failed!
Failed Jobs:
JobId   Alias   Feature Message Outputs
N/A     logs    MAP_ONLY        Message: Unexpected System Error Occured: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.ma
preduce.JobContext, but class was expected
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.setupUdfEnvAndStores(PigOutputFormat.java:235)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.checkOutputSpecs(PigOutputFormat.java:183)
        at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:266)
        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:139)
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
        at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335)
        at org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:240)
        at org.apache.pig.backend.hadoop20.PigJobControl.run(PigJobControl.java:121)
        at java.lang.Thread.run(Thread.java:745)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276)
        hdfs://localhost:9000/user/Administrator/logs-output.txt,

What I have try
1. 我尝试下载"pig-0.15.0-src"并尝试使用:

构建它。

ant -Dhadoopversion=23

我收到以下错误(在两者之间,我还添加了代理设置到我的"build.xml"):

C:pig-0.15.0-src>ant -Dhadoopversion=23
Buildfile: C:pig-0.15.0-srcbuild.xml
ivy-download:
      [get] Getting: http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.2.0/ivy-2.2.0.jar
      [get] To: C:pig-0.15.0-srcivyivy-2.2.0.jar
      [get] Not modified - so not downloaded
ivy-init-dirs:
ivy-probe-antlib:
ivy-init-antlib:
ivy-init:
[ivy:configure] :: Ivy 2.2.0 - 20100923230623 :: http://ant.apache.org/ivy/ ::
[ivy:configure] :: loading settings :: file = C:pig-0.15.0-srcivyivysettings.xml
ivy-resolve:
[ivy:resolve]
[ivy:resolve] :: problems summary ::
[ivy:resolve] :::: WARNINGS
[ivy:resolve]           ::::::::::::::::::::::::::::::::::::::::::::::
[ivy:resolve]           ::          UNRESOLVED DEPENDENCIES         ::
[ivy:resolve]           ::::::::::::::::::::::::::::::::::::::::::::::
[ivy:resolve]           :: org.antlr#antlr;3.4: configuration not found in org.antlr#antlr;3.4: 'master'. It was required from org.apache.pig#pig;0.15
.0-SNAPSHOT compile
[ivy:resolve]           ::::::::::::::::::::::::::::::::::::::::::::::
[ivy:resolve]
[ivy:resolve]
[ivy:resolve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
BUILD FAILED
C:pig-0.15.0-srcbuild.xml:1662: impossible to resolve dependencies:
        resolve failed - see output for details
我已经从Maven下载了jar: org.apache.pig pig 0.15.0 h2.jar。我不知道这是否有帮助。我也不知道把它放在哪里。

还有一个细节起初,当我运行"pig"时,它给我的路径没有找到"hadoop-config.cmd"。为了使其工作,我更改了以下行"pig/bin/pig.cmd":

设置hadoop-config-script = C: hadoop-2.7.1 libexec hadoop-config.cmd

其他ppl,类似问题
我在这里和其他地方看到了类似的问题。他们中的大多数建议运行这样的东西:

ant clean jar-withouthadoop -Dhadoopversion=23

…这充其量只能让我犯其他错误。


我需要帮助使我的Apache Pig运行命令和MapReduce作业。我该怎么办?

更新1 正如@Fred在评论中推荐的那样,我已经尝试并成功地使pig 0.12.0运行作业而没有拥挤(我只设置了路径,没有构建等),除了在互联网上找到这个版本:cloudera pig 0.12.0。

尽管如此,我还是想找到一个解决方案来使用最新版本的Apache Pig。

有两个来自同一来源的猪罐。如果你正在使用hadoop> 2.3。包括

<dependency>
    <groupId>org.apache.pig</groupId>
    <artifactId>pig</artifactId>
    <classifier>h2</classifier>
    <version>0.15.0</version>
</dependency>

代替

<dependency>
    <groupId>org.apache.pig</groupId>
    <artifactId>pig</artifactId>
    <version>0.15.0</version>
</dependency>

0.15.0 h2.jar是使用ant -Dhadoopversion=23 jar构建的

相关内容

  • 没有找到相关文章

最新更新