我试图在Windows系统上安装和运行Apache Pig 0.15.0,但没有成功。我打算在Apache Hadoop 2.7.1中使用它。
<<p> 上下文/strong>我遵循了基础教程Getting Started,"Download Pig"部分。我下载了"Pig -0.15.0",并设置了Pig的路径。
我可以输入"grunt",但是当我尝试运行一个简单的脚本时,例如:
logs = LOAD 'PigInput/logs' USING PigStorage(';');
STORE logs INTO 'logs-output.txt';
它给了我以下错误:
误差
2015-07-15 12:54:27,157 [main] WARN org.apache.pig.backend.hadoop20.PigJobControl - falling back to default JobControl (not using hadoop 0.20 ?)
java.lang.NoSuchFieldException: runnerState
at java.lang.Class.getDeclaredField(Class.java:1953)
at org.apache.pig.backend.hadoop20.PigJobControl.<clinit>(PigJobControl.java:51)
at org.apache.pig.backend.hadoop.executionengine.shims.HadoopShims.newJobControl(HadoopShims.java:109)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:314)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:196)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:304)
at org.apache.pig.PigServer.launchPlan(PigServer.java:1390)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1375)
at org.apache.pig.PigServer.execute(PigServer.java:1364)
at org.apache.pig.PigServer.access$500(PigServer.java:113)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1689)
at org.apache.pig.PigServer.registerQuery(PigServer.java:623)
at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1082)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:505)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66)
at org.apache.pig.Main.run(Main.java:565)
at org.apache.pig.Main.main(Main.java:177)
2015-07-15 12:54:27,165 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2015-07-15 12:54:27,186 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.reduce.markreset.buffer.percent is deprecated. Inst
ead, use mapreduce.reduce.markreset.buffer.percent
2015-07-15 12:54:27,187 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buf
fer.percent is not set, set to default 0.3
2015-07-15 12:54:27,188 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.output.compress is deprecated. Instead, use mapreduce.o
utput.fileoutputformat.compress
2015-07-15 12:54:27,190 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted ru
n in-process
2015-07-15 12:54:27,585 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/C:/pig-0.15.0/p
ig-0.15.0-core-h1.jar to DistributedCache through /tmp/temp27293389/tmp1227477167/pig-0.15.0-core-h1.jar
2015-07-15 12:54:27,627 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/C:/pig-0.15.0/l
ib/automaton-1.11-8.jar to DistributedCache through /tmp/temp27293389/tmp-1342585295/automaton-1.11-8.jar
2015-07-15 12:54:27,664 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/C:/pig-0.15.0/l
ib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp27293389/tmp-510663803/antlr-runtime-3.4.jar
2015-07-15 12:54:27,769 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/C:/hadoop-2.7.1
/share/hadoop/common/lib/guava-11.0.2.jar to DistributedCache through /tmp/temp27293389/tmp-1466437686/guava-11.0.2.jar
2015-07-15 12:54:27,817 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/C:/pig-0.15.0/l
ib/joda-time-2.5.jar to DistributedCache through /tmp/temp27293389/tmp672491704/joda-time-2.5.jar
2015-07-15 12:54:27,905 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2015-07-15 12:54:27,959 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for
submission.
2015-07-15 12:54:27,969 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker.http.address is deprecated. Instead, use ma
preduce.jobtracker.http.address
2015-07-15 12:54:27,979 [JobControl] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
2015-07-15 12:54:27,989 [JobControl] ERROR org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl - Error while trying to run jobs.
java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.setupUdfEnvAndStores(PigOutputFormat.java:235)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.checkOutputSpecs(PigOutputFormat.java:183)
at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:266)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:139)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335)
at org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:240)
at org.apache.pig.backend.hadoop20.PigJobControl.run(PigJobControl.java:121)
at java.lang.Thread.run(Thread.java:745)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276)
2015-07-15 12:54:28,005 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2015-07-15 12:54:28,014 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Spec
ify -stop_on_failure if you want Pig to stop immediately on failure.
2015-07-15 12:54:28,016 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job null has failed! Stop runnin
g all dependent jobs
2015-07-15 12:54:28,017 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2015-07-15 12:54:28,025 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2015-07-15 12:54:28,027 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.7.1 0.15.0 Administrator 2015-07-15 12:54:27 2015-07-15 12:54:28 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
N/A logs MAP_ONLY Message: Unexpected System Error Occured: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.ma
preduce.JobContext, but class was expected
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.setupUdfEnvAndStores(PigOutputFormat.java:235)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.checkOutputSpecs(PigOutputFormat.java:183)
at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:266)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:139)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335)
at org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:240)
at org.apache.pig.backend.hadoop20.PigJobControl.run(PigJobControl.java:121)
at java.lang.Thread.run(Thread.java:745)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276)
hdfs://localhost:9000/user/Administrator/logs-output.txt,
What I have try
1. 我尝试下载"pig-0.15.0-src"并尝试使用:
ant -Dhadoopversion=23
我收到以下错误(在两者之间,我还添加了代理设置到我的"build.xml"):
C:pig-0.15.0-src>ant -Dhadoopversion=23
Buildfile: C:pig-0.15.0-srcbuild.xml
ivy-download:
[get] Getting: http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.2.0/ivy-2.2.0.jar
[get] To: C:pig-0.15.0-srcivyivy-2.2.0.jar
[get] Not modified - so not downloaded
ivy-init-dirs:
ivy-probe-antlib:
ivy-init-antlib:
ivy-init:
[ivy:configure] :: Ivy 2.2.0 - 20100923230623 :: http://ant.apache.org/ivy/ ::
[ivy:configure] :: loading settings :: file = C:pig-0.15.0-srcivyivysettings.xml
ivy-resolve:
[ivy:resolve]
[ivy:resolve] :: problems summary ::
[ivy:resolve] :::: WARNINGS
[ivy:resolve] ::::::::::::::::::::::::::::::::::::::::::::::
[ivy:resolve] :: UNRESOLVED DEPENDENCIES ::
[ivy:resolve] ::::::::::::::::::::::::::::::::::::::::::::::
[ivy:resolve] :: org.antlr#antlr;3.4: configuration not found in org.antlr#antlr;3.4: 'master'. It was required from org.apache.pig#pig;0.15
.0-SNAPSHOT compile
[ivy:resolve] ::::::::::::::::::::::::::::::::::::::::::::::
[ivy:resolve]
[ivy:resolve]
[ivy:resolve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
BUILD FAILED
C:pig-0.15.0-srcbuild.xml:1662: impossible to resolve dependencies:
resolve failed - see output for details
- 我已经从Maven下载了jar: org.apache.pig pig 0.15.0 h2.jar。我不知道这是否有帮助。我也不知道把它放在哪里。
还有一个细节起初,当我运行"pig"时,它给我的路径没有找到"hadoop-config.cmd"。为了使其工作,我更改了以下行"pig/bin/pig.cmd":
设置hadoop-config-script = C: hadoop-2.7.1 libexec hadoop-config.cmd
其他ppl,类似问题
我在这里和其他地方看到了类似的问题。他们中的大多数建议运行这样的东西:
ant clean jar-withouthadoop -Dhadoopversion=23
…这充其量只能让我犯其他错误。
我需要帮助使我的Apache Pig运行命令和MapReduce作业。我该怎么办?
更新1 正如@Fred在评论中推荐的那样,我已经尝试并成功地使pig 0.12.0运行作业而没有拥挤(我只设置了路径,没有构建等),除了在互联网上找到这个版本:cloudera pig 0.12.0。
尽管如此,我还是想找到一个解决方案来使用最新版本的Apache Pig。
有两个来自同一来源的猪罐。如果你正在使用hadoop> 2.3。包括
<dependency>
<groupId>org.apache.pig</groupId>
<artifactId>pig</artifactId>
<classifier>h2</classifier>
<version>0.15.0</version>
</dependency>
代替
<dependency>
<groupId>org.apache.pig</groupId>
<artifactId>pig</artifactId>
<version>0.15.0</version>
</dependency>
0.15.0 h2.jar是使用ant -Dhadoopversion=23 jar构建的