我正在尝试Pig的在线手稿演示代码。
首先,我创建了一个名为myfile.txt
的测试文件。它包含两行中的六个整数:
4 5 3
1 2 3
使用hadoop fs -copyFromLocal myfile.txt /user/myfile.txt
将文件放入hdfs
然后我运行
A = LOAD '/user/myfile.text';
DUMP A;
但是获得以下错误消息:
2014-10-08 14:15:54,259 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2014-10-08 14:15:54,594 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-10-08 14:15:54,692 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-10-08 14:15:54,693 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-10-08 14:15:54,909 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-10-08 14:15:54,998 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-10-08 14:15:55,006 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
2014-10-08 14:15:55,013 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=12
2014-10-08 14:15:55,015 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1
2014-10-08 14:15:55,016 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - creating jar file Job7804857093829884774.jar
2014-10-08 14:15:58,229 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - jar file Job7804857093829884774.jar created
2014-10-08 14:15:58,266 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-10-08 14:15:58,304 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2014-10-08 14:15:58,353 [JobControl] WARN org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
2014-10-08 14:15:58,806 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2014-10-08 14:15:58,964 [JobControl] WARN org.apache.hadoop.conf.Configuration - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-10-08 14:15:58,968 [JobControl] WARN org.apache.hadoop.conf.Configuration - dfs.https.address is deprecated. Instead, use dfs.namenode.https-address
2014-10-08 14:15:58,969 [JobControl] WARN org.apache.hadoop.conf.Configuration - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2014-10-08 14:15:59,024 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2014-10-08 14:15:59,025 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2014-10-08 14:15:59,051 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2014-10-08 14:16:00,533 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201410081312_0015
2014-10-08 14:16:00,534 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases A
2014-10-08 14:16:00,534 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: A[2,4] C: R:
2014-10-08 14:16:05,098 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2014-10-08 14:16:05,098 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_201410081312_0015 has failed! Stop running all dependent jobs
2014-10-08 14:16:05,099 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2014-10-08 14:16:05,109 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2014-10-08 14:16:05,111 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.0.0-cdh4.7.0 0.11.0-cdh4.7.0 hdfs 2014-10-08 14:15:54 2014-10-08 14:16:05 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_201410081312_0015 A MAP_ONLY Message: Job failed!
**Input(s):
Failed to read data from "/user/myfile.txt"**
Pig似乎没有连接到hdfs,因此无法访问该文件。有人能帮我解决这个问题吗?
更改文件上的设置。也许您无法读取该文件。
在Linux环境中,使用更改文件的权限
chmod 755 myfile.txt
然后执行CopyFromLocal命令。