我正在尝试对存储在amazons3中的文件运行一些map reduce作业。我看到了http://wiki.apache.org/hadoop/AmazonS3并跟随它进行集成。这是我为地图减少作业设置输入目录的代码
FileInputFormat.setInputPaths(job, "s3n://myAccessKey:mySecretKey@myS3Bucket/dir1/dir2/*.txt");
当我运行mapreduce作业时,我得到了这个异常
Exception in thread "main" java.lang.IllegalArgumentException:
Wrong FS: s3n://myAccessKey:mySecretKey@myS3Bucket/dir1/dir2/*.txt,
expected: s3n://myAccessKey:mySecretKey@myS3Bucket
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:381)
at org.apache.hadoop.fs.FileSystem.makeQualified(FileSystem.java:294)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.setInputPaths(FileInputFormat.java:352)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.setInputPaths(FileInputFormat.java:321)
at com.appdynamics.blitz.hadoop.migration.DataMigrationManager.convertAndLoadData(DataMigrationManager.java:340)
at com.appdynamics.blitz.hadoop.migration.DataMigrationManager.migrateData(DataMigrationManager.java:300)
at com.appdynamics.blitz.hadoop.migration.DataMigrationManager.migrate(DataMigrationManager.java:166)
at com.appdynamics.blitz.command.DataMigrationCommand.run(DataMigrationCommand.java:53)
at com.appdynamics.blitz.command.DataMigrationCommand.run(DataMigrationCommand.java:21)
at com.yammer.dropwizard.cli.ConfiguredCommand.run(ConfiguredCommand.java:58)
at com.yammer.dropwizard.cli.Cli.run(Cli.java:53)
at com.yammer.dropwizard.Service.run(Service.java:61)
at com.appdynamics.blitz.service.BlitzService.main(BlitzService.java:84)
我找不到资源来帮助我。任何指示都将不胜感激。
你只需要继续玩
错误的FS:s3n://myAccessKey:mySecretKey@myS3Bucket/dir1/dir2/*.txt
你给Hadoop的路径不正确,只有当它能够访问正确的文件时,它才会工作。
所以我发现了问题。它是由这个错误引起的https://issues.apache.org/jira/browse/HADOOP-3733
即使我用"%2F"替换了"/",它仍然会出现同样的问题。我重新生成了密钥,并在密钥中没有"/"的地方放了一个,它解决了这个问题。