我有问题时,执行这个java代码导入表从mysql到hive:
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.*;
import com.cloudera.sqoop.Sqoop;
import com.cloudera.sqoop.SqoopOptions;
import com.cloudera.sqoop.SqoopOptions.FileLayout;
import com.cloudera.sqoop.tool.ImportTool;
import com.mysql.jdbc.*;
public class SqoopExample {
public static void main(String[] args) throws Exception {
String driver = "com.mysql.jdbc.Driver";
Class.forName(driver).newInstance();
Configuration config = new Configuration();
config.addResource(new Path("/home/socio/hadoop/etc/hadoop/core-site.xml"));
config.addResource(new Path("/home/socio/hadoop/etc/hadoop/hdfs-site.xml"));
FileSystem dfs = FileSystem.get(config);
SqoopOptions options = new SqoopOptions();
options.setDriverClassName(driver);
options.setConf(config);
options.setHiveTableName("tlinesuccess");
options.setConnManagerClassName("org.apache.sqoop.manager.GenericJdbcManager");
options.setConnectString("jdbc:mysql://dba-virtual-machine/test");
options.setHadoopMapRedHome("/home/socio/hadoop");
options.setHiveHome("/home/socio/hive");
options.setTableName("textlines");
options.setColumns(new String[] {"line"});
options.setUsername("socio");
options.setNumMappers(1);
options.setJobName("Test Import");
options.setOverwriteHiveTable(true);
options.setHiveImport(true);
options.setFileLayout(FileLayout.TextFile);
int ret = new ImportTool().run(options);
}
}
result:
Exception in thread "main" java.io.IOException: No FileSystem for scheme: hdfs
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2385)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2392)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2431)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2413)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:167)
at SqoopExample.main(SqoopExample.java:22)
I specify that this command works sqoop import --connect jdbc:mysql://dba-virtual-machine/test --username socio --table textlines --columns line --hive-import
.
I can import from mysql with the shell using the command, the problem is with the java code.
Any help/ideas would be greatly appreciated.
Thanks
Add this plugin while making maven jar, it will merge all file systems in one, also add hadoop-hdfs, hadoop-client dependencies..
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>1.5</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
<shadedArtifactAttached>true</shadedArtifactAttached>
<shadedClassifierName>allinone</shadedClassifierName>
<artifactSet>
<includes>
<include>*:*</include>
</includes>
</artifactSet>
<transformers>
<transformer
implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
<resource>reference.conf</resource>
</transformer>
<transformer
implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
</transformer>
<transformer
implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer">
</transformer>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
HDFS文件系统定义在库hadoop-hdfs-2.0.0-cdhX.X.X.jar
中。如果你将它作为java程序来执行,你需要将这个库添加到classpath中。
或者这个库将在hadoop类路径中可用。创建一个jar文件,并使用hadoop命令执行jar。
如果您正在使用Maven,这也是一个很好的解决方案
https://stackoverflow.com/a/28135140/3451801基本上你需要在你的pom依赖中添加hadoop-hdfs。