我正在尝试使用SQOOP将CloudSQL表导入GCS存储桶。 我用过下面的罐子。
风筝数据核心-1.1.0.jar, 风筝数据蜂巢-1.1.0.jar, 风筝-数据-mapreduce-1.1.0.jar, 风筝-Hadoop-兼容性-1.1.0.jar.
下面是我的代码片段:
```sqoop import
-libjars=gs://BUCKET_NAME/kite-data-core-1.1.0.jar,gs://BUCKET_NAME/kite-data-mapreduce-1.1.0.jar,gs://BUCKET_NAME/kite-data-hive-1.1.0.jar,gs://BUCKET_NAME/kite-hadoop-compatibility-1.1.0.jar,gs://BUCKET_NAME/hadoop-mapreduce-client-core-3.2.0.jar
--connect=jdbc:mysql://IP/DB Name
--username=sqoop_user
--password=sqoop_user
--target-dir=gs://BUCKET_NAME/mysql_output
--table persons
--split-by personid -m 2
--as-parquetfile```
我收到以下错误...
20/01/03 04:42:29 INFO 配置.弃用:映射.jar是 荒废的。相反,请使用mapreduce.job.jar 线程"main"中的异常 java.lang.NoClassDefFoundError: org/kitesdk/data/mapreduce/DatasetKeyOutputFormat at org.apache.sqoop.mapreduce.DataDrivenImportJob.getOutputFormatClass(DataDrivenImportJob.java:190( at org.apache.sqoop.mapreduce.ImportJobBase.configureOutputFormat(ImportJobBase.java:94( at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:259( at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:673( at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:118( at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:497( at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605( at org.apache.sqoop.Sqoop.run(Sqoop.java:143( at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76( at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179( at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218( at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227( at org.apache.sqoop.Sqoop.main(Sqoop.java:236( 原因:java.lang.ClassNotFoundException: org.kitesdk.data.mapreduce.DatasetKeyOutputFormat at java.net.URLClassLoader.findClass(URLClassLoader.java:382( at java.lang.ClassLoader.loadClass(ClassLoader.java:418( at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352( at java.lang.ClassLoader.loadClass(ClassLoader.java:351(
在错误的第一行中,它说"mapred.jar已被弃用。相反,请使用mapreduce.job.jar'...
我已经导入了mapreduce.job.jar并将其作为libjar参数传递,但问题仍然保持不变。
非常感谢对此问题的帮助。
这些是适合我的特定 jar 版本(主要是 Cloudera(:
- https://repo1.maven.org/maven2/org/apache/parquet/parquet-format/2.9.0/parquet-format-2.9.0.jar
- https://repository.cloudera.com/artifactory/cloudera-repos/org/apache/sqoop/sqoop/1.4.7.7.2.10.0-148/sqoop-1.4.7.7.2.10.0-148.jar
- https://repository.cloudera.com/artifactory/cloudera-repos/org/kitesdk/kite-data-core/1.0.0-cdh6.3.4/kite-data-core-1.0.0-cdh6.3.4.jar
- https://repository.cloudera.com/artifactory/cloudera-repos/org/kitesdk/kite-data-mapreduce/1.0.0-cdh6.3.4/kite-data-mapreduce-1.0.0-cdh6.3.4.jar
- https://repository.cloudera.com/artifactory/cloudera-repos/org/kitesdk/kite-hadoop-compatibility/1.0.0-cdh6.3.4/kite-hadoop-compatibility-1.0.0-cdh6.3.4.jar
- https://repository.cloudera.com/artifactory/cloudera-repos/org/apache/avro/avro/1.8.2.7.2.10.0-148/avro-1.8.2.7.2.10.0-148.jar
- https://repository.cloudera.com/artifactory/cloudera-repos/org/apache/avro/avro-mapred/1.8.2.7.2.10.0-148/avro-mapred-1.8.2.7.2.10.0-148.jar
- https://repository.cloudera.com/artifactory/cloudera-repos/org/apache/parquet/parquet-avro/1.10.99.7.2.10.0-148/parquet-avro-1.10.99.7.2.10.0-148.jar
- https://repository.cloudera.com/artifactory/cloudera-repos/org/apache/parquet/parquet-common/1.10.99.7.2.10.0-148/parquet-common-1.10.99.7.2.10.0-148.jar
- https://repository.cloudera.com/artifactory/cloudera-repos/org/apache/parquet/parquet-column/1.10.99.7.2.10.0-148/parquet-column-1.10.99.7.2.10.0-148.jar
- https://repository.cloudera.com/artifactory/cloudera-repos/org/apache/parquet/parquet-hadoop/1.10.99.7.2.10.0-148/parquet-hadoop-1.10.99.7.2.10.0-148.jar
- https://repository.cloudera.com/artifactory/cloudera-repos/org/apache/parquet/parquet-jackson/1.10.99.7.2.10.0-148/parquet-jackson-1.10.99.7.2.10.0-148.jar
- https://repository.cloudera.com/artifactory/cloudera-repos/org/apache/parquet/parquet-encoding/1.10.99.7.2.10.0-148/parquet-encoding-1.10.99.7.2.10.0-148.jar
此答案中共享的 Sqoop 作业的完整脚本。