我需要帮助来确定从dataproc连接到Cloud sql时的问题。 我需要从 spark.read.jdbc 的云 sql 实例中读取表。 我指的是这个:https://github.com/GoogleCloudPlatform/cloud-sql-jdbc-socket-factory。
添加了 maven 依赖项:
<dependency>
<groupId>com.google.cloud.sql</groupId>
<artifactId>mysql-socket-factory-connector-j-8</artifactId>
<version>1.0.15</version>
</dependency>
JDBC URL => s"jdbc:mysql://${OJDBCConfig.getString("database_name")}?cloudSqlInstance=${OJDBCConfig.getString("instance_name")}&socketFactory=com.google.cloud.sql.mysql.SocketFactory&useSSL=false&user=${OJDBCConfig.getString("user")}&password=${OJDBCConfig.getString("password")}"
我也为此创建了胖罐子,并检查了我的罐子是否包含套接字工厂类。
我仍然得到以下异常:
Exception in thread "main" com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Could not create socket factory 'com.google.cloud.sql.mysql.SocketFactory' due to underlying exception:
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)
at com.mysql.jdbc.Util.getInstance(Util.java:408)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:918)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:897)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:886)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:860)
at com.mysql.jdbc.MysqlIO.createSocketFactory(MysqlIO.java:3330)
at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:296)
at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2193)
at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2225)
at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2024)
at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:779)
at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)
at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:389)
at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:330)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$1.apply(JdbcUtils.scala:63)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$1.apply(JdbcUtils.scala:54)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:56)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.getSchema(JDBCRelation.scala:210)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:35)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:318)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:167)
at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:238)
at com.express.processing.enrichment.TestMysqlConnection$.main(TestMysqlConnection.scala:16)
at com.express.processing.enrichment.TestMysqlConnection.main(TestMysqlConnection.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:849)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: com.google.cloud.sql.mysql.SocketFactory
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at com.mysql.jdbc.MysqlIO.createSocketFactory(MysqlIO.java:3328)
... 37 more
听起来你没有正确构建超级罐子。
从 v1.0.15 开始,发布页面上提供了带有驱动程序和依赖项的 uber-jar。尝试试一试,看看它是否有效。