我有以下一段代码:
Try {
Context.sc.sqlContext.read.jdbc(url, tableName, prop)
} match {
case Success(df) =>
log.info(s"$tableName successfully read from $schemaName with this connection: $url")
df
case Failure(exception) =>
log.error(s"$tableName >> $schemaName connection url >> $url")
log.error(s"Error reading $tableName from $schemaName with this connection: $url :: Error Message is ${exception.getMessage}")
throw exception
}
当我运行它时,我得到以下异常:
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '-analytics-d.t_dim_sap_manufacturers WHERE 1=0' at line 1
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)
at com.mysql.jdbc.Util.getInstance(Util.java:408)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:943)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3973)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3909)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2527)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2680)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2487)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1858)
at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1966)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:62)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:114)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:45)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:330)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:152)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:125)
at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:166)
at com.eon.adp.substations.factory.TableFactory$$anonfun$1.apply(TableFactory.scala:42)
at com.eon.adp.substations.factory.TableFactory$$anonfun$1.apply(TableFactory.scala:42)
at scala.util.Try$.apply(Try.scala:192)
....
....
....
....
可以看出,错误消息:
You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '-analytics-d.t_dim_sap_manufacturers WHERE 1=0' at line 1
架构名称 haw-analytics-d 被截断。我想这与MySQL生成日志的方式有关,或者?
当我尝试打印我传入的架构和表名称时,我的应用程序正确解析了它们:
TableName = haw-analytics-d.t_dim_sap_manufacturers :: SchemaName haw-analytics-d :: connection url >> jdbc:mysql://my.datbase.server:3306/haw-analytics-d?useSSL=true&requireSSL=true
如何查看 SparkContext 生成的 SQL?API 中是否有我可以告诉的内容将生成的 SQL 语句打印到控制台?有什么想法吗?
架构名称 haw-analytics-d 被截断。我想这与MySQL生成日志的方式有关,或者?
我无法重现 Spark 2.2.1/MySQL 连接器/J 5.1.45 的问题,所以看起来它是您使用的特定组合中的一个错误,但是是的 - 你使用连字符名称在搬起石头砸自己的脚。
如果由于某种原因无法更新组件,可以尝试从 URL 中删除数据库名称:
jdbc:mysql://my.datbase.server:3306/?useSSL=true&requireSSL=true
并将tableName
替换为查询,用反引号转义架构名称:
val tableName = "(SELECT * FROM `haw-analytics-d`.t_dim_sap_manufacturers) AS t"