我正在尝试使用 spark.read.format 读取 Oracle 表,它适用于所有表,除了少数表,这些表的任何列的数据类型为 ROWID。
下面是我的代码
var df = spark.read.format("jdbc").
option("url", url).
option("driver", driver).
option("dbtable", dbTable).load()
println(df.first)
我收到以下错误
18/09/08 11:38:17 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 5.0 (TID 23, gbrdsr000002985.intranet.barcapint.com, executor 21): java.sql.SQLException: Invalid column type: getLong not implemented for class oracle.jdbc.driver.T4CRowidAccessor
at oracle.jdbc.driver.GeneratedAccessor.getLong(GeneratedAccessor.java:440)
at oracle.jdbc.driver.GeneratedStatement.getLong(GeneratedStatement.java:228)
at oracle.jdbc.driver.GeneratedScrollableResultSet.getLong(GeneratedScrollableResultSet.java:620)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$makeGetter$8.apply(JdbcUtils.scala:365)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$makeGetter$8.apply(JdbcUtils.scala:364)
分析:我检查了JdbcUtils对象,发现java.sql.Types.ROWID的情况映射到Spark的LongType。但是 getLong 没有在 Oracle JDBC 驱动程序中为 ROWID 类型实现。 我相信,如果我设法将其映射到 StringType,那么它将起作用,但我找不到任何解决方法。请在此处提出解决方案。
截至目前,我正在使用 ResultSet 读取数据,然后使用自定义架构将其转换为数据帧,但它会扼杀并行性。还想检查是否有任何选项来修改 JdbcUtils 的代码,因为我无法扩展 DataFrameReader 以创建自定义模板,因为类定义中提到了"private[sql]"。
val df=spark.read
.format("jdbc")
.option("driver","oracle.jdbc.OracleDriver")
.option("url","jdbc:oracle:thin:@localhost:1521:orcl")
.option("user","oracle1")
.option("password","oracle1")
.option("dbtable","(select Cast(RID as VARCHAR2(18)) from sample.ALL_RESULTS_DATA) my_table")
.load()
RID 是 Oracle 中数据类型为 ROWID 的列名。
上面的代码会将 ROWID 数据类型转换为字符串,以便我们可以使用 Spark 读取它
结果:
-------------------------+
|CAST(RIDASVARCHAR2(18))AS|
+-------------------------+
| AAAIVuAABAAAMhCAAA|
| AAAIVuAABAAAMhCAAA|
| AAAIVuAABAAAMhCAAA|
| AAAIVuAABAAAMhCAAA|
| AAAIVuAABAAAMhCAAA|
| AAAIVuAABAAAMhCAAA|
| AAAIVuAABAAAMhCAAA|
| AAAIVuAABAAAMhCAAA|
| AAAIVuAABAAAMhCAAA|
| AAAIVuAABAAAMhCAAA|
| AAAIVuAABAAAMhCAAA|
| AAAIVuAABAAAMhCAAA|
| AAAIVuAABAAAMhCAAA|
| AAAIVuAABAAAMhCAAA|
| AAAIVuAABAAAMhCAAA|
| AAAIVuAABAAAMhCAAA|
| AAAIVuAABAAAMhCAAA|
| AAAIVuAABAAAMhCAAA|
| AAAIVuAABAAAMhCAAA|
| AAAIVuAABAAAMhCAAA|