我正在尝试为Databricks设置一个外部hive元存储,使用以下初始化脚本:
dbutils.fs.put(
"/databricks/scripts/external-metastore.sh",
"""#!/bin/sh
|# Loads environment variables to determine the correct JDBC driver to use.
|source /etc/environment
|# Quoting the label (i.e. EOF) with single quotes to disable variable interpolation.
|cat << 'EOF' > /databricks/driver/conf/00-custom-spark.conf
|[driver] {
| # Hive specific configuration options.
| # spark.hadoop prefix is added to make sure these Hive specific options will propagate to the metastore client.
| # JDBC connect string for a JDBC metastore
| "spark.hadoop.javax.jdo.option.ConnectionURL" = "jdbc:sqlserver://tst.database.windows.net:1433;database=db;encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net"
|
| # Username to use against metastore database
| "spark.hadoop.javax.jdo.option.ConnectionUserName" = "admin"
|
| # Password to use against metastore database
| "spark.hadoop.javax.jdo.option.ConnectionPassword" = "password"
|
| # Driver class name for a JDBC metastore
| "spark.hadoop.javax.jdo.option.ConnectionDriverName" = "com.microsoft.sqlserver.jdbc.SQLServerDriver"
|
| # Spark specific configuration options
| "spark.sql.hive.metastore.version" = "2.3.7"
| # Skip this one if <hive-version> is 0.13.x.
| "spark.sql.hive.metastore.jars" = "builtin"
|}
|EOF
|""".stripMargin,
overwrite = true)
我运行这个Scala代码
spark.conf.get("spark.sql.hive.metastore.version")
输出:
res0: String = 2.3.7
然而,当我试图运行:
spark.table("diamonds").withColumnRenamed("table", "table_number")
.write
.jdbc(jdbcUrl, "diamonds", connectionProperties)
我得到这个错误:
AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient;
Caused by: HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
Caused by: RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
Caused by: InvocationTargetException:
Caused by: MetaException: Version information not found in metastore.
摘自文档External Apache Hive metastore:
- SQL Server不能作为底层亚转移数据库Hive 2.0及以上版本
请尝试更改版本。这是另一个你可以参考的问题:Azure Databricks外部Hive Metastore
HTH .