在hadoop kerberized集群上。如果我没有在spark节俭服务器上模拟用户。它运行良好。但当我这样做的时候,我面临着一个关于元存储身份验证的错误。
我流这个文件
https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.4/bk_spark-component-guide/content/config-sts-user-imp.html
https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.4/bk_data-access/content/ref-5422cb60-d1d5-425a-b719-ec7bd03ee5d3.1.html
步骤1:
- 在高级火花配置单元站点覆盖中设置hive.server2.enable.doAs=true
- 添加spark.jars=/usr/hdp/current/spark节俭服务器/lib/datanucleus-api-jdo-3.2.6.jar、/usr/hdp/current/spark节约服务器/lib/data nucleus-core-3.2.10.jar、/usr/hdp/current/sspark节约服务器/lib/datanucleus-rdbms-3.2.9.jar
步骤2:在Advanced hiveserver2站点
- 设置hive.security.authorization.enabled=true
- 设置hive.server2.enable.doAs=true
- 设置hive.metastore.pre.event.listeners=org.apache.hadoop.hive.ql.security.authenticationPreEventListener
- 设置hive.security.metastore.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizaationProvider
步骤3:
- 我创建了一个用户keytab和princinpal和kinit
- 运行cli:beline-u'jdbc:hive2://:/default;principal=spark3/@;auth=KERBEROS;transportMode=binary'
Result: Connecting to jdbc:hive2://<host>:<port>/default;principal=spark3/<HOST>@<REAM>;auth=KERBEROS;transportMode=binary Connected to: Spark SQL (version 3.2.2) Driver: Hive JDBC (version 3.1.0.3.1.4.0-315) Transaction isolation: TRANSACTION_REPEATABLE_READ Beeline version 3.1.0.3.1.4.0-315 by Apache Hive
- 运行cli:show databases
我正面临这样的错误
Error: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
....
Caused by: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
....
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
....
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
....
Caused by: java.lang.reflect.InvocationTargetException
....
Caused by: MetaException(message:Could not connect to meta store using any of the URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException: GSS initiate failed
我检查了火花节约的日志,就像一样
22/10/07 15:07:31 INFO ThriftCLIService: Client protocol version: HIVE_CLI_SERVICE_PROTOCOL_V10
22/10/07 15:07:31 INFO HiveSessionImpl: Operation log session directory is created: /tmp/spark3/operation_logs/64eb19a6-1bdc-4ed8-81c9-8881c4251e75
22/10/07 15:07:31 INFO metastore: Trying to connect to metastore with URI thrift://<host>:<port>
22/10/07 15:07:32 INFO metastore: Opened a connection to metastore, current connections: 1
22/10/07 15:07:32 INFO metastore: Connected to metastore.
22/10/07 15:07:39 INFO SparkExecuteStatementOperation: Submitting query 'show databases' with fdcf90cb-74bb-4574-99b7-bfd981ce8010
22/10/07 15:07:39 INFO SparkExecuteStatementOperation: Running query with fdcf90cb-74bb-4574-99b7-bfd981ce8010
22/10/07 15:07:39 INFO metastore: Closed a connection to metastore, current connections: 0
22/10/07 15:07:39 INFO metastore: Trying to connect to metastore with URI thrift://<host>:<port>
22/10/07 15:07:39 ERROR TSaslTransport: SASL negotiation failure
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
22/10/07 15:07:39 WARN metastore: Failed to connect to the MetaStore Server...
22/10/07 15:07:39 INFO metastore: Waiting 5 seconds before next connection attempt.
22/10/07 15:07:44 INFO metastore: Trying to connect to metastore with URI thrift://<host>:<port>
22/10/07 15:07:44 ERROR TSaslTransport: SASL negotiation failure
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
我测试连接到spark节俭服务器成功,但当我运行查询时。我面临上面的错误。我哪里错了?
Spark Thrift Server是建立在单个Spark应用程序上的,不幸的是,它还不支持模拟。
也许你可以试试Apache Kyoubihttps://github.com/apache/incubator-kyuubi