我正在尝试连接到Kerberized hdfs集群,使用下面的代码,使用下面相同的代码,我可以访问HBaseConfiguration、的hbase
Configuration config = new Configuration();
config.set("hadoop.security.authentication", "Kerberos");
UserGroupInformation.setConfiguration(config);
UserGroupInformation ugi = null;
ugi = UserGroupInformation.loginUserFromKeytabAndReturnUGI("me@EXAMPLE>COM","me.keytab");
model = ugi.doAs((PrivilegedExceptionAction<Map<String,Object>>) () -> {
testHadoop(hcb.gethDFSConfigBean());
return null;
});
我已经能够用相同的keytab和principal成功访问Solr,Impala,我得到了这个奇怪的失败为hdfs找到服务名称。
请查看下面的堆栈跟踪
java.io.IOException: Failed on local exception: java.io.IOException: java.lang.IllegalArgumentException: Failed to specify server's Kerberos principal name; Host Details : local host is: "Securonix-int3.local/10.0.4.36"; destination host is: "sobd189.securonix.com":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:752)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1988)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1118)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1114)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1400)
at com.securonix.application.ui.uiUtil.SnyperUIUtil.lambda$main$4(SnyperUIUtil.java:1226)
at com.securonix.application.ui.uiUtil.SnyperUIUtil$$Lambda$6/1620890840.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at com.securonix.application.ui.uiUtil.SnyperUIUtil.main(SnyperUIUtil.java:1216)
Caused by: java.io.IOException: java.lang.IllegalArgumentException: Failed to specify server's Kerberos principal name
at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:680)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:643)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:730)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
at org.apache.hadoop.ipc.Client.call(Client.java:1438)
... 23 more
Caused by: java.lang.IllegalArgumentException: Failed to specify server's Kerberos principal name
at org.apache.hadoop.security.SaslRpcClient.getServerPrincipal(SaslRpcClient.java:322)
at org.apache.hadoop.security.SaslRpcClient.createSaslClient(SaslRpcClient.java:231)
at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:159)
at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:396)
at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:553)
at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:368)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:722)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:718)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:717)
在我为Kerberos启用调试代码后,当我调用FileSystem.get()时,我得到了下面的调试日志;Kerberor调试日志:
Java配置名称:null本机配置名称:/etc/krb5.conf本机配置名:/etc/krb5.conf从本机配置加载2022年2月16日15:53:14警告实用程序。NativeCodeLoader:无法为您的平台加载本机hadoop库。。。在适用的情况下使用内置java类
java配置名称:null java配置名称本机配置名称:/etc/krb5.conf本机配置名:/etc/krb5.conf从本地配置加载从本地配置加载
KdcAccessibility:reset>>KdcAccessibility:resetKdcAccessibility:重置>>KdcAccessability:重置KeyTabInputStream,readName():EXAMPLE.COMKeyTabInputStream,readName():securonix>>KeyTabInputStream,readNameKeyTab:load()条目长度:55;类型:23>>KeyTab:load()条目长度:55;类型:23KeyTabInputStream,readName():EXAMPLE.COMKeyTabInputStream,readName():securonix>>KeyTabInputStream,readNameKeyTab:load()条目长度:71;类型:18>>KeyTab:load()条目长度:71;类型:18正在查找以下项的密钥:securonix@EXAMPLE.COM正在查找以下项的密钥:securonix@EXAMPLE.COM添加密钥:18版本:1添加密钥:18version:1添加的密钥:23版本:1添加的密钥正在查找以下项的密钥:securonix@EXAMPLE.COM正在查找以下项的密钥:securonix@EXAMPLE.COM添加密钥:18版本:1添加密钥:18version:1添加的密钥:23版本:1添加的密钥default_tkt_enctypes的默认etypes:18 18 16。default_tkt_enctypes的默认etypes:18 18 16。KrbAsReq创建消息>>KrbAsReq创建消息KrbKdcReq发送:kdc=sobd189.securonix.com TCP:88,超时=30000,重试次数=3,#字节=139>>KrbKdc请求发送:kdc=sobd189-securonix..com TCP:88kdc通信:kdc=sobd189.securonix.com TCP:88,超时=30000,尝试=1,#字节=139>>kdc通信调试:TCPClient读取639字节>>调试:TCPClient读取639个字节KrbKdcReq发送:读取的字节数=639KdcAccessibility:删除sobd189.securonix.com>>KdcAccessability:删除sobd89.securonix.com正在查找以下项的密钥:securonix@EXAMPLE.COM正在查找以下项的密钥:securonix@EXAMPLE.COM添加密钥:18版本:1添加密钥:18version:1添加的密钥:23版本:1添加的密钥ET类型:sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType>>ET类型:sun.security.krb5.internal.crypto.Ees256Cts HmacSha1E类型KrbAsRe.getReply securonix 中的KrbAsRep cons
有趣的是,当我使用像hdfs.exists()这样的文件系统的api时
>>>KinitOptions cache name is /tmp/krb5cc_501 >> Acquire default native Credentials default etypes for default_tkt_enctypes: 18 18 16. >>> Found no TGT's in LSA
我认为问题是HDFS希望Configuration有一个dfs.datanode.kerberos.principal的值,它是数据节点的主体,但在这种情况下却缺少它。
当我只从core-site.xml创建一个Configuration实例,却忘记添加hdfs-site.xml时,我也遇到了同样的问题
<property>
<name>dfs.datanode.kerberos.principal</name>
<value>....</value>
</property>
希望这能有所帮助。
我在Spark2和HDP3.1中遇到了同样的问题,使用Isilon/OneFS作为存储,而不是HDFS。
OneFS服务管理包不提供Spark2所需的一些HDFS参数的配置(它们在Ambari中根本不可用),如dfs.datanode.kerberos.principal。如果没有这些参数,Spark2 HistoryServer可能无法启动并报告错误,如"无法指定服务器的主体名称"。
我在自定义hdfs站点下向OneFS添加了以下属性:
dfs.datanode.kerberos.principal=hdfs/_HOST@<MY REALM>
dfs.datanode.keytab.file=/etc/security/keytabs/hdfs.service.keytab
dfs.namenode.kerberos.principal=hdfs/_HOST@<MY REALM>
dfs.namenode.keytab.file=/etc/security/keytabs/hdfs.service.keytab
这解决了最初的错误。此后,我收到了以下形式的错误:
Server has invalid Kerberos principal: hdfs/<isilon>.my.realm.com@my.realm.com, expecting: hdfs/somewhere.else.entirely@my.realm.com
这与跨领域身份验证有关。通过将以下设置添加到自定义hdfs站点解决:
dfs.namenode.kerberos.principal.pattern=*