从我的数据科学体验中,我能够在BiginSights中与Hive数据库建立连接,并阅读表格架。但是,由于我的数量为零,数据科学经验似乎无法阅读表内容!这是我的一些设置:
conf = (SparkConf().set("com.ibm.analytics.metadata.enabled","false"))
spark = SparkSession.builder.enableHiveSupport().getOrCreate()
dash = {
'jdbcurl': 'jdbc:hive2://nnnnnnnnnnn:10000/;ssl=true;',
'user': 'xxxxxxxxxx',
'password': 'xxxxxxxxx',
}
spark.conf
offers = spark.read.jdbc(dash['jdbcurl'],
table='offers',
properties={"user" : dash["user"],
"password" : dash["password"]})
offers.count() returns: 0
offers.show()
returns:
+-----------+----------+
|offers.name|offers.age|
+-----------+----------+
+-----------+----------+
谢谢。
是的,我能够使用Hive JDBC连接器看到相同的行为。我尝试了此Python连接器,并返回了正确的计数。
https://datascience.ibm.com/docs/content/analyze-data/python_load.html
from ingest.Connectors import Connectors
`HiveloadOptions = { Connectors.Hive.HOST : 'bi-hadoop-prod-4222.bi.services.us-south.bluemix.net',
Connectors.Hive.PORT : '10000',
Connectors.Hive.SSL : True,
Connectors.Hive.DATABASE : 'default',
Connectors.Hive.USERNAME : 'charles',
Connectors.Hive.PASSWORD : 'march14march',
Connectors.Hive.SOURCE_TABLE_NAME : 'student'}
`
`HiveDF = sqlContext.read.format("com.ibm.spark.discover").options(**HiveloadOptions).load()`
HiveDF.printSchema()
HiveDF.show()
HiveDF.count()
谢谢查尔斯。