Spark SQL在不配置配置单元的情况下加载数据



我在Spark中阅读了JSON,但是,我收到了一些关于Hive的警告。我没有在笔记本电脑上设置蜂巢。我使用的代码是:

scala> val dfs = spark.sql("SELECT * FROM json.`/Users/name/Desktop/constituents.json`")
21/12/18 23:48:08 WARN HiveConf: HiveConf of name hive.stats.jdbc.timeout does not exist
21/12/18 23:48:08 WARN HiveConf: HiveConf of name hive.stats.retries.wait does not exist
21/12/18 23:48:13 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 2.3.0
21/12/18 23:48:13 WARN ObjectStore: setMetaStoreSchemaVersion called but recording version is disabled: version = 2.3.0, comment = Set by MetaStore shashanksathish@127.0.0.1
21/12/18 23:48:13 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
21/12/18 23:48:14 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException
21/12/18 23:48:14 WARN ObjectStore: Failed to get database json, returning NoSuchObjectException
dfs: org.apache.spark.sql.DataFrame = [Name: string, Sector: string ... 1 more field]

我不明白数据是如何加载到我的变量中的。

许多站点将Spark与Spark创建的/仅限HDFS的表或目录一起使用。Spark不需要Hive,这只是一个警告。这与不需要Hadoop不同。

对于镶木地板和delta,在Hive Metastore方面不需要Hive元数据方面。

如果您需要Ranger Security st行级别,则需要Hive外部表。

相关内容

最新更新