Spark SQL 不会返回 HDP 上 HIVE 事务表的记录

>我在HDP设置中遇到了这个问题，在该设置上，事务表只需要一次压缩即可使用Spark SQL获取记录。另一方面，Apache设置甚至不需要压缩一次。

可能是压缩后元存储上触发了某些内容，Spark SQL 开始识别增量文件。

如果需要其他详细信息以获取根本原因，请告诉我。

试试这个，

查看完整场景：

hive> create table default.foo(id int) clustered by (id) into 2 buckets STORED AS ORC TBLPROPERTIES ('transactional'='true');
hive> insert into default.foo values(10);
scala> sqlContext.table("default.foo").count // Gives 0, which is wrong because data is still in delta files
#Now run major compaction:
hive> ALTER TABLE default.foo COMPACT 'MAJOR';
scala> sqlContext.table("default.foo").count // Gives 1
hive> insert into foo values(20);
scala> sqlContext.table("default.foo").count // Gives 2 , no compaction required.

Spark 不支持 Hive 跨国表的任何功能。

请检查： https://issues.apache.org/jira/browse/SPARK-15348

相关内容

最新更新

热门标签：