使用load命令加载数据到hive静态分区表

请不要介意，如果它是一个非常基本的:

<标题>用法h1> ravi 100 hyd
2奎师那200 hyd
3 FFF 300秒

我已经在hive中创建了一个分区为city的表，并加载了如下数据:

create external table temp(id int, name string, sal int) 
partitioned by(city string) 
location '/testing';

load data inpath '/test.txt' into table temp partition(city='hyd');

HDFS的结构是/testing/temp/city=hyd/test.txt

when i查询表为"select * from temp";

<标题>输出:

temp.id temp.name temp.sal temp.city  
    1   ravi    100 hyd  
    2   krishna 200 hyd  
    3   fff     300 hyd

这里我的问题是为什么第三行城市名称从"sec"改为"hyd"在输出?

我这边有什么问题吗?

提前感谢!!

你的问题是:

load data inpath '/test.txt' into table temp partition(city='hyd');

您加载到这个分区中的所有数据都带有city = 'hyd'。如果您正在进行静态分区，则您的责任是将正确的值放入分区。

从你的txt文件中删除最后一行，放入test2.txt并执行:

load data inpath '/test.txt' into table temp partition(city='hyd');
load data inpath '/test2.txt' into table temp partition(city='sec');

是的，不太舒服，但是静态分区就是这样工作的

我希望分区不能正常工作，对于单个文件的load语句
相反，我们需要写入hive中的临时表(stat_parti)，然后从那里我们需要写入另一个分区表(stat_test)

create external table stat_test(id int, name string, sal int)
partitioned by(city string) 
row format delimited fields 
terminated by ' ' 
location '/user/test/stat_test';

，可以给出静态或动态分区。

1)静态分区

insert into table stat_test partition(city='hyd') select id,name,sal from stat_parti where city='hyd';  
insert into table stat_test partition(city='sec') select id,name,sal from stat_parti where city='sec';

2)动态分区

这里我们需要启用

set hive.exec.dynamic.partition=true  
set hive.exec.dynamic.partition.mode=nonstrict
insert overwrite table stat_test partition(city) select id,name,sal from stat_parti;

复制到HDFS路径下的数据文件test.txt -'/testing/temp/city=hyd/test.txt'所有的数据将进入分区- 'city=hyd'

和Hive使用目录名检索值。字段城市名来自目录名hyd。

相关内容

最新更新

热门标签：