在我的HIVE表中,我有一列">MYCOL"包含以下内容:
{"id": "a651b57f",
"items": {
"ITEM1": {
"code": "CODE1",
"name": "NAME1"},
"ITEM2": {
"code": "CODE2",
"name": "NAME2"}},
"myinfo": {
"c7daf1a9": {
"id": "c7daf1a9",
"name": "newname",
"type": "newtype",
"appliedto": ["ITEM1", "ITEM2"]}},
"info2": 12}
我想访问"myinfo"中的元素,我尝试了这样的东西:
select GET_JSON_OBJECT(t.MYCOL,'$.myinfo') FROM MYTABLE
但它不起作用。。。。
有人能帮我吗?
感谢
确保HDFS文件中的数据具有each json row
的one line
(而不是一行多个新行(。
- 如果json行有多个新行,那么在存储到HDFS之前,我们需要替换每行的所有新行
示例:
HDFS file data:
{"id": "a651b57f","items": {"ITEM1": {"code": "CODE1","name": "NAME1"},"ITEM2": {"code": "CODE2","name": "NAME2"}},"myinfo": {"c7daf1a9": {"id": "c7daf1a9","name": "newname","type": "newtype","appliedto": ["ITEM1", "ITEM2"]}},"info2": 12}
Hive:
with cte as (select string('{"id": "a651b57f","items": {"ITEM1": {"code": "CODE1","name": "NAME1"},"ITEM2": {"code": "CODE2","name": "NAME2"}},"myinfo": {"c7daf1a9": {"id": "c7daf1a9","name": "newname","type": "newtype","appliedto": ["ITEM1", "ITEM2"]}},"info2": 12}')my_col) --sample data
select get_json_object(my_col,'$.myinfo')jsn from cte;
Output:
{"c7daf1a9":{"id":"c7daf1a9","name":"newname","type":"newtype","appliedto":["ITEM1","ITEM2"]}}
Update
--to access name subfield we need to specify the path of json object
hive> select get_json_object(my_col,'$.myinfo.c7daf1a9.name')jsn from <table_name>;
--result
newname
hive> select get_json_object(my_col,'$.myinfo.c7daf1a9.appliedto')jsn from <table_name>;
--result
["ITEM1","ITEM2"]
hive> select get_json_object(my_col,'$.myinfo.c7daf1a9.appliedto[0]')jsn from <table_name>;
--result
ITEM1