加载复合物接近XML到蜂巢表



可以添加一个复杂的嵌套XML,如下所示。

<items>
<item id="0001" type="donut">
<name>Cake</name>
<ppu>0.55</ppu>
<batters>
<batter id="1001">Regular</batter>
<batter id="1002">Chocolate</batter>
<batter id="1003">Blueberry</batter>
<batter id="1003">Devil's Food</batter>
</batters>
<topping id="5001">None</topping>
<topping id="5002">Glazed</topping>
<topping id="5005">Sugar</topping>
<topping id="5007">Powdered Sugar</topping>
<topping id="5006">Chocolate with Sprinkles</topping>
<topping id="5003">Chocolate</topping>
<topping id="5004">Maple</topping>
</item>
<item id="0002" type="donut">
<name>Raised</name>
<ppu>0.55</ppu>
<batters>
<batter id="1001">Regular</batter>
</batters>
<topping id="5001">None</topping>
<topping id="5002">Glazed</topping>
<topping id="5005">Sugar</topping>
<topping id="5003">Chocolate</topping>
<topping id="5004">Maple</topping>
</item>
<item id="0003" type="donut">
<name>Buttermilk</name>
<ppu>0.55</ppu>
<batters>
<batter id="1001">Regular</batter>
<batter id="1002">Chocolate</batter>
</batters>
</item>
<item id="0004" type="bar">
<name>Bar</name>
<ppu>0.75</ppu>
<batters>
<batter id="1001">Regular</batter>
</batters>
<topping id="5003">Chocolate</topping>
<topping id="5004">Maple</topping>
<fillings>
<filling id="7001">
<name>None</name>
<addcost>0</addcost>
</filling>
<filling id="7002">
<name>Custard</name>
<addcost>0.25</addcost>
</filling>
<filling id="7003">
<name>Whipped Cream</name>
<addcost>0.25</addcost>
</filling>
</fillings>
</item>
<item id="0005" type="twist">
<name>Twist</name>
<ppu>0.65</ppu>
<batters>
<batter id="1001">Regular</batter>
</batters>
<topping id="5002">Glazed</topping>
<topping id="5005">Sugar</topping>
</item>
<item id="0006" type="filled">
<name>Filled</name>
<ppu>0.75</ppu>
<batters>
<batter id="1001">Regular</batter>
</batters>
<topping id="5002">Glazed</topping>
<topping id="5007">Powdered Sugar</topping>
<topping id="5003">Chocolate</topping>
<topping id="5004">Maple</topping>
<fillings>
<filling id="7002">
<name>Custard</name>
<addcost>0</addcost>
</filling>
<filling id="7003">
<name>Whipped Cream</name>
<addcost>0</addcost>
</filling>
<filling id="7004">
<name>Strawberry Jelly</name>
<addcost>0</addcost>
</filling>
<filling id="7005">
<name>Rasberry Jelly</name>
<addcost>0</addcost>
</filling>
</fillings>
</item>
</items>

我能够映射到1001、1002、1003,但是相同的值,我无法提取。我将XML加载到Hive表中,并使用XPATH提取。我需要得到普通的价值,巧克力,蓝莓。

我将以下内容添加到蜂巢表(Store.Choclate(中,并查询为

从商店中选择XPath(str,'/item/item/batters/batters/@id'(。

这给出了1001、1002、1003的值。如何编写查询以提取常规,巧克力和蓝色?

在查询中,将XML数据加载为单个表,并在其顶部创建了视图。查询已构成为

select xpath(str, '/items/item/batters/batter[@id="1001"]/text()')

将提取"常规"的值。在类似的基础上,它可以为其他字段

构建

最新更新