如何通过 R 将数据从 XML 提取到数据集



这是我的源链接:源

<AOSBS_XML Name="HR_ODDS_WIN" Timestamp="2018-09-02 17:59:16" Version="L2.2R1C" ID="355">
<Meetings>
..........
<Pools>
<PoolInfo Pool="WIN" OddsUpdateTime="16:06" Enabled="0">
<OddsSet>
<OddsInfo Number="1" Odds="16" Scratched="0" OddsDrop="0.00" Hot="0" WillPay="16300"/>
<OddsInfo Number="2" Odds="3.5" Scratched="0" OddsDrop="0.00" Hot="0" WillPay="3550"/>
<OddsInfo Number="3" Odds="12" Scratched="0" OddsDrop="14.28" Hot="0" WillPay="12950"/>
<OddsInfo Number="4" Odds="12" Scratched="0" OddsDrop="0.00" Hot="0" WillPay="12950"/>
<OddsInfo Number="5" Odds="2.4" Scratched="0" OddsDrop="27.27" Hot="1" WillPay="2400"/>
<OddsInfo Number="6" Odds="6.6" Scratched="0" OddsDrop="0.00" Hot="0" WillPay="6600"/>
<OddsInfo Number="7" Odds="35" Scratched="0" OddsDrop="23.91" Hot="0" WillPay="35300"/>
<OddsInfo Number="8" Odds="8.2" Scratched="0" OddsDrop="18.00" Hot="0" WillPay="8250"/>
</OddsSet>
</PoolInfo>
</Pools>
...........
</Meetings>
</AOSBS_XML>

这是我的代码:

url = paste("http://iosbsinfo02.hkjc.com/infoA/AOSBS/HR_GetInfo.ashx?QT=HR_ODDS_win&Venue=*&Race=7")
doc = xmlParse(url)
root = xmlRoot(doc)
root

但是,我不知道如何将OddsSet的部分提取到数据集中。有人可以帮助我吗?

这应该有效:

library(XML)
library(xml2)
library(purrr)
url = paste("http://iosbsinfo02.hkjc.com/infoA/AOSBS/HR_GetInfo.ashx?QT=HR_ODDS_win&Venue=*&Race=7")
doc = read_xml(url)
OddsSet <- xml_find_all(doc, ".//OddsSet") %>% 
xml_children() %>% map(xml_attrs) %>% map_df(~as.list(.))

对于以属性为中心的数据提取,请考虑 XML 的内部方法xmlAttrsToDataFrame,可通过三冒号运算符访问:

library(XML)
...
df <- XML:::xmlAttrsToDataFrame(getNodeSet(doc, path='//OddsInfo'))

最新更新