XML解析-findall()列表为空



被困在处理URL和XML解析的任务中。我已经得到了数据,但似乎无法让findall((工作。我知道,一旦我能让findall((工作,我就会有一个列表可以循环浏览。任何见解都会很棒,如果可能的话,希望能得到一个温和的提示,而不是直接的答案。非常感谢。

import urllib.request, urllib.parse, urllib.error
import xml.etree.ElementTree as ET
fhand = urllib.request.urlopen('http://py4e-data.dr-chuck.net/comments_42.xml')
raw_data = fhand.read().decode()
xml_data = ET.fromstring(raw_data)
lst = xml_data.findall('name')
print(lst)

findall不是递归的,这意味着如果它不在您调用findall的元素的正下方(如果不使用xpath,也就是说(,它将找不到节点/元素。

相反,使用iter:

import urllib.request
import xml.etree.ElementTree as ET
fhand = urllib.request.urlopen('http://py4e-data.dr-chuck.net/comments_42.xml')
raw_data = fhand.read().decode()
xml_data = ET.fromstring(raw_data)
for name_node in xml_data.iter('name'):
print(name_node.text)

findallxpath:

xml_data.findall('comments/comment/name')

两者都将输出

Romina
Laurie
Bayli
Siyona
Taisha
Alanda
Ameelia
Prasheeta
Asif
Risa
Zi
Danyil
Ediomi
Barry
Lance
Hattie
Mathu
Bowie
Samara
Uchenna
Shauni
Georgia
Rivan
Kenan
Hassan
Isma
Samanthalee
Alexa
Caine
Grady
Anne
Rihan
Alexei
Indie
Rhuairidh
Annoushka
Kenzi
Shahd
Irvine
Carys
Skye
Atiya
Rohan
Nuala
Maram
Carlo
Japleen
Breeanna
Zaaine
Inika

您可以使用请求库和BeautifulSoup来完成以下操作:

import requests
from bs4 import BeautifulSoup
response = requests.get('http://py4e-data.dr-chuck.net/comments_42.xml')
soup = BeautifulSoup(response.text, 'html.parser')
names = soup.find_all('name')
for name in names:
print(name.text)

输出:

Romina
Laurie
Bayli
Siyona
Taisha
Alanda
Ameelia
Prasheeta
Asif
Risa
Zi
Danyil
Ediomi
Barry
Lance
Hattie
Mathu
Bowie
Samara
Uchenna
Shauni
Georgia
Rivan
Kenan
Hassan
Isma
Samanthalee
Alexa
Caine
Grady
Anne
Rihan
Alexei
Indie
Rhuairidh
Annoushka
Kenzi
Shahd
Irvine
Carys
Skye
Atiya
Rohan
Nuala
Maram
Carlo
Japleen
Breeanna
Zaaine
Inika

最新更新