我正在做Coursera Python课程的作业。目标是将每个用户名的计数相加,得到最终计数。
XML:http://py4e-data.dr-chuck.net/comments_42.xml
如果我复制并粘贴XML,并使用下面的程序对其进行解析,它就可以正常工作。
import xml.etree.ElementTree as ET
input = (XML string goes here)
ct = 0
stuff = ET.fromstring(input)
lst = stuff.findall('comments/comment')
for item in lst:
print('Name', item.find('name').text)
print('Count', item.find('count').text)
ct = ct + int(item.find('count').text)
print(ct)
问题是当我试图直接从URL获取它时。在这种情况下,我尝试了两种方法:
import urllib.request,urllib.parse, urllib.error
import xml.etree.ElementTree as ET
uh = urllib.request.urlopen('http://py4e-data.dr-chuck.net/comments_42.xml')
data = uh.read()
print(data.decode())
tree = ET.fromstring(data)
lst = commentinfo.findall('comments/comment')
for item in lst:
print('Count', item.find('count').text)
这导致以下错误:
Traceback (most recent call last):
File "C:UserspatriDesktopPY4EMaterialscode3urllib1.py", line 10, in <module>
lst = commentinfo.findall('comments/comment')
NameError: name 'commentinfo' is not defined
第二种方法是任务建议的方法,使用以下访问计数的方式:
counts = tree.findall('.//count')
所以我写了以下代码:
import urllib.request,urllib.parse, urllib.error
import xml.etree.ElementTree as ET
uh = urllib.request.urlopen('http://py4e-data.dr-chuck.net/comments_42.xml')
data = uh.read()
print(data.decode())
tree = ET.fromstring(data)
counts = tree.findall('.//count')
for item in counts:
print('Count', item.find('count').text)
这显然导致了None类型,我对此无能为力:
Traceback (most recent call last):
File "C:UserspatriDesktopPY4EMaterialscode3urllib1.py", line 12, in <module>
print('Count', item.find('count').text)
AttributeError: 'NoneType' object has no attribute 'text'
在第一个代码片段中,由于变量commentinfo
,错误为NameError: name 'commentinfo' is not defined
,该变量未声明:
import urllib.request,urllib.parse, urllib.error
import xml.etree.ElementTree as ET
uh = urllib.request.urlopen('http://py4e-data.dr-chuck.net/comments_42.xml')
data = uh.read()
print(data.decode())
tree = ET.fromstring(data)
# commentinfo not declared
lst = commentinfo.findall('comments/comment')
for item in lst:
print('Count', item.find('count').text)
将其替换为变量tree
以使代码工作:
import urllib.request,urllib.parse, urllib.error
import xml.etree.ElementTree as ET
uh = urllib.request.urlopen('http://py4e-data.dr-chuck.net/comments_42.xml')
data = uh.read()
print(data.decode())
tree = ET.fromstring(data)
lst = tree.findall('comments/comment')
for item in lst:
print('Count', item.find('count').text)
在第二个代码片段中,表达式tree.findall('.//count')
已经获得了count
元素的列表。因此,当在循环中调用item.find('count')
时,它在count
元素中找不到名为count
的子元素,从而导致错误AttributeError: 'NoneType' object has no attribute 'text'
。要修复它,请从循环中删除item.find('count')
:
import urllib.request,urllib.parse, urllib.error
import xml.etree.ElementTree as ET
uh = urllib.request.urlopen('http://py4e-data.dr-chuck.net/comments_42.xml')
data = uh.read()
print(data.decode())
tree = ET.fromstring(data)
counts = tree.findall('.//count')
for item in counts:
print('Count', item.text)