如何在Python中对xml url中的两个字符串进行比较



我需要弄清楚一个站点的两个字符串是否相等。如果它们相同,则转到下一个字符串并保存;如果它们不同,则保存。

from bs4 import BeautifulSoup as soup
from urllib.request import urlopen
import time
def get_data():
site = 'http://www.televideo.rai.it/televideo/pub/rss102.xml'
op = urlopen(site)
rd = op.read()
op.close()
sp_page = soup(rd, 'xml')
news_list = sp_page.find_all('item')
for news in news_list:
print(news.title.text)
print(news.pubDate.text)
print('-'*60)

while True:
print(get_data())
time.sleep(5)

谢谢

XML中的项目按时间顺序排列。您可以存储最新项目的时间戳并处理所有新项目。

我无法测试它,但它应该是:

from bs4 import BeautifulSoup as soup
from urllib.request import urlopen
import time
import datetime
def get_data():
global prev_date_time
site = 'http://www.televideo.rai.it/televideo/pub/rss102.xml'
op = urlopen(site)
rd = op.read()
op.close()
sp_page = soup(rd, 'xml')
news_list = sp_page.find_all('item')
for news in news_list:

if prev_date_time and prev_date_time >= datetime.datetime.strptime(news.pubDate.text, '%a, %d %b %Y %H:%M:%S %z'):
break
print(news.title.text)
print(news.pubDate.text)
print('-'*60)
if news_list:
prev_date_time = datetime.datetime.strptime(news_list[0].pubDate.text, '%a, %d %b %Y %H:%M:%S %z')
prev_date_time = None 
                              
while True:
get_data()
time.sleep(5)

最新更新