如何自动刷新网页,展开+菜单,并向下滚动到该菜单?



我正在跟踪这个网页:https://covid.cdc.gov/covid-data-tracker/#trends_dailytrendscases上报告的每日COVID病例。该网站通过在"每日病例趋势数据表-美国"中创建新行,每天更新一次每日病例计数。表格只有通过单击+符号才能查看表项。

我检查更新的方法是手动刷新网页,向下滚动到菜单,然后单击+符号。我想把这个过程自动化,这样我就不用不断地重复这个过程了。

我不是一个web开发人员,所以我更喜欢使用浏览器扩展或一些简单的东西,如果可能的话。如果需要的话,我可以写一点代码。

你可以使用Python,例如BeautifulSoup来自动完成。但是你也可以使用https://addons.mozilla.org/de/firefox/addon/rpa/

这会导致插件中的代码如下:

{
"Name": "Data",
"CreationDate": "2021-8-5",
"Commands": [
{
"Command": "open",
"Target": "https://covid.cdc.gov/covid-data-tracker/#trends_dailytrendscases",
"Value": "",
"Description": ""
},
{
"Command": "click",
"Target": "id=us-trends-table-header-icon",
"Value": "",
"Targets": [
"id=us-trends-table-header-icon",
"xpath=//*[@id="us-trends-table-header-icon"]",
"xpath=//i[@id='us-trends-table-header-icon']",
"xpath=//div[5]/div/div/i",
"css=#us-trends-table-header-icon"
],
"Description": ""
},
{
"Command": "click",
"Target": "id=btnUSTrendsTableExport",
"Value": "",
"Targets": [
"id=btnUSTrendsTableExport",
"xpath=//*[@id="btnUSTrendsTableExport"]",
"xpath=//button[@id='btnUSTrendsTableExport']",
"xpath=//div[5]/div[2]/div/button",
"css=#btnUSTrendsTableExport"
],
"Description": ""
},
{
"Command": "click",
"Target": "xpath=/html/body/a",
"Value": "",
"Targets": [
"xpath=/html/body/a",
"xpath=//body/a",
"css=body > a"
],
"Description": ""
}
]
}

此代码将保存完整的数据为csv文件。

这对你有帮助吗?这是一个用Python和Selenium编写的小脚本。但是您需要设置一个小的dev-environment。

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import re
import time
import datetime
url = "https://covid.cdc.gov/covid-data-tracker/#trends_dailytrendscases"
driver = webdriver.Chrome(r'd:Python-BuchRealPythonchromedriver.exe')
driver.get(url)
try:
driver.maximize_window()
element = WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.ID, 'us-trends-table-toggle')))
time.sleep(2)
update = driver.execute_script("return $('.general_note').text()")
day = re.findall('Data as of:s+w+,s+w+s+(d+)', update)
today = datetime.datetime.today().day
print('---')
print('Upload string:', update)
print('Day of data:', day)
print('Today:', today)
if int(day[0]) == today:
print('NEW DATA')
else:
print('OLD DATA')
print('---')

finally:
driver.quit()

最新更新