Python, Jupyter Notebook,从URL下载Excel文件



我目前正试图从ABS网站访问一些数据。

https://www.abs.gov.au/statistics/labour/earnings-and-work-hours/weekly-payroll-jobs-and-wages-australia/latest-release数据下载表5 .

excel文件的名称在每个版本中都会更改。我想通过自动下载并保存到数据帧来更新它。

目前进展:

谢谢你漂亮的汤。使用该函数获取网站上的Url列表

#####Step 1: start by importing all of the necessary packages#####
import requests #requesting URLs
import urllib.request #requesting URLs
import pandas as pd #for simplifying data operations (e.g. creating dataframe objects)
from bs4 import BeautifulSoup #for web-scraping operations
#####Step 2: connect to the URL in question for scraping#####
url = 'https://www.abs.gov.au/statistics/labour/earnings-and-work-hours/weekly-payroll-jobs-and-wages-australia/latest-release' 
response = requests.get(url) #Connect to the URL using the "requests" package
response #if successful then it will return 200
#####Step 3: read in the URL via the "BeautifulSoup" package#####
soup = BeautifulSoup(response.text, 'html.parser') 
#####Step 4: html print#####
for link in soup('a'):
print(link.get('href'))
##how to get the link to table 5?##
**url = ?**
##last step to save into data frame##
ws = pd.read_excel(url, sheet_name='Payroll jobs index-SA4', skiprows=5)

您可以从URL中找到与XSLX关联的div类,并使用find_all方法返回元素列表,并使用索引1查找href

import requests 
from bs4 import BeautifulSoup
url = 'https://www.abs.gov.au/statistics/labour/earnings-and-work-hours/weekly-payroll-jobs-and-wages-australia/latest-release' 
response = requests.get(url) 
response 
soup = BeautifulSoup(response.text, 'html.parser') 
url=soup.find_all("div",class_="abs-data-download-right")[1].find("a")['href']
pd.read_excel(url, sheet_name='Payroll jobs index-SA4', skiprows=5,engine='openpyxl')

查找所有URL:

urls=soup.find_all("div",class_="abs-data-download-right")
for i in urls:
print(i.find("a")['href'])

输出:

https://www.abs.gov.au/statistics/labour/earnings-and-work-hours/weekly-payroll-jobs-and-wages-australia/week-ending-31-july-2021/6160055001_DO004.xlsx
https://www.abs.gov.au/statistics/labour/earnings-and-work-hours/weekly-payroll-jobs-and-wages-australia/week-ending-31-july-2021/6160055001_DO005.xlsx
....

最新更新