我无法使用MIME格式的数据抓取表内容:使用python的application/octet流



我正试图从网站上抓取一些数据,但这些数据包含在Iframe中。最初我刮取了源链接,但从源也无法刮取数据。我需要帮助如何从这个源链接中提取数据。以下是源链接:https://chartviewer-europublic.bigapis.net/nzgaV/index.html

此外,我在这里分享了屏幕截图,显示了";a";标签,但我也无法提取此链接。

在此处输入图像描述

这是我用过的代码。我用了BeautifulSoup刮胡子。

# Libraries
from bs4 import BeautifulSoup
import requests
# Original website link
url_spain_total="https://anfac.com/cifras-clave/matriculaciones-turismos-y-todoterreno/"
page_total=requests.get(url_spain_total).text
soup_spain_total=BeautifulSoup(page_total,"lxml")
print(soup_spain_total.prettify())
# Getting the list of links in the iframe
result_spain=soup_spain_total.find_all("iframe")
result_spain
# Getting the required source link
total_main_link=result_spain[1]["src"]
total_main_link

在获得源链接后,我无法提取表内容。

感谢您的帮助。提前感谢!

以下是如何使用硒获取数据的示例:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import pandas as pd
chrome_options = Options()
chrome_options.add_argument("--no-sandbox")
# chrome_options.add_argument("--headless")
chrome_options.add_argument('disable-notifications')
chrome_options.add_argument("window-size=1920,1080")
webdriver_service = Service("chromedriver/chromedriver") ## path to where you saved chromedriver binary
browser = webdriver.Chrome(service=webdriver_service, options=chrome_options)
wait = WebDriverWait(browser, 20)
url = ' https://chartviewer-europublic.bigapis.net/nzgaV/index.html'
browser.get(url) 
table = wait.until(EC.element_to_be_clickable((By.ID, "datatable")))
df = pd.read_html(table.get_attribute("outerHTML"))[0]
print(df)

这将获得作为数据帧的信息,并在终端中显示:

%Variacion Acumulato>td style="text-align:right;">8.9639>>td style="text-align:left;">-4,3-4.34>1-1.4>td style="ext-align:rights;">17.6732>td style="text-align:left;">22922.85>
CategoríaAcumulato 2021
0Gasolina-17-17.34
1Diesel8.0648111211.1592.9799
2Resto
3可燃物总量
4特殊24.951226.0833233.413236.728
5Empresa21.7122224.337215.65444.03
6
7

最新更新