使用 python 抓取网站中的表格(无表格标签)



我正在尝试每天抓取产品的库存价值。这是网络 https://funds.ddns.net/f.php?isin=ES0110407097。这是我正在尝试的代码:

import pandas as pd
from bs4 import BeautifulSoup
html_string = 'https://funds.ddns.net/f.php?isin=ES0110407097'    
soup = BeautifulSoup(html_string, 'lxml') 
new_table = pd.DataFrame(columns=range(0,2), index = [0])  
row_marker = 0

column_marker = 0
for row in soup.find_all('tr'):
columns = soup.find_all('td')
for column in columns:
new_table.iat[row_marker,column_marker] = column.get_text()
column_marker += 1
print(new_table)

我想在Python中获得与我在网络上看到的相同的格式,包括数据和数字。请问我怎样才能得到它?

对于该特定页面,有一种更简单的方法:

import requests
import pandas as pd
url = 'https://funds.ddns.net/f.php?isin=ES0110407097'    
resp = requests.get(url)
new_table = pd.read_html(resp.text)[0]
print(new_table.head(5))

输出:

0          1
0       FECHA     VL:EUR
1  2019-12-20  120170000
2  2019-12-19  119600000
3  2019-12-18  119420000
4  2019-12-17  119390000

最新更新