我正在python中练习使用BeautifulSoup,试图从这个网站解析信息:
https://www.vogue.com/fashion/street-style
您将如何从网站打印图像? 例如,我试图解析HTML中存在的徽标:
from bs4 import BeautifulSoup
import requests
source = requests.get('https://www.vogue.com/fashion/street-style').text
soup = BeautifulSoup(source, 'lxml')
logo = soup.find('img', class_='logo--image')
print(logo)
这将打印出来:
<img alt="Vogue Logo" class="logo--image" src="/img/vogue-logo-pride.svg"/>
如何在编辑器中打印实际图片?
快速摘要
不幸的是,现代编辑器/终端支持文本,而不是图像。有一些替代方案:
-
选项 1:下载映像并定期打开。
-
选项 2:打开包含图像的新浏览器选项卡。
溶液
选项 1
from bs4 import BeautifulSoup
import requests
import shutil
WEBSITE = 'https://www.vogue.com/fashion/street-style'
SAVE_LOCATION = 'downloaded.svg' # name of file to save image to
source = requests.get(WEBSITE).text
soup = BeautifulSoup(source, 'lxml')
relative_link_to_logo = soup.find('img', class_='logo--image')['src']
img = requests.get(f'{WEBSITE}/{relative_link_to_logo}', stream=True)
with open(SAVE_LOCATION,'wb') as f:
for chunk in img:
f.write(chunk)
选项 2
from bs4 import BeautifulSoup
import requests
import webbrowser
WEBSITE = 'https://www.vogue.com/fashion/street-style'
source = requests.get(WEBSITE).text
soup = BeautifulSoup(source, 'lxml')
relative_link_to_logo = soup.find('img', class_='logo--image')['src']
webbrowser.open(f'{WEBSITE}/{relative_link_to_logo}')
结语
某些编辑器(如 Visual Studio Code(具有插件,可让您在不离开编辑器的情况下预览图像文件(已下载到计算机(。这仍然涉及选项 1,即下载到磁盘。
希望对您有所帮助!