如何使用BeautifulSoup在html页面源代码中搜索特定关键字



我的目标是了解如何在html页面源代码中搜索特定的关键字并返回True/False值。取决于是否已找到关键字。

我要找的特定关键字是"cdn.secomapp.com">

现在我的代码是这样的:

from urllib import request
from bs4 import BeautifulSoup

url_1 = "https://cheapchicsdesigns.com"
keyword ='cdn.secomapp.com'
page = request.urlopen(url_1)
soup = BeautifulSoup(page)
soup.find_all("head", string=keyword)

但当我运行这个程序时,它会返回一个空列表:

[]

有人能帮忙吗?提前感谢

如果您的唯一目的是查看关键字是否存在,那么您不需要构造BeautifulSoup对象。

from urllib import request
url_1 = "https://cheapchicsdesigns.com"
keyword ='cdn.secomapp.com'
page = request.urlopen(url_1)
print(keyword in page.read())

但我建议你使用requests,因为它更容易

import requests
url_1 = "https://cheapchicsdesigns.com"
keyword ='cdn.secomapp.com'
res = requests.get(url_1)
print(keyword in res.text)

尝试:

from urllib import request
from bs4 import BeautifulSoup

url_1 = "https://cheapchicsdesigns.com"
keyword ='cdn.secomapp.com'
page = request.urlopen(url_1)
soup = BeautifulSoup(page, 'html.parser')
print(keyword in soup.text)

打印:

True

或者:

import requests
from bs4 import BeautifulSoup

url_1 = "https://cheapchicsdesigns.com"
keyword ='cdn.secomapp.com'
page = requests.get(url_1)
soup = BeautifulSoup(page.content, 'html.parser')
print(keyword in soup.text)

打印:

True

最新更新