小贝子编程

从python/BeautifulSoup中的print语句中过滤掉一个字符串

本文关键字：字符串一个过滤 python BeautifulSoup 中的语句 print python web-scraping beautifulsoup
更新时间 : 2023-09-08
英文 : Filtering out one string from a print statement in python/BeautifulSoup

我正在使用BeautifulSoup来抓取网站的许多页面以供评论。本网站的每个页面都有评论"[[评论消息]]"。我想过滤掉这个字符串，这样它就不会在每次代码运行时都打印出来。我对python和BeautifulSoup很陌生，但是在寻找一点之后，我似乎找不到这个，尽管我可能搜索错误的东西。有什么建议吗？我的代码如下：

from bs4 import BeautifulSoup
import urllib
r = urllib.urlopen('website url').read()
soup = BeautifulSoup(r, "html.parser")
comments = soup.find_all("div", class_="commentMessage")
for element in comments:
    print element.find("span").get_text()

所有注释都在类 commentMessage 的div 内，包括不必要的注释"[[commentMessage]]"。

一个简单的if应该做

for element in comments:
    text = element.find("span").get_text()
    if "[[commentMessage]]" not in text:
        print text

从python/BeautifulSoup中的print语句中过滤掉一个字符串

相关内容

最新更新

热门标签：