小贝子编程

如何使用python的BeautifulSoup仅从字符串类型数据中获取文本信息

本文关键字：数据类型获取信息取文本字符串 python 何使用 BeautifulSoup python-3.x beautifulsoup
更新时间 : 2023-09-17
英文 : How to get only the text information from a string type data using python's BeautifulSoup

我有python中str格式的数据，如图所示。

data
'  </h3>n</div>n<div class="wpb_text_column wpb_content_element " data-wow-delay="0.3s">n<div class="wpb_wrapper">n<p>xa0</p>n<h4><span style="font-weight: 400;">Our Backbonexa0</span></h4>n<p><span style="font-weight: 400;">We use various techniques of AI like Neural nn'

我想获取此数据中的文本。如果它是 tag(<>( 而不是字符串格式，我可以对bs4.element.ResultSet类型使用.string()或get_text()。在这里它不能使用，因为它是字符串类型。如何从中获取整个字符串数据？

您可以直接对整个文档调用getText()

soup=BeautifulSoup(data,'html.parser')
text=soup.getText().replace("n","")
#  Our Backbone We use various techniques of AI like Neural

如果你想从特定的标签中提取，你可以尝试这样的东西

from bs4 import BeautifulSoup as bs
soup = bs(data,'html.parser')
a = [i.text.strip() for i in soup.findAll('div',{'class':'wpb_wrapper'})]

如何使用python的BeautifulSoup仅从字符串类型数据中获取文本信息

相关内容

最新更新

热门标签：