我正在尝试使用Beautiful Soup
来提取所有strong
元素之后的所有文本值。我已经能够从这个来源提取以下bs4.element.ResultSet
:
<table style="width:100%;">
<tr>
<td style="width:38%; vertical-align:top;">
<strong>MLS#:</strong> 20003254<br/>
<strong>Town:</strong> West New York<br/>
<strong>Address:</strong> 6050 Boulevard East , Unit 7 J<br/>
<br/>
<strong>Current Price:</strong> $590,000<br/>
<strong>Previous Price:</strong> $599,000<br/> <strong>Original Price:</strong> $649,900
</td>
<td style="width:31%; vertical-align:top;">
<strong>Style:</strong>Condo<br/> <strong>Bedrooms:</strong> 2<br/> <strong>Full Baths:</strong> 2<br/> <strong>Half Baths:</strong> 0<br/> <strong>Basement:</strong> Bin/Storage<br/>
<strong>Garage:</strong> Pkg Space In
</td>
<td style="width:31%; vertical-align:top;">
<strong>List Date:</strong> 01/24/2020<br/>
<strong>Category:</strong> Condo/Coop/Townhouse<br/>
<br/>
<strong>Taxes:</strong> $9,408 <br/><strong>Monthly Maintenance:</strong><br/>$ 1,140.00
</td>
我想将以下所有值拉到一个列表中:
[20003254,West New York, 6050 Boulevard East Unit 7 J, $590,000, $599,000] and so on for all values above.
我一直在使用以下方法将我想要的数据值的标题拉入列表:
from bs4 import BeautifulSoup
import requests
from requests import get
headers = ({'User-Agent':
'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36'})
sapo = "http://www.njmls.com/listings/index.cfm?action=dsp.info&mlsnum=20003254&dayssince=15&countysearch=false"
response = get(sapo, headers=headers)
html_soup = BeautifulSoup(response.text,'html.parser')
containers = html_soup.find_all('table',{'style': 'width:100%;'})
for cont in containers:
a = cont.find_all("strong")
a = [i.text.strip() for i in a]
print(a)
返回:
['MLS#:', 'Town:', 'Address:', 'Current Price:', 'Previous Price:', 'Original Price:', 'Style:', 'Bedrooms:', 'Full Baths:', 'Half Baths:', 'Basement:', 'Garage:', 'List Date:', 'Category:', 'Taxes:', 'Monthly Maintenance:']
更改
a = [i.text.strip() for i in a]
至
a = [i.next_sibling for i in a]
它给出
>>> [' 20003254', ' West New York', ' 6050 Boulevard East , Unit 7 J', ' $590,000', ' $599,000', ' $649,900 rntttttttt t', 'Condo', ' 2', ' 2', ' 0', ' Bin/Storage', ' Pkg Space Inrnttttttt', ' 01/24/2020', ' Condo/Coop/Townhouse', ' $9,408 ', <br/>]