TOPIC CLOSED
这部分代码
containers = page_soup.findAll('div',{'class' : 'product-count d-flex align-items-center'})
output = ''
for container in containers:
price = container.find('span',{'class':'lang'}).text.replace(",", "") if container.find('span',{'class':'lang'}) else ""
从这个HTML页面中提取我需要的值
<div class="product-count d-flex align-items-center">
<span class="icon-military_tech" style="color: #FFCF57; font-size: 16px"></span>
<span class="lang">bought 24 times</span>
</div>
结果是买了24次
但是当HTML代码为
时,用于其他站点
<div data-v-fd0de2e2=""><div data-v-fd0de2e2="" class="product-features"><!----> <div data-v-fd0de2e2=""><span data-v-fd0de2e2="" class="sold_products_count">bought 53 times</span></div></div> <!----> <!----> <div data-v-fd0de2e2="" class="product-meta"><div data-v-fd0de2e2="" class="product-sku"><strong data-v-fd0de2e2="">product code: </strong> <span data-v-fd0de2e2="">1200100045</span></div> <br data-v-fd0de2e2=""> <div data-v-fd0de2e2="" class="product-sku"><strong data-v-fd0de2e2="">weight: </strong> <span data-v-fd0de2e2="" style="direction: ltr; display: inline-block;">
0 kg
</span></div> <br data-v-fd0de2e2=""> <!----></div></div>
修改后的python代码给出空文件结果
containers = page_soup.findAll('div',{'class' : 'product-features'})
output = ''
for container in containers:
price = container.find('span',{'class':'sold_products_count'}).text.replace(",", "") if container.find('span',{'class':'sold_products_count'}) else ""
最后一个站点需要的结果是购买了53次
代码循环遍历所有的containers
退出之前,和覆盖price
每一次,所以price
的最终价值是一个从过去的"容器",是否包含您正在寻找的数据。
一旦你找到了你需要的值,你就可以跳出循环,像这样:
containers = page_soup.findAll('div',{'class' : 'product-features'})
output = ''
for container in containers:
price = container.find('span',{'class':'sold_products_count'}).text.replace(",", "") if container.find('span',{'class':'sold_products_count'}) else ""
if price:
break
print(price)