按 div 类分组



问题:我能否按找到的元素所在的div 类对找到的元素进行分组,并将它们存储在列表中的列表中。 这可能吗? *所以我做了一些进一步的测试,如前所述。似乎即使您将一个div 存储在变量中,并且当尝试在该存储的div 中搜索时,它也会搜索整个站点内容。

from selenium import webdriver
driver = webdriver.Chrome()
result_text = []
# Let's say this is the class of the different divs, I want to group it by
@class='a-fixed-right-grid a-spacing-top-medium'

# These are the texts from all divs around the page that I'm looking for but I can't say which one belongs in witch div
elements = driver.find_elements_by_xpath("//a[contains(@href, '/gp/product/')]")
for element in elements:
result_text.append(element.text)
print(result_text )

当前结果:我已经从页面周围的不同div中获得了我正在寻找的所有信息,但我希望它被最顶层的div"分组"。

['Text11', 'Text12', 'Text2', 'Text31', 'Text32']

我想要达到的结果:

这 文本按@class='a-fixed-right-grid a-spacing-top-medium'分组

[['Text11', 'Text12'], ['Text2'], ['Text31', 'Text32']]

HTML:(看起来像这样(class="a-text-center a-fixed-left-grid-col a-col-left"是第一个从那里包装组的,我们可以使用任何div 对其进行分组。至少我是这么认为的。

</div>
</div>
</div>
</div>
<div class="a-fixed-right-grid a-spacing-top-medium"><div class="a-fixed-right-grid-inner a-grid-vertical-align a-grid-top">
<div class="a-fixed-right-grid-col a-col-left" style="padding-right:3.2%;float:left;">
<div class="a-row">
<div class="a-fixed-left-grid a-spacing-base"><div class="a-fixed-left-grid-inner" style="padding-left:100px">
<div class="a-text-center a-fixed-left-grid-col a-col-left" style="width:100px;margin-left:-100px;float:left;">
<div class="item-view-left-col-inner">




<a class="a-link-normal" href="/gp/product/B07YCW79/ref=ppx_yo_dt_b_asin_image_o0_s00?ie=UTF8&psc=1">
<img alt="" src="https://images-eu.ssl-images-amazon.com/images/I/41rcskoL._SY90_.jpg" aria-hidden="true" onload="if (typeof uet == 'function') { uet('cf'); uet('af'); }" class="yo-critical-feature" height="90" width="90" title="Same as the text I'm looking for" data-a-hires="https://images-eu.ssl-images-amazon.com/images/I/41rsxooL._SY180_.jpg">
</a>

</div>
</div>
<div class="a-fixed-left-grid-col a-col-right" style="padding-left:1.5%;float:left;">

<div class="a-row">


<a class="a-link-normal" href="/gp/product/B07YCR79/ref=ppx_yo_dt_b_asin_title_o00_s0?ie=UTF8&psc=1">
Text I'm looking for
</a>

</div>
<div class="a-row">

我没有测试它的链接,但这可能对您有用:

from selenium import webdriver
driver = webdriver.Chrome()
result_text = [[a.text for a in div.find_elements_by_xpath("//a[contains(@href, '/gp/product/')]")]
for div in driver.find_elements_by_class_name('a-fixed-right-grid')]
print(result_text)

编辑:添加了替代功能:

# if that doesn't work try:
def get_results(selenium_driver, div_class, a_xpath):
div_list = []
for div in selenium_driver.find_elements_by_class_name(div_class):
a_list = []
for a in div.find_elements_by_xpath(a_xpath):
a_list.append(a.text)
div_list.append(a_list)
return div_list
get_results(driver,
div_class='a-fixed-right-grid'
a_xpath="//a[contains(@href, '/gp/product/')]")

如果这不起作用,那么尽管从div 调用,但 xpath 可能每次都返回每个匹配的元素,或者另一个元素在文档中更远的地方具有相同的类名

相关内容

  • 没有找到相关文章

最新更新