选择前一个td和点击链接机械化和野尻

你好，我正在用mechanize和nokoogiri来剪贴一个网页。我选择了一系列的链接<a></a>

 html_body = Nokogiri::HTML(body)
    links = html_body.css('.L1').xpath("//table/tbody/tr/td[2]/a[1]")

然后我需要检查每个链接的内容(<a>content</a>，而不是href)是否匹配我的db中的一些东西。我正在做这个:

       links.each do |link|
          if link = @tournament.homologation_number

如果我的条件实现了，我需要选择<td></td>，就在我检查的链接的<td>之前，然后点击其中的链接。

<td><a href="link I want to click if condition is true"></a></td>
<td><a href="">content I check with my condition</a></td>

如何使用Mechanize和nokogiri实现这一点?

我会迭代第一个td的，因为它更容易获得以下元素比之前的(无论如何与css)

page.search('td[1]').each do |td|
  if td.at('+ td a').text == 'foo'
    page2 = agent.get td.at('a')[:href]
  end
end

首先，您必须选择所有的<td></td>，下面的xpath //table/tbody/tr/td[2]/a[1]只选择第一个<a></a>元素，因此您可以尝试类似//table/tbody/tr/td的元素，但这取决于具体情况。

一旦你有了<td></td>的数组，你可以像这样访问它们的链接:

tds.each do |td|
  link = td.children.first             # Select the first children
  if condition_is_matched(link.html)   # Only consider the html part of the link, if matched follow the previous link
    previous_td   = td.previous
    previous_url = previous_td.children.first.href
    goto_url previous_url
  end
end

相关内容

最新更新

热门标签：