我正在使用Mechanize宝石。如何在两个div标签之间获取内容?
"<div class='a'></div>content<div class='a'></div>"
问题是,内容在<p>
标签之间。
<div>
<div class='a'>Content1</div>
<p></p>
<p></p>
<p></p>
<p></p>
<div class='a'>Content2</div>
<p></p>
<p></p>
<p></p>
<p></p>
</div>
您可以在检索页面后使用Nokogiri
对其进行解析:
m = Mechanize.new
result = m.get("http://google.com")
html = Nokogiri::HTML(result.body)
divs = html.xpath('//div').map { |div| div.content } # here you can do whatever is needed with the divs
# I've mapped their content into an array