我有这个HTML:
<div id="main">
<li>
<h2>
<a href="https://www.congress.gov/bill/99th-congress/senate-joint-resolution/427">S.J.Res.427</a>
</h2>
</li>
<li>
....
</li>
</div>
我想提取<a>
标签的href值。
使用Mechanize和Nokogiri我这样做了:
activity_list = member.search('#main li')
activity_list.each do |link|
activity_link = link.at("h2 a[href]")
end
但是我得到了TypeError: no implicit conversion of nil into String
怎么了?
您正在寻找#attr
方法:
html = Nokogiri::HTML('<div id="main"><li><h2>
<a href="https://www.congress.gov/bill/99th-congress/senate-joint-resolution/427">S.J.Res.427</a>
</h2></li></div>')
html.search('#main li').each do |link|
# ⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓
puts link.at("h2 a[href]").attr('href')
end
#⇒ https://www.congress.gov/bill/99th-congress/senate-joint-resolution/427
我会这样写:
require 'nokogiri'
doc = Nokogiri::HTML(<<EOT)
<div id="main">
<li>
<h2>
<a href="foo">S.J.Res.427</a>
</h2>
</li>
<li>
<h2>
<a href="bar">S.J.Res.427</a>
</h2>
</li>
</div>
EOT
activity_list = doc.search('#main li')
activity_list.each do |link|
activity_link = link.at("h2 a[href]")
activity_link['href'] # => "foo", "bar"
end
当您指向一个节点时,您可以使用[]
访问参数的值。