如何使用Nokogiri抓取特定URL上的图像?如果有比野村更好的选择,请建议。css图像标签是.profilePic img
如果它只是一个具有URL:的<img>
PAGE = "http://site.com/page.html"
require 'nokogiri'
require 'open-uri'
html = Nokogiri.HTML(open(PAGE))
src = html.at('.profilePic img')['src']
File.open("foo.png", "wb") do |f|
f.write(open(src).read)
end
如果需要将相对图像路径转换为绝对路径,请参阅:
https://stackoverflow.com/a/4864170/405017
懒惰的方法是使用mechanize,因为它会为您计算URL和文件名:
require 'mechanize'
agent = Mechanize.new
doc = agent.get(url)
agent.get(doc.parser.at('.profilePic img')['src']).save