我在 Rails 5 中的应用程序有问题。我创建类scrape.rb
通过 Nokogiri gem 抓取 HTML,可以将这些数据保存在另一个模型中,但是当我在 rails 控制台中创建新对象时,这将返回 nil 并且不报废任何值:
2.3.0 :018 > s = Scrape.new
=> #<Scrape:0x007fba68b79e98>
2.3.0 :019 > s.scrape_new_movie
=> nil
2.3.0 :020 >
这是scrape.rb
模型
class Scrape
attr_accessor :title, :vote, :image_url, :description,
def scrape_new_movie
begin
doc = Nokogiri::HTML(open("https://zalukaj.com/zalukaj-film/26280/barbie_w_wiecie_gier_barbie_video_game_hero_2017_.html").read, nil, 'utf-8')
doc.css('script').remove
self.title = doc.css('#pw_title.about_movie_title').text
v = doc.css('#success_vote').text
self.vote = v.slice(2...5)
self.image_url = doc.css('.about_movie img').attr('src').text
self.description = doc.css('#pw_description.e_s3k').text
return true
rescue Exception => e
self.failure = "Something went wrong with the scrape"
end
end
def save_movie
movie = Movie.new(
title: self.title,
vote: self.vote,
image_url: self.image_url,
description: self.description
)
movie.save
end
end
它返回nil
的原因是因为您以逗号结束attr_accessor
。变量failure
也是未定义的,所以我假设你也需要一个attr_accessor。
你应该改变
attr_accessor :title, :vote, :image_url, :description,
自
attr_accessor :title, :vote, :image_url, :description, :failure
替换方法:
def scrape_new_movie
begin
doc = Nokogiri::HTML(open("https://zalukaj.com/zalukaj-film/26280/barbie_w_wiecie_gier_barbie_video_game_hero_2017_.html").read, nil, 'utf-8')
doc.css('script').remove
self.title = doc.css('#pw_title.about_movie_title').text
v = doc.css('#success_vote').text
self.vote = v.slice(2...5)
self.image_url = doc.css('.about_movie img').attr('src').text
self.description = doc.css('#pw_description.e_s3k').text
return true
rescue Exception => e
self.failure = "Something went wrong with the scrape"
end
end
跟
def scrape_new_movie
doc = Nokogiri::HTML(open("https://zalukaj.com/zalukaj-film/26280/barbie_w_wiecie_gier_barbie_video_game_hero_2017_.html").read, nil, 'utf-8')
doc.css('script').remove
self.title = doc.css('#pw_title.about_movie_title').text
v = doc.css('#success_vote').text
self.vote = v.slice(2...5)
self.image_url = doc.css('.about_movie img').attr('src').text
self.description = doc.css('#pw_description.e_s3k').text
return true
end
然后,发生的任何错误都会冒泡并以允许您调试问题所在的方式显示。
这是一个很好的例子,说明为什么你永远不应该做rescue Exception
因为这总是使调试问题变得更加困难。参见:为什么在 Ruby 中"拯救异常 => e"是一种不好的风格?
通过设置方式,无需调用类的字面名称。只需添加自我。到方法,不要调用 new。如果您想调试此脚本,此脚本中也存在相当多的错误,我也会引发异常消息。您还应该将 self.title = 更改为 @title = 或者如果您想保留 self.title,您需要添加类以从 self 继承并将attr_accessor放入该类中。
class Scrape
class << self
attr_accessor :title, :vote, :image_url, :description, failure
end
def self.scrape_new_movie
begin
doc = Nokogiri::HTML(open("https://zalukaj.com/zalukaj-film/26280/barbie_w_wiecie_gier_barbie_video_game_hero_2017_.html").read, nil, 'utf-8')
doc.css('script').remove
self.title = doc.css('#pw_title.about_movie_title').text
v = doc.css('#success_vote').text
self.vote = v.slice(2...5)
self.image_url = doc.css('.about_movie img').attr('src').text
self.description = doc.css('#pw_description.e_s3k').text
return true
rescue Exception => e
raise e
end
end
def self.save_movie
movie = Movie.new(
title: self.title,
vote: self.vote,
image_url: self.image_url,
description: self.description
)
movie.save
end
end
Scrape.scrape_new_movie