我有完成所有抓取并将结果打印到控制台所需的代码,但是,我对如何在应用程序中使用它感到困惑。
它的工作方式应该是通过list#new
操作,我为一个参数:url
进行用户输入。然后,这个URL被传递给抓取代码,该代码获取所有附加参数并将所有内容添加到Postgres表中。使用所有这些新获取的数据,呈现新的list
。
我的问题:
-
列表控制器:
class UsersController < ApplicationController . . . def create @list = List.new ( #what goes in here? #only one param comes from the user if @list.save #how to set it up so that the save is successful #only if the extra params have been scraped? . . .
-
我想这将进入模型/list.rb:
class List < ActiveRecord::Base require 'open-uri' url = #assuming that the url is proper and for something this code is supposed to scrape #is it better to add the url to db first or send it straight from the input #and how is that defined here doc = Nokogiri::HTML(open(url)) . . .
你能在这里给我一些指导吗?
服务文件:
class ScrapingService
require 'open-uri'
require 'nokogiri'
def initialize(list)
@list = list
end
url = :url
doc = Nokogiri::HTML(open(url))
name = doc.at_css(".currentverylong").text
author = doc.at_css(".user").text
def scraped_successfully?
if name != nil && author != nil
true
else
false
end
end
private
attr_reader :list
end
我有一些问题是:
如何正确地将
:url
引入HTML(open...
?我现在的方法是抛出no implicit conversion of Symbol into String
错误。:url
以及:name
和:author
应该被保存到一个数据库条目中的部分真的很模糊。
任何关于这方面的文章建议都是受欢迎的。
app/controllers/lists_controller.rb
class UsersController < ApplicationController
def create
@list = List.new(list_params)
if @list.save
redirect_to @list
else
render :new
end
private
#Assuming that you are using Rails 4 or the strong_params gem
def list_params
params.require(:list).permit(:url)
end
end
app/models/list.rb
class List < ActiveRecord::Base
# This runs only when you try to create a list. If you want to run this
# validation when the user updates it, the remove the on: :create
before_validation :ensure_website_is_scrapable, on: :create
private
def ensure_website_is_scrapable
if ScrapingService.new(self).scraped_successfully?
true
else
errors.add(:url, 'The website is not scrapable')
end
end
end
app/services/scratch_service.rb
class ScrapingService
def initialize(list)
@list = list
end
def scraped_successfully?
# Do the scraping logic here and return true if it was successful or false otherwise
# Of course split the implementation to smaller methods
end
private
attr_reader :list
end