在Rails应用程序中使用Nokogiri



我有完成所有抓取并将结果打印到控制台所需的代码,但是,我对如何在应用程序中使用它感到困惑。

它的工作方式应该是通过list#new操作,我为一个参数:url进行用户输入。然后,这个URL被传递给抓取代码,该代码获取所有附加参数并将所有内容添加到Postgres表中。使用所有这些新获取的数据,呈现新的list

我的问题:

  1. 列表控制器:

    class UsersController < ApplicationController
      .
      .
      .
      def create
        @list = List.new ( #what goes in here? 
                           #only one param comes from the user
        if @list.save
          #how to set it up so that the save is successful 
          #only if the extra params have been scraped?
      .
      .
      .
    
  2. 我想这将进入模型/list.rb:

    class List < ActiveRecord::Base
    require 'open-uri'
    url = #assuming that the url is proper and for something this code is supposed to scrape
          #is it better to add the url to db first or send it straight from the input 
          #and how is that defined here
    doc = Nokogiri::HTML(open(url))
    .
    .
    .
    

你能在这里给我一些指导吗?


服务文件:

class ScrapingService
  require 'open-uri'
  require 'nokogiri'
  def initialize(list)
    @list = list
  end
  url = :url
  doc = Nokogiri::HTML(open(url))
  name = doc.at_css(".currentverylong").text
  author = doc.at_css(".user").text
  def scraped_successfully?
    if name != nil && author != nil
      true
    else 
      false
    end 
  end
  private
    attr_reader :list
end

我有一些问题是:

  1. 如何正确地将:url引入HTML(open...?我现在的方法是抛出no implicit conversion of Symbol into String错误。

  2. :url以及:name:author应该被保存到一个数据库条目中的部分真的很模糊。

任何关于这方面的文章建议都是受欢迎的。

app/controllers/lists_controller.rb

class UsersController < ApplicationController
  def create
    @list = List.new(list_params)
    if @list.save
      redirect_to @list
    else
      render :new
    end
  private
  #Assuming that you are using Rails 4 or the strong_params gem
  def list_params
    params.require(:list).permit(:url) 
  end
end

app/models/list.rb

class List < ActiveRecord::Base
  # This runs only when you try to create a list. If you want to run this
  # validation when the user updates it, the remove the on: :create
  before_validation :ensure_website_is_scrapable, on: :create
  private
  def ensure_website_is_scrapable
    if ScrapingService.new(self).scraped_successfully?
      true
     else
      errors.add(:url, 'The website is not scrapable')
    end
  end
end

app/services/scratch_service.rb

class ScrapingService
  def initialize(list)
    @list = list
  end
  def scraped_successfully?
    # Do the scraping logic here and return true if it was successful or false otherwise
    # Of course split the implementation to smaller methods
  end
  private
  attr_reader :list
end

相关内容

  • 没有找到相关文章