使用Net::HTTP::Pipeline下载之前请检查标头

在实际提交下载之前，我正在尝试解析图像URL的列表并获取一些基本信息。

图像是否存在（使用response.code解决？）
我已经有图片了吗（想看看类型和尺寸吗？）

我的脚本每天都会检查一个大列表（大约1300行），每行都有30-40个图像URL。我的@photo_urls变量允许我跟踪已经下载的内容。我真的希望以后能够将其用作哈希（而不是示例代码中的数组），以便稍后进行交互并进行实际下载。

现在我的问题（除了是Ruby新手之外）是Net:：HTTP:：Pipeline只接受Net:：HTTPRequest对象的数组。nethttp管道的文档表明，响应对象将以与进入的相应请求对象相同的顺序返回。问题是，除了该顺序之外，我没有办法将请求与响应关联起来。然而，我不知道如何获得块内的相对序数位置。我假设我可以只有一个计数器变量，但我如何通过顺序位置访问哈希？

          Net::HTTP.start uri.host do |http|
            # Init HTTP requests hash
            requests = {}
            photo_urls.each do |photo_url|          
              # make sure we don't process the same image again.
              hashed = Digest::SHA1.hexdigest(photo_url)         
              next if @photo_urls.include? hashed
              @photo_urls << hashed
              # change user agent and store in hash
              my_uri = URI.parse(photo_url)
              request = Net::HTTP::Head.new(my_uri.path)
              request.initialize_http_header({"User-Agent" => "My Downloader"})
              requests[hashed] = request
            end
            # process requests (send array of values - ie. requests) in a pipeline.
            http.pipeline requests.values do |response|
              if response.code=="200"
                  # anyway to reference the hash here so I can decide whether
                  # I want to do anything later?
              end
            end                
          end

最后，如果有更简单的方法，请随时提供任何建议。

谢谢！

使请求成为一个数组而不是散列，并在响应到来时弹出请求：

Net::HTTP.start uri.host do |http|
  # Init HTTP requests array
  requests = []
  photo_urls.each do |photo_url|          
    # make sure we don't process the same image again.
    hashed = Digest::SHA1.hexdigest(photo_url)         
    next if @photo_urls.include? hashed
    @photo_urls << hashed
    # change user agent and store in hash
    my_uri = URI.parse(photo_url)
    request = Net::HTTP::Head.new(my_uri.path)
    request.initialize_http_header({"User-Agent" => "My Downloader"})
    requests << request
  end
  # process requests (send array of values - ie. requests) in a pipeline.
  http.pipeline requests.dup do |response|
    request = requests.shift
    if response.code=="200"
      # Do whatever checking with request
    end
  end                
end

相关内容

最新更新

热门标签：