Ruby和Https:试图对无法访问的网络执行套接字操作



我正试图从coursera下载我所有的课堂笔记。我想,既然我正在学习ruby,这将是一个很好的练习,下载他们所有的PDF供将来使用。不幸的是,我收到一个异常,说ruby由于某种原因无法连接。这是我的代码:

require 'net/http'
module Coursera 
  class Downloader
    attr_accessor :page_url
    attr_accessor :destination_directory
    attr_accessor :cookie
    def initialize(page_url,dest,cookie)
      @page_url=page_url
      @destination_directory = dest
      @cookie=cookie
    end
    def download
      puts @page_url
      request = Net::HTTP::Get.new(@page_url)
      puts @cookie.encoding
      request['Cookie']=@cookie
      # the line below is where the exception is thrown
      res = Net::HTTP.start(@page_url.hostname, use_ssl=true,@page_url.port) {|http|
        http.request(request)  
      }
      html_page = res.body
      pattern = /http[^"]+.pdf/
      i=0
      while (match = pattern.match(html_page,i)) != nil do
        # 0 is the entire string.
        url_string = match[0]
        # make sure that 'i' is updated
        i = match.begin(0)+1
        # we want just the name of the file.
        j = url_string.rindex("/")
        filename = url_string[j+1..url_string.length]
        destination = @destination_directory+"\"+filename
        # I want to download that resource to that file.
        uri = URI(url_string)
        res = Net::HTTP.get_response(uri)
        # write that body to the file
        f=File.new(destination,mode="w")
        f.print(res.body)
      end
    end
  end
end
page_url_string = 'https://class.coursera.org/datasci-002/lecture'
puts page_url_string.encoding
dest='C:\Users\michael\training material\data_science'
page_url=URI(page_url_string)
# I copied this from my browsers developer tools, I'm omitting it since 
# it's long and has my session key in it
cookie="..."
downloader = Coursera::Downloader.new(page_url,dest,cookie)
downloader.download

在运行时,以下内容被写入控制台:

Fast Debugger (ruby-debug-ide 0.4.22, debase 0.0.9) listens on 127.0.0.1:65485
UTF-8
https://class.coursera.org/datasci-002/lecture
UTF-8
Uncaught exception: A socket operation was attempted to an unreachable network. - connect(2)
    C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:878:in `initialize'
    C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:878:in `open'
    C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:878:in `block in connect'
    C:/Ruby200-x64/lib/ruby/2.0.0/timeout.rb:52:in `timeout'
    C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:877:in `connect'
    C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:862:in `do_start'
    C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:851:in `start'
    C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:582:in `start'
    C:/Users/michael/Documents/Aptana Studio 3 Workspace/practice/CourseraDownloader.rb:20:in `download'
    C:/Users/michael/Documents/Aptana Studio 3 Workspace/practice/CourseraDownloader.rb:52:in `<top (required)>'
    C:/Ruby200-x64/bin/rdebug-ide:23:in `load'
    C:/Ruby200-x64/bin/rdebug-ide:23:in `<main>'
C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:878:in `initialize': A socket operation was attempted to an unreachable network. - connect(2) (Errno::ENETUNREACH)
    from C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:878:in `open'
    from C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:878:in `block in connect'
    from C:/Ruby200-x64/lib/ruby/2.0.0/timeout.rb:52:in `timeout'
    from C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:877:in `connect'
    from C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:862:in `do_start'
    from C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:851:in `start'
    from C:/Ruby200-x64/lib/ruby/2.0.0/net/http.rb:582:in `start'
    from C:/Users/michael/Documents/Aptana Studio 3 Workspace/practice/CourseraDownloader.rb:20:in `download'
    from C:/Users/michael/Documents/Aptana Studio 3 Workspace/practice/CourseraDownloader.rb:52:in `<top (required)>'
    from C:/Ruby200-x64/lib/ruby/gems/2.0.0/gems/ruby-debug-ide-0.4.22/lib/ruby-debug-ide.rb:86:in `debug_load'
    from C:/Ruby200-x64/lib/ruby/gems/2.0.0/gems/ruby-debug-ide-0.4.22/lib/ruby-debug-ide.rb:86:in `debug_program'
    from C:/Ruby200-x64/lib/ruby/gems/2.0.0/gems/ruby-debug-ide-0.4.22/bin/rdebug-ide:110:in `<top (required)>'
    from C:/Ruby200-x64/bin/rdebug-ide:23:in `load'
    from C:/Ruby200-x64/bin/rdebug-ide:23:in `<main>'

我按照这里的说明编写了所有的HTTP代码。据我所见,我一直在追随他们。

我使用的是Windows7、ruby 2.0.0p481和Aptana Studio 3。当我将url复制到浏览器中时,它会直接进入页面,没有任何问题。当我在浏览器中查看该url的请求标头时,我没有看到我认为缺少的其他内容。我还试着设置了Host和Referer请求头,没有什么区别。

我没有想法,已经在Stack Overflow上搜索了类似的问题,但这并没有帮助。请告诉我我缺了什么。

所以,我在另一个项目中收到了同样的错误消息,问题是我的机器实际上无法连接到IP/端口。你试过用curl连接吗?如果它在你的浏览器中工作,它可能使用代理或其他东西来实际到达那里。用curl测试URL为我解决了这个问题。

最新更新