我正在建立一个书签网站。我想从电子邮件中提取所有的uri/链接。我的网站是在使用Ruby on Rails。
如何提取收到的电子邮件内容的所有url ?
Ruby的内置URI模块已经做到了:
来自extract
文档:
require "uri"
URI.extract("text here http://foo.example.org/bla and here mailto:test@example.com and here also.")
# => ["http://foo.example.com/bla", "mailto:test@example.com"]
require 'uri'
text = %{"test
<a href="http://www.a.com/">http://www.a.com/</a>, and be sure
to check http://www.a.com/blog/. Email me at <a href="mailto:b@a.com">b@a.com</a>.}
END_CHARS = %{.,'?!:;}
p URI.extract(text, ['http']).collect { |u| END_CHARS.index(u[-1]) ? u.chop : u }
来源:http://www.java2s.com/Code/Ruby/Network/ExtractURL.htm