Ruby:如何编写"DRY"/动态/灵活的树状循环结构



我正在寻找在Ruby中完成以下结构/逻辑问题的最佳方法:

一个网站需要完全抓取,收集每个页面的标题。

但:

  • 网站的树状结构是未知的(有多少"层次","分支"等)
  • 代码应该是"DRY" (= "Don't Repeat Yourself")

下面的(简化的)例子当然是非常愚蠢的:

url = some_root_url
@title_collection = Array.new
go_to_page(url)
@title_collection << find_all_titles_on_page
urls = find_all_urls_on_page
urls.each do |url|
    go_to_page(url)
    @title_collection << find_all_titles_on_page
    urls = find_all_urls_on_page
    urls.each do |url|
        go_to_page(url)
        @title_collection << find_all_titles_on_page
        urls = find_all_urls_on_page
        urls.each do |url|
            go_to_page(url)
            @title_collection << find_all_titles_on_page
            urls = find_all_urls_on_page
            urls.each do |url|
                go_to_page(url)
                @title_collection << find_all_titles_on_page
                urls = find_all_urls_on_page
                urls.each do |url|
                    go_to_page(url)
                    @title_collection << find_all_titles_on_page
                    urls = find_all_urls_on_page
                    [...]
                end
            end
        end
    end
end

那么你如何以"DRY"的方式灵活高效地完成这一任务呢?

非常感谢!

汤姆

递归是你的朋友:

def walk_tree(url)
  go_to_page(url)
  title_collection << find_all_titles_on_page
  urls = find_all_urls_on_page
  urls.each do |child_url|
    title_collection << walk_tree(child_url)
  end
  title_collection
end

最新更新