这是我的代码。。
require "open-uri"
base_url = "http://en.wikipedia.org/wiki"
(1..5).each do |x|
# sets up the url
full_url = base_url + "/" + x.to_s
# reads the url
read_page = open(full_url).read
# saves the contents to a file and closes it
local_file = "my_copy_of-" + x.to_s + ".html"
file = open(local_file,"w")
file.write(read_page)
file.close
# open a file to store all entrys in
combined_numbers = open("numbers.html", "w")
entrys = open(local_file, "r")
combined_numbers.write(entrys.read)
entrys.close
combined_numbers.close
end
正如你所看到的。它基本上是抓取维基百科文章1到5的内容,然后尝试将它们合并到一个名为numbers.html.的文件中
它做对了第一点。但到了第二个。这似乎只是在循环中写第五篇文章的内容。
不过我看不出哪里出了问题。有什么帮助吗?
打开摘要文件时选择了错误的模式"w"覆盖现有文件,而"a"追加到现有文件
所以用这个让你的代码工作:
combined_numbers = open("numbers.html", "a")
否则,每次循环时,numbers.html的文件内容都会被当前文章覆盖。
此外,我认为您应该使用read_page
中的内容写入numbers.html
,而不是从新编写的文件中读取它们
require "open-uri"
(1..5).each do |x|
# set up and read url
url = "http://en.wikipedia.org/wiki/#{x.to_s}"
article = open(url).read
# saves current article to a file
# (only possible with 1.9.x use open too if on 1.8.x)
IO.write("my_copy_of-#{x.to_s}.html", article)
# add current article to summary file
open("numbers.html", "a") do |f|
f.write(article)
end
end