慢慢达到我想达到的目标。我通过屏幕抓取数据,并希望将数据保存到我的模型,我有两列,home_team和away_team。到目前为止,我抓取了数据。
FIXTURE_URL = "http://www.bbc.co.uk/sport/football/premier-league/fixtures"
def get_fixtures # Get me all Home and away Teams
doc = Nokogiri::HTML(open(FIXTURE_URL))
home_team = doc.css(".team-home.teams").map {|h| h.text.strip }
away_team = doc.css(".team-away.teams").map {|a| a.text.strip }
#team_clean = Hash[:home_team => home_team, :away_team => away_team]
#team_clean = Hash[:team_clean => [Hash[:home_team => home_team, :away_team => away_team]]]
end
我已经把两种方法的数据散列,一个是哈希,另一个是哈希中的哈希,我不确定哪一个我需要(如果有的话?)
所以如果我想保存从我的home_team收到的数据,我运行一个rake任务来做这个
def update_fixtures #rake task method
Fixture.destroy_all
get_fixtures.each {|home| Fixture.create(:home_team => home )}
end
我想要实现的是能够同时保存home_team和away_team。我是否需要访问哈希中的数据,如果需要,如何访问?这里有点迷失,但这是我第一次尝试这个
感谢您的帮助
试试这个,
FIXTURE_URL = "http://www.bbc.co.uk/sport/football/premier-league/fixtures"
def get_fixtures # Get me all Home and away Teams
doc = Nokogiri::HTML(open(FIXTURE_URL))
matches = doc.css('tr.preview')
matches.each do |match|
home_team = match.css('.team-home').text.strip
away_team = match.css('.team-away').text.strip
Fixture.create!(home_team: home_team, away_team: away_team)
end
end
这将循环遍历比赛,并为每场比赛创建一个新的Fixture
,其中包含客场和主队。
新增.text.strip
编辑2:
这应该也能得到日期,
FIXTURE_URL = "http://www.bbc.co.uk/sport/football/premier-league/fixtures"
def get_fixtures # Get me all Home and away Teams
doc = Nokogiri::HTML(open(FIXTURE_URL))
days = doc.css('#fixtures-data h2').each do |h2_tag|
date = Date.parse(h2_tag.text.strip)
matches = h2_tag.xpath('following-sibling::*[1]').css('tr.preview')
matches.each do |match|
home_team = match.css('.team-home').text.strip
away_team = match.css('.team-away').text.strip
Fixture.create!(home_team: home_team, away_team: away_team, date: date)
end
end
end
这比前面的代码要复杂一些,因为它必须使用一些XPath来调用h2
标签后面包含日期的下一个HTML元素。
它循环遍历div#fixtures-data
html中的所有h2
html标签,然后在每个h2
的下方/之后捕获table
标签。