用Nokogiri提取元素

不知道是否有人可以帮助解决以下问题。我正在使用Nokogiri从http://www.bbc.co.uk/sport/football/tables

中抓取一些数据

我想知道排名表的信息，到目前为止我得到了这个

def get_league_table # Get me Premier League Table
  doc = Nokogiri::HTML(open(FIXTURE_URL))
  table = doc.css('.table-stats')
  teams = table.xpath('following-sibling::*[1]').css('tr.team')
  teams.each do |team|
  position = team.css('.position-number').text.strip
  League.create!(position: position)
  end
end

所以我想我会抓住。table-stats，然后得到表中的每一行与一个类的团队，这些行包含我需要的所有信息，如位置号码，发挥，团队名称等。

一旦我进入tr.team，我想我可以做一个循环从行中获取相关信息。

它的xpath部分我卡住了(除非我接近整个事情错了?)，如何从。table-stats获得tr.team类?

谁能给点建议?

谢谢

这是一个脚本，我做动态解析表，我调整它为您的情况:

require 'open-uri'
require 'nokogiri'
url = 'http://www.bbc.co.uk/sport/football/tables'
doc = Nokogiri::HTML.parse(open url)
teams = doc.search('tbody tr.team')
keys = teams.first.search('td').map do |k|
  k['class'].gsub('-', '_').to_sym
end
hsh = teams.flat_map do |team|
  Hash[keys.zip(team.search('td').map(&:text))]
end
puts hsh

相关内容

最新更新

热门标签：