访问Nokogiri元素的子元素



解析html表后,我可以将表的第一行作为Nokogiri元素。

2.2.1 :041 > pp content[1]; nil
#(Element:0x3feee917d1e0 {
  name = "tr",
  children = [
    #(Element:0x3feee917cfd8 {
      name = "td",
      attributes = [
        #(Attr:0x3feee917cf74 { name = "valign", value = "top" })],
      children = [
        #(Element:0x3feee917ca60 {
          name = "a",
          attributes = [
            #(Attr:0x3feee917c9fc {
              name = "href",
              value = "/cgi-bin/own-disp?action=getowner&CIK=0001513362"
              })],
          children = [ #(Text "Maestri Luca")]
          })]
      }),
    #(Text "n"),
    #(Element:0x3feee917c150 {
      name = "td",
      children = [
        #(Element:0x3feee917d794 {
          name = "a",
          attributes = [
            #(Attr:0x3feee9179fb8 {
              name = "href",
              value = "/cgi-bin/browse-edgar?action=getcompany&CIK=0001513362"
              })],
          children = [ #(Text "0001513362")]
          })]
      }),
    #(Text "n"),
    #(Element:0x3feee91796a8 {
      name = "td",
      children = [ #(Text "2016-09-04")]
      }),
    #(Text "n"),
    #(Element:0x3feee9179194 {
      name = "td",
      children = [ #(Text "officer: Senior Vice President, CFO")]
      }),
    #(Text "n")]
  })
 => nil 

这是来自以下行的内容:

Maestri Luca 0001513362 2016-09-04高管:高级副总裁、CFO

我需要从Nokogiri元素访问Name、Number、Date和Title。

一种方法如下:

2.2.1 :042 > pp content[1].text; nil
"Maestri Lucan0001513362n2016-09-04nofficer: Senior Vice President, CFOn"

然而,我正在寻找一种单独访问元素的方法,而不是使用换行符进行长字符串访问。我该怎么做?

name, number, date, title = *content[1].css('td').map(&:text)

如果content[1]trcontent[1].css('td')将找到它下面的所有td元素,.map(&:text)将为这些td中的每一个调用td.text,并将其放入一个数组中,然后我们用*对其进行splat,以便我们可以进行多次赋值。

(注意:下次,请包括原始HTML片段,而不是Nokogiri节点检查结果。(

相关内容

  • 没有找到相关文章

最新更新