用野木/红宝石数一首诗的行数

我一直在努力用一个简单的正则表达式来实现这一点，但它从来都不是很准确。它不一定是完美的。

Source包含
和

标记的组合。我不想数空行。

老办法：

  self.words = rendered.gsub(/<p>&nbsp;</p>/,'').gsub(/<p><brs?/?>|(?:<brs?/?>){2,}/,'<br>').scan(/<br>|<br />|<p/).size+1

新方式（不起作用：尝试将所有
+转换为段落，然后将其放入nokogiri中，以计算其中包含3个以上字符的段落标记（我不知道怎么做？计算1个字母行也很好，但这在javascript中运行良好）

  h = rendered
  h.gsub!(/<br>s*<br>/gi,"<p>")
  h.gsub!(/<br>/gi,"<p>") if h =~ /<br>s*<br>/
  h.prepend "<p>" if !h =~ /^s*<p[^>]*>/i
  h.replace(/<p>s*<p>/g,"<p>&nbsp;</p><p>")
  Nokogiri::HTML(rendered)
  # find+count p tags with at least 1-3 chars?
  # this is javascript not ruby, but you get the idea
  $('p', c).each(function(i) { // had to trim it to remove whitespaces from start/end.
    if ($(this).children('img').length) return; // skip if it's just an image.
    if ($.trim($(this).text()).length > 3)
      $(this).append("<div class='num'>"+ (n += 1) +"</div>");
  })

欢迎使用其他方法！

示例诗（http://allpoetry.com/poem/7429983-the_many_endings-by-Kevin）

<p>
    from the other side of silence<br>
    you met me with change and a pocket<br>
    of unhappy apples.</p>
<p>
     </p>
<p>
    <br>
    we bled together to black<br>
    and chose the path carefully to<br>
    france.<br><br>
    sometimes when you smile<br>
    your radiant footsteps fall<br>
    and all around us is silence:<br>
    each dream step is<br>
    false but full of such glory</p>
<p>
     </p>
<p>
    <br>
    unhappiness never made a student of you:<br>
    just two by two by two.  now three<br>
    this great we that overflows our<br>
    heart-cave<br><br>
    each jewel-like addition to the delicate<br>
    crown.  but flowers fall and dreams,<br>
    all dreams, come to and end with death.</p>

谢谢！

对于子孙后代，以下是我现在使用的内容，它似乎非常准确。非拉丁字符有时会在ckeditor中引起一些问题，所以我现在将其删除。

  html = Nokogiri::HTML(rendered)
  text = html.at('body').inner_text rescue nil
  return self.words = rendered.gsub(/<p>&nbsp;</p>/,'').gsub(/<p><brs?/?>|(?:<brs?/?>){2,}/,'<br>').scan(/<br>|<br />|<p/).size+1 if !text
  #bonus points to strip lines entirely non-letter. idk
  #d "text is", text.gsub!(/([x09|x0D|t])|(xc2xa0){1,}|[^A-z]/u,'')
  text.gsub!(/[^A-zn]/u,'')
  #d "text is", text
  self.words = text.strip.scan(/(s*ns*)+/).size+1

相关内容

最新更新

热门标签：