捕获并替换括括号列表中的每个单词

  • 本文关键字:单词 替换 列表 ruby regex
  • 更新时间 :
  • 英文 :


>假设我有一个文本,例如

"tnheitanhiaiin [ hello, there, will, you, help ] thitnahioetnaeitn
tnhetnh [ me, figure, this, out ] ihnteahntanitnh
nhoietnaiotniaehntehtnea [ please, because, i, dont, know ] thnthen
"

如何捕获括号内的每个单词,以便用单引号将它们括起来?

我尝试了[s?(?:(w*),?s?)+],尽管它与括号中的部分匹配,但它似乎无法捕获任何东西。

括号内的单词可以是任何内容。

我希望在每一行都使用 gsub。

r = /
    (?<=[ ])  # match a space in a positive lookbehind
    p{L}+    # match one or more letters
    (?=       # begin a positive lookahead
      [^[]+? # match one or more characters other than a left bracket, lazily
      ]      # match a right bracket
    )         # end the positive lookahead
    /x        # free-spacing regex definition mode

str成为问题中定义的字符串,我们可以用单引号将括号之间的单词括起来,如下所示。

str.gsub(r) { |s| "'#{s}'" }
  #=> "tnheitanhiaiin [ 'hello', 'there', 'will', 'you', 'help' ]
  #    thitnahioetnaeitnntnhetnh [ 'me', 'figure', 'this', 'out' ]
  #    ihnteahntanitnhnnhoietnaiotniaehntehtnea [ 'please', 'because',
  #    'i', 'dont', 'know' ] thnthenn"

相反,如果我们希望提取这些单词,我们将使用 String#scan。

str.scan(r)
  #=> ["hello", "there", "will", "you", "help", "me", "figure", "this",
  #    "out", "please", "because", "i", "dont", "know"]

[^[]+?末尾的问号(懒惰匹配,而不是贪婪匹配(是为了提高效率,但不是必需的。

我使用自由间距定义模式使正则表达式自我记录。按照惯例,它会写成如下。

     /(?<= )p{L}+(?=[^[]+?])/

这假定(如示例中所示(括号匹配而不是嵌套,并且带括号的单词前面有一个空格,后跟一个逗号或空格。如果与括号内单词周围的字符有关的假设不正确,则可以调整正则表达式。

你可以试试这个:

original = "tnheitanhiaiin [ hello, there, will, you, help ] thitnahioetnaeitnntnhetnh [ me, figure, this, out ] ihnteahntanitnhnnhoietnaiotniaehntehtnea [ please, because, i, dont, know ] thnthenn"
clone = original
original.scan(/[(.*)]/).flatten.map { |elem| [elem, elem.gsub(/w+/) { |match| %Q('#{match}') }] }.each { |(pattern, replacement)| clone.gsub!(pattern, replacement) }
puts clone # =>
# tnheitanhiaiin [ 'hello', 'there', 'will', 'you', 'help' ] thitnahioetnaeitn
# tnhetnh [ 'me', 'figure', 'this', 'out' ] ihnteahntanitnh
# nhoietnaiotniaehntehtnea [ 'please', 'because', 'i', 'dont', 'know' ] thnthen

也许是以下行的双 gsub:

s = "tnheitanhiaiin [ hello, there, will, you, help ] thitnahioetnaeitnntnhetnh [ me, figure, this, out ] ihnteahntanitnhnnhoietnaiotniaehntehtnea [ please, because, i, dont, know ] thnthenn"
s.gsub(/[.*?]/) { |m| m.gsub(/w+/, '''') }
 #=> "tnheitanhiaiin [ 'hello', 'there', 'will', 'you', 'help' ] thitnahioetnaeitnntnhetnh [ 'me', 'figure', 'this', 'out' ] ihnteahntanitnhnnhoietnaiotniaehntehtnea [ 'please', 'because', 'i', 'dont', 'know' ] thnthenn"

最新更新