Regex替换注释中的单词以外的单词



在不支持lookbacking的语言中,如何修改regex,使其忽略模式中的注释?

我的正则表达式模式是:

b{Word}b(?=([^"\]*(\.|"([^"\]*\.)*[^"\]*"))*[^"]*$)
  • \b{Word}\b:整个单词,{Word}被反复替换为vocab列表
  • (?=([^"\](\.|""([^""\]\.([^"([^"]$(:不要替换引号中的任何内容

我的目标是整理变量和单词,使它们总是大小写相同。然而,我不想在评论中拖泥带水。(IDE很糟糕,没有其他选择(

这种语言中的注释以撇号为前缀。样本代码遵循

' This is a comment
This = "Is not" ' but this is 
' This is a comment, what is it's value?
Object.value = 1234 ' Set value
value = 123

基本上,我想让门楣接受上面的代码,并为单词";值";更新为:

' This is a comment
This = "Is not" ' but this is 
' This is a comment, what is it's value?
Object.Value = 1234 ' Set value
Value = 123

使得所有基于代码的";值";更新了,但双引号、评论或其他单词的一部分(如valueadded(中的任何内容都不会被触碰。

我尝试了几种解决方案,但都没能奏效。

  • ['.*]:不先于一个叛教
  • (?<!\s*'(:BackSearch不包含任何带有apoostrophys的空格
  • (?<!\s*'(:第二个例子似乎不正确,但这不起作用,因为该语言不支持backearches

任何人都知道我如何改变我的模式,这样我就不会编辑评论变量

VBA-


Sub TestSO()
Dim Code As String
Dim Expected As String
Dim Actual  As String
Dim Words   As Variant
Code = "item = object.value ' Put item in value" & vbNewLine & _
"some.item <> some.otheritem" & vbNewLine & _
"' This is a comment, what is it's value?" & vbNewLine & _
"Object.value = 1234 ' Set value" & vbNewLine & _
"value = 123" & vbNewLine
Expected = "Item = object.Value ' Put item in value" & vbNewLine & _
"some.Item <> some.otheritem" & vbNewLine & _
"' This is a comment, what is it's value?" & vbNewLine & _
"Object.Value = 1234 ' Set value" & vbNewLine & _
"Value = 123" & vbNewLine

Words = Array("Item", "Value")
Actual = SOLint(Words, Code)
Debug.Print Actual = Expected
Debug.Print "CODE: " & vbNewLine & Code
Debug.Print "Actual: " & vbNewLine & Actual
Debug.Print "Expected: " & vbNewLine & Expected

End Sub
Public Function SOLint(ByVal Words As Variant, ByVal FileContents As String) As String
Const NotInQuotes  As String = "(?=([^""\]*(\.|""([^""\]*\.)*[^""\]*""))*[^""]*$)"
Dim RegExp      As Object
Dim Regex    As String
Dim Index       As Variant


Set RegExp = CreateObject("VBScript.RegExp")
With RegExp
.Global = True
.IgnoreCase = True
End With

For Each Index In Words
Regex = "[('*)]b" & Index & "b" & NotInQuotes
RegExp.Pattern = Regex

FileContents = RegExp.Replace(FileContents, Index)
Next Index

SOLint = FileContents
End Function

如以上评论中所述:

((?:".*")|(?:'.*))|b(v)(alue)b

此正则表达式的3个部分用于交替。

  1. 一个双引号内文本的非捕获组,因为我们不需要它。

  2. 以单引号开头的文本的非捕获组

  3. 最后,字符串";值";被分为两部分(v(和(value(,因为在替换时,我们可以使用\U($2(将v转换为v,其余按原样\E$3,其中\U-转换为大写,\E-关闭大小写。

  4. \b\b-单词边界用于避免任何不属于设置值的独立文本。

https://regex101.com/r/mD9JeR/8

最新更新