在不支持lookbacking的语言中,如何修改regex,使其忽略模式中的注释?
我的正则表达式模式是:
b{Word}b(?=([^"\]*(\.|"([^"\]*\.)*[^"\]*"))*[^"]*$)
- \b{Word}\b:整个单词,{Word}被反复替换为vocab列表
- (?=([^"\](\.|""([^""\]\.([^"([^"]$(:不要替换引号中的任何内容
我的目标是整理变量和单词,使它们总是大小写相同。然而,我不想在评论中拖泥带水。(IDE很糟糕,没有其他选择(
这种语言中的注释以撇号为前缀。样本代码遵循
' This is a comment
This = "Is not" ' but this is
' This is a comment, what is it's value?
Object.value = 1234 ' Set value
value = 123
基本上,我想让门楣接受上面的代码,并为单词";值";更新为:
' This is a comment
This = "Is not" ' but this is
' This is a comment, what is it's value?
Object.Value = 1234 ' Set value
Value = 123
使得所有基于代码的";值";更新了,但双引号、评论或其他单词的一部分(如valueadded(中的任何内容都不会被触碰。
我尝试了几种解决方案,但都没能奏效。
- ['.*]:不先于一个叛教
- (?<!\s*'(:BackSearch不包含任何带有apoostrophys的空格
- (?<!\s*'(:第二个例子似乎不正确,但这不起作用,因为该语言不支持backearches
任何人都知道我如何改变我的模式,这样我就不会编辑评论变量
VBA-
Sub TestSO()
Dim Code As String
Dim Expected As String
Dim Actual As String
Dim Words As Variant
Code = "item = object.value ' Put item in value" & vbNewLine & _
"some.item <> some.otheritem" & vbNewLine & _
"' This is a comment, what is it's value?" & vbNewLine & _
"Object.value = 1234 ' Set value" & vbNewLine & _
"value = 123" & vbNewLine
Expected = "Item = object.Value ' Put item in value" & vbNewLine & _
"some.Item <> some.otheritem" & vbNewLine & _
"' This is a comment, what is it's value?" & vbNewLine & _
"Object.Value = 1234 ' Set value" & vbNewLine & _
"Value = 123" & vbNewLine
Words = Array("Item", "Value")
Actual = SOLint(Words, Code)
Debug.Print Actual = Expected
Debug.Print "CODE: " & vbNewLine & Code
Debug.Print "Actual: " & vbNewLine & Actual
Debug.Print "Expected: " & vbNewLine & Expected
End Sub
Public Function SOLint(ByVal Words As Variant, ByVal FileContents As String) As String
Const NotInQuotes As String = "(?=([^""\]*(\.|""([^""\]*\.)*[^""\]*""))*[^""]*$)"
Dim RegExp As Object
Dim Regex As String
Dim Index As Variant
Set RegExp = CreateObject("VBScript.RegExp")
With RegExp
.Global = True
.IgnoreCase = True
End With
For Each Index In Words
Regex = "[('*)]b" & Index & "b" & NotInQuotes
RegExp.Pattern = Regex
FileContents = RegExp.Replace(FileContents, Index)
Next Index
SOLint = FileContents
End Function
如以上评论中所述:
((?:".*")|(?:'.*))|b(v)(alue)b
此正则表达式的3个部分用于交替。
一个双引号内文本的非捕获组,因为我们不需要它。
以单引号开头的文本的非捕获组
最后,字符串";值";被分为两部分(v(和(value(,因为在替换时,我们可以使用\U($2(将v转换为v,其余按原样\E$3,其中\U-转换为大写,\E-关闭大小写。
\b\b-单词边界用于避免任何不属于设置值的独立文本。
https://regex101.com/r/mD9JeR/8