我在python 3中有一个详细的正则表达式来捕获Windows文件路径。捕获可选的驱动器卷,然后是字符、反斜杠、一个或多个字符,然后是可选的文件扩展名。
(
(
([A-Za-z]:)
(\){1,2}
)? # group to catch optional drive volume
(
([A-Za-z0-9_%~-])* # catch some letters/symbols
(\) # catch one backslash
([A-Za-z0-9_%~-])* # catch more letters/symbols
)+ # at least one of this group
(
.[a-zA-Z]{3,4}
)? # catch optional file extension
)
据我所知,所有括号都已终止,但我仍然在第 3 行第 17 列收到未终止的括号错误。
File "C:UsersmreaDocumentsResult Fingerprintinglineidentifier.py", line 282, in identify_line
for match_obj in re.finditer(reg, line, re.VERBOSE):
File "C:UsersmreaAppDataLocalProgramsPythonPython37-32libre.py", line 230, in finditer
return _compile(pattern, flags).finditer(string)
File "C:UsersmreaAppDataLocalProgramsPythonPython37-32libre.py", line 286, in _compile
p = sre_compile.compile(pattern, flags)
File "C:UsersmreaAppDataLocalProgramsPythonPython37-32libsre_compile.py", line 764, in compile
p = sre_parse.parse(p, flags)
File "C:UsersmreaAppDataLocalProgramsPythonPython37-32libsre_parse.py", line 930, in parse
p = _parse_sub(source, pattern, flags & SRE_FLAG_VERBOSE, 0)
File "C:UsersmreaAppDataLocalProgramsPythonPython37-32libsre_parse.py", line 426, in _parse_sub
not nested and not items))
File "C:UsersmreaAppDataLocalProgramsPythonPython37-32libsre_parse.py", line 816, in _parse
p = _parse_sub(source, state, sub_verbose, nested + 1)
File "C:UsersmreaAppDataLocalProgramsPythonPython37-32libsre_parse.py", line 426, in _parse_sub
not nested and not items))
File "C:UsersmreaAppDataLocalProgramsPythonPython37-32libsre_parse.py", line 819, in _parse
source.tell() - start)
re.error: missing ), unterminated subpattern at position 31 (line 3, column 17)
我在一行中尝试了所有这些,它抛出了此错误,所以我让它详细检查,我看不出出了什么问题。
我假设这是一些我还不知道的特定于 python 的语法内容。谁能帮忙?
这些是执行扩展正则表达式时的一些正则表达式字符串选项。
源代码中最容易阅读的是类型 3的三引号 ">">,但这需要转义字符串,就好像它是单引号一样,
这意味着
即使是转义也必须是奇数。
您可以使用以下公式执行此操作:num_esc_to_add = (actual_num_escapes - 1)
例:
raw : \ : \ : \\ : \\ : \\\
quote ' : \ : \\ : \\\ : \\\\ : \\\\\'
====
=========================类型 1:
>>> import re
>>> expression1 = ' n
... ( # (1 start) n
... ( # (2 start) n
... ([A-Za-z]:) # (3) n
... (\){1,2} # (4) n
... )? # (2 end), group to catch optional drive volume n
... ( # (5 start) n
... ([A-Za-z0-9_%~-])* # (6), catch some letters/symbols n
... (\) # (7), catch one backslash n
... ([A-Za-z0-9_%~-])* # (8), catch more letters/symbols n
... )+ # (5 end), at least one of this group n
... ( # (9 start) n
... .[a-zA-Z]{3,4} n
... )? # (9 end), catch optional file extension n
... ) # (1 end) n
... '
>>> Rx= re.compile(expression1, re.X)
>>> print(expression1)
( # (1 start)
( # (2 start)
([A-Za-z]:) # (3)
(\){1,2} # (4)
)? # (2 end), group to catch optional drive volume
( # (5 start)
([A-Za-z0-9_%~-])* # (6), catch some letters/symbols
(\) # (7), catch one backslash
([A-Za-z0-9_%~-])* # (8), catch more letters/symbols
)+ # (5 end), at least one of this group
( # (9 start)
.[a-zA-Z]{3,4}
)? # (9 end), catch optional file extension
) # (1 end)
类型 2:
>>> import re
>>> expression2 = " n
... ( # (1 start) n
... ( # (2 start) n
... ([A-Za-z]:) # (3) n
... (\\){1,2} # (4) n
... )? # (2 end), group to catch optional drive volume n
... ( # (5 start) n
... ([A-Za-z0-9_%~\-])* # (6), catch some letters/symbols n
... (\\) # (7), catch one backslash n
... ([A-Za-z0-9_%~\-])* # (8), catch more letters/symbols n
... )+ # (5 end), at least one of this group n
... ( # (9 start) n
... \.[a-zA-Z]{3,4} n
... )? # (9 end), catch optional file extension n
... ) # (1 end) n
... "
>>> Rx= re.compile(expression2, re.X)
>>> print(expression2)
( # (1 start)
( # (2 start)
([A-Za-z]:) # (3)
(\){1,2} # (4)
)? # (2 end), group to catch optional drive volume
( # (5 start)
([A-Za-z0-9_%~-])* # (6), catch some letters/symbols
(\) # (7), catch one backslash
([A-Za-z0-9_%~-])* # (8), catch more letters/symbols
)+ # (5 end), at least one of this group
( # (9 start)
.[a-zA-Z]{3,4}
)? # (9 end), catch optional file extension
) # (1 end)
类型 3:
>>> import re
>>> expression3 = """
... ( # (1 start)
... ( # (2 start)
... ([A-Za-z]:) # (3)
... (\){1,2} # (4)
... )? # (2 end), group to catch optional drive volume
... ( # (5 start)
... ([A-Za-z0-9_%~-])* # (6), catch some letters/symbols
... (\) # (7), catch one backslash
... ([A-Za-z0-9_%~-])* # (8), catch more letters/symbols
... )+ # (5 end), at least one of this group
... ( # (9 start)
... .[a-zA-Z]{3,4}
... )? # (9 end), catch optional file extension
... ) # (1 end)
... """
>>> Rx= re.compile(expression3, re.X)
>>> print(expression3)
( # (1 start)
( # (2 start)
([A-Za-z]:) # (3)
(\){1,2} # (4)
)? # (2 end), group to catch optional drive volume
( # (5 start)
([A-Za-z0-9_%~-])* # (6), catch some letters/symbols
(\) # (7), catch one backslash
([A-Za-z0-9_%~-])* # (8), catch more letters/symbols
)+ # (5 end), at least one of this group
( # (9 start)
.[a-zA-Z]{3,4}
)? # (9 end), catch optional file extension
) # (1 end)
类型 4:
>>> import re
>>> expression4 = (
... r" " + "n"
... r" ( # (1 start) " + "n"
... r" ( # (2 start) " + "n"
... r" ([A-Za-z]:) # (3) " + "n"
... r" (\){1,2} # (4) " + "n"
... r" )? # (2 end), group to catch optional drive volume " + "n"
... r" ( # (5 start) " + "n"
... r" ([A-Za-z0-9_%~-])* # (6), catch some letters/symbols " + "n"
... r" (\) # (7), catch one backslash " + "n"
... r" ([A-Za-z0-9_%~-])* # (8), catch more letters/symbols " + "n"
... r" )+ # (5 end), at least one of this group " + "n"
... r" ( # (9 start) " + "n"
... r" .[a-zA-Z]{3,4} " + "n"
... r" )? # (9 end), catch optional file extension " + "n"
... r" ) # (1 end) " + "n"
... )
>>> Rx= re.compile(expression4, re.X)
>>> print(expression4)
( # (1 start)
( # (2 start)
([A-Za-z]:) # (3)
(\){1,2} # (4)
)? # (2 end), group to catch optional drive volume
( # (5 start)
([A-Za-z0-9_%~-])* # (6), catch some letters/symbols
(\) # (7), catch one backslash
([A-Za-z0-9_%~-])* # (8), catch more letters/symbols
)+ # (5 end), at least one of this group
( # (9 start)
.[a-zA-Z]{3,4}
)? # (9 end), catch optional file extension
) # (1 end)