如何在python中匹配regex

describe aws_security_group({:group_id=>"sg-ezsrzerzer", :vpc_id=>"vpc-zfds54zef4s"}) do

我试图过滤掉sg-ezsrzerzer(所以我想在开始sg-时过滤，直到双引号(。我正在使用python

我目前有：

import re
a = 'describe aws_security_group({:group_id=>"sg-ezsrzerzer", :vpc_id=>"vpc-zfds54zef4s"}) do'
test = re.findall(r'bsg-.*b', a)
print(test)

输出为

['sg-ezsrzerzer", :vpc_id=>"vpc-zfds54zef4s"}) do']

如何只获取['sg-ezsrzerzer']？

如果目标是在给定字符串中提取group_id值，则模式(?<=group_id=>").+?(?=")将很好地工作，该字符串的格式如示例所示。

(?<=group_id=>")在后面查找要匹配的字符串之前的子字符串group_id=>"。

.+?惰性地匹配任意字符中的一个或多个。

(?=")提前查找匹配后的字符"(有效地使表达式.+与除结束的"之外的任何字符匹配(。

如果您只想提取group_id以sg-开头的子字符串，那么您可以简单地将其添加到模式的匹配部分，如下所示(?<=group_id=>")sg-.+?(?=")

import re
s = 'describe aws_security_group({:group_id=>"sg-ezsrzerzer", :vpc_id=>"vpc-zfds54zef4s"}) do'
results = re.findall('(?<=group_id=>").+?(?=")', s)
print(results)

输出

['sg-ezsrzerzer']

当然，您也可以使用re.search而不是re.findall来查找给定字符串中与上述模式匹配的子字符串的第一个实例——我想这取决于您的用例。

import re
s = 'describe aws_security_group({:group_id=>"sg-ezsrzerzer", :vpc_id=>"vpc-zfds54zef4s"}) do'
result = re.search('(?<=group_id=>").+?(?=")', s)
if result:
result = result.group()
print(result)

输出

'sg-ezsrzerzer'

如果您决定使用re.search，您会发现如果在输入字符串中没有找到匹配项，它会返回None，如果有，则返回re.Match对象-因此，if语句和对s.group()的调用会提取匹配字符串(如果在上例中存在(。

模式bsg-.*b匹配过多，因为.*将匹配到字符串末尾，然后返回到第一个单词边界，即o和字符串末尾之后。

如果您使用re.findall，您也可以使用捕获组而不是查找，并且组值将在结果中。

:group_id=>"(sg-[^"rn]+)"

模式匹配：

:group_id=>"完全匹配
(sg-[^"rn]+)捕获组1匹配sg-并将除"或换行符之外的任何字符乘以1+
"匹配双引号

查看regex演示或Python演示

例如

import re
pattern = r':group_id=>"(sg-[^"rn]+)"'
s = "describe aws_security_group({:group_id=>"sg-ezsrzerzer", :vpc_id=>"vpc-zfds54zef4s"}) do"
print(re.findall(pattern, s))

输出

['sg-ezsrzerzer']

与w+:匹配到第一个单词边界

import re
a = 'describe aws_security_group({:group_id=>"sg-ezsrzerzer", :vpc_id=>"vpc-zfds54zef4s"}) do'
test = re.findall(r'bsg-w+', a)
print(test[0])

请参见Python验证。

解释

--------------------------------------------------------------------------------
b                       the boundary between a word char (w) and
something that is not a word char
--------------------------------------------------------------------------------
sg-                      'sg-'
--------------------------------------------------------------------------------
w+                      word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))

结果：g-ezsrzerzer

相关内容

最新更新

热门标签：