正则提取被支架包围的钥匙

a = "[abc]def - aaa"      # key = "abc" value = "def - aaa"
a2 = "[_abc def]def - aaa"  # key = "_abc def" value = "def - aaa"
b = "[abc]"
c = "abc]"                 # key = "abc"   value = ""
d = "[abc]]def/acd"       # key = "abc"   value = "def/acd"
f = "abc]]"               # key = "abc" value = ""

上面只是模式的一些示例。我有数千个类似的字符串变量。支架可以是单个"]", "["或双"]]", "[["，也可以是左侧。

我想要的是获得钥匙值对。钥匙是支架内的字符串（可能缺少左支架）（例如abc，abc def）。该值是支架右侧的字符串，例如def - aaa或def/acd或空字符串。

如何在Python中定义正则表格？我尝试了一些，但它们不适合所有变量。

我尝试了re.search(r"([^[].*?)(?:]|]])([^]].*)", a)，但它与re.search(r"([^[].*?)(?:]|]])([^]].*)", b)

不使用

如果您只想忽略括号，则可以使用以下方式：

words = re.split('[[]]+', key_value)
words = filter(None, words)          # remove empty words
key = words[0]
value = words[1] if len(words) > 1 else None

从文档中复制了此模式：RE - 正则表达操作

peronly我会使用.index（）进行操作，但是您要求regexp，所以您在这里。

>>> expr = r"^(?:[?)(.*?)]+(.*?)$"
>>> re.search(expr, a).group(0, 1, 2)
('[abc]def - aaa', 'abc', 'def - aaa')
>>> re.search(expr, a2).group(0, 1, 2)        
('[_abc def]def - aaa', '_abc def', 'def - aaa')
>>> re.search(expr, b).group(0, 1, 2)
('[abc]', 'abc', '')
>>> re.search(expr, c).group(0, 1, 2)
('abc]', 'abc', '')
>>> re.search(expr, d).group(0, 1, 2)
('[abc]]def/acd', 'abc', 'def/acd')
>>> re.search(expr, f).group(0, 1, 2)         
('abc]]', 'abc', '')

请参阅此处右侧栏上的"匹配信息"部分。

我将在这里使用rpartition：

txt='''
[abc]def - aaa
[_abc def]def - aaa
[abc]
abc]
[abc]]def/acd
abc]]'''
import re
for e in txt.splitlines():
    li=e.rpartition(']')
    key=re.search(r'([^[]]+)', li[0]).group(1)
    value=li[-1]
    print '{:20}=> "{}":"{}"'.format(e,key, value)

如果您想要使用正则表达式，则可以使用：

for e in txt.splitlines():
    m=re.search(r'[*([^[]]+)]*(.*)', e)
    print '{:20}=> "{}":"{}"'.format(e,*m.groups())

无论哪种情况，打印：

[abc]def - aaa      => "abc":"def - aaa"
[_abc def]def - aaa => "_abc def":"def - aaa"
[abc]               => "abc":""
abc]                => "abc":""
[abc]]def/acd       => "abc":"def/acd"
abc]]               => "abc":""

相关内容

最新更新

热门标签：