我有以下文本
fd types:["a"]
s types:
["b","c"]
types: [
"one"
]
types: "two"
types: ["three", "four", "five","six"]
no: ["Don't","read","this"]
并且我只想提取所有类型:a到c,一个和6我不想提取其他属性,如单位到目前为止,我可以分两步完成
(?:types:s*[s*)(.+?)(?:s*])|(?:types:s*)(".+?")
通过这种方式我得到了分组:
"a"
"b","c"
"one"
"two"
"three", "four", "five","six"
然后我申请
s*"((?:[^",s]*)*)"s*
到小组中的任何一个,并得到
a
b
c
one
two
three
four
five
six
我想知道这是否可以在一步中完成
您可以pip install
regex`并使用
(?:G(?!^)s*,s*|types:(?:s*[)?s*)"([^"]*)"
请参阅regex演示详细信息:
(?:G(?!^)s*,s*|types:(?:s*[)?s*)
-两种模式之一:G(?!^)s*,s*
-上一个匹配的结束,然后用逗号括起来,空格为零或更多|
-或types:(?:s*[)?s*
-types:
,零个或多个空白和一个[
字符的可选序列,然后是零个或更多空白
"([^"]*)"
-"
,然后将除"
之外的零个或多个字符捕获到组1中,然后是一个"
字符
请参阅Python演示:
import regex
text = 'fd types:["a"] ns types: n ["b","c"]ntypes: [n"one"n]ntypes: "two"ntypes: ["three", "four", "five","six"]nno: ["Don't","read","this"]'
print(regex.findall(r'(?:G(?!^)s*,s*|types:(?:s*[)?s*)"([^"]*)"', text))
输出:
['a', 'b', 'c', 'one', 'two', 'three', 'four', 'five', 'six']