我有一个列表,它有一组字符串-
mylist = ['abc','[apple','banana','cucumber]','efg','{','egg','[fff]','ginger }','end;','abc1','[','apple1','banana1','cucumber1',']','efg1','{','egg1','[fff1]','ginger1 }','end1;']
将所有字符串组合在一起以应用正则表达式来排除方括号内的内容:
newlist = ['|'.join(mylist)]
Output newlist : ['abc|[apple|banana|cucumber]|efg|{|egg|[fff]|ginger }|end;|abc1|[ |apple1|banana1|cucumber1|]|efg1|{|egg1|[fff1]|ginger1 }|end1;']
使用regex元素方括号内容-
newlist1 =[re.sub('[.+?]','',newlist[0])
Output of newlist2- ['abc||efg|{|egg||ginger }|end;|abc1||efg1|{|egg1||ginger1 }|end1;']
它还消除了花括号内的方括号内容。。
Expected output : ['abc||efg|{|egg|[fff]|ginger }|end;|abc1||efg1|{|egg1|[fff1]|ginger1 }|end1;']
虽然这不是你想要的答案,但我希望它能对你有所帮助。首先我对文本进行预处理,然后应用正则表达式。
import re
def preprocess(text):
CurluBrace=False
for i in range(len(text)):
c=text[i]
if c=='[' and CurluBrace==False:
text=text[:i]+'`'+text[i+1:]
if c==']' and CurluBrace==False:
text=text[:i]+'`'+text[i+1:]
if c=='{':
CurluBrace=True
if c=='}':
CurluBrace=False
return text
def main():
text='abc|[apple|banana|cucumber]|efg|{|egg|[fff]|ginger }|end;|abc1|[ |apple1|banana1|cucumber1|]|efg1|{|egg1|[fff1]|ginger1 }|end1;'
processedText=preprocess(text)
newlist1 =[re.sub('`.+?`','',processedText)]
print(newlist1)
一个更优雅的解决方案:
import re
def preprocess(text):
CurlyBraceContext=0
for i in range(len(text)):
c=text[i]
if CurlyBraceContext<=0:
if c=='[' or c==']':
text=text[:i]+'`'+text[i+1:]
if c=='{':
CurlyBraceContext+=1
if c=='}':
CurlyBraceContext-=1
return text
def main():
text='abc|[apple|banana|cucumber]|efg|{|egg|[fff]|ginger }|end;|abc1|[ |apple1|banana1|cucumber1|]|efg1|{|egg1|[fff1]|ginger1 }|end1;'
processedText=preprocess(text)
newlist1 =[re.sub('`.+?`','',processedText)]
print(newlist1)