找不到一个好的正则表达式模式来以正确的顺序替换字符串(python)



我有一个字符串格式的列名列表,如下所示:

lst = ["plug", "[plug+wallet]", "(wallet-phone)"]

现在,我想使用regex将df[]" ' "添加到每个列名中,我做到了,当列表中有(wallet-phone)这种字符串时,它会给出类似df[('wallet']-df['phone')]的输出。我怎么会变成这个(df['wallet']-df['phone']),我的模式错了吗。请参阅以下内容:

import re
lst = ["plug", "[plug+wallet]", "(wallet-phone)"]
x=[]
y=[]
for l in lst: 
x.append(re.sub(r"([^+-*/'d]+)", r"'1'", l))
for f in x:    
y.append(re.sub(r"('[^+-*/'d]+')", r'df[1]',f))
print(x)
print(y)

给出:

x:["'plug'", "'[plug'+'wallet]'", "'(wallet'-'phone)'"]
y:["df['plug']", "df['[plug']+df['wallet]']", "df['(wallet']-df['phone)']"]

图案错了吗?预期输出:

x:["'plug'", "['plug'+'wallet']", "('wallet'-'phone')"]
y:["df['plug']", "[df['plug']+df['wallet']]", "(df['wallet']-df['phone'])"]

我也尝试过([^+-*/()[]'d]+)这种模式,但它并不能避免() or []

查找单词并将其包含在字典引用中可能更容易:

import re
lst = ["plug", "[plug+wallet]", "(wallet-phone)"]
z = [re.sub(r"(w+)",r"df['1']",w) for w in lst]
print(z)
["df['plug']", "[df['plug']+df['wallet']]", "(df['wallet']-df['phone'])"]

最新更新