Python正则表达式从字典列表中删除字符串



我有以下字典列表

d =
[
{
"Business": "Company A",
"Category": "Supply Chain",
"Date": "Posted DatenDecember 21 2021",
},
{
"Business": "Company B",
"Category": "Manufacturing",
"Date": "Posted DatenDecember 21 2021",
}
]

我试图使用re从字典中删除Posted Daten字符串,但出现以下错误:

TypeError: expected string or bytes-like object

我的代码如下:

regex = re.compile('Posted Daten')
filtered = [i for i in d if not regex.match(i)]
print(filtered)

如果我在没有字典的普通字符串列表上也这样做,那就行了。我必须先把字典转换成字符串吗?

谢谢!

假设d是字典列表,那么您正在循环遍历字典本身。因此,对于第一次迭代:

i = {
"Business": "Company A",
"Category": "Supply Chain",
"Date": "Posted DatenDecember 21 2021",
}

事实上,您不能在字典上使用regex。您需要更深入地遍历字典中的键和值。但如果在循环时更改字典,也可能导致RunTimeErrors。

import re
d = [{
"Business": "Company A",
"Category": "Supply Chain",
"Date": "Posted DatenDecember 21 2021",
}, {
"Business": "Company B",
"Category": "Manufacturing",
"Date": "Posted DatenDecember 21 2021",
}]
regex = re.compile('Posted Daten')
for dikt in d:
for key, value in list(dikt.items()):  # make a list to prevent RuntimeError
if regex.match(value): 
del dikt[key]

这将完全省略Date密钥:

d = [{
"Business": "Company A",
"Category": "Supply Chain",
}, {
"Business": "Company B",
"Category": "Manufacturing",
}]

如果你只是想摆脱";Posted Date\n";,这就足够了:

d = [{
"Business": "Company A",
"Category": "Supply Chain",
"Date": "Posted DatenDecember 21 2021",
}, {
"Business": "Company B",
"Category": "Manufacturing",
"Date": "Posted DatenDecember 21 2021",
}]

for dikt in d:
for key, value in dikt.items():
dikt[key] = value.replace('Posted Daten', '') # replace string from all our values stupidly :)

结果:

d = [{
"Business": "Company A",
"Category": "Supply Chain",
"Date": "December 21 2021",
}, {
"Business": "Company B",
"Category": "Manufacturing",
"Date": "December 21 2021",
}]

最新更新