我有一个文本字符串,我想用一个单词替换两个单词。 例如,如果单词artificial intelligence
,我想用artificial_intelligence
替换它。对于 200 个单词的列表和大小为 5 MB 的文本文件,需要执行此操作。我尝试了string.replace
但它只能用于一个元素,而不适用于列表。
例
Text='人工智能在深度学习的每种情况下都对我们很有用。
List a : list b
Artificial intelligence: artificial_intelligence
Deep learning: deep_ learning
...
Text.replace('Artificial intelligence','Artificial_intelligence'
( 正在工作。但
For I in range(len(Lista)):
Text=Text.replace(Lista[I],List b[I])
不行。
我建议使用dict
作为您的替换:
text = "Artificial intelligence is useful for us in every situation of deep learning."
replacements = {"Artificial intelligence" : "Artificial_intelligence",
"deep learning" : "deep_learning"}
然后你的方法有效(尽管它区分大小写(:
>>> for rep in replacements:
text = text.replace(rep, replacements[rep])
>>> print(text)
Artificial_intelligence is useful for us in every situation of deep_learning.
对于其他方法(如建议的正则表达式方法(,请查看SO:Python替换多个字符串。
由于列表条目和字符串之间存在大小写问题,因此可以使用带有IGNORECASE
标志的 re.sub()
函数来获取所需的内容:
import re
list_a = ['Artificial intelligence', 'Deep learning']
list_b = ['artificial_intelligence', 'deep_learning']
text = 'Artificial intelligence is useful for us in every situation of deep learning.'
for from_, to in zip(list_a, list_b):
text = re.sub(from_, to, text, flags=re.IGNORECASE)
print(text)
# artificial_intelligence is useful for us in every situation of deep_learning.
请注意,使用 zip()
函数允许同时迭代两个列表。
另请注意,克里斯蒂安是对的,字典更适合您的替换数据。然后,对于完全相同的结果,前面的代码将是以下内容:
import re
subs = {'Artificial intelligence': 'artificial_intelligence',
'Deep learning': 'deep_learning'}
text = 'Artificial intelligence is useful for us in every situation of deep learning.'
for from_, to in subs.items():
text = re.sub(from_, to, text, flags=re.IGNORECASE)
print(text)