如何使用Regex从句子中删除以小写的方式删除单词



"我正在尝试使用正则表达式以小写的方式删除单词,但我没有得到所需的输出。"

我的意见是"适用于该法案,并将其作为Illiam B Geissler的一部分"

import re 
text = "apply to this bill and are made a part thereof Illam B GEISSLER"  
result = re.sub(r"w[a-z]", "", text)  
print(result) 

我将输出作为" I B Geissler"所需的输出为" Illiam B Geissler"

尝试查找模式b[a-z]+s*,然后用空字符串替换:

text = "apply to this bill and are made a part thereof Illam B GEISSLER"  
result = re.sub(r'b[a-z]+s*', "", text).strip()
print(result)

此打印:

Illam B GEISSLER

模式b[a-z]+s*背后的想法是,它仅匹配整个单词。请注意,我们致电strip以删除所有剩余的空格。

另一个微妙的点是,该模式消除了每个匹配小写字母的RHS上的所有空格。这是为了让文本可读,例如,某些不匹配的单词之间的一些匹配单词:

text = "United States Of a bunch of states called America"  
result = re.sub(r'b[a-z]+s*', "", text).strip()
print(result)

正确打印:

United States Of America

您可以搜索大写字在链接中,您可以找到一个示例

REGEX-在字符串中找到大写字

此表达式也可能起作用:

s*b[a-z][a-z]*

演示1

测试

import re
regex = r"s*b[a-z][a-z]*"
test_str = "apply to this bill and are made a part thereof Illam B GEISSLER apply to this bill and are made a part thereof Illam B GEISSLER"
subst = ""
# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)
if result:
    print (result)

或这个:

([A-Z].*?bs*)

测试

import re
regex = r"([A-Z].*?bs*)"
test_str = "apply to this bill and are made a part thereof Illam B GEISSLER apply to this bill and are made a part thereof Illam B GEISSLER"
print("".join(re.findall(regex, test_str)))

输出

Illam B GEISSLER Illam B GEISSLER

演示2

尝试这个,

import re
text = "apply to this bill and are made a part thereof Illam B GEISSLER"
result = re.sub(r"(b[a-z]+)", '', text).strip()
print(result)

输出

Illam B GEISSLER

最新更新