我很难使用正则表达式来解决这个表达式,
e.g when given below:
regex_exp(address, "OG 56432")
它应该返回
"OG 56432: Middle Street Pollocksville | 686"
地址是一个字符串数组:
address = [
"622 Gordon Lane St. Louisville OH 52071",
"432 Main Long Road St. Louisville OH 43071",
"686 Middle Street Pollocksville OG 56432"
]
我的解决方案目前看起来像这样(Python(:
import re
def regex_exp(address, zipcode):
for i in address:
if zipcode in i:
postal_code = (re.search("[A-Z]{2}s[0-9]{5}", x)).group(0)
# returns "OG 56432"
digits = (re.search("d+", x)).group(0)
# returns "686"
address = (re.search("D+", x)).group(0)
# returns "Middle Street Pollocksville OG"
print(postal_code + ":" + address + "| " + digits)
regex_exp(address, "OG 56432")
# returns OG 56432: High Street Pollocksville OG | 686
正如你从我的第二段中看到的,这不是正确的答案 - 我需要返回的值
"OG 56432: Middle Street Pollocksville | 686"
如何操作我的地址变量正则表达式搜索以排除 2 个大写连续大写字母?我试过这样的事情
address = (re.search("?!D+", x)).group(0)
删除基于正则表达式的两个连续大写字母以排除单词/字符串,但我认为这是朝着错误方向迈出的一步。
PS:我知道有更简单的方法可以解决这个问题,但我想使用正则表达式来改善我的基础
如果您只想删除邮政编码(5位数字(的前身的两个连续大写字母,请使用这个
import re
text = "432 Main Long PC Market Road St. Louisville OG 43071"
address = re.sub(r'([A-Z]{2}[s]{1})(?=[d]{5})','',text)
print(address)
# Output: 432 Main Long PC Market Road St. Louisville 43071
要删除所有出现的两个连续大写字母:
import re
text = "432 Main Long PC Market Road St. Louisville OG 43071"
address = re.sub(r'([A-Z]{2}[s]{1})(?=[d]{5})','',text)
print(address)
# Output: 432 Main Long Market Road St. Louisville 43071
使用 re.sub(( 和组捕获,您可以使用:
s="686 Middle Street Pollocksville OG 56432"
re.sub(r"(d+)(.*)s+([A-Z]+s+d+)",r"3: 2 | 1",s)
Out: 'OG 56432: Middle Street Pollocksville | 686'