我正在学习如何使用正则表达式来解析位置/地址字符串。不幸的是,我得到的数据是不一致的,大多数地址都是非常规的。下面是到目前为止我所拥有的,我所遇到的问题是我需要多次解析字符串以将其转换为适当的格式。
以以下字符串为例:102 Spruce, 108 Spruce, 110 Spruce, Greenwood, SC 29649
我想要的最终结果是110 Spruce, Greenwood, SC 29649
代码:
l = nil
location_str = "102 Spruce, 108 Spruce, 110 Spruce, Greenwood, SC 29649"
1.upto(4).each do |attempt|
l = Location.from_string(location_str)
puts "TRYING: #{location_str}"
break if !l.nil?
location_str.gsub!(/^[^,:-]+s*/, '')
end
输出:TRYING: 102 Spruce, 108 Spruce, 110 Spruce, Greenwood, SC 29649
TRYING: , 108 Spruce, 110 Spruce, Greenwood, SC 29649
TRYING: , 108 Spruce, 110 Spruce, Greenwood, SC 29649
TRYING: , 108 Spruce, 110 Spruce, Greenwood, SC 29649
:
TRYING: 102 Spruce, 108 Spruce, 110 Spruce, Greenwood, SC 29649
TRYING: 108 Spruce, 110 Spruce, Greenwood, SC 29649
TRYING: 110 Spruce, Greenwood, SC 29649
这是其中一个有不止一种方法的事情。这里还有一个:
def address_from_location_string(location)
*_, address, city, state_zip = location.split(/s*,s*/)
"#{address}, #{city}, #{state_zip}"
end
address_from_location_string("102 Spruce, 108 Spruce, 110 Spruce, Greenwood, SC 29649")
# => "110 Spruce, Greenwood, SC 29649"
假设格式为:
"Stuff you aren't interested in, more stuff, more stuff, etc., house, city, state zip"
那么你只需要将最后3部分用美元符号固定在字符串的末尾:
location_str[/[^,]*,[^,]*,[^,]*$/]
尝试不使用regex:
address = "102 Spruce, 108 Spruce, 110 Spruce, Greenwood, SC 29649"
elements = address.split(",").map(&:strip)
city, state_and_zip = elements[elements.length-2..-1]
addresses = elements[0...elements.length-2]
p [addresses.last, city, state_and_zip].join(",")
输出:"110 Spruce,Greenwood,SC 29649"