Java 正则表达式 for google maps url?



我想解析字符串中的所有谷歌地图链接。格式如下:

第一个例子https://www.google.com/maps/place/white+house/@38.8976763,-77.0387185,17z/data=!3m1!4b1!4m5!3m4!1s0x89b7b7bcdecbb1df:0x715969d86d0b76bf!8m2!3d38.8976763!4d-77.0365298

https://www.google.com/maps/place/white+house/@38.8976763,-77.0387185,17z

https://www.google.com/maps/place//@38.8976763,-77.0387185,17z

https://maps.google.com/maps/place//@38.8976763,-77.0387185,17z

https://www.google.com/maps/place/@38.8976763,-77.0387185,17z

https://google.com/maps/place/@38.8976763,-77.0387185,17z

http://google.com/maps/place/@38.8976763,-77.0387185,17z

https://www.google.com.tw/maps/place/@38.8976763,-77.0387185,17z

这些都是有效的谷歌地图网址(链接到白宫)

这是我尝试过的

String gmapLinkRegex = "(http|https)://(www\.)?google\.com(\.\w*)?/maps/(place/.*)?@(.*z)[^ ]*";
Pattern patternGmapLink = Pattern.compile(gmapLinkRegex , Pattern.CASE_INSENSITIVE);
Matcher m = patternGmapLink.matcher(s);
while (m.find()) {
logger.info("group0 = {}" , m.group(0));
String place = m.group(4); 
place = StringUtils.stripEnd(place , "/"); // remove tailing '/'
place = StringUtils.stripStart(place , "place/"); // remove header 'place/'
logger.info("place = '{}'" , place);
String latLngZ = m.group(5);
logger.info("latLngZ = '{}'" , latLngZ);
}

它可以在简单的情况下工作,但仍然有错误... 例如

它需要后处理来获取可选的place信息

并且它不能提取具有两个 url 的一行,例如:

s = "https://www.google.com/maps/place//@38.8976763,-77.0387185,17z " +
" and http://google.com/maps/place/@38.8976763,-77.0387185,17z";

它应该是两个网址,但正则表达式匹配整行...

要点 :

  • 整个 URL 应group(0)匹配(包括第一个示例中的尾随data部分),
  • 在第一个例子中,如果删除缩放级别:17z,它仍然是一个有效的gmap URL,但我的正则表达式无法匹配它。
  • 更容易提取可选place信息
  • Lat/Lng提取是必须的,缩放级别是可选的。
  • 能够在一行中解析多个网址
  • 能够处理maps.google.com(.xx)/maps,我尝试了(www|maps.)?但似乎仍然有问题

有什么建议可以改进这个正则表达式吗?多谢!

点星号

.*

将始终允许最后一个 URL 末尾的任何内容。 您需要"更紧密"的正则表达式,它匹配单个 URL,但不匹配介于两者之间的多个 URL。 "[^ ]*"可能包含下一个URL,如果它由" "以外的内容分隔,其中包括换行符,制表符,移位空格...

我建议(对不起,没有在 java 上测试过),使用"除了@之外的任何内容"和"数字、减号、逗号或点"和"可选的特殊字符串,后跟定制的字符集,很多次"。

"(http|https)://(www.)?google.com(.w*)?/maps/(place/[^@]*)?@([0123456789.,-]*z)(/data=[!:.-0123456789abcdefmsx]+)?"

我在 perl-regex 兼容引擎 (np++) 上测试了上面的一个。
如果我猜错了什么,请调整自己。明确的数字列表可能可以用"\d"代替,我试图尽量减少对正则表达式风格的假设。

为了匹配"URL"或"URL和URL",请使用存储正则表达式的变量,然后执行"(URL和)*URL",将"URL"替换为正则表达式var。如果问题是如何检索多个匹配项:那是java,我帮不上忙。让我知道,我删除这个答案,而不是激起应得的反对票;-)

(经过编辑以捕获数据部分、以前未看到的数据部分、第一个示例、第一行;以及一行中的多个 URL。

我写了这个正则表达式来验证谷歌地图链接:

"(http:|https:)?\/\/(www\.)?(maps.)?google\.[a-z.]+\/maps/?([\?]|place/*[^@]*)?/*@?(ll=)?(q=)?(([\?=]?[a-zA-Z]*[+]?)*/?@{0,1})?([0-9]{1,3}\.[0-9]+(,|&[a-zA-Z]+=)-?[0-9]{1,3}\.[0-9]+(,?[0-9]+(z|m))?)?(\/?data=[\!:\.\-0123456789abcdefmsx]+)?"

我使用以下谷歌地图链接列表进行了测试:

String location1 = "http://www.google.com/maps/place/21.01196755,105.86306012";
String location2 = "https://www.google.com.tw/maps/place/@38.8976763,-77.0387185,17z";
String location3 = "http://www.google.com/maps/place/21.01196755,105.86306012";
String location4 = "https://www.google.com/maps/place/white+house/@38.8976763,-77.0387185,17z/data=!3m1!4b1!4m5!3m4!1s0x89b7b7bcdecbb1df:0x715969d86d0b76bf!8m2!3d38.8976763!4d-77.0365298";
String location5 = "https://www.google.com/maps/place/white+house/@38.8976763,-77.0387185,17z";
String location6 = "https://www.google.com/maps/place//@38.8976763,-77.0387185,17z";
String location7 = "https://maps.google.com/maps/place//@38.8976763,-77.0387185,17z";
String location8 = "https://www.google.com/maps/place/@38.8976763,-77.0387185,17z";
String location9 = "https://google.com/maps/place/@38.8976763,-77.0387185,17z";
String location10 = "http://google.com/maps/place/@38.8976763,-77.0387185,17z";
String location11 = "https://www.google.com/maps/place/@/data=!4m2!3m1!1s0x3135abf74b040853:0x6ff9dfeb960ec979";
String location12 = "https://maps.google.com/maps?q=New+York,+NY,+USA&hl=no&sll=19.808054,-63.720703&sspn=54.337928,93.076172&oq=n&hnear=New+York&t=m&z=10";
String location13 = "https://www.google.com/maps";
String location14 = "https://www.google.fr/maps";
String location15 = "https://google.fr/maps";
String location16 = "http://google.fr/maps";
String location17 = "https://www.google.de/maps";
String location18 = "https://www.google.com/maps?ll=37.0625,-95.677068&spn=45.197878,93.076172&t=h&z=4";
String location19 = "https://www.google.de/maps?ll=37.0625,-95.677068&spn=45.197878,93.076172&t=h&z=4";
String location20 = "https://www.google.com/maps?ll=37.0625,-95.677068&spn=45.197878,93.076172&t=h&z=4&layer=t&lci=com.panoramio.all,com.google.webcams,weather";
String location21 = "https://www.google.com/maps?ll=37.370157,0.615234&spn=45.047033,93.076172&t=m&z=4&layer=t";
String location22 = "https://www.google.com/maps?ll=37.0625,-95.677068&spn=45.197878,93.076172&t=h&z=4";
String location23 = "https://www.google.de/maps?ll=37.0625,-95.677068&spn=45.197878,93.076172&t=h&z=4";
String location24 = "https://www.google.com/maps?ll=37.0625,-95.677068&spn=45.197878,93.076172&t=h&z=4&layer=t&lci=com.panoramio.all,com.google.webcams,weather";
String location25 = "https://www.google.com/maps?ll=37.370157,0.615234&spn=45.047033,93.076172&t=m&z=4&layer=t";
String location26 = "http://www.google.com/maps/place/21.01196755,105.86306012";
String location27 = "http://google.com/maps/bylatlng?lat=21.01196022&lng=105.86298748";
String location28 = "https://www.google.com/maps/place/C%C3%B4ng+vi%C3%AAn+Th%E1%BB%91ng+Nh%E1%BA%A5t,+354A+%C4%90%C6%B0%E1%BB%9Dng+L%C3%AA+Du%E1%BA%A9n,+L%C3%AA+%C4%90%E1%BA%A1i+H%C3%A0nh,+%C4%90%E1%BB%91ng+%C4%90a,+H%C3%A0+N%E1%BB%99i+100000,+Vi%E1%BB%87t+Nam/@21.0121535,105.8443773,13z/data=!4m2!3m1!1s0x3135ab8ee6df247f:0xe6183d662696d2e9";

最新更新