正则表达式将url/链接与换行符匹配



我想从引用匹配url。但是有些url里面有行刹车

示例文本=耶鲁大学气候变化传播项目。康涅狄格州纽黑文:xxx大学和乔治梅森大学;2015. 1 - 62页。可从:https://example.xxx.edu/wp-content/获得全球变暖- ccam上传/2015/04/3月- 2015. - pdf。

要匹配:https://example.xxx.edu/wp-content/uploads/2015/04/Global-Warming-CCAM-March-2015.pdf

Try (Regex demo.)

txt = """
Yale Project on Climate Change Communication. New Haven, CT: xxx University and George
Mason University; 2015. p. 1–62. Available from: https://example.xxx.edu/wp-content/
uploads/2015/04/Global-Warming-CCAM-March-2015.pdf. This is another text just for example"""

import re
pat = re.compile(r"https?://[Sn]+")
for url in pat.findall(txt):
    print(url.replace("n", "").strip("."))

打印:

https://example.xxx.edu/wp-content/uploads/2015/04/Global-Warming-CCAM-March-2015.pdf

最新更新