外壳提取损坏的网址



我是shell脚本的新手。我正在通过python从邮件中提取一些URL,但是脚本解码的URL已损坏。所以我的想法是编写一个代码,以便我只能提取所需的 URL。

这是文件:

http://stackoverflow.com/questions/17988756/=
how-to-select-lines-between-two-marker-patterns-which-may-occur-multiple-times-w
.
.
.(some text)
http://stackoverflow.com/questions/9605232/=
merge-two-lines-into-one
.
.
.

所需的输出是:

http://stackoverflow.com/questions/17988756/how-to-select-lines-between-two-marker-patterns-which-may-occur-multiple-times-w
http://stackoverflow.com/questions/9605232/merge-two-lines-into-one

提前谢谢。

使用以下sed

sed ':loop; /^http:.*=$/{N;s/=n//g; t loop}' file

测试:

$ cat file
(some text)
http://stackoverflow.com/questions/9605232/=
merge-two-lines=
-into-one
(some text)
$ sed ':loop; /^http:.*=$/{N;s/=n//; t loop}' file
(some text)
http://stackoverflow.com/questions/9605232/merge-two-lines-into-one
(some text)

最新更新