使用 grep 提取单引号之间的路径

我正在使用wget下载文件，在此过程中，我保存日志消息(见下文(以供以后使用。最重要的部分是这条线Saving to: ‘/path/somefile.gz’.

我想出了，如何使用 .
提取这个截图grep Saving现在，我的问题是：如何只提取单引号之间的路径？‘/path/somefile.gz’ => /path/somefile.gz

HTTP request sent, awaiting response... 200 OK
Length: 15391 (15K) [application/octet-stream]
Saving to: ‘/path/somefile.gz’
0K .......... .....                                      100% 79,7M=0s
2020-07-06  - ‘/path/somefile.gz’ saved [15391/15391]

Total wall clock time: 0,1s
Downloaded: 1 files, 15K in 0s (79,7 MB/s)

编辑

有没有办法已经以这种形式处理它？

wget -m --no-parent -nd https://someurl/somefile.gz -P ~/src/  2>&1 |
grep Saving |
tee ~/src/log.txt

提前谢谢你！

wget的示例输出：

$ cat wget.out
HTTP request sent, awaiting response... 200 OK
Length: 15391 (15K) [application/octet-stream]
Saving to: '/path/somefile.gz'
0K .......... .....                                      100% 79,7M=0s
2020-07-06  - 'path/somefile.gz' saved [15391/15391]

Total wall clock time: 0,1s
Downloaded: 1 files, 15K in 0s (79,7 MB/s)

提取所需路径/文件的一种awk解决方案：

$ awk -F"'" '                        # define input delimiter as single quote
/Saving to:/   { print $2 }          # if line contains string "Saving to:" then print 2nd input field
' wget.out                           # our input
/path/somefile.gz                    # our output

要将上述内容保存到变量中：

$ wget_path=$(awk -F"'" '/Saving to:/ {print $2}' wget.out)
$ echo "${wget_path}"
/path/somefile.gz

跟进OP对问题的编辑...将wget的输出管道输送到awk解决方案中：

wget -m --no-parent -nd https://someurl/somefile.gz -P ~/src/ 2>&1 | awk -F"'" '/Saving to:/ {print $2}' | tee ~/src/log.txt

由于问题要求在grep中解，因此单个 GNUgrep命令来提取指定的路径可以是：

grep -Po "^Saving to: .\K[^']*"

前提是 Perl 正则表达式在grep中实现(并非所有grep都实现了这些(。

当然，它也可以在管道中使用：

wget_command | grep -Po "^Saving to: .\K[^']*" | tee log.txt

请注意，我在模式匹配表达式中使用了单引号 ('( 字符来锚定路径的末尾，但在问题中，示例输入中使用了 Unicode 字符左单引号 (U+2018( (‘( 和 Unicode 字符右单引号 (U+2019( (’(。如果这确实是有意的，那么只需将[^']替换为上面模式匹配表达式中的[^’]即可。

相关内容

最新更新

热门标签：