用于从命令历史记录中获取已执行脚本名称的正则表达式

我正在尝试编写一个正则表达式，它将解析调用脚本的语法，并捕获脚本名称。

所有这些都是调用的有效语法

# normal way
run cred=username/password script.bi
# single quoted username password, also separated in a different way
run cred='username password' script.bi
# username/password is optional
run script.bi
# script extension is optional
run script
# the call might be broken into multiple lines using 
# THIS ONE SHOULD NOT MATCH
run cred=username/password 
script.bi

这就是我目前拥有的

my $r = q{run +(?:cred=(?:[^s']*|'.*') +)?([^s\]+)};

以捕获CCD_ 1中的值。

但我有

Unmatched [ before HERE mark in regex m/run +(?:cred=(?:[^s']*|'.*') +)?([ << HERE ^s]+)/

\被视为，因此在正则表达式中它变为]，从而转义]，从而转义不匹配的[

替换为run +(?:cred=(?:[^s']*|'.*') +)?([^s\\]+)（注意\\），然后重试。

此外，从注释中，您必须使用qr作为regex，而不仅仅是$10。

（我只是查看了错误，而不是针对您的问题的正则表达式的有效性/效率）

指定正则表达式问题的本质是一个字节的差异：q和qr。你正在写一个正则表达式，所以就这么称呼它吧。将模式视为字符串意味着你必须处理字符串引用的规则，而不是正则表达式转义的规则。

至于正则表达式匹配的语言，请添加锚点以强制模式匹配整行。正则表达式引擎决心坚定，将一直工作到找到匹配项为止。没有锚，很高兴找到一个子字符串。

有时这会给你带来意想不到的结果。你有没有遇到过一个脾气暴躁的孩子（或一个幼稚的成年人），他对你所说的话有狭隘、极端字面的解释？正则表达式引擎是这样的，但它试图提供帮助。

与最后一个例子相匹配，因为

您在使用?量词时说过，cred=...子模式可以匹配零次，所以正则表达式引擎跳过了它
您说脚本名称是下面的子字符串，它是一个或多个非空白、非反斜杠字符的运行，所以正则表达式引擎看到cred=username/password，它们都不是空白或反斜杠字符，并且匹配。正则表达式是贪婪的：它们考虑眼前的内容，而不考虑给定的子字符串是否"应该"与另一个子模式匹配

最后一个例子符合要求——尽管不是按照你想要的方式。正则表达式的一个重要教训是任何可以匹配零次的量词，如?或*，总是成功！

如果没有$锚，问题中的模式会使尾部反斜杠不匹配，您可以通过对$runpat进行轻微修改来看到这一点。

qr{run +(?:cred=(?:[^s']*|'.*') +)?([^s\]+)(.*)}; # ' SO hiliter hack

请注意末尾的(.*)，以获取可能留下的任何非换行符。将环路更改为

while (<DATA>) {
  next unless /$runpat/;
  print "line $.: $1=[$1]; $2=[$2]n";
}

给出了第15行的以下输出。

第15行：$1=[cred=username/password]$2=[\]

作为一个完整的程序，它变成了

#! /usr/bin/env perl
use strict;
use warnings;
# The goofy comment on the next line is a hack to
# help Stack Overflow's syntax highlighter recover
# from its confusion after seeing the quotes. It's
# for presentation only: you won't need it in your
# real code.
my $runpat = qr{^s*run +(?:cred=(?:[^s']*|'.*') +)?([^s\]+)$}; # '
while (<DATA>) {
  next unless /$runpat/;
  print "line $.: $1=[$1]n";
}
__DATA__
# normal way
run cred=username/password script.bi
# single quoted username password, also separated in a different way
run cred='username password' script.bi
# username/password is optional
run script.bi
# script extension is optional
run script
# the call might be broken into multiple lines using 
# THIS ONE SHOULD NOT MATCH
run cred=username/password 
script.bi

输出：

第2行：$1=[script.bi]第5行：$1=[script.bi]第8行：$1=[script.bi]第11行：$1=[script]

简洁并不总是对正则表达式有帮助。考虑以下替代但等效的规范：

my $runpat = qr{
  ^ s*
  (?:
    run s+ cred=(?:[^s']*|'.*?') s+ (?<script> [^s\]+)  # ' hiliter
  | run s+ (?!cred=)                  (?<script> [^s\]+)
  )
  s* $
}x;

是的，它需要更多的写作空间，但它更清楚地说明了可接受的替代方案。你的循环几乎是相同的

while (<DATA>) {
  next unless /$runpat/;
  print "line $.: script=[$+{script}]n";
}

甚至让可怜的读者不用数括号。

要使用命名的捕获缓冲区，，例如、(?<script>...)，请确保添加

use 5.10.0;

到程序的顶部，以提供perl的最低要求版本的可执行文档。

脚本有时会有参数吗？如果没有，为什么不：

/^run(?:s.*s|s)(S+)s*$/

我想这对换行位不起作用。

/^run(?:s+cred=(?:[^'s]*|'[^']*')s+|s+)([^\s]+)s*$/

测试程序：

#!/usr/bin/perl
$foo="# normal way
run cred=username/password script.bi
# single quoted username password, also separated in a different way
run cred='username password' script.bi
# username/password is optional
run script.bi
# script extension is optional
run script
# the call might be broken into multiple lines using 
# THIS ONE SHOULD NOT MATCH
run cred=username/password \
script.bi
";
foreach my $line (split(/n/,$foo))
{
  print "Looking >$line<n";
  print "Match >$1<n"
    if ($line =~ /^run(?:s+cred=(?:[^'s]*|'[^']*')s+|s+)([^\s]+)s*$/);
}

示例输出：

Looking ># normal way<
Looking >run cred=username/password script.bi<
Match >script.bi<
Looking ><
Looking ># single quoted username password, also separated in a different way<
Looking >run cred='username password' script.bi<
Match >script.bi<
Looking ><
Looking ># username/password is optional<
Looking >run script.bi<
Match >script.bi<
Looking ><
Looking ># script extension is optional<
Looking >run script<
Match >script<
Looking ><
Looking ># the call might be broken into multiple lines using <
Looking ># THIS ONE SHOULD NOT MATCH<
Looking >run cred=username/password <
Looking >script.bi<

相关内容

最新更新

热门标签：