搜索模式并将占位符设置为变量



我想根据文件将人员分组。文件如下:

group1 = john dave jim collin; 
group2 = abc def ghi jkl mno
      pqr stu vxz; 
group3 = marc;

所以我必须在等号和分号之间匹配人(换行符最终介于两者之间,参见group2)并归属于一个组。

我尝试了以下操作,但没有成功:

my $person2ascr = "sarah";
open (grp_file, "<$group_file");
   # the line bellow will only match if the group list is in one line only
   while(<grp_file>) {my $grp = $1 if (/(.*)s*=s*.*n*.*$person2ascr.*n*.*;/i)};
   # the following line wont match any. Off course i close/open the file again
   while(<grp_file>) {my $grp = $1 if /(w+)s*=s*(w+)*s*$person2ascr(s+w+)*s*;/i};

但当我阅读手册时,我得出的结论是,我做得对:-/有什么帮助吗?

怎么样:

$/=";";
my @grps = <DATA>;
s/n+//g for@grps;
my $person2ascr = "ghi";
for(@grps) {
    say "group: $1" if /^([^=]+)=.*b$person2ascrb/;
}
__DATA__
group1 = john dave jim collin; 
group2 = abc def ghi jkl mno
      pqr stu vxz; 
group3 = marc;

输出:

group:  group2 

当文件具有定义良好的记录结束标记时,有一种非常简单的方法可以一次从文件中读取记录。

#Enclosing braces to ensure local $/ stays very local
{
    #Use 3-arg open (safer)
    open my $fh, '<', $group_file or die "Can't open $group_file: $!";
    #Set "newline" separator to the end-of-record token
    local $/ = ";n";
    while(my $record = <$fh>) {
        #$record will contain "groupN = some name or other;n"
        chomp $record;
        #$record now contains "groupN = some name or other" without the trailing ";n"
        my ($group, $data) = split / = /, $record, 2;
        #$group contains "groupN"; $data contains "some name or other"
        $grp = $group if $data =~ /$person2ascr/; #Add i modifier if you want case insensitive matching
    }
    #It's paranoid, but close _can_ fail
    close $fh or warn "Closing $group_file failed: $!";
}

这个解决方案可能有些过头了。它解析组文件并构建完整的数据结构。不过,如果您重复查询组信息,这可能是合适的。如果您只需要针对组文件中的几个名称grep,那么您可能不想要这个解决方案,因为它在这方面做得太过火了。

我为groups文件编写了一个通用解析器,它返回两个映射:从名称到组的映射和从组到名称的映射。

sub parse_name_groups
{
    my $file  = shift;          # file name of group file
    my %group_to_names;         # Hash mapping groups to lists of names
    my %name_to_groups;         # Hash mapping names to a list of groups
    my $group = "<UNKNOWN>";    # If we see a name outside of a group, assign it to <UNKNOWN>
    my $last_line_in_group = 0; # Flag: If we see a semicolon, this is the last line in a group.
    open my $fh, "<", $file
        or die "Cannot open group file '$file'n";
    foreach my $line (<$fh>)
    {
        chomp $line;
        # Trim white space from front and back
        $line =~ s/^s*//g;
        $line =~ s/s*$//g;
        # Does line begin with a group specifier (ie. "group = ")?
        # If so, grab it and make it our current group.
        if ($line =~ s/^s*(S+)s*=s*//)
        {
            $group = $1;
        }
        # Does line have a semicolon?  Ignore it and everything
        # after.  Also, reset $group to <UNKNOWN> after this line.
        if ($line =~ s/;.*$//)
        {
            $last_line_in_group = 1;
        }
        # Split the rest of the line into a list of names
        # and make the name-to-group and group-to-name 
        # association.
        foreach my $name (split /s+/, $line)
        {
            push @{ $group_to_names{ $group } }, $name;
            push @{ $name_to_groups{ $name  } }, $group;
        }
        if ($last_line_in_group)
        {
            $group = "<UNKNOWN>";
        }
        $last_line_in_group = 0;
    }
    close $fh;
    return ( %group_to_names, %name_to_groups );
}

这里有一个示例程序,它将在组文件中查找一个名称,并告诉您该名称属于哪个组(如果有的话):

# Example program that looks up the group(s) associated with a name.  
# Usage:
# 
#   ./lookup_name group_file name
if ($#ARGV != 1)
{
    die "Usage: lookup_name group_file namen";
}
my ( $file, $name ) = @ARGV;
my ($group_to_names, $name_to_groups) = parse_name_groups( $file );
my $groups = $name_to_groups->{ $name };
if (!defined $groups)
{
    print "$name does not belong to any groupsn";
} else
{
    print join("n", @$groups), "n";
}

由于没有完全指定组文件格式,我在解析器中进行了一些判断调用。具体来说,如果它在看到group =指定之前看到类似名称的东西,它将把这些名称分配给组<UNKNOWN>。同样,如果它看到一个分号,它看到的任何名称(从后面的一行开始),但在group =被分配给组<UNKNOWN>之前。

该代码还将分号视为"行尾"指示。同一行分号之后的任何内容都将被忽略。

上面的代码中应该有足够的注释,这样您就可以根据应用程序的需要更改这些行为。

最新更新