如果条件满足，则排除单词

我有这个文本文件，我想排除单词"访问"；因为在第二、第三或第四位置，a后面跟着a、b或c。

# cat tt.txt
access
ample
taxing

我试过这个，但它返回了全部3个单词。

# grep '[a-c][^a-c][^a-c][^a-c]' tt.txt
access
ample
taxing

更新1:

我使用了上面的简化示例。

# cat tt.txt
access
bccess
ample
taxing
tacking
not
# grep -Ev '[a-c].{0,2}[a-c]' tt.txt
ample
taxing
not
# grep -E '[a-c].{0,2}[^a-c]' tt.txt
access
bccess
ample
taxing
tacking
# Expected
ample
taxing

我想排除单词access，因为在的第二、第三或第四位置，a后面跟有a、b或c

可以使用此awk:

awk '/[a-c]/ && !/[a-c].{0,2}[a-c]/' file
ample
taxing

RegEx细分：

[a-c]：匹配a或b或c
.{0,2}：匹配0到2个任意字符
[a-c]：匹配a或b或c

或者在gnu-grep:中使用环视

grep -P '^(?=.*[a-c])(?!.*[a-c].{0,2}[a-c])' file
ample
taxing

perl:中的相同解决方案

perl -ne 'print if /[a-c]/ && !/[a-c].{0,2}[a-c]/' file

据我所知，你的条件是：

字符串必须包含a、b或c之一
在位置0-3中，没有abc可以在另一个abc之前

那么，我们为什么不这样写代码呢。

use strict;
use warnings;
while (<DATA>) {
next unless /[a-c]/;                            # skip if no abc
next if substr($_, 0, 4) =~ /(?<=[a-c])[a-c]/;  # skip if an abc is preceded by an abc 
print;                                          # otherwise print
}
__DATA__
access
bccess
ample
taxing
tacking
not

这是一种编写代码的方法，模拟Perl oneliners中常用的-n和-p开关所使用的菱形运算符<>。我们使用DATA文件句柄来模拟文件。如果你想把它变成一个oneliner，它会看起来像

$ perl -ne' next unless /[a-c]/; next if substr($_, 0, 4) =~ /(?<=[a-c])[a-c]/; print; ' file.txt

我已经在你的示例单词表上测试过了，它似乎如预期的那样有效。

相关内容

最新更新

热门标签：