>标题说明了一切。我正在尝试构建一个正则表达式,但失败得很惨。任务是返回逗号分隔列表中与"禁止"常量字符串不匹配的第一个字符串。"禁止"字符串可以出现在列表中的任何位置,并且(理论上)可以在列表中多次出现。
例如(当"禁止"字符串 = "TBD" 时):
"TBD,Smith" --> need to return Smith
"TBD,TBD,TBD,Jones,Edwards" --> need to return Jones
"ABC,TBD,Smith" --> need to return ABC
"TBD,DEF-9gh,GHI,JKLMNOpqrst,Any old string" --> need to return DEF-9gh
有没有知道如何做到这一点的正则表达式忍者?
使用 grep -P
:
s="ABC,TBD,Smith"
echo "$s"|grep -oP '(^|,)K(?!TBD)[^,]+'|head -1
ABC
s="TBD,TBD,TBD,Jones,Edwards"
echo "$s"|grep -oP '(^|,)K(?!TBD)[^,]+'|head -1
Jones
s="TBD,DEF-9gh,GHI,JKLMNOpqrst,Any old string"
echo "$s"|ggrep -oP '(^|,)K(?!TBD)[^,]+'|head -1
DEF-9gh
如果你的 grep 不支持-P
那么这里有一个 awk 解决方案:
echo "$s" | awk -F '(TBD,)*|,' '{print $1$2; exit}'
DEF-9gh
我是否正确理解了你的问题?
awk
:
$ awk -F',' '{for(i=1;i<=NF;i++){if($i!="TBD"){print $i;next}}}' input.txt
Smith
Jones
ABC
DEF-9gh
符合 POSIX 标准的外壳解决方案:
$ cat t.sh
#!/bin/sh
while read -r line; do
IFS=,
for token in ${line}; do
if [ "${token}" != TBD ]; then
echo "${token}"
continue 2
fi
done
done <<EOT
TBD,Smith
TBD,TBD,TBD,Jones,Edwards
ABC,TBD,Smith
TBD,DEF-9gh,GHI,JKLMNOpqrst,Any old string
EOT
.
$ ./t.sh
Smith
Jones
ABC
DEF-9gh
或者只是
get_token()
(
IFS=,
for t in $@; do
[ "$t" != TBD ] && echo "$t" && break
done
)
get_token "TBD,TBD,TBD,Jones,Edwards" # => "Jones"