awk捕捉元素到空格或使用特殊的转义字符

尝试提取$1中-之后的第5个元素，直到空间或\。如果使用了/，则脚本awk -F'[-/'] 'NR==0{print; next} {print $0"t""t"$5}' file按预期工作。谢谢：(。

文件--制表符分隔--

00-0000-L-F-Male    \pathto   xxx xxx
00-0001-L-F-Female  \pathto   xxx xxx

所需(最后一个字段之前有两个选项卡(

00-0000-L-F-Male    \pathto   xxx xxx         Male
00-0001-L-F-Female  \pathto   xxx xxx         Female

awk

awk -F'-[[:space:]][[:space:]]+' 'NR==0{print; next} {print $0"t""t"$5}' file
00-0000-L-F-Male        \pathto       xxx     xxx
00-0001-L-F-Female      \pathto       xxx     xxx

awk 2

awk -F'[-\]' 'NR==0{print; next} {print $0"t""t"$5}' file
awk: fatal: Unmatched [ or [^: /[-]/

使用任何awk:

$ awk -F'[-t]' -v OFS='tt' '{print $0, $5}' file
00-0000-L-F-Male    \pathto   xxx xxx     Male
00-0001-L-F-Female  \pathto   xxx xxx     Female

关于您的脚本：

awk
awk -F'-[[:space:]][[:space:]]+' 'NR==0{print; next} {print $0"t""t"$5}' file

-F'-[[:space:]][[:space:]]+'表示您的字段由一个-和两个或多个空格分隔，但它们不是
CCD_ 9说"；对行号0执行CCD_ 10"；但是在任何输入中都不存在行号0

awk 2
awk -F'[-\]' 'NR==0{print; next} {print $0"t""t"$5}' file

-F'[-\]'似乎试图将FS设置为减号或反斜杠，但您已经告诉我们您的字段是制表符分隔的，而不是反斜杠分隔的
当以这种方式设置FS时，它会经历几个不同的解释阶段，将shell字符串转换为awk字符串，将awk字符串转换为regexp，并将regexp用作字段分隔符，因此需要几层转义(而不仅仅是1(来生成反斜杠文字。如果不确定，请继续添加反斜杠，直到警告和错误消失

您可以使用此awk:

awk -F't' '{n=split($1, a, /-/); print $0 FS FS a[(n > 4 ? 5 : n)]}' file
00-0000-L-F-Male        \pathto       xxx xxx         Male
00-0001-L-F-Female      \pathto       xxx xxx         Female

如果数组中有5个或5个以上元素，a[(n > 4 ? 5 : n)]表达式将从数组中获取第5个元素，否则它将获取最后一个元素。

假设您的文件是't'，每个字段用一个选项卡分隔，并且您希望在Male/Female输出之前有一个空字段，则可以使用：

awk -F"t" '{ split($1,arr,"-"); print $0 "tt" arr[5] }' filetabs.txt

示例使用/输出

如果filetabs.txt包含带有制表符字段分隔符的样本数据，则会得到：

$ awk -F"t" '{ split($1,arr,"-"); print $0 "tt" arr[5] }' filetabs.txt
00-0000-L-F-Male        \pathto       xxx xxx         Male
00-0001-L-F-Female      \pathto       xxx xxx         Female

有了perl一个支持延迟匹配的liner，我们可以尝试以下代码。仅在所示样品中书写和测试。

perl -pe 's/^((?:.*?-)+)([^[:space:]]+)([[:space:]]+.*)$/123tt2/'  Input_file

上述或也可以写成如下：

perl -pe 's/^((?:.*?-)+)(S+)(s+.*)$/123tt2/' Input_file

解释：添加上面所用正则表达式的详细解释。这是代码中使用的regex的在线演示。

^(                ##From starting of the value creating one capturing group here.
(?:            ##Opening non-capturing group here.
.*?-           ##Using lazy match till - here.
)+             ##Closing non-capturing group here with matching 1 OR more occurrences of this.
)                ##Closing 1st capturing group here.
([^[:space:]]+)   ##Creating 2nd capturing group and matching all non-spaces in it.
([[:space:]]+.*)$ ##Creating 3rd capturing group which matches spaces till end of the value.

相关内容

最新更新

热门标签：