根据字段数添加额外字符串-Sed/Awk



我在一个文本文件中有以下格式的数据。

null,"ABC:MNO"
"hjgy","ABC:PQR"
"mn","qwe","ABC:WER"
"mn","qwe","mno","ABC:WER"

所有行都应该像第3行一样有3个字段。我想要以下格式的数据。

"","","","ABC:MNO"
"hjgy","","","ABC:PQR"
"mn","qwe","","ABC:WER"
"mn","qwe","mno","ABC:WER" 

如果该行以null开头,则null应替换为"","","",

如果只有2个字段,则应在第一个字符串之后添加"","",

如果有3个字段,则应在第二个字符串之后添加"",

如果有4个字段,则不执行任何操作。

我能够使用sed 's/null/"","",""/' test.txt处理第一种情况

但我不知道如何处理接下来的两种情况。

谨致问候。

使用perl:

$ perl -pe 's/^null,/"","","",/; s/.*,K/q("",) x (3 - tr|,||)/e' ip.txt
"","","","ABC:MNO"
"hjgy","","","ABC:PQR"
"mn","qwe","","ABC:WER"
"mn","qwe","mno","ABC:WER"
  • s/^null,/"","","",/首先处理null字段
  • .*,K匹配到行中的最后一个,
    • K有助于避免将该匹配部分放回原处
    • 3 - tr|,||会告诉您缺少多少字段(tr返回值为此处,的出现次数(
    • 这里的q("",)q()用于表示单引号字符串,因此不需要转义"
    • x是字符串复制运算符
    • e标志允许您在替换部分使用Perl代码

如果以null,开头的行总是有两个字段,那么您也可以使用:

perl -pe 's/.*,K/q("",) x (3 - tr|,||)/e; s/^null,/"",/'

awk:类似的逻辑

awk -v q='"",' 'BEGIN{FS=OFS=","} {sub(/^null,/, q q q);
c=4-NF; while (c--) $NF = q $NF} 1'

只显示您的示例,请尝试以下操作。

awk '
BEGIN{
FS=OFS=","
}
{
sub(/^null/,""","",""")
}
NF==2{
$1=$1","","""
}
NF==3{
$2=$2","""
}
1' Input_file

OR"作为变量,也可以尝试以下操作:

awk -v s1="""" '
BEGIN{
FS=OFS=","
}
{
sub(/^null/,s1 "," s1","s1)
}
NF==2{
$1=$1"," s1 "," s1
}
NF==3{
$2=$2"," s1
}
1'  Input_file

解释:添加以上详细解释。

awk '                  ##Starting awk program from here.
BEGIN{                 ##Starting BEGIN section of this program from here.
FS=OFS=","           ##Setting FS and OFS to comma here.
}
{
sub(/^null/,""","",""")  ##Substituting starting with space null to "","","", in current line.
}
NF==2{                 ##If number of fields are 2 then do following.
$1=$1","","""    ##Adding ,"","" after 1st field value here.
}
NF==3{                 ##If number of fields are 3 here then do following.
$2=$2","""         ##Adding ,"" after 2nd field value here.
}
1                      ##Printing current line here.
' Input_file           ##Mentioning Input_file name here.

使用awk的解决方案:

awk -F "," 'BEGIN{ OFS=FS }
{ gsub(/^ /,"",$1)
if($1=="null") print "x22x22","x22x22","x22x22", $2
else if(NF==2) print $1,"x22x22","x22x22",$2
else if(NF==3) print $1,$2,"x22x22",$3
else print $0 }' input

这可能对你有用(GNU sed(:

sed 's/^s*null,/"",/;:a;ta;s/,/&/3;t;s/.*,/&"",/;ta' file

如果该行以null开头,则用空字段(即"",(替换该字段。

使用ta返回:a重置替换成功标志(仅当第一个字段为null并且已被替换时才会出现这种情况(。

如果存在第三个字段分隔符,则全部完成。

否则,在最后一个字段分隔符之前插入一个空字段,然后重复。

最新更新