如何删除由逗号分隔符分隔的10列行中的换行符



我需要删除每行中的换行符,其中10列由逗号分隔符分隔。这是输入:

EXP_TRANSF_DE_PARA,N/A,Input,1,1,1,04/30/2020 19:52:20,1588287140,11131,Transformation [EXP_TRANSF_DE_PARA] had an error evaluating variable column [v_NUFONE]. Error message is [<<Expression Error>> [TO_INTEGER]: decimal operation overflow
... i:TO_INTEGER(u:RTRIM(u:LTRIM(u:'6711630149',u:' ���'),u:' ���'),i:0)].,3,N/A,N/A,N/A,-1,-1,N/A
EXP_TRANSF_DE_PARA,N/A,Input,1,2,1,04/30/2020 19:52:20,1588287140,11131,Transformation [EXP_TRANSF_DE_PARA] had an error evaluating variable column [v_NUFONE]. Error message is [<<Expression Error>> [TO_INTEGER]: decimal operation overflow
... i:TO_INTEGER(u:RTRIM(u:LTRIM(u:'6342311300',u:' ���'),u:' ���'),i:0)].,3,N/A,N/A,N/A,-1,-1,N/A
PREST_TELEFONE_HIS,N/A,Input,1,3,1,04/30/2020 19:52:20,1588287140,8361,Error loading row to target table [PREST_TELEFONE_HIS]. Error message [
FnName: Execute -- [IBM][CLI Driver][DB2] SQL0407N  Assignment of a NULL value to a NOT NULL column ""*N"" is not allowed.  SQLSTATE=23502
],2,N/A,N/A,N/A,-1,-1,N/A

这应该是输出:

EXP_TRANSF_DE_PARA,N/A,Input,1,1,1,04/30/2020 19:52:20,1588287140,11131,Transformation [EXP_TRANSF_DE_PARA] had an error evaluating variable column [v_NUFONE]. Error message is [<<Expression Error>> [TO_INTEGER]: decimal operation overflow... i:TO_INTEGER(u:RTRIM(u:LTRIM(u:'6711630149',u:' ���'),u:' ���'),i:0)].,3,N/A,N/A,N/A,-1,-1,N/A
EXP_TRANSF_DE_PARA,N/A,Input,1,2,1,04/30/2020 19:52:20,1588287140,11131,Transformation [EXP_TRANSF_DE_PARA] had an error evaluating variable column [v_NUFONE]. Error message is [<<Expression Error>> [TO_INTEGER]: decimal operation overflow... i:TO_INTEGER(u:RTRIM(u:LTRIM(u:'6342311300',u:' ���'),u:' ���'),i:0)].,3,N/A,N/A,N/A,-1,-1,N/A
PREST_TELEFONE_HIS,N/A,Input,1,3,1,04/30/2020 19:52:20,1588287140,8361,Error loading row to target table [PREST_TELEFONE_HIS]. Error message [FnName: Execute -- [IBM][CLI Driver][DB2] SQL0407N  Assignment of a NULL value to a NOT NULL column ""*N"" is not allowed.  SQLSTATE=23502],2,N/A,N/A,N/A,-1,-1,N/A

到目前为止,我已经尝试了这个awk命令:

awk -F"," 'NF=10{printf("%s",$0);getline;print;next}1'

输出:

EXP_TRANSF_DE_PARA N/A Input 1 1 1 04/30/2020 19:52:20 1588287140 11131 Transformation [EXP_TRANSF_DE_PARA] had an error evaluating variable column [v_NUFONE]. Error message is [<<Expression Error>> [TO_INTEGER]: decimal operation overflow... i:TO_INTEGER(u:RTRIM(u:LTRIM(u:'6711630149',u:' ���'),u:' ���'),i:0)].,3,N/A,N/A,N/A,-1,-1,N/A
EXP_TRANSF_DE_PARA N/A Input 1 2 1 04/30/2020 19:52:20 1588287140 11131 Transformation [EXP_TRANSF_DE_PARA] had an error evaluating variable column [v_NUFONE]. Error message is [<<Expression Error>> [TO_INTEGER]: decimal operation overflow... i:TO_INTEGER(u:RTRIM(u:LTRIM(u:'6342311300',u:' ���'),u:' ���'),i:0)].,3,N/A,N/A,N/A,-1,-1,N/A
PREST_TELEFONE_HIS N/A Input 1 3 1 04/30/2020 19:52:20 1588287140 8361 Error loading row to target table [PREST_TELEFONE_HIS]. Error message [FnName: Execute -- [IBM][CLI Driver][DB2] SQL0407N  Assignment of a NULL value to a NOT NULL column *N is not allowed.  SQLSTATE=23502
] 2 N/A N/A N/A -1 -1 N/A  ] 2 N/A N/A N/A -1 -1 N/A

我不知道为什么命令要删除行中的逗号分隔符。我知道第6行没有10列,这就是为什么不删除断线。。。有什么建议吗?

试试这个

awk -F","  '{OFS=",";  if ($3 != "Input") {printf "%s", $0} else {printf "n%s" ,$0}}' |sed '1d'  | sed  -e '$a'

演示:

$cat file.txt 
EXP_TRANSF_DE_PARA,N/A,Input,1,1,1,04/30/2020 19:52:20,1588287140,11131,Transformation [EXP_TRANSF_DE_PARA] had an error evaluating variable column [v_NUFONE]. Error message is [<<Expression Error>> [TO_INTEGER]: decimal operation overflow
... i:TO_INTEGER(u:RTRIM(u:LTRIM(u:'6711630149',u:' ���'),u:' ���'),i:0)].,3,N/A,N/A,N/A,-1,-1,N/A
EXP_TRANSF_DE_PARA,N/A,Input,1,2,1,04/30/2020 19:52:20,1588287140,11131,Transformation [EXP_TRANSF_DE_PARA] had an error evaluating variable column [v_NUFONE]. Error message is [<<Expression Error>> [TO_INTEGER]: decimal operation overflow
... i:TO_INTEGER(u:RTRIM(u:LTRIM(u:'6342311300',u:' ���'),u:' ���'),i:0)].,3,N/A,N/A,N/A,-1,-1,N/A
PREST_TELEFONE_HIS,N/A,Input,1,3,1,04/30/2020 19:52:20,1588287140,8361,Error loading row to target table [PREST_TELEFONE_HIS]. Error message [
FnName: Execute -- [IBM][CLI Driver][DB2] SQL0407N  Assignment of a NULL value to a NOT NULL column ""*N"" is not allowed.  SQLSTATE=23502
],2,N/A,N/A,N/A,-1,-1,N/A
$awk -F","  '{OFS=",";  if ($3 != "Input") {printf "%s", $0} else {printf "n%s" ,$0}}' file.txt  | sed '1d'  | sed  -e '$a'
EXP_TRANSF_DE_PARA,N/A,Input,1,1,1,04/30/2020 19:52:20,1588287140,11131,Transformation [EXP_TRANSF_DE_PARA] had an error evaluating variable column [v_NUFONE]. Error message is [<<Expression Error>> [TO_INTEGER]: decimal operation overflow... i:TO_INTEGER(u:RTRIM(u:LTRIM(u:'6711630149',u:' ���'),u:' ���'),i:0)].,3,N/A,N/A,N/A,-1,-1,N/A
EXP_TRANSF_DE_PARA,N/A,Input,1,2,1,04/30/2020 19:52:20,1588287140,11131,Transformation [EXP_TRANSF_DE_PARA] had an error evaluating variable column [v_NUFONE]. Error message is [<<Expression Error>> [TO_INTEGER]: decimal operation overflow... i:TO_INTEGER(u:RTRIM(u:LTRIM(u:'6342311300',u:' ���'),u:' ���'),i:0)].,3,N/A,N/A,N/A,-1,-1,N/A
PREST_TELEFONE_HIS,N/A,Input,1,3,1,04/30/2020 19:52:20,1588287140,8361,Error loading row to target table [PREST_TELEFONE_HIS]. Error message [FnName: Execute -- [IBM][CLI Driver][DB2] SQL0407N  Assignment of a NULL value to a NOT NULL column ""*N"" is not allowed.  SQLSTATE=23502],2,N/A,N/A,N/A,-1,-1,N/A
$

实验:

awk -F","<--将分隔符设置为,

'{OFS=",";<--将输出字段分隔符设置为,,因为我们将使用printf来格式化文本

if ($3 != "Input") {printf "%s", $0}<--如果当前记录的第3列不是"Input",则打印当前记录。请注意,我们没有添加newline,因此记录不会被终止。

else {printf "n%s" ,$0}}'<--如果当前记录是记录,我们希望在打印记录之前添加换行符n

sed '1d'<--删除第一条记录。这将是空行,因为我们的记录有"输入">

sed -e '$a'<--在文件末尾添加新行。

这里有一个Bash脚本可以解决您的问题:

#!/bin/bash
set -o errexit
set -o nounset
fieldCount=20
#filter out newlines which are not record separators
fieldNum=1
while read -N1 -r ch; do
if [ "$ch" = "," ]; then
fieldNum="$((fieldNum + 1))"
elif [ "$ch" = $'n' ] && [ "$fieldNum" = "$fieldCount" ]; then
fieldNum=1
fi
if [ "$ch" != $'n' ] || [ "$fieldNum" = 1 ]; then
printf "$ch"
fi
done
printf 'n'

选项-N1每次读取一个字符(而不是一次读取一行(,选项-r将反斜杠视为普通字符。

这个问题也可以用一个类似大小的简单C程序来解决:

#include <stdio.h>
int main(void)
{
const int fieldCount = 20;
int fieldNum, ch;
/*filter out newlines which are not record separators*/
fieldNum = 1;
ch = getchar();
while (ch != EOF) {
if (ch == ',') {
fieldNum++;
} else if ((ch == 'n') && (fieldNum == fieldCount)) {
fieldNum = 1;
}
if ((ch != 'n') || (fieldNum == 1)) {
putchar(ch);
}
ch = getchar();
}
putchar('n');
return 0;
}

最新更新