如何删除文件中不需要的字符(使用shell脚本)



我有一个文件,看起来像这样(file.txt(

AYOnanl3knsgv2StRr44  CRITICAL","component  MMP-FileService  James.curren@wt.com  CODE_SMELL
AYOnanl3knsgv2StRr45  CRITICAL","component  MMP-FileService  James.curren@wt.com  CODE_SMELL
AYOnanl3knsgv2StRr46  CRITICAL","component  MMP-FileService  James.curren@wt.com  CODE_SMELL
AYOnanl3knsgv2StRr47  CRITICAL","component  MMP-FileService  nipun.dith@wt.com     CODE_SMELL
AYOnanmeknsgv2StRr48  MAJOR",               MMP-FileService  gbhasrajkn@vir.com  CODE_SMELL
AYOnanm-knsgv2StRr4-  BLOCKER",             MMP-FileService  nipun.dith@wt.com      CODE_SMELL
AYOnanm6knsgv2StRr49  MAJOR",               MMP-FileService  nipun.dith@wt.com      CODE_SMELL
AYOnannKknsgv2StRr4_  BLOCKER",             MMP-FileService  nipun.dith@wt.com   BUG
AYODwmuBknsgv2StRqkr  MINOR",               MMP-FileService  sam.curren@wt.com   CODE_SMELL
AYODwmuBknsgv2StRqkt  MINOR",               MMP-FileService  sam.curren@wt.com   CODE_SMELL
AYODwmuBknsgv2StRqks  MINOR",               MMP-component    sam.curren@wt.com   CODE_SMELL
AYODwmuBknsgv2StRqku  MINOR",               MMP-component    sam.curren@wt.com   CODE_SMELL
AYODsI7Fknsgv2StRqac  MAJOR",               MMP-component    sam.curren@wt.com   CODE_SMELL
AYODsI7Nknsgv2StRqad  MAJOR",               MMP-component    sam.curren@wt.com   CODE_SMELL
AYODsI-Qknsgv2StRqai  MAJOR",               MMP-component    sam.curren@wt.com   CODE_SMELL

我必须删除第二列中不需要的字符","component",

则预期输出

AYOnanl3knsgv2StRr44  CRITICAL  MMP-FileService  James.curren@wt.com  CODE_SMELL
AYOnanl3knsgv2StRr45  CRITICAL  MMP-FileService  James.curren@wt.com  CODE_SMELL
AYOnanl3knsgv2StRr46  CRITICAL  MMP-FileService  James.curren@wt.com  CODE_SMELL
AYOnanl3knsgv2StRr47  CRITICAL  MMP-FileService  nipun.dith@wt.com    CODE_SMELL
AYOnanmeknsgv2StRr48  MAJOR     MMP-FileService  gbhasrajkn@vir.com  CODE_SMELL
AYOnanm-knsgv2StRr4-  BLOCKER   MMP-FileService  nipun.dith@wt.com    CODE_SMELL
AYOnanm6knsgv2StRr49  MAJOR     MMP-FileService  nipun.dith@wt.com    CODE_SMELL
AYOnannKknsgv2StRr4_  BLOCKER   MMP-FileService  nipun.dith@wt.com    BUG
AYODwmuBknsgv2StRqkr  MINOR     MMP-FileService  sam.curren@wt.com   CODE_SMELL
AYODwmuBknsgv2StRqkt  MINOR     MMP-FileService  sam.curren@wt.com   CODE_SMELL
AYODwmuBknsgv2StRqks  MINOR     MMP-component  sam.curren@wt.com   CODE_SMELL
AYODwmuBknsgv2StRqku  MINOR     MMP-component  sam.curren@wt.com   CODE_SMELL
AYODsI7Fknsgv2StRqac  MAJOR     MMP-component  sam.curren@wt.com   CODE_SMELL
AYODsI7Nknsgv2StRqad  MAJOR     MMP-component  sam.curren@wt.com   CODE_SMELL
AYODsI-Qknsgv2StRqai  MAJOR     MMP-component  sam.curren@wt.com   CODE_SMELL

这就是我尝试的

cat file.txt | tr -d '",' | sed 's/component//'

然后输出我得到

YOnanl3knsgv2StRr44  CRITICAL  MMP-FileService  James.curren@wt.com  CODE_SMELL
AYOnanl3knsgv2StRr45  CRITICAL  MMP-FileService  James.curren@wt.com  CODE_SMELL
AYOnanl3knsgv2StRr46  CRITICAL  MMP-FileService  James.curren@wt.com  CODE_SMELL
AYOnanl3knsgv2StRr47  CRITICAL  MMP-FileService  nipun.dith@wt.com     CODE_SMELL
AYOnanmeknsgv2StRr48  MAJOR               MMP-FileService  gbhasran@virtusa.com  CODE_SMELL
AYOnanm-knsgv2StRr4-  BLOCKER             MMP-FileService  nipun.dith@wt.com      CODE_SMELL
AYOnanm6knsgv2StRr49  MAJOR               MMP-FileService  nipun.dith@wt.com      CODE_SMELL
AYOnannKknsgv2StRr4_  BLOCKER             MMP-FileService  nipun.dith@wt.com   BUG
AYODwmuBknsgv2StRqkr  MINOR               MMP-FileService  sam.curren@wt.com   CODE_SMELL
AYODwmuBknsgv2StRqkt  MINOR               MMP-FileService  sam.curren@wt.com   CODE_SMELL
AYODwmuBknsgv2StRqks  MINOR               MMP-    sam.curren@wt.com   CODE_SMELL
AYODwmuBknsgv2StRqku  MINOR               MMP-    sam.curren@wt.com   CODE_SMELL
AYODsI7Fknsgv2StRqac  MAJOR               MMP-    sam.curren@wt.com   CODE_SMELL
AYODsI7Nknsgv2StRqad  MAJOR               MMP-    sam.curren@wt.com   CODE_SMELL
AYODsI-Qknsgv2StRqai  MAJOR               MMP-    sam.curren@wt.com   CODE_SMELL

我执行的shell命令也应用于其他列(在本例中,它也应用于第三列(,这就是我遇到的问题。有什么方法可以只对第二列?

有人能帮我弄清楚吗?提前感谢!

注意:我不允许使用jq或其他脚本语言,如JavaScript、Python等。

可以在单个sub:中完成

awk '{sub(/"[^[:blank:]]*$/, "", $2)} 1' file | column -t
AYOnanl3knsgv2StRr44  CRITICAL  MMP-FileService  James.curren@wt.com  CODE_SMELL
AYOnanl3knsgv2StRr45  CRITICAL  MMP-FileService  James.curren@wt.com  CODE_SMELL
AYOnanl3knsgv2StRr46  CRITICAL  MMP-FileService  James.curren@wt.com  CODE_SMELL
AYOnanl3knsgv2StRr47  CRITICAL  MMP-FileService  nipun.dith@wt.com    CODE_SMELL
AYOnanmeknsgv2StRr48  MAJOR     MMP-FileService  gbhasrajkn@vir.com   CODE_SMELL
AYOnanm-knsgv2StRr4-  BLOCKER   MMP-FileService  nipun.dith@wt.com    CODE_SMELL
AYOnanm6knsgv2StRr49  MAJOR     MMP-FileService  nipun.dith@wt.com    CODE_SMELL
AYOnannKknsgv2StRr4_  BLOCKER   MMP-FileService  nipun.dith@wt.com    BUG
AYODwmuBknsgv2StRqkr  MINOR     MMP-FileService  sam.curren@wt.com    CODE_SMELL
AYODwmuBknsgv2StRqkt  MINOR     MMP-FileService  sam.curren@wt.com    CODE_SMELL
AYODwmuBknsgv2StRqks  MINOR     MMP-component    sam.curren@wt.com    CODE_SMELL
AYODwmuBknsgv2StRqku  MINOR     MMP-component    sam.curren@wt.com    CODE_SMELL
AYODsI7Fknsgv2StRqac  MAJOR     MMP-component    sam.curren@wt.com    CODE_SMELL
AYODsI7Nknsgv2StRqad  MAJOR     MMP-component    sam.curren@wt.com    CODE_SMELL
AYODsI-Qknsgv2StRqai  MAJOR     MMP-component    sam.curren@wt.com    CODE_SMELL

此处:

  • "[^[:blank:]]*$:匹配输入($2(中以"开头的文本,并将其替换为空字符串
  • column -t仅用于表格输出,如果不需要,可以将其删除

如果您确定要删除的字符仅出现在第二列中(借用并改编自此处https://unix.stackexchange.com/questions/492500/awk-replace-one-character-only-in-a-certain-column)

awk '{{gsub("",("component)?","", $2)}} 1' file.txt
  • gsub("","component?","", $2)对于每个输入行,将第二个字段中的所有",("component)?替换为空白-这是一个正则表达式,表示查找",,然后可选地查找括号中的部分:"component?是可选的运算符
  • 1是一个awk习惯用法,用于打印$0(包含输入记录(的内容

第一个解决方案: 考虑到您的字符串","component将始终位于同一位置,请尝试以下awk代码。此代码还将保留字段之间的空格,如仅显示的示例所示。

awk '
match($0,/","*[^[:space:]]*/){
print substr($0,1,RSTART-1) sprintf("%-"(RLENGTH) "s",OFS) substr($0,RSTART+RLENGTH)
next
}
1
'  Input_file

第二个解决方案:与GNUawk一起使用match函数和regex,regex确实认为您的字符串将只出现在第二个字段中,如所示示例所示,这也会处理Input_file中的空格并将其保留在输出中。这是所用正则表达式的在线演示。

awk '
match($0,/^([^[:space:]]+[[:space:]]+)([^"]*)(","*[^[:space:]]*)(.*$)/,arr){
print arr[1] arr[2] sprintf("%-"length(arr[3]) "s",OFS) arr[4]
next
}
1
'  Input_file

您可以使用GNU sed执行以下任务,让file.txt内容为

AYOnanl3knsgv2StRr44  CRITICAL","component  MMP-FileService  James.curren@wt.com  CODE_SMELL
AYOnanl3knsgv2StRr45  CRITICAL","component  MMP-FileService  James.curren@wt.com  CODE_SMELL
AYOnanl3knsgv2StRr46  CRITICAL","component  MMP-FileService  James.curren@wt.com  CODE_SMELL
AYOnanl3knsgv2StRr47  CRITICAL","component  MMP-FileService  nipun.dith@wt.com     CODE_SMELL
AYOnanmeknsgv2StRr48  MAJOR",               MMP-FileService  gbhasrajkn@vir.com  CODE_SMELL
AYOnanm-knsgv2StRr4-  BLOCKER",             MMP-FileService  nipun.dith@wt.com      CODE_SMELL
AYOnanm6knsgv2StRr49  MAJOR",               MMP-FileService  nipun.dith@wt.com      CODE_SMELL
AYOnannKknsgv2StRr4_  BLOCKER",             MMP-FileService  nipun.dith@wt.com   BUG
AYODwmuBknsgv2StRqkr  MINOR",               MMP-FileService  sam.curren@wt.com   CODE_SMELL
AYODwmuBknsgv2StRqkt  MINOR",               MMP-FileService  sam.curren@wt.com   CODE_SMELL
AYODwmuBknsgv2StRqks  MINOR",               MMP-component    sam.curren@wt.com   CODE_SMELL
AYODwmuBknsgv2StRqku  MINOR",               MMP-component    sam.curren@wt.com   CODE_SMELL
AYODsI7Fknsgv2StRqac  MAJOR",               MMP-component    sam.curren@wt.com   CODE_SMELL
AYODsI7Nknsgv2StRqad  MAJOR",               MMP-component    sam.curren@wt.com   CODE_SMELL
AYODsI-Qknsgv2StRqai  MAJOR",               MMP-component    sam.curren@wt.com   CODE_SMELL

然后

sed 's/"[^[:space:]]*//' file.txt

给出输出

AYOnanl3knsgv2StRr44  CRITICAL  MMP-FileService  James.curren@wt.com  CODE_SMELL
AYOnanl3knsgv2StRr45  CRITICAL  MMP-FileService  James.curren@wt.com  CODE_SMELL
AYOnanl3knsgv2StRr46  CRITICAL  MMP-FileService  James.curren@wt.com  CODE_SMELL
AYOnanl3knsgv2StRr47  CRITICAL  MMP-FileService  nipun.dith@wt.com     CODE_SMELL
AYOnanmeknsgv2StRr48  MAJOR               MMP-FileService  gbhasrajkn@vir.com  CODE_SMELL
AYOnanm-knsgv2StRr4-  BLOCKER             MMP-FileService  nipun.dith@wt.com      CODE_SMELL
AYOnanm6knsgv2StRr49  MAJOR               MMP-FileService  nipun.dith@wt.com      CODE_SMELL
AYOnannKknsgv2StRr4_  BLOCKER             MMP-FileService  nipun.dith@wt.com   BUG
AYODwmuBknsgv2StRqkr  MINOR               MMP-FileService  sam.curren@wt.com   CODE_SMELL
AYODwmuBknsgv2StRqkt  MINOR               MMP-FileService  sam.curren@wt.com   CODE_SMELL
AYODwmuBknsgv2StRqks  MINOR               MMP-component    sam.curren@wt.com   CODE_SMELL
AYODwmuBknsgv2StRqku  MINOR               MMP-component    sam.curren@wt.com   CODE_SMELL
AYODsI7Fknsgv2StRqac  MAJOR               MMP-component    sam.curren@wt.com   CODE_SMELL
AYODsI7Nknsgv2StRqad  MAJOR               MMP-component    sam.curren@wt.com   CODE_SMELL
AYODsI-Qknsgv2StRqai  MAJOR               MMP-component    sam.curren@wt.com   CODE_SMELL

说明:使用空字符串替换第一个"和所有后续的非空白字符,即删除它。假设:"只出现在第二列,不需要保持列对齐。

(在GNU sed 4.2.2中测试(

相关内容

  • 没有找到相关文章

最新更新