Awk将三个文件的结果合并为一个文件,并用任何字符分隔



我使用awk替换每个字段之间的空格分隔符,使其成为一个字段。我想将处理任意数量文件的输出合并为一个结果文件,用空格分隔。

awk -v OFS='^' '{for(i=1; i<=NF; i++)printf("%s%s", $i,(i==NF)?ORS:OFS)}' filename > outputFile

File1 Awk命令之后

777^Brockton^Avenue,^Abington^MA^2351         
30^Memorial^Drive,Avon^MA^2322               
250^Hartford^Avenue,^Bellingham^MA^2019.    
....

当awk应用于文件2时,它不受命令的影响,因为它只有一个字段。

madanm@comcast.net
skajan@verizon.net
barnett@hotmail.com
sbmrjbr@sbcglobal.net
mastinfo@sbcglobal.net
....

我尝试在它应用awk命令后合并三个文件

paste listOf* |  awk -v OFS='^' '{for(i=1; i<=NF; i++)printf("%s%s", $i,(i==NF)?ORS:OFS)}' > outputFile

但我的结果看起来像这个

777^Brockton^Avenue,^Abington^MA^2351^madanm@comcast.net^Manual^Ordway
30^Memorial^Drive,^Avon^MA^2322^skajan@verizon.net^Yuonne^Cajigas
250^Hartford^Avenue,^Bellingham^MA^2019^barnett@hotmail.com^Pattie^Darsey
700^Oak^Street,^Brockton^MA^2301^sbmrjbr@sbcglobal.net^Cammie^Knoles
66-4^Parkhurst^Rd,^Chelmsford^MA^1824^mastinfo@sbcglobal.net^Evia^Fallen
591^Memorial^Dr,^Chicopee^MA^1020^carcus@aol.com^Soo^Sanfilippo

我希望它看起来像这个

Home Address[delimiter]Email[delimiter]Name[delimiter]
777^Brockton^Avenue,^Abington^MA^2351 madanm@comcast.net Manual^Ordway
30^Memorial^Drive,^Avon^MA^2322 skajan@verizon.net Yuonne^Cajigas
250^Hartford^Avenue,^Bellingham^MA^2019 barnett@hotmail.com Pattie^Darsey
700^Oak^Street,^Brockton^MA^2301 sbmrjbr@sbcglobal.net Cammie^Knoles
66-4^Parkhurst^Rd,^Chelmsford^MA^1824 mastinfo@sbcglobal.net Evia^Fallen
591^Memorial^Dr,^Chicopee^MA^1020 carcus@aol.com Soo^Sanfilippo

不要做你想做的事情,你真的把文件搞砸了。特别是,^是一个糟糕的字符选择,因为它是一个regexp元字符,因此会使任何进一步的处理都比必须的要困难得多。为什么不在输入中保留空白字符(如果可以存在制表符,则将其中的任何制表符转换为空格(,并使用制表符作为分隔符,或者将整个内容转换为CSV?

例如,给定此输入:

$ cat file1
777 Brockton Avenue, Abington MA 2351
30 Memorial Drive, Avon MA 2322
250 Hartford Avenue, Bellingham MA 2019
700 Oak Street, Brockton MA 2301
66-4 Parkhurst Rd, Chelmsford MA 1824
591 Memorial Dr, Chicopee MA 1020
$ cat file2
madanm@comcast.net
skajan@verizon.net
barnett@hotmail.com
sbmrjbr@sbcglobal.net
mastinfo@sbcglobal.net
carcus@aol.com
$ cat file3
Manual Ordway
Yuonne Cajigas
Pattie Darsey
Cammie Knoles
Evia Fallen
Soo Sanfilippo

你可以生产TSV:

$ cat tst.awk
BEGIN {
OFS = "t"
ofmt = "%s%s"
numFiles = ARGC - 1
}
FNR == 1 {
fileNr++
}
{
gsub(/[[:space:]]+/," ")
gsub(/^ | $/,"")
val[FNR,ARGIND] = $0
}
fileNr == numFiles {
for (i=1; i<=numFiles; i++) {
printf ofmt, val[FNR,i], (i<numFiles ? OFS : ORS)
}
}
$ awk -f tst.awk file1 file2 file3
777 Brockton Avenue, Abington MA 2351   madanm@comcast.net      Manual Ordway
30 Memorial Drive, Avon MA 2322 skajan@verizon.net      Yuonne Cajigas
250 Hartford Avenue, Bellingham MA 2019 barnett@hotmail.com     Pattie Darsey
700 Oak Street, Brockton MA 2301        sbmrjbr@sbcglobal.net   Cammie Knoles
66-4 Parkhurst Rd, Chelmsford MA 1824   mastinfo@sbcglobal.net  Evia Fallen
591 Memorial Dr, Chicopee MA 1020       carcus@aol.com  Soo Sanfilippo

或CSV(仅更改OFSofmt的值(:

$ cat tst.awk
BEGIN {
OFS = ","
ofmt = ""%s"%s"
numFiles = ARGC - 1
}
FNR == 1 {
fileNr++
}
{
gsub(/[[:space:]]+/," ")
gsub(/^ | $/,"")
val[FNR,ARGIND] = $0
}
fileNr == numFiles {
for (i=1; i<=numFiles; i++) {
printf ofmt, val[FNR,i], (i<numFiles ? OFS : ORS)
}
}
$ awk -f tst.awk file1 file2 file3
"777 Brockton Avenue, Abington MA 2351","madanm@comcast.net","Manual Ordway"
"30 Memorial Drive, Avon MA 2322","skajan@verizon.net","Yuonne Cajigas"
"250 Hartford Avenue, Bellingham MA 2019","barnett@hotmail.com","Pattie Darsey"
"700 Oak Street, Brockton MA 2301","sbmrjbr@sbcglobal.net","Cammie Knoles"
"66-4 Parkhurst Rd, Chelmsford MA 1824","mastinfo@sbcglobal.net","Evia Fallen"
"591 Memorial Dr, Chicopee MA 1020","carcus@aol.com","Soo Sanfilippo"

或任何其他常见的文件格式。以上两者都可以通过例如MS Excel来理解。

只是为了显示所需的最小更改,以获得您实际要求的内容(再次,不要这样做!(将是:

$ cat tst.awk
BEGIN {
OFS  = " "
ofmt = "%s%s"
numFiles = ARGC - 1
}
FNR == 1 {
fileNr++
}
{
gsub(/[[:space:]^]+/,"^")
gsub(/^^|^$/,"")
val[FNR,ARGIND] = $0
}
fileNr == numFiles {
for (i=1; i<=numFiles; i++) {
printf ofmt, val[FNR,i], (i<numFiles ? OFS : ORS)
}
}
$ awk -f tst.awk file1 file2 file3
777^Brockton^Avenue,^Abington^MA^2351 madanm@comcast.net Manual^Ordway
30^Memorial^Drive,^Avon^MA^2322 skajan@verizon.net Yuonne^Cajigas
250^Hartford^Avenue,^Bellingham^MA^2019 barnett@hotmail.com Pattie^Darsey
700^Oak^Street,^Brockton^MA^2301 sbmrjbr@sbcglobal.net Cammie^Knoles
66-4^Parkhurst^Rd,^Chelmsford^MA^1824 mastinfo@sbcglobal.net Evia^Fallen
591^Memorial^Dr,^Chicopee^MA^1020 carcus@aol.com Soo^Sanfilippo

分隔符并不总是空格。因此,基本上,您的第一个命令不会删除分隔符,而是将其更改为^

此命令同样有效:

awk '{$1=$1}1' OFS='^' file > newfile

Sed会更好:

sed 's/ /^/g' file > newfile

对于第二个命令,粘贴默认情况下使用制表符分隔符,但您可以更改这一点。如果需要空格,请使用空格作为-d选项。:

paste -d" " file* > newfile

请记住,您可以选择任何要读取或写入csv的分隔符(这不是它的名称所暗示的(。如果你的输入文件使用空格分隔符,你可以在粘贴命令中使用逗号分隔符,就这样

相关内容

最新更新