我使用awk替换每个字段之间的空格分隔符,使其成为一个字段。我想将处理任意数量文件的输出合并为一个结果文件,用空格分隔。
awk -v OFS='^' '{for(i=1; i<=NF; i++)printf("%s%s", $i,(i==NF)?ORS:OFS)}' filename > outputFile
File1 Awk命令之后
777^Brockton^Avenue,^Abington^MA^2351
30^Memorial^Drive,Avon^MA^2322
250^Hartford^Avenue,^Bellingham^MA^2019.
....
当awk应用于文件2时,它不受命令的影响,因为它只有一个字段。
madanm@comcast.net
skajan@verizon.net
barnett@hotmail.com
sbmrjbr@sbcglobal.net
mastinfo@sbcglobal.net
....
我尝试在它应用awk命令后合并三个文件
paste listOf* | awk -v OFS='^' '{for(i=1; i<=NF; i++)printf("%s%s", $i,(i==NF)?ORS:OFS)}' > outputFile
但我的结果看起来像这个
777^Brockton^Avenue,^Abington^MA^2351^madanm@comcast.net^Manual^Ordway
30^Memorial^Drive,^Avon^MA^2322^skajan@verizon.net^Yuonne^Cajigas
250^Hartford^Avenue,^Bellingham^MA^2019^barnett@hotmail.com^Pattie^Darsey
700^Oak^Street,^Brockton^MA^2301^sbmrjbr@sbcglobal.net^Cammie^Knoles
66-4^Parkhurst^Rd,^Chelmsford^MA^1824^mastinfo@sbcglobal.net^Evia^Fallen
591^Memorial^Dr,^Chicopee^MA^1020^carcus@aol.com^Soo^Sanfilippo
我希望它看起来像这个
Home Address[delimiter]Email[delimiter]Name[delimiter]
777^Brockton^Avenue,^Abington^MA^2351 madanm@comcast.net Manual^Ordway
30^Memorial^Drive,^Avon^MA^2322 skajan@verizon.net Yuonne^Cajigas
250^Hartford^Avenue,^Bellingham^MA^2019 barnett@hotmail.com Pattie^Darsey
700^Oak^Street,^Brockton^MA^2301 sbmrjbr@sbcglobal.net Cammie^Knoles
66-4^Parkhurst^Rd,^Chelmsford^MA^1824 mastinfo@sbcglobal.net Evia^Fallen
591^Memorial^Dr,^Chicopee^MA^1020 carcus@aol.com Soo^Sanfilippo
不要做你想做的事情,你真的把文件搞砸了。特别是,^
是一个糟糕的字符选择,因为它是一个regexp元字符,因此会使任何进一步的处理都比必须的要困难得多。为什么不在输入中保留空白字符(如果可以存在制表符,则将其中的任何制表符转换为空格(,并使用制表符作为分隔符,或者将整个内容转换为CSV?
例如,给定此输入:
$ cat file1
777 Brockton Avenue, Abington MA 2351
30 Memorial Drive, Avon MA 2322
250 Hartford Avenue, Bellingham MA 2019
700 Oak Street, Brockton MA 2301
66-4 Parkhurst Rd, Chelmsford MA 1824
591 Memorial Dr, Chicopee MA 1020
$ cat file2
madanm@comcast.net
skajan@verizon.net
barnett@hotmail.com
sbmrjbr@sbcglobal.net
mastinfo@sbcglobal.net
carcus@aol.com
$ cat file3
Manual Ordway
Yuonne Cajigas
Pattie Darsey
Cammie Knoles
Evia Fallen
Soo Sanfilippo
你可以生产TSV:
$ cat tst.awk
BEGIN {
OFS = "t"
ofmt = "%s%s"
numFiles = ARGC - 1
}
FNR == 1 {
fileNr++
}
{
gsub(/[[:space:]]+/," ")
gsub(/^ | $/,"")
val[FNR,ARGIND] = $0
}
fileNr == numFiles {
for (i=1; i<=numFiles; i++) {
printf ofmt, val[FNR,i], (i<numFiles ? OFS : ORS)
}
}
$ awk -f tst.awk file1 file2 file3
777 Brockton Avenue, Abington MA 2351 madanm@comcast.net Manual Ordway
30 Memorial Drive, Avon MA 2322 skajan@verizon.net Yuonne Cajigas
250 Hartford Avenue, Bellingham MA 2019 barnett@hotmail.com Pattie Darsey
700 Oak Street, Brockton MA 2301 sbmrjbr@sbcglobal.net Cammie Knoles
66-4 Parkhurst Rd, Chelmsford MA 1824 mastinfo@sbcglobal.net Evia Fallen
591 Memorial Dr, Chicopee MA 1020 carcus@aol.com Soo Sanfilippo
或CSV(仅更改OFS
和ofmt
的值(:
$ cat tst.awk
BEGIN {
OFS = ","
ofmt = ""%s"%s"
numFiles = ARGC - 1
}
FNR == 1 {
fileNr++
}
{
gsub(/[[:space:]]+/," ")
gsub(/^ | $/,"")
val[FNR,ARGIND] = $0
}
fileNr == numFiles {
for (i=1; i<=numFiles; i++) {
printf ofmt, val[FNR,i], (i<numFiles ? OFS : ORS)
}
}
$ awk -f tst.awk file1 file2 file3
"777 Brockton Avenue, Abington MA 2351","madanm@comcast.net","Manual Ordway"
"30 Memorial Drive, Avon MA 2322","skajan@verizon.net","Yuonne Cajigas"
"250 Hartford Avenue, Bellingham MA 2019","barnett@hotmail.com","Pattie Darsey"
"700 Oak Street, Brockton MA 2301","sbmrjbr@sbcglobal.net","Cammie Knoles"
"66-4 Parkhurst Rd, Chelmsford MA 1824","mastinfo@sbcglobal.net","Evia Fallen"
"591 Memorial Dr, Chicopee MA 1020","carcus@aol.com","Soo Sanfilippo"
或任何其他常见的文件格式。以上两者都可以通过例如MS Excel来理解。
只是为了显示所需的最小更改,以获得您实际要求的内容(再次,不要这样做!(将是:
$ cat tst.awk
BEGIN {
OFS = " "
ofmt = "%s%s"
numFiles = ARGC - 1
}
FNR == 1 {
fileNr++
}
{
gsub(/[[:space:]^]+/,"^")
gsub(/^^|^$/,"")
val[FNR,ARGIND] = $0
}
fileNr == numFiles {
for (i=1; i<=numFiles; i++) {
printf ofmt, val[FNR,i], (i<numFiles ? OFS : ORS)
}
}
$ awk -f tst.awk file1 file2 file3
777^Brockton^Avenue,^Abington^MA^2351 madanm@comcast.net Manual^Ordway
30^Memorial^Drive,^Avon^MA^2322 skajan@verizon.net Yuonne^Cajigas
250^Hartford^Avenue,^Bellingham^MA^2019 barnett@hotmail.com Pattie^Darsey
700^Oak^Street,^Brockton^MA^2301 sbmrjbr@sbcglobal.net Cammie^Knoles
66-4^Parkhurst^Rd,^Chelmsford^MA^1824 mastinfo@sbcglobal.net Evia^Fallen
591^Memorial^Dr,^Chicopee^MA^1020 carcus@aol.com Soo^Sanfilippo
分隔符并不总是空格。因此,基本上,您的第一个命令不会删除分隔符,而是将其更改为^
此命令同样有效:
awk '{$1=$1}1' OFS='^' file > newfile
Sed会更好:
sed 's/ /^/g' file > newfile
对于第二个命令,粘贴默认情况下使用制表符分隔符,但您可以更改这一点。如果需要空格,请使用空格作为-d
选项。:
paste -d" " file* > newfile
请记住,您可以选择任何要读取或写入csv的分隔符(这不是它的名称所暗示的(。如果你的输入文件使用空格分隔符,你可以在粘贴命令中使用逗号分隔符,就这样