我有一个文件,从中我试图打印一个名为'grant (actual)的列,动态地使用列名。我能够通过使用下面的命令迭代列号来派生列,当前位置是列6
$ awk '/--/,/Datacenter/ ' cas.txt | awk '{print $6}'
(actual)
49.9%
55.4%
53.5%
48.7%
(actual)
53.1%
50.0%
47.6%
48.3%
(actual)
50.0%
51.1%
48.9%
51.3%
但是我想动态地确定列数,以便如果列的位置发生变化,我的脚本应该工作。
$ cat cas.txt
Datacenter: DC01
====================
Status=TRUE/FALSE
|/ State=Normal/Leaving/Joining/Moving
-- Address Load USER grant (actual) Host ID Vol
DN 10.0.0.138 221.03 MiB 256 49.9% dd09f7aa STG1
DN 10.0.0.139 173.47 MiB 256 55.4% 53179492 STG1
DN 10.0.0.136 200.08 MiB 256 53.5% 89a28140 STG1
DN 10.0.0.137 318.69 MiB 256 48.7% 8cc9dfac STG1
Datacenter: DC02
====================
Status=TRUE/FALSE
|/ State=Normal/Leaving/Joining/Moving
-- Address Load USER grant (actual) Host ID Vol
DN 10.0.0.142 270.01 MiB 256 53.1% 04210b53 STG1
DN 10.0.0.143 166.65 MiB 256 50.0% d5469c9b STG1
DN 10.0.0.140 199.51 MiB 256 47.6% fcc38a17 STG1
DN 10.0.0.141 170.52 MiB 256 48.3% 3d7b4e59 STG1
Datacenter: DC03
====================
Status=TRUE/FALSE
|/ State=Normal/Leaving/Joining/Moving
-- Address Load USER grant (actual) Host ID Vol
DN 10.0.0.150 229.2 MiB 256 50.0% 0fa51a1a STG1
DN 10.0.0.151 195.88 MiB 256 51.1% e329ac17 STG1
DN 10.0.0.148 147.01 MiB 256 48.9% c14bd7ae STG1
DN 10.0.0.149 298.34 MiB 256 51.3% 6c73d2b5 STG1
使用GNU awk forFIELDWIDTHS
和split()
的第四个参数,您可以创建一个数组(下面的f[]
),将列名映射到它们的数字,然后您可以打印,比较,重新排序或做任何其他您喜欢的列,只是通过索引该数组的列名:
$ cat tst.awk
/^--/ {
if ( FIELDWIDTHS == "" ) {
wids = ""
numFlds = split($0,flds,/ +/,seps)
for ( fldNr=1; fldNr<=numFlds; fldNr++ ) {
f[flds[fldNr]] = fldNr
wids = (fldNr>1 ? wids " " : "") length(flds[fldNr] seps[fldNr])
}
FIELDWIDTHS = wids
$0 = $0
}
inBlock = 1
}
inBlock {
if ( /^Datacenter:/ ) {
print ""
inBlock = 0
next
}
for ( i=1; i<=NF; i++ ) {
gsub(/^s+|s+$/,"",$i)
}
print $(f["grant (actual)"])
}
$ awk -f tst.awk cas.txt
grant (actual)
49.9%
55.4%
53.5%
48.7%
grant (actual)
53.1%
50.0%
47.6%
48.3%
grant (actual)
50.0%
51.1%
48.9%
51.3%
结合@Dan和@Daweo的想法
awk -F' {2,}' -v col='grant (actual)' '
/^Datacenter/ {i=0}
$1 == "--" {for (i=1; i<=NF; i++) if ($i == col) break; next}
i {print $i}
' cas.txt
49.9%
55.4%
53.5%
48.7%
53.1%
50.0%
47.6%
48.3%
50.0%
51.1%
48.9%
51.3%
如果您想在输出中看到col标头,只需删除next
考虑以下示例,让file.txt
内容为
-- Able Baker Charlie
DN 1 2 3
DN 4 5 6
DN 7 8 9
-- Charlie
DN 10
DN 11
DN 12
然后
awk 'BEGIN{colname="Charlie"}/--/{delete names;for(i=1;i<=NF;i+=1){names[$i]=i};next}{print $(names[colname])}' file.txt
给输出
3
6
9
10
11
12
解释:我使用colname
变量来存储所需的列名。当遇到包含——的行时,它被视为带有列名的标题。names
数组被清除,以防止前一个块的残余,然后填充,以便列(键)的名称对应于它的位置(值)。在这样做之后,我指示GNUAWK
处理next
行,即不打印任何内容。对于其他行,我通知GNUAWK
查找与所选名称对应的数字,并通知print
查找该列。
(在gawk 4.2.1中测试)
查看您的数据,我们将使用split()
将记录拆分为2个或更多空格(/ +/
):
$ awk '$1~/^--$/ { # -- starts the header record
n=split($0,h,/ +/) # get field count n of header record
for(i=1;i<=n;i++) # iterate fields
if(h[i]=="grant (actual)") # looking for desired header
break # break once found, i is the field number
}
split($0,a,/ +/)==n { # process records with equal amount of fields
print a[i] # and output ith field
}' file
输出:
grant (actual)
49.9%
55.4%
53.5%
48.7%
grant (actual)
53.1%
47.6%
48.3%
grant (actual)
50.0%
51.1%
48.9%
51.3%
上面的
对于最后一个字段仅以1个空格分隔的记录失败:
DN 10.0.0.143 166.65 MiB 256 50.0% d5469c9b STG1
简介
基于awk
的解决方案:
- doesn't require gnu-gawk for FIELDWIDTHS/fixed width fields
- doesn't require fudging with FS/OFS/RS/FPAT
- doesn't require a specialized regex engine,
e.g. with back-references support
- doesn't require array-splitting or dealing with the
painfully slow match() function
- doesn't *even* require a single call to any function
输入>Datacenter: DC01
====================
Status=TRUE/FALSE
|/ State=Normal/Leaving/Joining/Moving
-- Address Load USER grant (actual) Host ID Vol
DN 10.0.0.138 221.03 MiB 256 49.9% dd09f7aa STG1
DN 10.0.0.139 173.47 MiB 256 55.4% 53179492 STG1
DN 10.0.0.136 200.08 MiB 256 53.5% 89a28140 STG1
DN 10.0.0.137 318.69 MiB 256 48.7% 8cc9dfac STG1
Datacenter: DC02
====================
Status=TRUE/FALSE
|/ State=Normal/Leaving/Joining/Moving
-- Address Load USER grant (actual) Host ID Vol
DN 10.0.0.142 270.01 MiB 256 53.1% 04210b53 STG1
DN 10.0.0.143 166.65 MiB 256 50.0% d5469c9b STG1
DN 10.0.0.140 199.51 MiB 256 47.6% fcc38a17 STG1
DN 10.0.0.141 170.52 MiB 256 48.3% 3d7b4e59 STG1
Datacenter: DC03
====================
Status=TRUE/FALSE
|/ State=Normal/Leaving/Joining/Moving
-- Address Load USER grant (actual) Host ID Vol
DN 10.0.0.150 229.2 MiB 256 50.0% 0fa51a1a STG1
DN 10.0.0.151 195.88 MiB 256 51.1% e329ac17 STG1
DN 10.0.0.148 147.01 MiB 256 48.9% c14bd7ae STG1
DN 10.0.0.149 298.34 MiB 256 51.3% 6c73d2b5 STG1
< cas.txt |
{m,g}awk ' !NF ? !_ : /^[=]+/ ? ($!_=!__ ? "" : " ")
: --NF<+_ ? !_ : __+=($!_=(/%/?"":$(_-_^!_)" ")($_))^!_' _=6
1 grant (actual)
2 49.9%
3 55.4%
4 53.5%
5 48.7%
6
7 grant (actual)
8 53.1%
9 50.0%
10 47.6%
11 48.3%
12
13 grant (actual)
14 50.0%
15 51.1%
16 48.9%
17 51.3%