我有一个包含以下模式行的文件。
date=2020-02-22 time=13:32:41 type=text subtype=text ip=1.2.3.4 country="China" service="foo" id=47291 msg="foo: bar.baz," value=50
date=2020-03-17 time=11:49:54 type=text subtype=anothertext ip=1.2.3.5 country="Russian Federation" service="bar" id=47324 msg="foo: bar.baz," value=30
date=2020-03-30 time=16:29:24 type=text subtype=someothertext ip=1.2.3.6 country="Korea, Republic of" service="grault, garply" id=47448 msg="foo: bar.baz," value=60
我想删除类型、子类型和服务以及这些字段的值(=之后的值(。
期望输出:
date=2020-02-22 time=13:32:41 ip=1.2.3.4 country="China" id=47291 msg="foo: bar.baz," value=50
date=2020-03-17 time=11:49:54 ip=1.2.3.5 country="Russian Federation" id=47324 msg="foo: bar.baz," value=30
date=2020-03-30 time=16:29:24 ip=1.2.3.6 country="Korea, Republic of" id=47448 msg="foo: bar.baz," value=60
我一直在尝试使用cut
、awk
、sed
,但还并没有接近解决方案。我在网上搜索了好几个小时,但都白费了。有人能帮忙吗?
您以后可能想要重用或构建的东西:
$ cat tst.awk
BEGIN {
split(s,tmp)
for (i in tmp) {
skip[tmp[i]]
}
FPAT = "[^ ]+(="[^"]+")?"
}
{
c=0
for (i=1; i<=NF; i++) {
tag = gensub(/=.*/,"",1,$i)
if ( !(tag in skip) ) {
printf "%s%s", (c++ ? OFS : ""), $i
}
}
print ""
}
$ awk -v s='type subtype service' -f tst.awk file
date=2020-02-22 time=13:32:41 ip=1.2.3.4 country="China" id=47291 msg="foo: bar.baz," value=50
date=2020-03-17 time=11:49:54 ip=1.2.3.5 country="Russian Federation" id=47324 msg="foo: bar.baz," value=30
date=2020-03-30 time=16:29:24 ip=1.2.3.6 country="Korea, Republic of" id=47448 msg="foo: bar.baz," value=60
上面使用GNU awk作为FPAT和gensub((。
您可以使用此sed
:
sed -E 's/(^|[[:blank:]]+)(subtype|type|service)=[^[:blank:]]+//g' file
date=2020-02-22 time=13:32:41 ip=1.2.3.4 country="China" id=47291 msg="foo: bar.baz," value=50
date=2020-03-17 time=11:49:54 ip=1.2.3.5 country="Russian Federation" id=47324 msg="foo: bar.baz," value=30
date=2020-03-30 time=16:29:24 ip=1.2.3.6 country="Korea, Republic of" garply" id=47448 msg="foo: bar.baz," value=60
您可以尝试以下操作:
awk -F " " '{ $3=""; $4=""; $5=""; print}' file
您基本上将列设置为一个空字符串。