sed/awk:用字节替换兆字节(插入零)



我有.csv文件,其中的字段用逗号分隔,行由\n分隔。在某些行中,我有兆字节符号。我想用零来替换它,以便(或多或少(以字节为单位具有正确的大小。

我有

,2.6 M,

我想要

,2600000,

示例

2015-06-01 00:04:52.736,10.0.0.2,10.0.0.4,443,443,56923,2.6 M,10.156.119.1
2015-06-01 00:04:56.736,10.0.0.2,10.0.0.4,443,58935,55658,1.3 M,10.156.126.1
2015-06-01 00:04:56.736,10.0.0.2,10.0.0.4,443,86,54801,1256,10.156.119.1
2015-06-01 00:04:52.736,10.0.0.2,10.0.0.4,443,49652,443,1.6 M,10.156.119.1
2015-06-01 00:04:53.732,10.0.0.2,10.0.0.4,443,443,55770,4.9 M,10.156.119.1
2015-06-01 00:04:54.732,10.0.0.2,10.0.0.4,443,80,45980,639,10.156.119.1
2015-06-01 00:04:54.732,10.0.0.2,10.0.0.4,443,63951,27058,1.2 M,10.156.119.1
2015-06-01 00:04:54.732,10.0.0.2,10.0.0.4,443,80,41035,13.8 M,10.156.119.1
2015-06-01 00:04:55.736,10.0.0.2,10.0.0.4,443,80,40078,7.9 M,10.156.119.1
2015-06-01 00:04:56.732,10.0.0.2,10.0.0.4,443,42008,4.5 M,10.156.119.1

目标

2015-06-01 00:04:52.736,10.0.0.2,10.0.0.4,443,443,56923,2600000,10.156.119.1
2015-06-01 00:04:56.736,10.0.0.2,10.0.0.4,443,58935,55658,1300000,10.156.126.1
2015-06-01 00:04:56.736,10.0.0.2,10.0.0.4,443,86,54801,1256,10.156.119.1
2015-06-01 00:04:52.736,10.0.0.2,10.0.0.4,443,49652,443,1600000,10.156.119.1
2015-06-01 00:04:53.732,10.0.0.2,10.0.0.4,443,443,55770,4900000,10.156.119.1
2015-06-01 00:04:54.732,10.0.0.2,10.0.0.4,443,80,45980,639,10.156.119.1
2015-06-01 00:04:54.732,10.0.0.2,10.0.0.4,443,63951,27058,1200000,10.156.119.1
2015-06-01 00:04:54.732,10.0.0.2,10.0.0.4,443,80,41035,13800000 M,10.156.119.1
2015-06-01 00:04:55.736,10.0.0.2,10.0.0.4,443,80,40078,7900000,10.156.119.1
2015-06-01 00:04:56.732,10.0.0.2,10.0.0.4,443,42008,4500000,10.156.119.1

由于样本数据中的最后一行缺少一列,这一点变得复杂起来。

awk 'BEGIN {FS=OFS=","} {$(NF-1)=$(NF-1)*1000000} 1' file

如果有时你有"M",有时有"K",我们可以适应:

awk '
    BEGIN {
      FS=OFS=","
      mult[""]=1
      mult["K"]=1000
      mult["M"]=1000000
      mult["G"]=1000000000
    } 
    {
      split($(NF-1), a, " ")
      $(NF-1) = a[1] * mult[a[2]]
      print
    }
'
sed 's/([0-9]*).([0-9]*) M/1200000/' file
sed 's/ ([KMG])/0000000001/
     s/.([0-9]{3}[0-9]*K/1/
     s/.([0-9]{6}[0-9]*M/1/
     s/.([0-9]{9}[0-9]*G/1/
    ' YourFile
  • 根据您的样本,1000单位的倍数只有一个字母的数字,我们应该更改

如果像您的样本中那样只有M出现(使用M的值仅在点后1位(,则可以使用sed 's/.([^,]*) M/1000000/' YourFile简化

最新更新