使用带分隔符的AWK打印特定列



我的文件如下:

+------------------------------------------+---------------+----------------+------------------+------------------+-----------------+
| Message                                  | Status        | Adress         | Changes          | Test             | Calibration     |
|------------------------------------------+---------------+----------------+------------------+------------------+-----------------|
| Hello World                              | Active        | up             |                1 |               up |            done |
| Hello Everyone Here                      | Passive       | up             |                2 |             down |            none |
| Hi there. My name is Eric. How are you?  | Down          | up             |                3 |         inactive |            done |
+------------------------------------------+---------------+----------------+------------------+------------------+-----------------+
+----------------------------+---------------+----------------+------------------+------------------+-----------------+
| Message                    | Status        | Adress         | Changes          | Test             | Calibration     |
|----------------------------+---------------+----------------+------------------+------------------+-----------------|
| What's up?                 | Active        | up             |                1 |               up |            done |
| Hi. I'm Otilia             | Passive       | up             |                2 |             down |            none |
| Hi there. This is Marcus   | Up            | up             |                3 |         inactive |            done |
+----------------------------+---------------+----------------+------------------+------------------+-----------------+

我想使用AWK提取一个特定的列。我可以用CUT来做;但是,当每个表的长度根据每个列中存在的字符数量而变化时,我无法获得所需的输出。

cat File.txt | cut -c -44
+------------------------------------------+
| Message                                  |
|------------------------------------------+
| Hello World                              |
| Hello Everyone Here                      |
| Hi there. My name is Eric. How are you?  |
+------------------------------------------+
+----------------------------+--------------
| Message                    | Status
|----------------------------+--------------
| What's up?                 | Active
| Hi. I'm Otilia             | Passive
| Hi there. This is Marcus   | Up
+----------------------------+--------------

cat File.txt | cut -c 44-60
+---------------+
| Status        |
+---------------+
| Active        |
| Passive       |
| Down          |
+---------------+
--+--------------
  | Adress
--+--------------
  | up
  | up
  | up
--+--------------

我尝试使用AWK,但我不知道如何添加两个不同的分隔符,这将照顾到所有的行。

cat File.txt | awk 'BEGIN {FS="|";}{print $2,$3}'
 Message                                    Status
------------------------------------------+---------------+----------------+------------------+------------------+-----------------
 Hello World                                Active
 Hello Everyone Here                        Passive
 Hi there. My name is Eric. How are you?    Down

 Message                      Status
----------------------------+---------------+----------------+------------------+------------------+-----------------
 What's up?                   Active
 Hi. I'm Otilia               Passive
 Hi there. This is Marcus     Up

我正在寻找的输出:

+------------------------------------------+
| Message                                  |
|------------------------------------------+
| Hello World                              |
| Hello Everyone Here                      |
| Hi there. My name is Eric. How are you?  |
+------------------------------------------+
+----------------------------+
| Message                    |
|----------------------------+
| What's up?                 | 
| Hi. I'm Otilia             | 
| Hi there. This is Marcus   | 
+----------------------------+

+------------------------------------------+---------------+
| Message                                  | Status        |
|------------------------------------------+---------------+
| Hello World                              | Active        |
| Hello Everyone Here                      | Passive       |
| Hi there. My name is Eric. How are you?  | Down          |
+------------------------------------------+---------------+
+----------------------------+---------------+
| Message                    | Status        | 
|----------------------------+---------------+
| What's up?                 | Active        | 
| Hi. I'm Otilia             | Passive       | 
| Hi there. This is Marcus   | Up            | 
+----------------------------+---------------+

或随机其他列

+------------------------------------------+----------------+------------------+
| Message                                  | Adress         | Test             |
|------------------------------------------+----------------+------------------+
| Hello World                              | up             |               up |
| Hello Everyone Here                      | up             |             down |
| Hi there. My name is Eric. How are you?  | up             |         inactive |
+------------------------------------------+----------------+------------------+
+----------------------------+---------------+------------------+
| Message                    |Adress         | Test             |
|----------------------------+---------------+------------------+
| What's up?                 |up             |               up |
| Hi. I'm Otilia             |up             |             down |
| Hi there. This is Marcus   |up             |         inactive |
+----------------------------+---------------+------------------+

提前感谢。

使用GNU awk的一个想法:

awk -v fldlist="2,3" '
BEGIN { fldcnt=split(fldlist,fields,",") }                      # split fldlist into array fields[]
      { split($0,arr,/[|+]/,seps)                               # split current line on dual delimiters "|" and "+"
        for (i=1;i<=fldcnt;i++)                                 # loop through our array of fields (fldlist)
            printf "%s%s", seps[fields[i]-1], arr[fields[i]]    # print leading separator/delimiter and field
        printf "%sn", seps[fields[fldcnt]]                     # print trailing separator/delimiter and terminate line
      }
' File.txt

指出:

  • 需要GNU awk作为split()函数的第四个参数(seps ==分隔符数组;
  • 假设我们的字段分隔符(|, +)不作为数据
  • 的一部分显示
  • 输入变量fldlist是一个逗号分隔的列列表,模仿将传递给cut的内容(例如,当一行以分隔符开始时,字段#1为空白)

对于fldlist="2,3",生成:

+------------------------------------------+---------------+
| Message                                  | Status        |
|------------------------------------------+---------------+
| Hello World                              | Active        |
| Hello Everyone Here                      | Passive       |
| Hi there. My name is Eric. How are you?  | Down          |
+------------------------------------------+---------------+
+----------------------------+---------------+
| Message                    | Status        |
|----------------------------+---------------+
| What's up?                 | Active        |
| Hi. I'm Otilia             | Passive       |
| Hi there. This is Marcus   | Up            |
+----------------------------+---------------+

对于fldlist="2,4,6",生成:

+------------------------------------------+----------------+------------------+
| Message                                  | Adress         | Test             |
|------------------------------------------+----------------+------------------+
| Hello World                              | up             |               up |
| Hello Everyone Here                      | up             |             down |
| Hi there. My name is Eric. How are you?  | up             |         inactive |
+------------------------------------------+----------------+------------------+
+----------------------------+----------------+------------------+
| Message                    | Adress         | Test             |
|----------------------------+----------------+------------------+
| What's up?                 | up             |               up |
| Hi. I'm Otilia             | up             |             down |
| Hi there. This is Marcus   | up             |         inactive |
+----------------------------+----------------+------------------+

对于fldlist="4,3,2",生成:

+----------------+---------------+------------------------------------------+
| Adress         | Status        | Message                                  |
+----------------+---------------|------------------------------------------+
| up             | Active        | Hello World                              |
| up             | Passive       | Hello Everyone Here                      |
| up             | Down          | Hi there. My name is Eric. How are you?  |
+----------------+---------------+------------------------------------------+
+----------------+---------------+----------------------------+
| Adress         | Status        | Message                    |
+----------------+---------------|----------------------------+
| up             | Active        | What's up?                 |
| up             | Passive       | Hi. I'm Otilia             |
| up             | Up            | Hi there. This is Marcus   |
+----------------+---------------+----------------------------+

再说一遍?(fldlist="3,3,3"):

+---------------+---------------+---------------+
| Status        | Status        | Status        |
+---------------+---------------+---------------+
| Active        | Active        | Active        |
| Passive       | Passive       | Passive       |
| Down          | Down          | Down          |
+---------------+---------------+---------------+
+---------------+---------------+---------------+
| Status        | Status        | Status        |
+---------------+---------------+---------------+
| Active        | Active        | Active        |
| Passive       | Passive       | Passive       |
| Up            | Up            | Up            |
+---------------+---------------+---------------+

如果你犯了错误,试图打印"第一"列,即fldlist="1":

+
|
|
|
|
|
+
+
|
|
|
|
|
+

如果GNU awk可用,请尝试markp-fuso的nice解决方案。如果没有,这里有一个posix兼容的替代方案:

#!/bin/bash
# define bash variables
cols=(2 3 6)                            # bash array of desired columns
col_list=$(IFS=,; echo "${cols[*]}")    # create a csv string
awk -v cols="$col_list" '
NR==FNR {
    if (match($0, /^[|+]/)) {           # the record contains a table
        if (match($0, /^[|+]-/))        # horizontally ruled line
            n = split($0, a, /[|+]/)    # split into columns
        else                            # "cell" line
            n = split($0, a, /|/)
        len = 0
        for (i = 1; i < n; i++) {
            len += length(a[i]) + 1     # accumulated column position
            pos[FNR, i] = len
        }
    }
    next
}
{
    n = split(cols, a, /,/)             # split the variable `cols` on comma into an array
    for (i = 1; i <= n; i++) {
        col = a[i]
        if (pos[FNR, col] && pos[FNR, col+1]) {
            printf("%s", substr($0, pos[FNR, col], pos[FNR, col + 1] - pos[FNR, col]))
        }
    }
    print(substr($0, pos[FNR, col + 1], 1))
}
' file.txt file.txt

cols=(2 3 6)的结果如上面所示:

+---------------+----------------+-----------------+
| Status        | Adress         | Calibration     |
+---------------+----------------+-----------------|
| Active        | up             |            done |
| Passive       | up             |            none |
| Down          | up             |            done |
+---------------+----------------+-----------------+
+---------------+----------------+-----------------+
| Status        | Adress         | Calibration     |
+---------------+----------------+-----------------|
| Active        | up             |            done |
| Passive       | up             |            none |
| Up            | up             |            done |
+---------------+----------------+-----------------+

它在第一次传递中检测列宽度,然后在第二次传递中根据列位置拆分行。
可以使用在脚本开头分配的bash数组cols控制要打印的列。请按递增顺序将数组分配给所需列号列表。如果您想以不同的方式使用bash变量,请告诉我。

最新更新