我想使用unix在csv文件中修复下面的问题。我没有访问源代码的权限,所以我必须单独修复这个csv文件。我需要想要的输出。这是可以实现的吗。请帮忙。
我在下面的代码中尝试过这个,但它不起作用。
perl -p00e 's/n|/|/g' test.csv
问题:
DATECODE|SUBCLASSCODE|SUBCLASS_NAME|CLASS
2021-05-25|2202|Bras|1310
2021-05-25|1119|No Longer in Use - Depleted by 2019 Reclass|0805
2021-05-25|0949|No Longer in Use - Depleted by 2021 Reclass|0231
2021-05-25|1928|Fishing Gloves|1155
2021-05-25|1604|Training FW|1080
2021-05-25|0894|Hunting Waders|0894
2021-05-25|1873|Small Game|0326
2021-05-25|9950|EVENT
REGISTRATION FEE|9950
2021-05-25|0476|Regular Golf Gloves|0476
2021-05-25|1366|
Shorts|0988
2021-05-25|1914|Wade Shoes|0894
2021-05-25|0537|No Longer in Use - Depleted by 2019 Reclass|0537
2021-05-25|1635|Pickleball FW|
0021
2021-05-25|0679|Case Sunglasses|0679
2021-05-25|1544|Sandals|0001
2021-05-25|
1527|Golf/Tennis Accessories|1059
2021-05-25|1582|Lifestyle FW|0502
期望结果:
DATECODE|SUBCLASSCODE|SUBCLASS_NAME|CLASS
2021-05-25|2202|Bras|1310
2021-05-25|1119|No Longer in Use - Depleted by 2019 Reclass|0805
2021-05-25|0949|No Longer in Use - Depleted by 2021 Reclass|0231
2021-05-25|1928|Fishing Gloves|1155
2021-05-25|1604|Training FW|1080
2021-05-25|0894|Hunting Waders|0894
2021-05-25|1873|Small Game|0326
2021-05-25|9950|EVENT REGISTRATION FEE|9950
2021-05-25|0476|Regular Golf Gloves|0476
2021-05-25|1366|Shorts|0988
2021-05-25|1914|Wade Shoes|0894
2021-05-25|0537|No Longer in Use - Depleted by 2019 Reclass|0537
2021-05-25|1635|Pickleball FW|0021
2021-05-25|0679|Case Sunglasses|0679
2021-05-25|1544|Sandals|0001
2021-05-25|1527|Golf/Tennis Accessories|1059
2021-05-25|1582|Lifestyle FW|0502
使用awk
使用3条规则可以非常简单地修复输出。具体来说,您将检查每一行是否以您格式中的日期开始,并以4位数字结束(例如第4个字段$4
(。如果是,只需打印行(规则1(。如果不是,并且该行以您的格式中的日期开始,则只需输出不带'n'
的内容,这样您就可以将下一行附加到该行(规则2(。如果您到达的行既不满足规则1也不满足规则2,则它是前一行的末尾,只需使用'n'
输出即可完成前一行(规则3(。
这可以用来完成
awk -F'|' '
NF==4 && $4~/^[[:digit:]]{4}$/ { print; next }
$1~/[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}/ {
printf "%s",$0
next
}
{ print }
' f.csv
示例使用/输出
使用f.csv
中的输入文件,您将获得:
$ awk -F'|' '
> NF==4 && $4~/^[[:digit:]]{4}$/ { print; next }
> $1~/[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}/ {
> printf "%s",$0
> next
> }
> { print }
> ' f.csv
DATECODE|SUBCLASSCODE|SUBCLASS_NAME|CLASS
2021-05-25|2202|Bras|1310
2021-05-25|1119|No Longer in Use - Depleted by 2019 Reclass|0805
2021-05-25|0949|No Longer in Use - Depleted by 2021 Reclass|0231
2021-05-25|1928|Fishing Gloves|1155
2021-05-25|1604|Training FW|1080
2021-05-25|0894|Hunting Waders|0894
2021-05-25|1873|Small Game|0326
2021-05-25|9950|EVENTREGISTRATION FEE|9950
2021-05-25|0476|Regular Golf Gloves|0476
2021-05-25|1366|Shorts|0988
2021-05-25|1914|Wade Shoes|0894
2021-05-25|0537|No Longer in Use - Depleted by 2019 Reclass|0537
2021-05-25|1635|Pickleball FW|0021
2021-05-25|0679|Case Sunglasses|0679
2021-05-25|1544|Sandals|0001
2021-05-25|1527|Golf/Tennis Accessories|1059
2021-05-25|1582|Lifestyle FW|0502
这是您指定的输出。
你可以把它写成浓缩的形式,每行一条规则:
awk -F'|' '
NF==4 && $4~/^[[:digit:]]{4}$/ { print; next }
$1~/[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}/ { printf "%s",$0; next }
{ print }
' f.csv
仔细看看,如果你还有问题,请告诉我。
您还有非常简单的解决方案
perl -pe 's/n/ /g;s/2021-/n2021-/g;s/| */|/g' input.txt
给你
+------------+--------------+---------------------------------------------+--------+
| DATECODE | SUBCLASSCODE | SUBCLASS_NAME | CLASS |
+------------+--------------+---------------------------------------------+--------+
| 2021-05-25 | 2202 | Bras | 1310 |
| 2021-05-25 | 1119 | No Longer in Use - Depleted by 2019 Reclass | 0805 |
| 2021-05-25 | 0949 | No Longer in Use - Depleted by 2021 Reclass | 0231 |
| 2021-05-25 | 1928 | Fishing Gloves | 1155 |
| 2021-05-25 | 1604 | Training FW | 1080 |
| 2021-05-25 | 0894 | Hunting Waders | 0894 |
| 2021-05-25 | 1873 | Small Game | 0326 |
| 2021-05-25 | 9950 | EVENT REGISTRATION FEE | 9950 |
| 2021-05-25 | 0476 | Regular Golf Gloves | 0476 |
| 2021-05-25 | 1366 | Shorts | 0988 |
| 2021-05-25 | 1914 | Wade Shoes | 0894 |
| 2021-05-25 | 0537 | No Longer in Use - Depleted by 2019 Reclass | 0537 |
| 2021-05-25 | 1635 | Pickleball FW | 0021 |
| 2021-05-25 | 0679 | Case Sunglasses | 0679 |
| 2021-05-25 | 1544 | Sandals | 0001 |
| 2021-05-25 | 1527 | Golf/Tennis Accessories | 1059 |
| 2021-05-25 | 1582 | Lifestyle FW | 0502 |
+------------+--------------+---------------------------------------------+--------+