如何修复shell脚本中导出的csv中的换行符



我想使用unix在csv文件中修复下面的问题。我没有访问源代码的权限,所以我必须单独修复这个csv文件。我需要想要的输出。这是可以实现的吗。请帮忙。

我在下面的代码中尝试过这个,但它不起作用。

perl -p00e 's/n|/|/g' test.csv

问题:

DATECODE|SUBCLASSCODE|SUBCLASS_NAME|CLASS
2021-05-25|2202|Bras|1310
2021-05-25|1119|No Longer in Use - Depleted by 2019 Reclass|0805
2021-05-25|0949|No Longer in Use - Depleted by 2021 Reclass|0231
2021-05-25|1928|Fishing Gloves|1155
2021-05-25|1604|Training FW|1080
2021-05-25|0894|Hunting Waders|0894
2021-05-25|1873|Small Game|0326
2021-05-25|9950|EVENT
REGISTRATION FEE|9950
2021-05-25|0476|Regular Golf Gloves|0476
2021-05-25|1366|
Shorts|0988
2021-05-25|1914|Wade Shoes|0894
2021-05-25|0537|No Longer in Use - Depleted by 2019 Reclass|0537
2021-05-25|1635|Pickleball FW|
0021
2021-05-25|0679|Case Sunglasses|0679
2021-05-25|1544|Sandals|0001
2021-05-25|
1527|Golf/Tennis Accessories|1059
2021-05-25|1582|Lifestyle FW|0502

期望结果:

DATECODE|SUBCLASSCODE|SUBCLASS_NAME|CLASS
2021-05-25|2202|Bras|1310
2021-05-25|1119|No Longer in Use - Depleted by 2019 Reclass|0805
2021-05-25|0949|No Longer in Use - Depleted by 2021 Reclass|0231
2021-05-25|1928|Fishing Gloves|1155
2021-05-25|1604|Training FW|1080
2021-05-25|0894|Hunting Waders|0894
2021-05-25|1873|Small Game|0326
2021-05-25|9950|EVENT REGISTRATION FEE|9950
2021-05-25|0476|Regular Golf Gloves|0476
2021-05-25|1366|Shorts|0988
2021-05-25|1914|Wade Shoes|0894
2021-05-25|0537|No Longer in Use - Depleted by 2019 Reclass|0537
2021-05-25|1635|Pickleball FW|0021
2021-05-25|0679|Case Sunglasses|0679
2021-05-25|1544|Sandals|0001
2021-05-25|1527|Golf/Tennis Accessories|1059
2021-05-25|1582|Lifestyle FW|0502

使用awk使用3条规则可以非常简单地修复输出。具体来说,您将检查每一行是否以您格式中的日期开始,并以4位数字结束(例如第4个字段$4(。如果是,只需打印行(规则1(。如果不是,并且该行以您的格式中的日期开始,则只需输出不带'n'的内容,这样您就可以将下一行附加到该行(规则2(。如果您到达的行既不满足规则1也不满足规则2,则它是前一行的末尾,只需使用'n'输出即可完成前一行(规则3(。

这可以用来完成

awk -F'|' '
NF==4 && $4~/^[[:digit:]]{4}$/ { print; next }
$1~/[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}/ { 
printf "%s",$0
next
}
{ print }
' f.csv

示例使用/输出

使用f.csv中的输入文件,您将获得:

$ awk -F'|' '
>     NF==4 && $4~/^[[:digit:]]{4}$/ { print; next }
>     $1~/[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}/ {
>         printf "%s",$0
>         next
>     }
>     { print }
> ' f.csv
DATECODE|SUBCLASSCODE|SUBCLASS_NAME|CLASS
2021-05-25|2202|Bras|1310
2021-05-25|1119|No Longer in Use - Depleted by 2019 Reclass|0805
2021-05-25|0949|No Longer in Use - Depleted by 2021 Reclass|0231
2021-05-25|1928|Fishing Gloves|1155
2021-05-25|1604|Training FW|1080
2021-05-25|0894|Hunting Waders|0894
2021-05-25|1873|Small Game|0326
2021-05-25|9950|EVENTREGISTRATION FEE|9950
2021-05-25|0476|Regular Golf Gloves|0476
2021-05-25|1366|Shorts|0988
2021-05-25|1914|Wade Shoes|0894
2021-05-25|0537|No Longer in Use - Depleted by 2019 Reclass|0537
2021-05-25|1635|Pickleball FW|0021
2021-05-25|0679|Case Sunglasses|0679
2021-05-25|1544|Sandals|0001
2021-05-25|1527|Golf/Tennis Accessories|1059
2021-05-25|1582|Lifestyle FW|0502

这是您指定的输出。

你可以把它写成浓缩的形式,每行一条规则:

awk -F'|' '
NF==4 && $4~/^[[:digit:]]{4}$/ { print; next }
$1~/[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}/ { printf "%s",$0; next }
{ print }
' f.csv

仔细看看,如果你还有问题,请告诉我。

您还有非常简单的解决方案

perl -pe 's/n/ /g;s/2021-/n2021-/g;s/| */|/g' input.txt

给你

+------------+--------------+---------------------------------------------+--------+
| DATECODE   | SUBCLASSCODE | SUBCLASS_NAME                               | CLASS  |
+------------+--------------+---------------------------------------------+--------+
| 2021-05-25 | 2202         | Bras                                        | 1310   |
| 2021-05-25 | 1119         | No Longer in Use - Depleted by 2019 Reclass | 0805   |
| 2021-05-25 | 0949         | No Longer in Use - Depleted by 2021 Reclass | 0231   |
| 2021-05-25 | 1928         | Fishing Gloves                              | 1155   |
| 2021-05-25 | 1604         | Training FW                                 | 1080   |
| 2021-05-25 | 0894         | Hunting Waders                              | 0894   |
| 2021-05-25 | 1873         | Small Game                                  | 0326   |
| 2021-05-25 | 9950         | EVENT REGISTRATION FEE                      | 9950   |
| 2021-05-25 | 0476         | Regular Golf Gloves                         | 0476   |
| 2021-05-25 | 1366         | Shorts                                      | 0988   |
| 2021-05-25 | 1914         | Wade Shoes                                  | 0894   |
| 2021-05-25 | 0537         | No Longer in Use - Depleted by 2019 Reclass | 0537   |
| 2021-05-25 | 1635         | Pickleball FW                               | 0021   |
| 2021-05-25 | 0679         | Case Sunglasses                             | 0679   |
| 2021-05-25 | 1544         | Sandals                                     | 0001   |
| 2021-05-25 | 1527         | Golf/Tennis Accessories                     | 1059   |
| 2021-05-25 | 1582         | Lifestyle FW                                | 0502   |
+------------+--------------+---------------------------------------------+--------+

最新更新