我不擅长unix。
我有一个csv文件,我有多列。其中,一列包含新行和^M
字符。我需要用~~
替换两个"(这是一个单元格值)之间的所有字段,这样我就可以将单元格值视为单个字段
"id","notes"
"N001","this is^M
test.
Again test
"
"N002","this is perfect"
"N00345","this is
having ^M
problem"
我需要这样的文件:
"id","notes"
"N001","this is~~test.~~~~Again test~~~~"
"N002","this is perfect"
"N00345","this is~~~~having ~~problem"
因此,整个单元格值可以作为单个字段值读取。
在这个需求中,我需要再添加一个案例,其中单元格中的数据包含"
(双引号)。在这种情况下,当"
后面跟逗号时,我们可以识别它的结尾。以下是更新的案例数据:
"id","notes"
"N001","this is^M
test. "Again test."
Again test
"
"N002","this is perfect"
"N00345","this is
having ^M
problem as it contains "
test"
我们可以保留或删除"
。预期输出为:
"id","notes"
"N001","this is~~test. "Again test."~~~~Again test~~~~"
"N002","this is perfect"
"N00345","this is ~~~~having ~~problem as it contains "~~test"
尝试使用sed
sed -i -e 's/^M//g' -e '/"$/!{:a N; s/n/~~/; /"$/b; ba}' file
注意:要输入 运行命令后的文件内容 或使用 简短描述 这里的想法是删除每行中不以 有关的更多详细信息,请参阅 编辑 有时单元格中的数据本身包含"内部"。 使用 运行更新案例数据命令后的文件内容 使用^M
,请键入Ctrl+V,然后键入Ctrl+M"id","notes"
"N001","this is~~test.~~~~Again test~~~~"
"N002","this is perfect"
"N00345","this is~~~~having ~~problem"
dos2unix
然后使用sed
dos2unix file
sed -i '/"$/!{:a N; s/n/~~/; /"$/b; ba}' file
"
结尾的换行符sed -i ' # -i specifies in-place relace i.e. modifies file itself
/"$/!{ # if a line doesn't contain end pattern, " at the end of a line, then do following
:a # label 'a' for branching/looping
N; # append the next line of input into the pattern space
s/n/~~/; # replace newline character 'n' with '~~' i.e. suppress new lines
/"$/b; # if a line contains end pattern then branch out i.e. break the loop
ba # branch to label 'a' i.e. this will create loop around label 'a'
}
' file # input file name
man sed
sed
sed -i ':a N; s/n/~~/; $s/"~~"/"n"/g; ba' file
"id","notes"
"N001","this is~~test. "Again test."~~~~Again test~~~~"
"N002","this is perfect"
"N00345","this is~~~~having ~~problem as it contains "~~test"
perl
一行perl -0777 -i -pe 's/n/~~/g; s/"~~("|$)/"n$1/g;' file
您可以使用sed
命令执行此操作
单独替换'^M'
sed -i 's|^M|~~|g' file_name
编辑:谢谢你的评论。
添加语句以替换"^M和新行"
替换'^M和新行'**
sed -i ':a;N;$!ba;s|^Mn|~~|g' file_name
要在控制台中获得"^M",您应该同时按下Cntrl+v+m
使用tr
。
$ tr '<Ctrl>+m' '~'
sed 's/^M/~~/;t nextline
b
: nextline
N
s/n/~~/
s/^[^"]*("[^"]*"}{1,}[^"]*$
t
b nextline
"
不仅要更改^M,还要更改引号之间的新行。
^M在unix会话中使用CTRL+V,然后在键盘上使用CTRL+M