在CSV值中搜索并替换(转义)双引号



我想将","",之间的所有出现替换为'''(三个单引号)。它将在csv文件和所有可能的嵌套引号上完成,以避免混淆格式。

"test","""","test"变为
"test","''''''","test"
又如:
"test","quotes "inside" quotes","test"
变为
"test","quotes '''inside''' quotes"

我使用https://sed.js.org/来测试替换

我现在拥有的是

sed "s/([^,])(")(.)/\1'\''\3/g"

但它似乎不完整,它没有涵盖我想要的所有情况。

。作品:
"anything","inside "quotes"","anything"
"anything","inside '''quotes'''","anything"
不工作:
"anything","inside "test" quotes","anything"
"anything''',"inside '''test''' quotes''',"anything"预期

"anything","inside '''test''' quotes","anything"

也许有人是好的正则表达式,可以帮助?

使用sed

$ cat input_file
"test","""","test"
"test","quotes "inside" quotes","test"
"anything","inside "quotes"","anything"
"anything","inside "test" quotes","anything"
$ sed -E ':a;s/(,"[^,]*('"'"'+)?)"([^,]*"(,|$))/1'"'''"'3/;ta' input_file
"test","''''''","test"
"test","quotes '''inside''' quotes","test"
"anything","inside '''quotes'''","anything"
"anything","inside '''test''' quotes","anything"

用变量${qs}来避免转义三个单引号。
开始用${qs}替换所有引号。
接下来在行首,行尾和,周围重置替换。

qs="'''"
sed "s/"/${qs}/g; s/^${qs}/"/; s/${qs}$/"/; s/${qs},${qs}/","/g" csvfile

最新更新