我想将"
在,"
和",
之间的所有出现替换为'''
(三个单引号)。它将在csv文件和所有可能的嵌套引号上完成,以避免混淆格式。
。"test","""","test"
变为"test","''''''","test"
。
又如:"test","quotes "inside" quotes","test"
变为"test","quotes '''inside''' quotes"
。
我使用https://sed.js.org/来测试替换
我现在拥有的是
sed "s/([^,])(")(.)/\1'\''\3/g"
但它似乎不完整,它没有涵盖我想要的所有情况。
。作品:"anything","inside "quotes"","anything"
→"anything","inside '''quotes'''","anything"
不工作:"anything","inside "test" quotes","anything"
→"anything''',"inside '''test''' quotes''',"anything"
预期
→"anything","inside '''test''' quotes","anything"
也许有人是好的正则表达式,可以帮助?
使用sed
$ cat input_file
"test","""","test"
"test","quotes "inside" quotes","test"
"anything","inside "quotes"","anything"
"anything","inside "test" quotes","anything"
$ sed -E ':a;s/(,"[^,]*('"'"'+)?)"([^,]*"(,|$))/1'"'''"'3/;ta' input_file
"test","''''''","test"
"test","quotes '''inside''' quotes","test"
"anything","inside '''quotes'''","anything"
"anything","inside '''test''' quotes","anything"
用变量${qs}
来避免转义三个单引号。
开始用${qs}
替换所有引号。
接下来在行首,行尾和,
周围重置替换。
qs="'''"
sed "s/"/${qs}/g; s/^${qs}/"/; s/${qs}$/"/; s/${qs},${qs}/","/g" csvfile