我将段落定义为以"."结尾的文本(可以在中间包含"."(后面跟一个'\n'。例如,这里有两段:
first
paragraph.
second paragraph.
我想将此文本转换为
first paragraph.
second paragraph.
我试过这个
sed 's/([^n.])(n)/1 /g' file.txt
但这并没有影响输出,尽管在我了解了在模式空间中对比赛进行分组并在保持空间中返回某个组之后,这似乎是一个明显的解决方案。我的想法是用所有不在"后面的"替换只有一个空间。我查找了一些将unix格式的换行符转换为dos换行符的单行脚本,但它并没有解决我的问题。重要的是,我想留下所有在"."之后的".",这么简单的's/\n//g'对我不起作用。
我只是感兴趣的是,是否有可能用sed来做这件事?如果有人能给我指出我应该学习sed的方向,我将不胜感激。提前感谢!
只是为了表明使用sed是可能的,但我建议使用其他工具(shell、awk、python…(。用sed做逻辑并不容易。
[STEP 101] $ cat file
this
is the
1st paragraph.
this is the
2nd paragraph.
this is the 3rd paragraph.
this is the
4th paragraph?
[STEP 102] $ sed -e :go -e '$q;N;/[.]n/{P;D;};s/n/ /;bgo' file
this is the 1st paragraph.
this is the 2nd paragraph.
this is the 3rd paragraph.
this is the 4th paragraph?
[STEP 103] $
处理空行:
[STEP 104] $ cat file
this is the
1st paragraph.
this is the
2nd paragraph.
this is the 3rd paragraph.
this is the
4th paragraph?
[STEP 105] $ sed -e :go -e '$q;N;/[.]n/{P;D;};s/^n//;s/[ ]*n[ ]*/ /;bgo' file
this is the 1st paragraph.
this is the 2nd paragraph.
this is the 3rd paragraph.
this is the 4th paragraph?
[STEP 106] $
您可以使用sed
来完成此操作,但必须折叠线条并使用标签。awk
更适合这个任务。此外,在没有结束点的行之后添加一个空格也是很好的。
awk '{s=($0~/.$/)? "n": " "; printf $0 s}' file