编辑数据,删除换行符并将所有内容放在一行中



嗨,我是 shell 脚本的新手,我无法做到这一点:

我的数据看起来像这样(实际上要大得多):

 >SampleName_ZN189A
 01000001000000000000100011100000000111000000001000
 00110000100000000000010000000000001100000010000000
 00110000000000001110000010010011111000000100010000
 00000110000001000000010100000000010000001000001110
 0011
 >SampleName_ZN189B
 00110000001101000001011100000000000000000000010001
 00010000000000000010010000000000100100000001000000
 00000000000000000000000010000000000010111010000000
 01000110000000110000001010010000001111110101000000
 1100

注意:每 50 个字符后有一个换行符,但有时在数据完成并且有一个新的示例名称时会更少

我希望每 50 个字符后,换行符将被删除,因此我的数据如下所示:

 >SampleName_ZN189A
 0100000100000000000010001110000000011100000000100000110000100000000000010000000000001100000010000000...
 >SampleName_ZN189B
 0011000000110100000101110000000000000000000001000100010000000000000010010000000000100100000001000000...

我尝试使用 tr,但出现错误:

tr 'n' '' < my_file
tr: empty string2

提前致谢

带有"-d"的 tr 删除指定的字符

$ cat input.txt
00110000001101000001011100000000000000000000010001
00010000000000000010010000000000100100000001000000
00000000000000000000000010000000000010111010000000
01000110000000110000001010010000001111110101000000
1100
$ cat input.txt | tr -d "n"
001100000011010000010111000000000000000000000100010001000000000000001001000000000010010000000100000000000000000000000000000010000000000010111010000000010001100000001100000010100100000011111101010000001100

你可以使用这个尴尬:

awk '/^ *>/{if (s) print s; print; s="";next} {s=s $0;next} END {print s}' file
>SampleName_ZN189A
010000010000000000001000111000000001110000000010000011000010000000000001000000000000110000001000000000110000000000001110000010010011111000000100010000000001100000010000000101000000000100000010000011100011
>SampleName_ZN189B
001100000011010000010111000000000000000000000100010001000000000000001001000000000010010000000100000000000000000000000000000010000000000010111010000000010001100000001100000010100100000011111101010000001100

使用 awk

awk '/>/{print (NR==1)?$0:RS $0;next}{printf $0}' file

如果您不在乎第一行有额外新行的结果,这里是更短的

awk '{printf (/>/?RS $0 RS:$0)}' file

这可能对你有用(GNU sed):

sed '/^s*>/!{H;$!d};x;s/ns*//2gp;x;h;d' file

在保留空间中建立记录,当遇到下一条记录的开头或文件结尾时,删除换行符并打印出来。

您可以使用此sed

sed '/^>Sample/!{ :loop; N; /n>Sample/{n}; s/n//; b loop; }' file.txt

试试这个

cat SampleName_ZN189A | tr -d 'r'
# tr -d deletes the given/specified character from the input

使用简单的awk,也可以实现。

 awk 'BEGIN{ORS=""} {print}' SampleName_ZN189A #Output doesn't contains an carriage return
 at the end, If u want an line break at the end this works.
 awk 'BEGIN{ORS=""} {print}END{print "r"}' SampleName_ZN189A
 # select the correct line break charachter (i.e) r (or) n (rn) depends upon the file format.

相关内容

  • 没有找到相关文章

最新更新