我有一个文件,看起来像这样:
FirstSentences1 bfjkjhdfhizhfzibfkjezfzfiuzehfizdjfldfsdfsljfklj
SecondSentences2 fjlskdjfjoijrgeojrgijgoejrgrjgiorjofgjeirjgoergd
.
.
.
NthhhSentencesN klkdlffjsldfsljflsfjlskfjldkjflsfjlfkdjfdfjojjij
我必须得到以下输出:
FirstSentences1 bfjkjhdfhizhfzibfkje
FirstSentences1 zfzfiuzehfizdjfldfsd
FirstSentences1 fsljfklj
SecondSentences2 fjlskdjfjoijrgeojrgi
SecondSentences2 jgoejrgrjgiorjofgjei
SecondSentences2 rjgoergd
.
.
.
NthhhSentencesN klkdlffjsldfsljflsfj
NthhhSentencesN lskfjldkjflsfjlfkdjf
NthhhSentencesN dfjojjij
说明:
,例如第一行:
FirstSentences1 bfjkjhdfhizhfzibfkjezfzfiuzehfizdjfldfsdfsljfklj
我们取字符串"bfjkjhdfhizhfzibfkjezfzfiuzehfizdjfldfsdfsljfklj"当长度等于20时将其包装起来
你知道怎么得到这个吗?
您可以使用一个短脚本利用字符串索引和嵌套循环:
#!/bin/bash
declare -i len=${2:-20} ## take length as 2nd arg (filename is 1st)
while read -r line; do ## read each line
while [ ${#line} -gt 0 ]; do ## if characters remain
printf "%sn" "${line:0:$((len))}" ## print len chars
line="${line:$((len))}" ## strip len chars from line
done
done < "$1"
示例输入文件
$ cat dat/longsent.txt
bfjkjhdfhizhfzibfkjezfzfiuzehfizdjfldfsdfsljfklj
fjlskdjfjoijrgeojrgijgoejrgrjgiorjofgjeirjgoergd
使用/输出示例
default 20-chars
per-line:
$ bash wrap.sh dat/longsent.txt
bfjkjhdfhizhfzibfkje
zfzfiuzehfizdjfldfsd
fsljfklj
fjlskdjfjoijrgeojrgi
jgoejrgrjgiorjofgjei
rjgoergd
按每行10
字符换行:
$ bash wrap.sh dat/longsent.txt 10
bfjkjhdfhi
zhfzibfkje
zfzfiuzehf
izdjfldfsd
fsljfklj
fjlskdjfjo
ijrgeojrgi
jgoejrgrjg
iorjofgjei
rjgoergd
注意:您应该验证len
大于0
,并且您可以将|| test -n "$line"
添加到第一个while子句中,以容纳在最后一行结束的非posix行(为简洁而省略)。
包括行前缀
如果您的数据文件包含前缀,(例如FirstSentence1
, ...
),并且您需要在输出中包含这些前缀,则只需在line
之前添加prefix
的读取,并在每个换行行之前输出prefix
(具有相同的字段宽度,左对齐)。例如:
#!/bin/bash
declare -i len=${2:-20} ## take length as 2nd arg (filename is 1st)
declare -i wdth=22 ## set min field width for prefix (so cols align)
while read -r prefix line; do ## read each line
while [ ${#line} -gt 0 ]; do ## if characters remain
## print len chars w/prefix width set to wdth, left-justified
printf "%-*s %sn" $wdth "$prefix" "${line:0:$((len))}"
line="${line:$((len))}" ## strip len chars from line
done
done < "$1"
示例输入文件w/前缀
$ cat dat/longsentpfx.txt
FirstSentence1 bfjkjhdfhizhfzibfkjezfzfiuzehfizdjfldfsdfsljfklj
SecondSentences2 fjlskdjfjoijrgeojrgijgoejrgrjgiorjofgjeirjgoergd
使用/输出示例
$ bash wrap.sh dat/longsentpfx.txt
FirstSentence1 bfjkjhdfhizhfzibfkje
FirstSentence1 zfzfiuzehfizdjfldfsd
FirstSentence1 fsljfklj
SecondSentences2 fjlskdjfjoijrgeojrgi
SecondSentences2 jgoejrgrjgiorjofgjei
SecondSentences2 rjgoergd
$ bash wrap.sh dat/longsentpfx.txt 10
FirstSentence1 bfjkjhdfhi
FirstSentence1 zhfzibfkje
FirstSentence1 zfzfiuzehf
FirstSentence1 izdjfldfsd
FirstSentence1 fsljfklj
SecondSentences2 fjlskdjfjo
SecondSentences2 ijrgeojrgi
SecondSentences2 jgoejrgrjg
SecondSentences2 iorjofgjei
SecondSentences2 rjgoergd
如果你有其他问题,请告诉我。
注意:将宽度设置为比最长的prefix
晚一个字符,您需要在实际编写换行行之前读取所有prefix
值以找到最长的宽度,然后添加+1
。如果您的数据文件很短,您可以将前缀和行读入一对索引数组并首先扫描前缀数组的长度,如果数据文件很大,您可以扫描文件两次(不是最佳的),或者您可以像上面那样设置一些预定的宽度。
使用substr
:
awk '{ for(i=0;i<length($2);i=i+20) print $1,substr($2,i,20) }' file
对于您的示例,您可以:
awk '{n=patsplit($2, a, /.{1,20}/); for(i=1;i<=n;i++) print $1, a[i] }' file