我想做的是在每个字符之前和之后获取 2 个字符串。
输入文件:
hello
reader
.....
预期输出为:
# # h e l //before character h is null and assign with '#". After character h are "e" and "l".
# h e l l //before character e is "h". After character e are "l" and "l".
h e l l o //before character l are "h" and "e". After character l are "l" and "o".
e l l o # //before character l are "e" and "l". After character l is "o".
l l o # # //before character o are "l" and "l". After character o is null and assign with '#".
# # r e a
# r e a d
r e a d e
e a d e r
a d e r #
d e r # #
这是代码:归功于RudiC
awk '
{ L = length * 2
M = int (L / 4)
X = sprintf ("%*sY%*s", M, "", M, "")
gsub (/ /, "#", X)
sub (/Y/, $1, X)
gsub (/./, "& ", X)
for (i=1; i<=L; i+=2) print substr (X, i, L-1)
}
' $1
但第一个词只起作用
# # h e l
# h e l l
h e l l o
e l l o #
l l o # #
# # # r e a
# # r e a d
# r e a d e
r e a d e r
e a d e r #
a d e r # #
我会使用这样的东西:
awk '{n=length($0) # get the length N of the string
$0 = "##" $0 "##" # prepend and append "##"
gsub(/./, "& ") # add a space after every character
for (i=1; i<=2*n; i+=2) # loop X from position 1 to length of the string
print substr($0, i, 5*2-1) # print 5*2 chars from position 2X (-1 for the trailing space)
print ""}' file # print an empty line to separate blocks
查看实际操作:
$ awk '{n=length($0); $0 = "##" $0 "##"; gsub(/./, "& "); {for (i=1; i<=2*n; i+=2) print substr($0, i, 5*2)} print ""}' file
# # h e l
# h e l l
h e l l o
e l l o #
l l o # #
# # r e a
# r e a d
r e a d e
e a d e r
a d e r #
d e r # #
如您所见,这里的关键是硬编码要打印的字符数,而不是依赖于字符串的长度。就我而言,我将其设置为 5。
问题是输出的长度不应取决于读取的行的长度。
试试这个:
awk '
{
L = length($0) * 2
M = int (L / 4)
X = sprintf ("%*sY%*s", M, "", M, "")
gsub (/ /, "#", X)
sub (/Y/, $0, X)
gsub (/./, "& ", X)
for (i=1; i<=L; i+=2) print substr (X, i, (2*maxlen)-1)
} ' maxlen=5 "${1}"
maxlen=5
用于将参数传递给awk
。 awk
自动检测参数是variable=value
还是filename
。使用它来设置打印到标准输出的非空格字符数。
测试:
$ cat file
hello
reader
wonderful
$ awk '
{
L = length($0) * 2
M = int (L / 4)
X = sprintf ("%*sY%*s", M, "", M, "")
gsub (/ /, "#", X)
sub (/Y/, $0, X)
gsub (/./, "& ", X)
for (i=1; i<=L; i+=2) print substr (X, i, (2*maxlen)-1)
} ' maxlen=5 file
# # h e l
# h e l l
h e l l o
e l l o #
l l o # #
# # # r e
# # r e a
# r e a d
r e a d e
e a d e r
a d e r #
# # # # w
# # # w o
# # w o n
# w o n d
w o n d e
o n d e r
n d e r f
d e r f u
e r f u l
这是一个 1 行:
$ cat data
hello
reader
$ sed 's/^/##/;s/$/##/' data | while read -r line || [[ -n "$line" ]]; do for i in $(seq 0 $((${#line}-4))); do temp="${line:$i:5}"; [[ "${#temp}" -eq 5 ]] && echo "${line:$i:5}"; done; done | sed 's/./& /g'
# # h e l
# h e l l
h e l l o
e l l o #
l l o # #
# # r e a
# r e a d
r e a d e
e a d e r
a d e r #
d e r # #
减去 4 表示排除由 sed
添加的#
。
一些换行符以获得更好的可读性:
$ sed 's/^/##/;s/$/##/' data | while read -r line || [[ -n "$line" ]]; do
> for i in $(seq 0 $((${#line}-4))); do
> temp="${line:$i:5}"
> [[ "${#temp}" -eq 5 ]] && echo "${line:$i:5}"
> done
> done | sed 's/./& /g'
使用相同的逻辑发布 awk 解决方案以确保完整性:
$ cat data
hello
reader
$ awk '{$0="##" $0 "##"; for(i=0;i<=(length($0)-4);i++) { temp=substr($0, i, 5); if(length(temp)==5) { gsub(/./, "& ", temp); print temp; }}}' data
# # h e l
# h e l l
h e l l o
e l l o #
l l o # #
# # r e a
# r e a d
r e a d e
e a d e r
a d e r #
d e r # #
以下是awk
脚本:
{
$0="##" $0 "##";
for(i=0;i<=(length($0)-4);i++)
{
temp=substr($0, i, 5);
gsub(/./, "& ", temp);
if(length(temp)==10)
print temp;
}
}
使用 awk 和 sed 的解决方案
sed 's/^/##/;s/$/##/' input.txt | awk '{ for(i = 1; i < length-3; i++) print substr($0, i, 5) }' | sed 's/./& /g'
输出
# # h e l
# h e l l
h e l l o
e l l o #
l l o # #
# # r e a
# r e a d
r e a d e
e a d e r
a d e r #
d e r # #