如何定义单词的参数



我想做的是在每个字符之前和之后获取 2 个字符串。

输入文件:

hello
reader
.....

预期输出为:

# # h e l //before character h is null and assign with '#". After character h are "e" and "l".
# h e l l //before character e is "h". After character e are "l" and "l".
h e l l o //before character l are "h" and "e". After character l are "l" and "o".
e l l o # //before character l are "e" and "l". After character l is "o".
l l o # # //before character o are "l" and "l". After character o is null and assign with '#".
# # r e a
# r e a d
r e a d e
e a d e r
a d e r #
d e r # #

这是代码:归功于RudiC

awk '
        { L = length * 2
          M = int (L / 4)
          X = sprintf ("%*sY%*s", M, "", M, "")
          gsub (/ /, "#", X)
          sub (/Y/, $1, X)
          gsub (/./, "& ", X)
          for (i=1; i<=L; i+=2) print substr (X, i, L-1)
        }
' $1 

但第一个词只起作用

# # h e l
# h e l l
h e l l o
e l l o #
l l o # #
# # # r e a
# # r e a d
# r e a d e
r e a d e r
e a d e r #
a d e r # #

我会使用这样的东西:

awk '{n=length($0)                    # get the length N of the string
      $0 = "##" $0 "##"               # prepend and append "##"
      gsub(/./, "& ")                 # add a space after every character
      for (i=1; i<=2*n; i+=2)         # loop X from position 1 to length of the string
          print substr($0, i, 5*2-1)  # print 5*2 chars from position 2X (-1 for the trailing space)
      print ""}' file                 # print an empty line to separate blocks

查看实际操作:

$ awk '{n=length($0); $0 = "##" $0 "##"; gsub(/./, "& "); {for (i=1; i<=2*n; i+=2) print substr($0, i, 5*2)} print ""}' file
# # h e l
# h e l l
h e l l o
e l l o #
l l o # #
# # r e a
# r e a d
r e a d e
e a d e r
a d e r #
d e r # #

如您所见,这里的关键是硬编码要打印的字符数,而不是依赖于字符串的长度。就我而言,我将其设置为 5。

问题是输出的长度不应取决于读取的行的长度。

试试这个:

awk '
 {
    L = length($0) * 2
    M = int (L / 4)
    X = sprintf ("%*sY%*s", M, "", M, "")
    gsub (/ /, "#", X)
    sub (/Y/, $0, X)
    gsub (/./, "& ", X)
    for (i=1; i<=L; i+=2) print substr (X, i, (2*maxlen)-1)
 } ' maxlen=5 "${1}"

maxlen=5用于将参数传递给awkawk自动检测参数是variable=value还是filename。使用它来设置打印到标准输出的非空格字符数。

测试:

$ cat file
hello

reader
wonderful
$ awk '
{
  L = length($0) * 2
  M = int (L / 4)
  X = sprintf ("%*sY%*s", M, "", M, "")
  gsub (/ /, "#", X)
  sub (/Y/, $0, X)
  gsub (/./, "& ", X)
  for (i=1; i<=L; i+=2) print substr (X, i, (2*maxlen)-1)
} ' maxlen=5 file
# # h e l
# h e l l
h e l l o
e l l o #
l l o # #
# # # r e
# # r e a
# r e a d
r e a d e
e a d e r
a d e r #
# # # # w
# # # w o
# # w o n
# w o n d
w o n d e
o n d e r
n d e r f
d e r f u
e r f u l

这是一个 1 行:

$ cat data 
hello
reader
$ sed 's/^/##/;s/$/##/' data | while read -r line || [[ -n "$line" ]]; do for i in $(seq 0 $((${#line}-4))); do temp="${line:$i:5}"; [[ "${#temp}" -eq 5 ]] && echo "${line:$i:5}"; done; done | sed 's/./& /g'
# # h e l 
# h e l l 
h e l l o 
e l l o # 
l l o # # 
# # r e a 
# r e a d 
r e a d e 
e a d e r 
a d e r # 
d e r # # 

减去 4 表示排除由 sed 添加的#

一些换行符以获得更好的可读性:

$ sed 's/^/##/;s/$/##/' data | while read -r line || [[ -n "$line" ]]; do
> for i in $(seq 0 $((${#line}-4))); do
> temp="${line:$i:5}"
> [[ "${#temp}" -eq 5 ]] && echo "${line:$i:5}"
> done
> done | sed 's/./& /g'

使用相同的逻辑发布 awk 解决方案以确保完整性:

$ cat data 
hello
reader
$ awk '{$0="##" $0 "##"; for(i=0;i<=(length($0)-4);i++) { temp=substr($0, i, 5); if(length(temp)==5) { gsub(/./, "& ", temp); print temp; }}}' data 
# # h e l 
# h e l l 
h e l l o 
e l l o # 
l l o # # 
# # r e a 
# r e a d 
r e a d e 
e a d e r 
a d e r # 
d e r # # 

以下是awk脚本:

{
    $0="##" $0 "##";
    for(i=0;i<=(length($0)-4);i++)
    {
        temp=substr($0, i, 5);
        gsub(/./, "& ", temp);
        if(length(temp)==10)
            print temp;
    }
} 

使用 awk 和 sed 的解决方案

sed 's/^/##/;s/$/##/' input.txt | awk '{ for(i = 1; i < length-3; i++) print substr($0, i, 5) }' | sed 's/./& /g'

输出

# # h e l 
# h e l l 
h e l l o 
e l l o # 
l l o # # 
# # r e a 
# r e a d 
r e a d e 
e a d e r 
a d e r # 
d e r # #

相关内容

  • 没有找到相关文章

最新更新