当特殊字符满足时，将文件分为多个

我有一个主文件如下：

/* ------------- AAAAAAAA ------------- */
some
lines 
here
/* ------------- BBBBBBBB ------------- */
more
things
/* ------------- CCCCCCCC ------------- */
there
a 
few
more
lines

我的最终目标是创建一个文件，该文件只会引起包含特定字符串的块，例如，如果该字符串为 lines，那么我将拥有这样的输出文件：

/* ------------- AAAAAAAA ------------- */
some
lines 
here
/* ------------- CCCCCCCC ------------- */
there
a 
few
more 
lines

要达到我的目标，我首先尝试将主文件分为Bock的子文件，以获取

之类的东西

file-1
file-2
file-3

然后我计划检查每个文件，如果包含搜索的字符串，则将它们附加回我的新主文件。

我不知道这是说实话的最佳方法，此外，我在主文件中获得了30139行的1600多个块，所以要解析很多。

但是，如果我继续这样做这项工作，我的代码仍然有问题：

#!/bin/ksh
i=0
while IFS=| read -r "line"; do
        if [ `echo $line | grep '/* ------' | wc -l` -eq 1 ]; then
                i=$((i+1))
        fi
        echo $line > "file-$i"
done < $1

由于每个块被/* --------分开，如果我执行echo $line，则输出将是我的根目录（/etc，/tmp等），而不是$line本身。

因此，我知道这是一个两个问题，但是由于可以使用脚本的其他方式绕过第二个问题，所以它肯定是链接的。

编辑：

解决方案必须在Korn Shell中，因为我无法在这台计算机上安装任何内容

尴尬中的另一个：

$ awk '
function dump() {         # define a function to avoid duplicate code in END
    if(b~/lines/)         # if buffer has "lines" in it
        print b           # output and ...
    b="" }                # reset buffer
/^/*/ { dump() }        # at the start of a new block dump existing buffer
{ b=b (b==""?"":ORS) $0 } # gather buffer
END{ dump() }             # dump the last buffer also
' file
/* ------------- AAAAAAAA ------------- */
some
lines 
here
/* ------------- CCCCCCCC ------------- */
there
a 
few
more
lines

当您真的想使用while read构造时，请尝试避免其他文件和系统调用。

matched=0
all=
while IFS= read -r line; do
  if [[ ${line} =~ "/* ----"* ]]; then
      if [ ${matched} -eq 1 ]; then
         printf "%sn" "${all}"
      fi
      all=
      matched=0
  fi
  all="${all}${line}
"
  if [[ "${line}" =~ line ]]; then
    matched=1
  fi
done < <(cat mainfile; echo "/* ---- The End --- */" )

如果您不介意使用 perl ，那么有一个很好的单位使您的成就变得容易。

您唯一需要的是添加这样的行：

/* ------------- END ------------- */

在文件的末尾。这样就成为了：

/* ------------- AAAAAAAA ------------- */
some
lines 
here
/* ------------- BBBBBBBB ------------- */
more
things
/* ------------- CCCCCCCC ------------- */
there
a 
few
more
lines
/* ------------- END ------------- */

现在借助此正则模式：

`/.?(?=/*)`

您可以分别匹配每个部分。例如，此部分：

/* ------------- AAAAAAAA ------------- */
some
lines 
here

因此，如果您的存储在数组中的结果最后，您将有一个数组，其中包含 3 部分。最终，您可以在每个部分中申请lines。如果找到了，则将打印该部分。

单线

perl -ne 'BEGIN{$/=undef;}push(@arr,$&) while//*.*?(?=/*)/smg;END{for (@arr){print if /lines/g }}' file

，输出将是：

/* ------------- AAAAAAAA ------------- */
some
lines 
here
/* ------------- CCCCCCCC ------------- */
there
a 
few
more
lines

，如果您申请more：

/* ------------- BBBBBBBB ------------- */
more
things
/* ------------- CCCCCCCC ------------- */
there
a 
few
more
lines

基于@batman解决方案

命令行解决方案：

tr 'n' ';' < file | grep -Po '/*.*?(?=/*)' | grep lines | tr ';' 'n'

其输出：

/* ------------- AAAAAAAA ------------- */
some
lines 
here
/* ------------- CCCCCCCC ------------- */
there
a 
few
more
lines

使用 awk

awk -v RS="/[*]" '/lines/{printf "/*"$0}' file

输出：

/* ------------- AAAAAAAA ------------- */
some
lines
here
/* ------------- CCCCCCCC ------------- */
there
a
few
more
lines

`/.?(?=/*)`

相关内容

最新更新

热门标签：