更新的问题好,所以我有一个带有这样的行的文件:
44:) 2.884E-02 0.000E+00 0.000E+00 2.780E+02 0.000E+00 0.000E+00 9.990E+02
45:) 2.884E-02 0.000E+00 0.000E+00 2.780E+02 0.000E+00 0.000E+00 9.990E+02
1:) 3.593E-02 0.000E+00 0.000E+00 2.780E+02 0.000E+00 0.000E+00 1.000E+05
2:) 3.593E-02 0.000E+00 0.000E+00 2.780E+02 0.000E+00 0.000E+00 1.000E+05
第一列中的数字从1到x(在这种情况下为45),然后以1次开始。我想将一些列移至单独的文件。我要移动的列的索引存储在变量/数组$selected_columns
(在这种情况下2、5和8)中,我想要移动的列数存储在$number_of_columns
中(在这种情况下为3)。
i然后要创建45个文件,一个用于所有1:)
的选定列,一个用于所有2:)
的选定列等。我想使其尽可能一般,因为列的数量和从1到x的数字都会改变。数字X始终是已知的,并且要提取的列是用户选择的。
原始问题:
我有一个由eGrep提取的字符串。然后,我想在该字符串中打印一些列(单词)。位置(列索引)在我的bash脚本中的列表中已知。目前看起来像这样:
line=$(egrep " ${i}:)" $1)
for ((j=1; j<=$number_of_columns; j++))
do
awk $line -v current_column=${selected_columns[$j]} '{printf $(current_column)}' > "history_files/history${i}"
done
其中number_of_columns
是要打印的列数,而selected_columns
包含这些列的相应索引。作为示例number_of_columns = 3
和selected_columns = [2 5 8]
,所以我想从字符串line
打印到文件编号2、5和8到文件history${i}
。
我不确定有什么问题,但这是通过反复试验进行的。当前错误是awk: cannot open 0.000E+00 (No such file or directory)
。
任何帮助都将受到赞赏!
我猜,您必须将awk
行更改为:
echo $line | awk -v current_column=${selected_columns[$j]} ...
对于您的更新问题,如果列在数组$selected_columns
中。在您的示例文件中,列被多个相邻空间分开。如果对原始文件不正确,则可以在grep
之前省略sed
。
columns=`echo ${selected_columns[*]} | sed 's/ /,/g'`
for i in `seq 45`; do
sed -e 's/ */ /g' file | grep "^$i:)" | cut -d' ' -f $columns >file-$i
done
in:
awk $line -v ...
$ line持有GREP的输出,这可能不是Awk期望在其命令行上看到的东西。另外,m此:
for ((j=1; j<=$number_of_columns; j++))
do
anything > "history_files/history${i}"
done
将使您每次通过循环覆盖历史记录文件。我不知道你在那里真正想要什么。
但是,您的脚本还有许多其他问题。您说:"作为示例number_of_columns = 3和selected_columns = [2 5 8],所以我想从字符串线到文件历史记录$ {i}打印单词编号2、5和8。"。
。这完全是微不足道的,您也不需要在尴尬之外做一个" grep",所以您可以做整个事情,例如:
awk -v pat=" ${i}:)" -v selected_columns="$selected_columns" '
BEGIN { number_of_columns = split(selected_columns,selected_columnsA) }
$0 ~ pat {
sep=""
for (j=1;j<=number_of_columns;j++) {
current_column = selected_columnsA[j]
printf "%s,%s",sep,lineA[current_column]
sep = "t"
}
print ""
}
' "$1" > "history_files/history${i}"
如果那对您不起作用,请让我们解决这个问题,而不是尝试修复原始脚本。听起来您已经在上面的外面封闭了循环,很可能也可能是Awk脚本的一部分。
基于更新的OP:
编辑我添加了很多评论,但是如果您有问题,请告诉我:
$ cat file
44:) 2.884E-02 0.000E+00 0.000E+00 2.780E+02 0.000E+00 0.000E+00 9.990E+02
45:) 2.884E-02 0.000E+00 0.000E+00 2.780E+02 0.000E+00 0.000E+00 9.990E+02
1:) 3.593E-02 0.000E+00 0.000E+00 2.780E+02 0.000E+00 0.000E+00 1.000E+05
2:) 3.593E-02 0.000E+00 0.000E+00 2.780E+02 0.000E+00 0.000E+00 1.000E+05
$
$ cat tst.sh
selected_columns=(2 5 8)
selCols="${selected_columns[@]}"
awk -v selCols="$selCols" '
BEGIN { # Executed before the first line of the input file is read
# Split the string of selected column numbers, selCols, into
# an array selColsA where selColsA[1] has the value of the
# first space-separated sub-string of selCols (i.e. the number
# of the first column to print). Note that we dont need the
# number of columns passed into the script as a result of
# splitting the string is the count of elements put into the
# array as a return code from the split() builtin function.
numCols = split(selCols,selColsA)
}
{ # Executed once for every line of the input file
# Create a numerix suffix like "45" from the first column
# in the current line of the input file, e.g. "45:)" by
# just getting rid of all non-digit characters.
sfx = $1
gsub(/[^[:digit:]]/,"",sfx)
# Create the name of the output file by attaching that
# numeric suffix to the base value for all output files.
#histfile = "history_files/history" sfx
histfile = "tmp" sfx
# Loop through every column we want printed. selColsA[<index>]
# gives us a column number which we can then use to access the
# columns of the current line. Awk uses the builtin variable $0
# to hold the current line, and it autolatically splits it so
# that $1 holds the first column, $2 is the second, etc. So
# if selColsA[1] has the value 3, then $(selColsA[1]) would be
# the value of the 3rd column of the current input line.
sep=""
for (i=1;i<=numCols;i++) {
curCol = selColsA[i]
# Print the current column, prefixed by a tab for all but
# the first column, and without a terminating newline so the
# next column gets appended to the end of the current output line.
# Note that in awk "> file" has different semantics from shell
# and opens the file for writing the first time the line is hit
# like "> file" in shell, but then appends to it every time its
# hit afterwards, like ">> file" in shell.
printf "%s%s",sep,$curCol > histfile
sep = "t"
}
# Add a newline to the end of the current output line
print "" > histfile
}
' "$1"
$
$ ./tst.sh file
$
$ cat tmp1
3.593E-02 2.780E+02 1.000E+05
$ cat tmp2
3.593E-02 2.780E+02 1.000E+05
$ cat tmp44
2.884E-02 2.780E+02 9.990E+02
$ cat tmp45
2.884E-02 2.780E+02 9.990E+02
顺便说一句,由于您只是在学习,所以我使用了上面的"列"one_answers"行",但是awk术语实际上是"字段"one_answers"记录"。
我认为您可以使用cut来做自己想做的事,即
echo "$line" | cut -d" " -f2 -f5 -f8 > "history_files/history${i}"
-d是您的定界符,我使用空间进行测试,因此"
希望这有帮助