该建议是一个只使用awk来分割字符串的函数,该函数接受任何字符串作为分隔符,并接受任何字符串为输入。
如何为split创建一个只使用awk并接受任何字符串作为输入和分隔符的函数?
有很多关于使用bash命令进行字符串分割的建议(参见本例(,但所有这些建议都只在特定情况下有效,而不是根据我们的建议。
我们决定以我们的代码为例,但尽管它功能齐全,但我们认为有几点可以改进/调整/纠正。
示例函数(f_split(
F_PRESERVE_BLANK_LINES_R=""
f_preserve_blank_lines() {
: 'Remove "single quotes" used to prevent blank lines being erroneously removed.
The "single quotes" are used at the beginning and end of the strings to prevent
blank lines with no other characters in the sequence being erroneously removed.
We do not know the reason for this side effect. This problem occurs, for example,
in commands that involve "awk".
Args:
STR_TO_TREAT_P (str): String to be treated.
Returns:
F_PRESERVE_BLANK_LINES_R (str): String treated.
'
F_PRESERVE_BLANK_LINES_R=""
STR_TO_TREAT_P=$1
STR_TO_TREAT_P=${STR_TO_TREAT_P%?}
F_PRESERVE_BLANK_LINES_R=${STR_TO_TREAT_P#?}
}
F_SPLIT_R=()
f_split() {
: 'It does a "split" into a given string and returns an array.
Args:
TARGET_P (str): Target string to "split".
DELIMITER_P (Optional[str]): Delimiter used to "split". If not informed the
split will be done by spaces.
Returns:
F_SPLIT_R (array): Array with the provided string separated by the informed
delimiter.
'
F_SPLIT_R=()
TARGET_P=$1
DELIMITER_P=$2
if [ -z "$DELIMITER_P" ] ; then
DELIMITER_P=" "
fi
REMOVE_N=1
if [ "$DELIMITER_P" == "n" ] ; then
REMOVE_N=0
fi
# PROBLEM: This was the only parameter that has been a problem so far... There are
# probably others. Maybe a scheme using "sed" would solve the problem...
if [ "$DELIMITER_P" == "./" ] ; then
DELIMITER_P="[.]/"
fi
if [ ${REMOVE_N} -eq 1 ] ; then
# PROBLEM: Due to certain limitations we have some problems getting the output
# of a split by awk inside an array and so we need to use "line break" (n)
# to succeed. Seen this, we remove the line breaks momentarily afterwards
# we reintegrate them. The problem is that if there is a line break in the
# "string" informed, this line break will be lost, that is, it is erroneously
# removed in the output...
TARGET_P=$(awk 'BEGIN {RS="dn"} {gsub("n", "3F2C417D448C46918289218B7337FCAF"); printf $0}' <<< "${TARGET_P}")
fi
# PROBLEM: The replace of "n" by "3F2C417D448C46918289218B7337FCAF" results in
# more occurrences of "3F2C417D448C46918289218B7337FCAF" than the amount of "n"
# that there was originally in the string (one more occurrence at the end of
# the string). We can not explain the reason for this side effect. The line below
# corrects this problem...
TARGET_P=${TARGET_P%????????????????????????????????}
SPLIT_NOW=$(awk -F "$DELIMITER_P" '{for(i=1; i<=NF; i++){printf "%sn", $i}}' <<< "${TARGET_P}")
while IFS= read -r LINE_NOW ; do
if [ ${REMOVE_N} -eq 1 ] ; then
LN_NOW_WITH_N=$(awk 'BEGIN {RS="dn"} {gsub("3F2C417D448C46918289218B7337FCAF", "n"); printf $0}' <<< "'${LINE_NOW}'")
# PROBLEM: It would be perfect if we didn't need to use the function below...
f_preserve_blank_lines "$LN_NOW_WITH_N"
LN_NOW_WITH_N="$F_PRESERVE_BLANK_LINES_R"
F_SPLIT_R+=("$LN_NOW_WITH_N")
else
F_SPLIT_R+=("$LINE_NOW")
fi
done <<< "$SPLIT_NOW"
}
用法
read -r -d '' FILE_CONTENT << 'HEREDOC'
BEGIN
15
It may also be helpful to note (though understandably you had no room to do so) that the -d option to readarray first appears in Bash 4.4. –
fbicknel
Aug 18, 2017 at 15:57
4
Great answer (+1). If you change your awk to awk '{ gsub(/,[ ]+|$/,"