交叉/连接bash中以逗号分隔的范围列表(1,2,4-5)



我处理的数据结构基本上是排序的、不重叠的整数范围的列表:

1,2,3
1-5
0,1,3-4,6,8-10

我需要对这些列表对执行基本操作(并集和交集(,例如:

$ list_and '1,2,3' '0,1,3-4,6,8-10'
1,3
$ list_or  '1-5'   '0,1,3-4,6,8-10'
0-6,8-10

我可以在关联数组中显式地构造每个集合,对成对的成员执行操作,然后将结果集折叠回这种形式,但对于大范围来说,这将非常缓慢。

什么是连接和相交这种"曲线"的最有效的方式;范围列表";纯Bash?

有了join(1)comm(1)sort(1)等命令的帮助,Set操作会容易得多,但除了bash之外,它们什么都不可行。我确实同意奥古兹的观点,换一种语言会更合适。您可以使用比shell支持的更好的数据结构来获得非常奇特和高效的稀疏整数集。

通过一些shell运算,将输入范围拆分为一行行单独的数字来作为这些程序的输入也很容易。

#!/usr/bin/env bash
# Turn a string like 1,3,5-10 into one number per line, filling in ranges
expand() {
local IFS=,
# For each comma-separated element
for elem in $1; do
# If it's a range, print out each number in the range
if [[ $elem =~ ([0-9]+)-([0-9]+) ]]; then
for (( i = BASH_REMATCH[1]; i <= BASH_REMATCH[2]; i++ )); do
printf "%dn" "$i"
done
else
# And if it's just a scalar, print that number.
printf "%dn" "$elem"
fi
done
}
list_and() {
# Add each element in the first argument to an associate array,
# and then for each element of the second argument, see if it also
# exists in that array. If so, add it to the result.
local -A arr1
for elem in $(expand "$1"); do
arr1[$elem]=1
done

local -a intersection
for elem in $(expand "$2"); do
if [[ -v arr1[$elem] ]]; then
intersection+=("$elem")
fi  
done
local IFS=,
printf "%sn" "${intersection[*]}"
}
list_or() {
# Populate a sparse array using the numbers from arguments
# as indexes.

local -a union
for elem in $(expand "$1") $(expand "$2"); do
union[$elem]=$elem
done
local IFS=,
printf "%sn" "${union[*]}"
}
printf "%sn" "$(list_and '1,2,3' '0,1,3-4,6,8-10')"
printf "%sn" "$(list_or '1-5' '0,1,3-4,6,8-10')"

将输出中的1,2,3之类的内容转换为1-3留给读者练习。(我不想为你做所有作业…(

最新更新