我们在一个实时kafka集群上有大约1k个主题。目前我们在3个数据中心上有6个代理(Id为1、2、3、4、5、6)。我们在集群级别的默认复制因子设置为3。现在由于一些不可避免的情况,我们失去了一个DC(broker id 1和2)。因此,我们已经完成了分区重新分配,我们将分区重新分配给代理3、4、5和6。除此之外,为了获得更高的容错性,我们希望将所有现有主题的复制因子增加到4
下面是生成的主题分区的一个小示例。现在,这里的计划是保留现有的分区重新分配,只添加缺少的代理,例如
。my_topic_1 p0副本为[4,5,3],我希望将其更新为[4,5,3,6]
my_topic_2 p0副本为[3,6,4],我希望将其更新为[3,6,4,5]
my_topic_2 p0副本为[6,4,5],我希望将其更新为[6,4,5,3]
JSON示例如下。我一直在尝试使用grep、sed和jq的组合,这样我们就可以得到每个分区的副本列表,例如
my_topic_1 p0副本列表是[4,5,3],并将其与主列表(集群中现有的代理)进行比较;[3,4,5,6]并将缺失的broker追加到分区列表中,因此,在分区列表中没有broker 6;因此,添加6,使topic的分区列表变为[4,5,3,6]
感谢您的建议
{
"version": 1,
"partitions": [{
"topic": "my_topic_1",
"partition": 0,
"replicas": [4, 5, 3],
"log_dirs": ["any", "any", "any"]
}, {
"topic": "my_topic_2",
"partition": 0,
"replicas": [3, 6, 4],
"log_dirs": ["any", "any", "any"]
}, {
"topic": "my_topic_2",
"partition": 1,
"replicas": [6, 4, 5],
"log_dirs": ["any", "any", "any"]
}, {
"topic": "my_topic_2",
"partition": 2,
"replicas": [4, 5, 3],
"log_dirs": ["any", "any", "any"]
}, {
"topic": "my_topic_2",
"partition": 3,
"replicas": [5, 3, 6],
"log_dirs": ["any", "any", "any"]
}, {
"topic": "my_topic_2",
"partition": 4,
"replicas": [3, 5, 6],
"log_dirs": ["any", "any", "any"]
}, {
"topic": "my_topic_2",
"partition": 5,
"replicas": [6, 3, 4],
"log_dirs": ["any", "any", "any"]
}, {
"topic": "my_topic_3",
"partition": 0,
"replicas": [4, 6, 5],
"log_dirs": ["any", "any", "any"]
}, {
"topic": "my_topic_3",
"partition": 1,
"replicas": [5, 4, 3],
"log_dirs": ["any", "any", "any"]
}, {
"topic": "my_topic_3",
"partition": 2,
"replicas": [3, 5, 6],
"log_dirs": ["any", "any", "any"]
}, {
"topic": "my_topic_3",
"partition": 3,
"replicas": [6, 3, 4],
"log_dirs": ["any", "any", "any"]
}, {
"topic": "my_topic_3",
"partition": 4,
"replicas": [4, 6, 5],
"log_dirs": ["any", "any", "any"]
}, {
"topic": "my_topic_3",
"partition": 5,
"replicas": [5, 4, 3],
"log_dirs": ["any", "any", "any"]
}]
}
我不知道这意味着什么,但是要将[3,4,5,6]
的缺失项添加到给定数组中,只需添加一个值18
(即3+4+5+6
)减去数组的当前和(add
)的项:
jq '.partitions[].replicas |= . + [18-add]' file.json
演示为了使它更通用,您可以使用--argjson
:
jq --argjson full '[3,4,5,6]' '
.partitions[].replicas |= . + [($full | add) - add]
' file.json
演示或者通过创建unique
值的数组来从手头的数组生成完整的列表:
jq '
([.partitions[].replicas[]] | unique | add) as $sum
| .partitions[].replicas |= . + [$sum - add]
' file.json
演示