r语言 - 大小为 K 的整数分区



给定一个F非负整数的向量v,我想一个接一个地创建所有可能的K向量集,其大小为F,其总和为v。我称 C 为这些 K 向量的矩阵;C 的行和给出v

例如,如果我们设置 K=2,大小为 F=2 的向量 (1,2(,可以分解为:

# all sets of K vectors such that their sum is (1,2)
C_1 = 1,0   C_2 = 1,0  C_3 = 1,0 C_4 =  0,1   C_5 = 0,1  C_6 = 0,1
      2,0         1,1        0,2        2,0         1,1        0,2

目前,我使用此代码,预先计算所有可能的C,然后浏览它们。

library(partitions)
K <- 3
F <- 5
v <- 1:F
partitions <- list()
for(f in 1:F){
  partitions[[f]] <- compositions(n=v[f],m=K)
}
# Each v[f] has multiple partitions. Now we create an index to consider
# all possible combinations of partitions for the whole vector v.
npartitions <- sapply(partitions, ncol)
indices <- lapply(npartitions, function(x) 1:x)
grid <- as.matrix(do.call(expand.grid, indices)) # breaks if too big
for(n in 1:nrow(grid)){
  selected <- c(grid[n,])
  C <- t(sapply(1:F, function(f) partitions[[f]][,selected[f]]))
  # Do something with C
  #...
  print(C)
}

但是,当尺寸太大,F,K很大时,组合的数量就会爆炸,expand.grid无法处理。

我知道,对于给定的位置 v[f],我可以一次创建一个分区

partition <- firstcomposition(n=v[f],m=K)
nextcomposition(partition, v[f],m=K)

但是我如何使用它来生成所有可能的 C,如上面的代码所示?

npartitions <- ......
indices <- lapply(npartitions, function(x) 1:x)
grid <- as.matrix(do.call(expand.grid, indices))

您可以避免生成grid,并通过康托尔扩展连续生成其行。

下面是返回整数n的康托尔展开的函数:

aryExpansion <- function(n, sizes){
  l <- c(1, cumprod(sizes))
  nmax <- tail(l,1)-1
  if(n > nmax){
    stop(sprintf("n cannot exceed %d", nmax))
  }
  epsilon <- numeric(length(sizes))
  while(n>0){
    k <- which.min(l<=n)
    e <- floor(n/l[k-1])
    epsilon[k-1] <- e
    n <- n-e*l[k-1]
  }
  return(epsilon)
}

例如:

expand.grid(1:2, 1:3)
##   Var1 Var2
## 1    1    1
## 2    2    1
## 3    1    2
## 4    2    2
## 5    1    3
## 6    2    3
aryExpansion(0, sizes = c(2,3)) + 1
## [1] 1 1
aryExpansion(1, sizes = c(2,3)) + 1
## [1] 2 1
aryExpansion(2, sizes = c(2,3)) + 1
## [1] 1 2
aryExpansion(3, sizes = c(2,3)) + 1
## [1] 2 2
aryExpansion(4, sizes = c(2,3)) + 1
## [1] 1 3
aryExpansion(5, sizes = c(2,3)) + 1
## [1] 2 3

因此,与其生成网格:

npartitions <- ......
indices <- lapply(npartitions, function(x) 1:x)
grid <- as.matrix(do.call(expand.grid, indices))
for(n in 1:nrow(grid)){
  selected <- grid[n,]
  ......
}  

你可以做:

npartitions <- ......
for(n in seq_len(prod(npartitions))){
  selected <- 1 + aryExpansion(n-1, sizes = npartitions)
  ......
}  

最新更新