r-数组中每12个矩阵的元素平均,重复一个序列12次,不带for循环



我有一个尺寸为[360,180,396]的数组。这些是经度、纬度和月-年33年的月度数据。元素是该纬度/经度的百分比。

我想从这里创建一个汇总数组,我将在以后的分析中使用,而不默认使用for循环。我想求所有33年每个月数据的平均值,然后求所有年份的年平均值。这是我用来包含数据的摘要数组的空白。

mca <- array(data = NA, 
dim = c(360,180,13), 
dimnames = list(lon, 
lat, 
c(month.abb, "Ann")))

下面是这个示例的较小的测试输入和输出数组

#input
set.seed(42)
smallin <- array(data = rnorm(n = 600, mean = 60, sd = 20),
dim = c(5, 5, 24))

#output to fill
smallout <- array(data = NA, 
dim = c(5,5,13), 
dimnames = list(c("1", "2", "3", "4", "5"), 
c("-89.5", "-88.5", "-87.5", "-86.5", "-85.5"), 
c(month.abb, "Ann")))

根据这个问题的第二个答案,我尝试了

jan <- apply(ca, c(seq(from = 1, to = 385, by = 12)), mean)
#also 
ind_jan <- c(seq(from = 1, to = 385, by = 12))
jan <- apply(ca, ind_jan, mean)

我认为它相当于

jan <- apply(smallin, c(seq(from = 1, to = 13, by = 12)), mean)

考虑边距,我需要放置我想要平均的第三维度,但收到错误:

apply(ca, c(seq(from = 1, to = 385, by = 12)), mean):'MARGIN'不匹配dim(X)

我回到上面的查询,并意识到margin = 1:2必须选择所有的每个矩阵(维度1和2)。所以使用它,我可以得到所有矩阵的平均值,这应该是我的输出数组[,,13],

的百分比的年平均值。
smallout[,,13] <- apply(smallin, 1:2, mean)

但我仍然不知道如何让它平均每12个矩阵从1开始,然后从2开始,然后从3开始…

我已经阅读了申请文件,但发现它在这种情况下没有帮助/难以理解。出现的所有建议问题似乎都是用Python(或其他语言)。

我也不确定我是否可以一次完成这一切,或者一个矩阵一个矩阵地通过上面的索引传递到输出数组。

我能想到的最接近的是

ind_jan <- c(seq(from = 1, to = 13, by = 12))
smallout[,,1] <- apply(smallin[,,c(ind_jan)], 1:2, mean)

对数组中的每个输出矩阵重复。有没有更少人工/更有效/更好的方法?

考虑这个简化的数组A(见下面的数据)。

str(A)
# int [1:2, 1:3, 1:6] 1 1 1 1 1 1 2 2 2 2 ...

我们可以使用sapply来"循环";和选项simplify='array'返回的年平均值数组,

yrs <- seq_len(dim(A)[3]/nm)
sapply(yrs, (i) apply(A[, , 1:nm + i - 1], 1:2, mean), simplify='array')
# , , 1
# 
#      [,1] [,2] [,3]
# [1,]    2    2    2
# [2,]    2    2    2
# 
# , , 2
# 
#      [,1] [,2] [,3]
# [1,]    2    2    2
# [2,]    2    2    2

,以及相应的历年月平均值:

mnt <- seq_len(nm)
sapply(mnt, (i) apply(A[, , i], 1:2, mean), simplify='array')
# , , 1
# 
#      [,1] [,2] [,3]
# [1,]    1    1    1
# [2,]    1    1    1
# 
# , , 2
# 
#      [,1] [,2] [,3]
# [1,]    2    2    2
# [2,]    2    2    2
# 
# , , 3
# 
#      [,1] [,2] [,3]
# [1,]    3    3    3
# [2,]    3    3    3

数据:

nm <- 3  ## no. "months"  ## actually 12 months in real years
ny <- 2  ## no. "years"  ## in your case 33
A <- array(rep(1:nm, each=nm*ny), c(2, 3, nm*ny))  ## think this is your `ca`

您可以通过拆分最后一个包含月和年的维度来为数组添加另一个维度,以分隔月和年的维度。

i <- dim(smallin)
dim(smallin) <- c(i[1:2], 12L, i[3]/12L)

用这个你可以得到所有年份每个月的平均值:

apply(smallin, 1:3, mean)
#, , 1
#
#         [,1]     [,2]     [,3]     [,4]     [,5]
#[1,] 73.66338 58.35988 72.33907 62.19628 52.08766
#[2,] 61.95544 79.93891 75.27725 49.30859 44.07820
#[3,] 64.02119 68.98285 35.76780 35.06961 58.79089
#[4,] 73.67935 67.72028 50.90479 23.22819 72.14434
#[5,] 62.57796 59.03798 64.53486 83.65987 97.04576
#
#...
#
#, , 12
#
#         [,1]     [,2]     [,3]     [,4]     [,5]
#[1,] 83.55254 68.77645 48.88358 52.99573 56.82992
#[2,] 83.47723 39.02472 95.08051 65.97988 54.00097
#[3,] 47.59936 36.93396 38.35189 57.86126 83.99976
#[4,] 73.00906 53.71818 36.93229 80.85843 39.27094
#[5,] 81.67441 64.50031 62.71359 56.27758 54.01388

历年平均值:

apply(smallin, c(1,2,4), mean)
#, , 1
#
#         [,1]     [,2]     [,3]     [,4]     [,5]
#[1,] 60.77253 60.15417 54.71206 67.31820 62.05012
#[2,] 56.60298 59.14604 73.17469 57.66912 53.36540
#[3,] 56.52924 56.31096 58.73874 67.47850 59.06819
#[4,] 67.75999 56.45636 49.43743 55.14660 65.46497
#[5,] 60.28056 62.17656 55.08681 54.15788 60.05240
#
#, , 2
#
#         [,1]     [,2]     [,3]     [,4]     [,5]
#[1,] 60.55035 65.21223 59.92112 59.75500 69.77088
#[2,] 60.89782 54.59722 55.17699 59.06815 60.03906
#[3,] 58.85733 54.02893 47.31326 63.10434 59.56569
#[4,] 60.96362 61.82648 55.45109 54.50272 45.21176
#[5,] 59.94452 54.31497 60.64839 64.65777 80.86525

历年的年平均值:

apply(smallin, 1:2, mean)
#         [,1]     [,2]     [,3]     [,4]     [,5]
#[1,] 60.66144 62.68320 57.31659 63.53660 65.91050
#[2,] 58.75040 56.87163 64.17584 58.36864 56.70223
#[3,] 57.69329 55.16994 53.02600 65.29142 59.31694
#[4,] 64.36180 59.14142 52.44426 54.82466 55.33836
#[5,] 60.11254 58.24577 57.86760 59.40782 70.45883

我确信有更好的方法,如果有人有的话,我仍然热衷于学习,但是下面的方法是有效的,一旦我弄清楚如何做索引,以便选择每个月的数据使用apply取平均值。

mca[,,1] <- apply(ca[,,c(seq(from = 1, to = 396, by = 12))], 1:2, mean)
mca[,,2] <- apply(ca[,,c(seq(from = 2, to = 396, by = 12))], 1:2, mean)
mca[,,3] <- apply(ca[,,c(seq(from = 3, to = 396, by = 12))], 1:2, mean)
mca[,,4] <- apply(ca[,,c(seq(from = 4, to = 396, by = 12))], 1:2, mean)
mca[,,5] <- apply(ca[,,c(seq(from = 5, to = 396, by = 12))], 1:2, mean)
mca[,,6] <- apply(ca[,,c(seq(from = 6, to = 396, by = 12))], 1:2, mean)
mca[,,7] <- apply(ca[,,c(seq(from = 7, to = 396, by = 12))], 1:2, mean)
mca[,,8] <- apply(ca[,,c(seq(from = 8, to = 396, by = 12))], 1:2, mean)
mca[,,9] <- apply(ca[,,c(seq(from = 9, to = 396, by = 12))], 1:2, mean)
mca[,,10] <- apply(ca[,,c(seq(from = 10, to = 396, by = 12))], 1:2, mean)
mca[,,11] <- apply(ca[,,c(seq(from = 11, to = 396, by = 12))], 1:2, mean)
mca[,,12] <- apply(ca[,,c(seq(from = 12, to = 396, by = 12))], 1:2, mean)
mca[,,13] <- apply(ca, 1:2, mean)

最新更新