r语言 - 计算一组列的逐行加权和



我有如下数据帧:

> library(tidyverse)
> dd <- tibble(a = rep(1,10), b = rep(1,10), c = rep(1,10))
> dd
# A tibble: 10 × 3
a     b     c
<dbl> <dbl> <dbl>
1     1     1     1
2     1     1     1
3     1     1     1
4     1     1     1
5     1     1     1
6     1     1     1
7     1     1     1
8     1     1     1
9     1     1     1
10     1     1     1

和一个权重向量:

> weight <- c(1, 5, 10)
> weight
[1]  1  5 10

当我想计算数据框中所有列的逐行加权和时,我这样做:

> dd %>% mutate(m = rowSums(map2_dfc(dd, weight,`*`)))
# A tibble: 10 × 4
a     b     c     m
<dbl> <dbl> <dbl> <dbl>
1     1     1     1    16
2     1     1     1    16
3     1     1     1    16
4     1     1     1    16
5     1     1     1    16
6     1     1     1    16
7     1     1     1    16
8     1     1     1    16
9     1     1     1    16
10     1     1     1    16

但我不知道如何计算子集的逐行加权和数据帧的。我尝试了下面的代码,但它给出了混乱的结果:

> dd %>% rowwise() %>% mutate(m = rowwise(map2_dfc(c_across(b:c), weight[2:3],`*`)))
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
# A tibble: 10 × 4
# Rowwise: 
a     b     c m$...1 $...2
<dbl> <dbl> <dbl>  <dbl> <dbl>
1     1     1     1      5    10
2     1     1     1      5    10
3     1     1     1      5    10
4     1     1     1      5    10
5     1     1     1      5    10
6     1     1     1      5    10
7     1     1     1      5    10
8     1     1     1      5    10
9     1     1     1      5    10
10     1     1     1      5    10
谁能给我提示一下如何处理这个问题?

这是矩阵乘法。原始的等于as.matrix(dd) %*% weight。对于mutate中的子集,您可以这样做:

dd %>% mutate(m = (across(b:c) %>% as.matrix()) %*% weight[1:2])

使用tidyverse方法,我们可以为'weight'创建一个命名向量,循环across列'b'到'c',根据列名(cur_column())子集'weight'值,乘以并得到rowSums

library(dplyr)
names(weight) <- names(dd)
dd %>% 
mutate(m = rowSums(across(b:c,  ~ .x * weight[cur_column()])))

与产出

# A tibble: 10 × 4
a     b     c     m
<dbl> <dbl> <dbl> <dbl>
1     1     1     1    15
2     1     1     1    15
3     1     1     1    15
4     1     1     1    15
5     1     1     1    15
6     1     1     1    15
7     1     1     1    15
8     1     1     1    15
9     1     1     1    15
10     1     1     1    15

或者如果我们想使用rowwise(不推荐,因为它较慢)

dd %>% 
rowwise %>%
mutate(m = sum(c_across(b:c) * weight[2:3])) %>%
ungroup

或者使用crossprod

dd %>%
mutate(m = crossprod(t(pick(b:c)), weight[2:3])[,1])

base R

dd$m <-  rowSums(dd[2:3] * weight[2:3][col(dd[2:3])])

相关内容

  • 没有找到相关文章

最新更新