我有如下数据帧:
> library(tidyverse)
> dd <- tibble(a = rep(1,10), b = rep(1,10), c = rep(1,10))
> dd
# A tibble: 10 × 3
a b c
<dbl> <dbl> <dbl>
1 1 1 1
2 1 1 1
3 1 1 1
4 1 1 1
5 1 1 1
6 1 1 1
7 1 1 1
8 1 1 1
9 1 1 1
10 1 1 1
和一个权重向量:
> weight <- c(1, 5, 10)
> weight
[1] 1 5 10
当我想计算数据框中所有列的逐行加权和时,我这样做:
> dd %>% mutate(m = rowSums(map2_dfc(dd, weight,`*`)))
# A tibble: 10 × 4
a b c m
<dbl> <dbl> <dbl> <dbl>
1 1 1 1 16
2 1 1 1 16
3 1 1 1 16
4 1 1 1 16
5 1 1 1 16
6 1 1 1 16
7 1 1 1 16
8 1 1 1 16
9 1 1 1 16
10 1 1 1 16
但我不知道如何计算子集的逐行加权和数据帧的。我尝试了下面的代码,但它给出了混乱的结果:
> dd %>% rowwise() %>% mutate(m = rowwise(map2_dfc(c_across(b:c), weight[2:3],`*`)))
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
New names:
• `` -> `...1`
• `` -> `...2`
# A tibble: 10 × 4
# Rowwise:
a b c m$...1 $...2
<dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1 1 5 10
2 1 1 1 5 10
3 1 1 1 5 10
4 1 1 1 5 10
5 1 1 1 5 10
6 1 1 1 5 10
7 1 1 1 5 10
8 1 1 1 5 10
9 1 1 1 5 10
10 1 1 1 5 10
谁能给我提示一下如何处理这个问题?这是矩阵乘法。原始的等于as.matrix(dd) %*% weight
。对于mutate
中的子集,您可以这样做:
dd %>% mutate(m = (across(b:c) %>% as.matrix()) %*% weight[1:2])
使用tidyverse
方法,我们可以为'weight'创建一个命名向量,循环across
列'b'到'c',根据列名(cur_column()
)子集'weight'值,乘以并得到rowSums
library(dplyr)
names(weight) <- names(dd)
dd %>%
mutate(m = rowSums(across(b:c, ~ .x * weight[cur_column()])))
与产出
# A tibble: 10 × 4
a b c m
<dbl> <dbl> <dbl> <dbl>
1 1 1 1 15
2 1 1 1 15
3 1 1 1 15
4 1 1 1 15
5 1 1 1 15
6 1 1 1 15
7 1 1 1 15
8 1 1 1 15
9 1 1 1 15
10 1 1 1 15
或者如果我们想使用rowwise
(不推荐,因为它较慢)
dd %>%
rowwise %>%
mutate(m = sum(c_across(b:c) * weight[2:3])) %>%
ungroup
或者使用crossprod
dd %>%
mutate(m = crossprod(t(pick(b:c)), weight[2:3])[,1])
与base R
dd$m <- rowSums(dd[2:3] * weight[2:3][col(dd[2:3])])