按元素相乘,但仅选择列

  • 本文关键字:选择 元素 r
  • 更新时间 :
  • 英文 :


我有一个大的数据帧,我想在其中进行元素乘法,但只针对某些列。以下是DF 的示例

Name   Age State Student_A  Student_B  Height_Student_A Height_Student_B 
A     2   NZ    0.5         NA          0.5                0.2     
B     1   AS    0.5         NA          0.5                0.2     
C     4   MU    NA         0.6          0.5                0.2     
D     5   BY    NA         0.7          0.5                0.2

目标是通过匹配的"高度"列将每个学生相乘。输出应该看起来像

Name   Age State Student_A  Student_B  Height_Student_A Height_Student_B Score_Student_A Score_Student B 
A     2   NZ    0.5         NA          0.5                0.2           0.25           NA     
B     1   AS    0.5         NA          0.5                0.2           0.25           NA
C     4   MU    NA         0.6          0.5                0.2           NA             0.12
D     5   BY    NA         0.7          0.5                0.2           NA             0.14

我怀疑使用元素乘法,但我不确定如何指定从哪列开始。感谢您的帮助。

您可以捕获Student列和Height列,并直接将它们相乘。

student_cols <- sort(grep('^Student', names(df), value = TRUE))
height_cols <- sort(grep('^Height', names(df), value = TRUE))
df[paste0('Score_', student_cols)] <- df[student_cols] * df[height_cols]
df
#  Name Age State Student_A Student_B Height_Student_A Height_Student_B Score_Student_A Score_Student_B
#1    A   2    NZ       0.5        NA              0.5              0.2            0.25              NA
#2    B   1    AS       0.5        NA              0.5              0.2            0.25              NA
#3    C   4    MU        NA       0.6              0.5              0.2              NA            0.12
#4    D   5    BY        NA       0.7              0.5              0.2              NA            0.14

游戏后期,但这可能有助于您(并解决@latlio的评论(将其推广到其他处理中(因为Ronak的交易是该数据的通用解决方案(。

稍微改变一下形状。不幸的是,我不是tidyr::pivot最强的,所以可能有更好的方法。

library(dplyr)
library(tidyr)
pivot_longer(dat, -c("Name", "Age", "State"), names_pattern = "(.*)_([^_]+)$", names_to = c("type", "AB")) %>%
pivot_wider(c(Name:State, AB), names_from = "type", values_from = "value") %>%
mutate(Score = Student * Height_Student)
# # A tibble: 8 x 7
#   Name    Age State AB    Student Height_Student  Score
#   <chr> <int> <chr> <chr>   <dbl>          <dbl>  <dbl>
# 1 A         2 NZ    A         0.5            0.5  0.25 
# 2 A         2 NZ    B        NA              0.2 NA    
# 3 B         1 AS    A         0.5            0.5  0.25 
# 4 B         1 AS    B        NA              0.2 NA    
# 5 C         4 MU    A        NA              0.5 NA    
# 6 C         4 MU    B         0.6            0.2  0.12 
# 7 D         5 BY    A        NA              0.5 NA    
# 8 D         5 BY    B         0.7            0.2  0.140

由于R是矢量化的,您可以直接将列相乘以创建新列:

DF$Score_Student_A <- DF$Student_A * DF$Height_Student_A
DF$Score_Student_B <- DF$Student_B * DF$Height_Student_B

最新更新