是否存在使用If else或在多列条件下突变变量的R函数



我正试图根据以下条件在数据中创建一个变量:

x y Z S T G
1 0 1 0 1 0
1 0 0 0 0 0 
1 1 1 0 0 0 
1 1 1 1 1 1
if x=1 then 1, 
if y=1 then 2 if s=1 then 3, 
if t=1 then 4 if G=1 then 5 if X==y==z==1 then 6 and so on.

请告诉我如何使用if else 写这篇文章

使用if-else?你可以在没有if else:的情况下计算它

v <- 1:6
# this vector should give each column a the value
# 1 2 3 ... 6
# the most tedious part is to get your notes into a the R terminal
# as an R matrix.
# I used the fact that the string in R can span multiple lines:
s <- "x y Z S T G
1 0 1 0 1 0
1 0 0 0 0 0 
1 1 1 0 0 0 
1 1 1 1 1 1"
# it looks like this:
s
## [1] "x y Z S T Gn1 0 1 0 1 0n1 0 0 0 0 0 n1 1 1 0 0 0 n1 1 1 1 1 1"
# after trying long around with the base R functions
# which led to errors and diverse problems, I found the most elegant way
# to transform this string into a matrix-like tabular form
# is to use tidyverse's read_delim().
# install.packages("tidyverse")
# load tidyverse:
require(tidyverse) # or: library(tidyverse)
tb <- read_delim(s, delim=" ") ## it complains about parsing failues, but
tb
# A tibble: 4 x 6
x     y     Z     S     T     G
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1     1     0     1     0     1     0
2     1     0     0     0     0     0
3     1     1     1     0     0     0
4     1     1     1     1     1     1
# so it is read correctly in!

# what you want to do actually is
# to multiply each row with `v` and sum this result:
tb[1, ]
# A tibble: 1 x 6
x     y     Z     S     T     G
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1     1     0     1     0     1     0
# you do:
v * tb[1, ]
x y Z S T G
1 1 0 3 0 5 0
# if you build sum with this, then you get your desired numbers
sum(v * tb[1, ])
## [1] 9

# row-wise manipulation of matrix/data.frame/tibbles you do by 
apply(tb, MARGIN=1, FUN=function(row) v * row)
[,1] [,2] [,3] [,4]
x    1    1    1    1
y    0    0    2    2
Z    3    0    3    3
S    0    0    0    4
T    5    0    0    5
G    0    0    0    6
# very often such functions flip the results, so flip it back 
# by the transpose function `t()`:
t(apply(tb, MARGIN=1, FUN=function(row) v * row))
x y Z S T G
[1,] 1 0 3 0 5 0
[2,] 1 0 0 0 0 0
[3,] 1 2 3 0 0 0
[4,] 1 2 3 4 5 6
# to get directly the sum by row, do:
apply(tb, MARGIN=1, FUN=function(row) sum(v * row))
## [1]  9  1  6 21
# these are the values you wanted, isn't it?
# I see now, that 
tb * v    # by using vectorization of R
x y Z S T G
[1,] 1 0 3 0 5 0
[2,] 1 0 0 0 0 0
[3,] 1 2 3 0 0 0
[4,] 1 2 3 4 5 6
# therfore the rowSums are:
rowSums(tb * v)
## [1]  9  1  6 21

因此,这是一种通常(混乱(的解决方案。

最后,它可以归结为(通常你可以在Stack Overflow中找到这样的简短答案(:

简短回答

require(tidyverse)
s <- "x y Z S T G
1 0 1 0 1 0
1 0 0 0 0 0 
1 1 1 0 0 0 
1 1 1 1 1 1"
tb <- read_delim(s, delim=" ")
rowSums(tb * v)

这就是R的美妙之处:如果你确切地知道该做什么,那只需要1-3行代码(或者多一点(。。。

最新更新