我有以下tibble:
library(tidyverse)
dat <- structure(list(V1 = c("Number of input reads", "Uniquely mapped reads number",
"Uniquely mapped reads %", "Average mapped length"), V2 = c("26265603",
"13330431", "50.75%", "47.37")), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -4L))
它看起来像这样:
V1 V2
<chr> <chr>
1 Number of input reads 26265603
2 Uniquely mapped reads number 13330431
3 Uniquely mapped reads % 50.75%
4 Average mapped length 47.37
我想做的是将V2
列转换为数字。预期的最终结果是:
V1 V2
<chr> <dbl>
1 Number of input reads 26265603
2 Uniquely mapped reads number 13330431
3 Uniquely mapped reads % 0.5075
4 Average mapped length 47.37
我试过这个
dat %>%
mutate(V2 = case_when(V1 == "Uniquely mapped reads %" ~ as.numeric(sub("%","",V2))/100,
TRUE ~ as.numeric(V2)))
但它给了我警告:
Warning message:
In eval_tidy(pair$rhs, env = default_env) : NAs introduced by coercion
正确的方法是什么?
使用管道可能会有点复杂,因为我们只想更新几行,但在基R中,我们可以首先找到包含特定字符串的行,然后只更新那些V2
值。
inds <- dat$V1 == "Uniquely mapped reads %"
dat$V2[inds] <- as.numeric(sub("%", "", dat$V2[inds]))/100
dat
# A tibble: 4 x 2
# V1 V2
# <chr> <chr>
#1 Number of input reads 26265603
#2 Uniquely mapped reads number 13330431
#3 Uniquely mapped reads % 0.5075
#4 Average mapped length 47.37
使用管道的方法可以是
library(dplyr)
dat %>%
mutate(V2 = as.numeric(sub("%", "", V2))/
(c(1, 100)[(V1 == "Uniquely mapped reads %") + 1]))