如何使用r对这两列进行排序

我有一个很大的数据库，下面是一个示例:

df<-read.table (text=" Col1 Col2
65  NA
NA  91
56  NA
71  71
67  100
NA  45
44  NA
NA  90
NA  40
84  71
44  63
NA  20
", header=TRUE)

我想加上"1"到Col1，用Col2完成Col1中的NA考虑第2行，Col1中的NA应该是91。这里我们不加"1"。然而，在Col1中，如果它们没有NA，我们在开始时加1。

感兴趣的结果是:

你是这个意思吗?

> with(df, as.numeric(ifelse(is.na(Col1), Col2, sprintf("1%s", Col1))))
[1] 165  91 156 171 167  45 144  90  40 184 144  20

或

> with(df,ifelse(is.na(Col1), Col2, 100 + Col1))
[1] 165  91 156 171 167  45 144  90  40 184 144  20

我们可以使用

with(df, pmax(Col1 + 100, Col2, na.rm = TRUE))
[1] 165  91 156 171 167  45 144  90  40 184 144  20

在coalesce的帮助下，我们可以做

library(dplyr)
df %>%
transmute(Out = coalesce(suppressWarnings(as.numeric(paste0('1', Col1))),Col2))
#   Out
#1  165
#2   91
#3  156
#4  171
#5  167
#6   45
#7  144
#8   90
#9   40
#10 184
#11 144
#12  20

如果Col1中的值总是2位数，我们可以将其简化为-

df %>% transmute(Out = coalesce(Col1 + 100, Col2))

使用within

within(df, {
na <- is.na(df$Col1)
Col1 <- 100L + Col1
Col1[na] <- Col2[na]
rm(na, Col2)
})
#    Col1
# 1   165
# 2    91
# 3   156
# 4   171
# 5   167
# 6    45
# 7   144
# 8    90
# 9    40
# 10  184
# 11  144
# 12   20

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(purrr)
df<-read.table (text=" Col1 Col2
65  NA
NA  91
56  NA
71  71
67  100
NA  45
44  NA
NA  90
NA  40
84  71
44  63
NA  20
", header=TRUE)
df %>%
mutate(
Col1 = Col1 %>% map2(Col2, ~ ifelse(is.na(.x), .y, .x + 1))
)
#>    Col1 Col2
#> 1    66   NA
#> 2    91   91
#> 3    57   NA
#> 4    72   71
#> 5    68  100
#> 6    45   45
#> 7    45   NA
#> 8    90   90
#> 9    40   40
#> 10   85   71
#> 11   45   63
#> 12   20   20

^{由reprex包(v2.0.1)于2021-09-28创建}

library(tidyverse)
df %>%
mutate(across(c(Col1, Col2), as.numeric),
Out = if_else(is.na(Col1), Col2, Col1 + 100))

相关内容

最新更新

热门标签：