r - 检查一系列列中的值是否与另一列中的值在一定数量的值内 - r - check if values in a series of columns are within a certain number of values from those in another series of columns 小贝子编程网

我在R中有一个数据帧，看起来像这样：

|---------------------------------------------------------|
| col1 | col2   | col3  | col4  | col5  | col6   | col7   |
|______|________|_______|_______|_______|________|________|
| x    | 2003   | 2004  | 2009  | 2002  | 2011   | NA     |
|------|--------|-------|-------|-------|--------|--------|
| y    | 2004   |  NA   | NA    | 2002  | 2004   | NA     |
|------|--------|-------|-------|-------|--------|--------|
| x    | 2007   |  2009 | NA    | 2010  | 2012   | 2013   |
|---------------------------------------------------------|

我想检查 col1 中每个类别的次数，col5：col7 中的值在 (0-2) col2：col4 中的任何值之后的 2 年或更短的时间内出现。

所以期望的结果将是这样的：

[[x]] 
2
[[y]]
1

或作为这样的数据帧：

col1 | count |
______________
x    | 2
--------------
y    | 1

我认为必须有一种 dplyr 方法来做到这一点？喜欢gather()和filter()的东西？还是使用sapply来获取值之间的差异，然后仅计算数字> 2 的某种方法？

我遇到的主要问题是，当并非所有列都有每行的值时，语法如何，我想将 col2：col4 中的值与 col5：col7 中的所有值进行比较，而不仅仅是特定列。

好的，谢谢@NelsonGon这有效，但我认为可能有一种更简单的方法：

#convert to long format
test <- mydf %>%
gather( first_group, year.1, col2:col4) %>%
gather(scond_group, year.2, col5:col7) 
#remove the NA values
test <- test[-c(which(is.na(test$year.2))),]
test <- test[-c(which(is.na(test$year.1))),]
#count number fitting criteria
test2 <- test %>%
group_by(col1) %>%
filter(year.2 >= year.1 & year.2 <= year.1 + 2) %>%
summarise(n = n()) 
##result
#test1
## A tibble: 2 x 2
#depend_var     n
#<chr>      <int>
#1 x         2
#2 y         1

r - 检查一系列列中的值是否与另一列中的值在一定数量的值内

相关内容

最新更新

热门标签：