我想计算一个人是否从一年存活到下一年。0表示它死了,1表示它活了下来。数据集由不同年份(2007年至2020年(组成,计算应从2008年开始。我只希望R使用我所拥有的数据的一部分。
我的数据集如下所示:
我的数据集的前17行
> ID 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020
1 0 1 0 0 0 0 0 0 0 0 0 0 0 0
3 0 1 1 1 0 0 0 0 0 0 0 0 0 0
4 0 1 1 1 0 0 0 0 0 0 0 0 0 0
9 0 1 0 0 0 0 0 0 0 0 0 0 0 0
24 0 0 1 1 1 1 1 1 1 1 1 1 1 0
...
我总共有1121个条目,共有16列。
我希望R在2008年的第一排开始,看看是否有1。如果有1,我希望R查看下一列(2009(,看看是否也有1(应该给我1作为输出(或0(应该给我们0作为输出(。如果没有1,我希望R检查下一列,直到它找到一个有1的年份,那么它应该如上所述检查下一个列。在它找到1并进行检查后,它应该忽略剩余的列,并移动到下一行并重复该过程。输出应该保存在一个新列中。
我尝试了循环和if-else语句以及if-else,if。。。
我最接近我的目标是使用以下代码
for(x in foal_fates_2)) {
if (foal_fates_2$`2008`=="1" && foal_fates_2$`2009` =="1") {
print("1")
} else if (foal_fates_2$`2008`== "1" && foal_fates_2$`2009` =="0") {
print("0")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="1" && foal_fates_2$`2010` == "1"){
print("1")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="1" && foal_fates_2$`2010`== "0") {
print("0")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="1" &&
foal_fates_2$`2011`=="1"){
print("1")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="1" &&
foal_fates_2$`2011`=="0"){
print("0")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="1" && foal_fates_2$`2012`=="1"){
print("1")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="1" && foal_fates_2$`2012`=="0"){
print("0")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="1" && foal_fates_2$`2013`=="1"){
print("1")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="1" && foal_fates_2$`2013`=="0"){
print("0")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="0" && foal_fates_2$`2013`=="1" &&
foal_fates_2$`2014`== "1"){
print("1")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="0" && foal_fates_2$`2013`=="1" &&
foal_fates_2$`2014`=="0"){
print("0")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="0" && foal_fates_2$`2013`=="0" &&
foal_fates_2$`2014`== "1" && foal_fates_2$`2015`=="1"){
print("1")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="0" && foal_fates_2$`2013`=="0" &&
foal_fates_2$`2014`== "1" && foal_fates_2$`2015`=="0"){
print("0")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="0" && foal_fates_2$`2013`=="0" &&
foal_fates_2$`2014`== "0" && foal_fates_2$`2015`=="1" && foal_fates_2$`2016` =="1"){
print("1")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="0" && foal_fates_2$`2013`=="0" &&
foal_fates_2$`2014`== "0" && foal_fates_2$`2015`=="1" && foal_fates_2$`2016` =="0"){
print("0")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="0" && foal_fates_2$`2013`=="0" &&
foal_fates_2$`2014`== "0" && foal_fates_2$`2015`=="0" && foal_fates_2$`2016` =="1" &&
foal_fates_2$`2017`=="1"){
print("1")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="0" && foal_fates_2$`2013`=="0" &&
foal_fates_2$`2014`== "0" && foal_fates_2$`2015`=="0" && foal_fates_2$`2016` =="1" &&
foal_fates_2$`2017`=="0"){
print("0")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="0" && foal_fates_2$`2013`=="0" &&
foal_fates_2$`2014`== "0" && foal_fates_2$`2015`=="0" && foal_fates_2$`2016` =="0" &&
foal_fates_2$`2017`=="1" && foal_fates_2$`2018`=="1"){
print("1")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="0" && foal_fates_2$`2013`=="0" &&
foal_fates_2$`2014`== "0" && foal_fates_2$`2015`=="0" && foal_fates_2$`2016` =="0" &&
foal_fates_2$`2017`=="1" && foal_fates_2$`2018`=="0"){
print("0")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="0" && foal_fates_2$`2013`=="0" &&
foal_fates_2$`2014`== "0" && foal_fates_2$`2015`=="0" && foal_fates_2$`2016` =="0" &&
foal_fates_2$`2017`=="0" && foal_fates_2$`2018`=="1" && foal_fates_2$`2019`=="1"){
print("1")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="0" && foal_fates_2$`2013`=="0" &&
foal_fates_2$`2014`== "0" && foal_fates_2$`2015`=="0" && foal_fates_2$`2016` =="0" &&
foal_fates_2$`2017`=="0" && foal_fates_2$`2018`=="1" && foal_fates_2$`2019`=="0"){
print("0")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="0" && foal_fates_2$`2013`=="0" &&
foal_fates_2$`2014`== "0" && foal_fates_2$`2015`=="0" && foal_fates_2$`2016` =="0" &&
foal_fates_2$`2017`=="0" && foal_fates_2$`2018`=="0" && foal_fates_2$`2019`=="1" &&
foal_fates_2$`2020`=="1"){
print("1")
} else if (foal_fates_2$`2008`== "0" && foal_fates_2$`2009` =="0" && foal_fates_2$`2010` =="0" &&
foal_fates_2$`2011`=="0" && foal_fates_2$`2012`=="0" && foal_fates_2$`2013`=="0" &&
foal_fates_2$`2014`== "0" && foal_fates_2$`2015`=="0" && foal_fates_2$`2016` =="0" &&
foal_fates_2$`2017`=="0" && foal_fates_2$`2018`=="0" && foal_fates_2$`2019`=="1" &&
foal_fates_2$`2020`=="0"){
print("0")
}
}
有了这个代码,R至少做了一些事情,结果有正确数量的实体,但输出是不正确的。R给我0和1,但不是在正确的位置。意味着例如对于前五行R给了我结果"0";0"0"0"1〃"0";但它应该是";0"1〃"1〃"1〃"0〃;。至少如果我理解正确的话。我是R的新手,所以也许循环和其他工具不是我想做的事情的合适工具。所以,问题是我如何才能达到我的目标。如果有任何帮助,我将不胜感激。
我会编写一个函数应用于每一行。类似以下内容(当然可以更详细,但应该可以完成任务(:
numberAfterFirstOne <- function(myRow){
x <- which(myRow == 1)[1]
if (length(x + 1) < length(myRow)) #
return(myRow[x + 1])
else
return(NA)
}
说明:
- 哪些指数等于一,只需选择第一个;如果none为1,则x将为NA
- 如果在第一个值之后有一个值,则返回
- return NA(也可以是0或您希望的任何"键值">
对于测试,这里有一个示例数据集:
n <- 5
m <- 16
set.seed(1562) # for reproducability
dataset <- as.data.frame(matrix(ncol = m, nrow = n, data = round(runif(m * n, 0, 0.7))))
dataset <- rbind(dataset, rep(0, 16))
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16
1 1 0 0 1 0 0 0 1 0 1 0 1 0 0 1 0
2 1 1 0 0 0 1 1 0 0 1 1 0 0 0 1 0
3 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1
4 1 0 0 0 0 0 0 0 0 1 0 0 1 0 1 0
5 0 1 1 0 0 1 0 1 0 1 0 1 0 0 1 0
6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
然后apply
—每行上的函数numberAfterFirstOne
(apply类似于for循环,但更便于写入和读取(。
apply(dataset, 1, numberAfterFirstOne)
[1] 0 1 0 0 1 NA
这类似于带有for循环的更结块的构造:
result <- c()
for (i in 1:nrow(dataset)){
result[i] <- numberAfterFirstOne(dataset[i, ])
}
您现在可以调整函数以返回您想要的内容。目前可能会返回0、1或NA,也许你只想要1和0或1和NA。不需要使用if (length(x+1))
进行检查,因为如果索引不跳动,则由myRow[x+1]
返回NA,这将使函数更加简单。
您也可以修改代码,以便也返回年份:
colnames(dataset) <- 2007:2020 # name the columns of the example dataset
numberAfterFirstOne <- function(myRow){
x <- which(myRow == 1)[1]
return(c(x, myRow[x + 1])) # return the column index + the value
}
result <- apply(dataset, 1, numberAfterFirstOne) #save the result
result[1, ] <- names(dataset)[result[1, ]] # set column index to name of dataset column
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] "2007" "2007" "2012" "2007" "2008" NA
[2,] "0" "1" "0" "0" "1" NA