r-使用循环/其他任何东西来编写一个区间字符串(向量)



我做得很艰难,想知道如何使用循环/更快的方法。我正在创建用于剪切语句的级别标签,与年龄组合作。

levels(age_group) <- ("<10","10-19","20-29","30-39","40-49","50-59","60-69","70-79","80-89","90-99","100-109",
             "110-119","120-129","130+")

有人对如何做到这一点有什么好主意吗?较少的"<10"one_answers"130+"可以手动添加,但我相信还有一种更快的方法可以完成其余部分。

感谢

最好使用cut生成的级别,因为您当前的间隔没有定义包括哪一端。

s <- c(-Inf,seq(10,130,10),Inf)
levels(cut(s,s))
#  [1] "(-Inf,10]"  "(10,20]"    "(20,30]"    "(30,40]"    "(40,50]"   
#  [6] "(50,60]"    "(60,70]"    "(70,80]"    "(80,90]"    "(90,100]"  
# [11] "(100,110]"  "(110,120]"  "(120,130]"  "(130, Inf]"

如果你必须使用当前的间隔,你可以使用这个简单的功能:

strInterval <- function(start, end, by) {
  s <- seq(start, end, by)
  i <- paste(head(s,-1), s[-1]-1, sep="-")
  c(paste0("<",start), i, paste0(end,"+"))
}
strInterval(10,130,10)
#  [1] "<10"     "10-19"   "20-29"   "30-39"   "40-49"   "50-59"   "60-69"  
#  [8] "70-79"   "80-89"   "90-99"   "100-109" "110-119" "120-129" "130+" 
cts <- seq(10,130, by=10)
paste(c("<=", cts), c(cts-1, "+") , sep="-")
# [1] "<=-9"    "10-19"   "20-29"   "30-39"   "40-49"   "50-59"   "60-69"  
# [8] "70-79"   "80-89"   "90-99"   "100-109" "110-119" "120-129" "130-+"  

你说你可以根据需要调整两端,对吧?

只需插入max/min并运行其余代码。

min <- 10
max <- 130
seq1 <- seq(min, max, by = 10)
seq2 <- seq(min-1, max-1, by = 10)
age_group <- c(paste("<", min, sep = ""), rep("foo", length(seq1)-1))
  for (i in 1:(length(seq1)-1)) {
    grp1 <- seq1[i]
    grp2 <- seq2[i+1]
    group <- paste(grp1, "-", grp2, sep = "")
    age_group[i+1] <- group
  }
age_group <- c(age_group, paste(max, "+", sep = ""))
age_group

我的解决方案早些时候发布了,但这里有一些更改(这仅适用于使用cut函数并使用该间隔的情况):

mydata<-round(seq(1,20,length.out=5))
mydata<-as.data.frame(mydata)
names(mydata)<-"V" #name the column as V
mydata$V1<-cut(mydata$V,5) #break the data into five intervals and name that as col V1
mydata$lower<-with(mydata,round(as.numeric( sub("\((.+),.*", "\1", V1)))) #extract lower value
mydata$upper<-with(mydata,round(as.numeric( sub("[^,]*,([^]]*)\]", "\1",V1)))) # extract upper value
myfinaldata<-mydata[,c("lower","upper")] #create data frame of lower and upper values
myfinaldata$interval<-with(myfinaldata,paste(lower,upper,sep="-"))
 myfinaldata
  lower upper interval
1     1     5      1-5
2     5     9      5-9
3     9    12     9-12
4    12    16    12-16
5    16    20    16-20

最新更新