我正在尝试完成我最初认为是一个简单的任务,但我的R编码技能显然非常生疏。简而言之,我在R中设置了一个数据框,"测试"。X1 是一个因子,X2 是一个具有空值的列:
- X1 X2
- F1 .
- F1 .
- F2 .
- F3 .
- F3 .
我对这个程序的最终目标是创建一个函数或程序,该函数或程序将迭代因子的每个级别,要求用户提供一个值,用于在当前水平 X1 上填充 X2,然后继续下一个级别 X1。
您将如何对此进行编程?
我的问题来自循环本身。要么循环没有重写 X2 的值(作为我假设的局部变量进行(,要么我收到"条件长度为>1"错误。以下是我尝试过的几个版本:
someValue<-0
for (i in levels(test$X1)){
if (identical(test$X1,i)) {
test$X2<-someValue}
someValue+1
}
#This doesn't seem to overwrite X2
someValue<-0
for (i in levels(test$X1)){
if (test$X1==i) {
test$X2<-someValue}
someValue+1
}
#This throws the 'condition has length >1' warning. I understand why this is happening.
However, ifelse isn't an option because I want it to do nothing
and iterate to the next level of i if false.
我不想在此过程中使用查找表或联接,因为这会消除我试图通过编写此内容来节省的时间。但显然我不太擅长在 R 中做循环!
此函数执行您在问题中描述的操作:
fillfac <- function(vec){
fill <- character(length(vec))
# " iterate over each level of the factor"
for(i in levels(vec)){
#"ask the user for a value with which to fill X2"
# "over the current level of X1"
print(paste("What should be the fill for", i, "?"))
value <- scan(what = "character", n=1)
fill[labels(vec)[vec] == i] <- value
}
return(fill)
}
例:
> X1 = factor(sample(1:5, size = 20, rep=T))
> X2 <- fillfac(X1)
[1] "What should be the fill for 1 ?"
1: "one"
Read 1 item
[1] "What should be the fill for 2 ?"
1: "two"
Read 1 item
[1] "What should be the fill for 3 ?"
1: "three"
Read 1 item
[1] "What should be the fill for 4 ?"
1: "four"
Read 1 item
[1] "What should be the fill for 5 ?"
1: "five"
Read 1 item
> (df <- as.data.frame(cbind(X1,X2)))
X1 X2
1 1 one
2 3 three
3 1 one
4 2 two
5 5 five
6 3 three
7 3 three
8 4 four
9 2 two
10 3 three
11 2 two
12 3 three
13 4 four
14 5 five
15 2 two
16 1 one
17 2 two
18 2 two
19 5 five
20 4 four