我对 R 中的 "for" 函数有问题



首先,我想说明的是,英语不是我的母语,我的英语说得不好,所以我为任何错误感到抱歉。

我将在下面展示我遇到问题的行。

vetor = iris$Sepal.Length
for (i in vetor) {
mean = mean(vetor)
SD = sd(vetor)
zscore = ((i-mean)/SD)
print(paste("The z score number", 1:150, "is",  zscore))
}

我想要的是返回

" z分数1是…">

z分数2是…

以此类推。事实上,他还给了我,但当他完成后,他又重新开始,至少重复了13次。我不知道为什么会这样。

如果我理解正确的话,您想要计算iris$Sepal.Length中每个值的z-score。你不需要循环来做这件事:R在默认情况下是向量化的,这意味着它同时处理一列中的每个值。

所以,你可以做我认为你想做的,而不使用循环:

# Calculate the mean
mean = mean(vetor)
# Calculate the SD
SD = sd(vetor)
# Calculate the z-scores of every value in the vector, all at the same time
zscore = ((vetor-mean)/SD)
# Format the output as required.
print(paste("The z score number", 1:150, "is",  zscore))
[1] "The z score number 1 is -0.897673879196766"   
[2] "The z score number 2 is -1.13920048346495"    
[3] "The z score number 3 is -1.38072708773314"    
[4] "The z score number 4 is -1.50149038986724"    
[5] "The z score number 5 is -1.01843718133086"    
[6] "The z score number 6 is -0.535383972794483"   
[7] "The z score number 7 is -1.50149038986724"    
[8] "The z score number 8 is -1.01843718133086"    
[9] "The z score number 9 is -1.74301699413542"    
[10] "The z score number 10 is -1.13920048346495"   
<output truncated>
data("iris")
vetor = iris$Sepal.Length
for (i in vetor) {
mean = mean(vetor)
SD = sd(vetor)
zscore = ((i-mean)/SD)
}
print(paste("The z score number", 1:150, "is",  zscore))
输出:

[1] "The z score number 1 is 0.0684325378759866"   "The z score number 2 is 0.0684325378759866"  
[3] "The z score number 3 is 0.0684325378759866"   "The z score number 4 is 0.0684325378759866"  
[5] "The z score number 5 is 0.0684325378759866"   "The z score number 6 is 0.0684325378759866"  
[7] "The z score number 7 is 0.0684325378759866"   "The z score number 8 is 0.0684325378759866"  
[9] "The z score number 9 is 0.0684325378759866"   "The z score number 10 is 0.0684325378759866" 
[11] "The z score number 11 is 0.0684325378759866"  "The z score number 12 is 0.0684325378759866" 
[13] "The z score number 13 is 0.0684325378759866" ...... so on 

您可以按照以下方式修改代码:

  • meanSD的计算移出循环将更有效,因为它们只需要计算一次,而不是每次迭代
  • 我决定使用seq_along函数的vetor索引序列,并通过子集[[]]获得实际值

当你在一个向量上应用seq_along时,它返回每个元素对应的所有索引,在这种情况下它返回1, 2, 3, ..., 150。关于这两个括号,你只能用一个,因为对于一个向量,它没有区别。但是当涉及到数据帧或列表子集时,一个括号返回同一类(数据帧和列表)的对象,但是两个括号将简化结果并返回一个向量。

vetor <- iris$Sepal.Length
mean <- mean(vetor)
SD <- sd(vetor)
for (i in seq_along(vetor)) {
zscore <- ((vetor[[i]] - mean) / SD)
print(paste("The z score number", i, "is",  zscore))
}
[1] "The z score number 1 is -0.897673879196766"
[1] "The z score number 2 is -1.13920048346495"
[1] "The z score number 3 is -1.38072708773314"
[1] "The z score number 4 is -1.50149038986724"
[1] "The z score number 5 is -1.01843718133086"

要了解更多信息,你可以从Roger D Peng教授的这本书中获得相当大的见解。

最新更新