R 不调用内存中的对象

我正在构建一个包含多个步骤的函数，其中每个步骤都创建一个对象。某个步骤失败 (temp3( 并且找不到前面的步骤对象(错误：找不到对象"temp2"(。我不确定为什么 - 我有类似的函数遵循完全相同的结构，每个步骤都遵循先前创建的对象，在函数内运行良好。当您在函数之外运行该代码时，它可以工作(因此代码看起来很好(，并且使用 debug(( 据称不创建数据 (temp2( 的步骤实际上存储在本地内存中(所以我可以看到对象"temp2"(，但由于某种原因，R 似乎无法识别或使用它。我被难住了！也许我只是不明白 R 如何评估步骤并调用本地内存中的对象？我是否以错误的方式编写函数？

如果它更有用，我可以很容易地准备一个工作示例，因为这个函数会召回奇怪的包等，但目前我认为它更多的是我如何误解 R 如何将对象分配给函数中的本地内存的问题。这里有一个类似的查询，R 如何在函数调用中处理对象？，但实际上我正在分配函数中的每个新对象。你能帮忙吗？

    glm.random<-function(df){
  reps=5
  output<-matrix(NA, ncol=1, nrow=0)
    while (length(output[,1])<reps) {    
      temp1 <- ddply(df,.(study_id),randomRows,1)
      temp2 <- subset(temp1,select = c(continent,taxatype, metric,nullm, yi_pos))
      temp3 <- glmulti(yi_pos ~ ., data = temp2, family = gaussian( link = log), crit = aic, plotty = F,  report = F)           
      temp4 <- noquote(paste(summary(temp3)$bestmodel[1]))
      output<-rbind(output,temp4)     
      }
    write.table(output, "output.glm.random1.txt", append=TRUE, sep="t", quote=FALSE)
  }

答复：

再次嗨，

安德里 – 1(。所以我删除了子集的使用(但很好奇，你指的是什么"意外结果"？2(. 我一直发现手头的数据很困难，但我明白你的意思，需要在这里改进我的编码方法 3(.好提示！但在这里，它只是为了检查该东西是否正常工作——我可能只是使用该输出对象进行更多分析。

加文 1(会的！2+3(所以错误似乎在于创建(或调用("temp1"。

我希望下面是一些可重现的代码。如果有帮助，我试图复制的方法可以在Gibson等人2011 Nature 478：378中找到。(请参阅详细方法"广义线性模型"(。

Thank you!
    #rm(list = ls())
    library("plyr")
    library("glmulti")
    # random rows function
    randomRows = function(df,n){
      return(df[sample(nrow(df),n),])
    }
    # Dataframe example
    study_id <- c(1,1,1,1,2,2,3,3,3,4)
    continent <- c("AF","AF","AF","AF","AF","AF", "AS", "AS", "AS", "SA")
    taxatype <- c("bird","bird","bird","mam","mam","arthro", "arthro", "arthro", "arthro", "arthro")
    metric<- c("sppr","sppr","sppr","sppr","abund","abund", "abund", "abund", "abund", "abund")
    extra.data<- c(34:43)
    yi_pos<- runif(1:10)
    df<- data.frame(study_id=study_id, continent=continent,metric=metric, taxatype=taxatype,extra.data = extra.data, yi_pos = yi_pos)
    df
    # Function. Goal:repeat x10000 (but here reps =5) ( Select one random value per study_id, run glmulti{glmulti}, select best ranked model, concatenate to an output and export). 
    glm.random<-function(df){
      reps=5
      output<-matrix(NA, ncol=1, nrow=0)
      while (length(output[,1])<reps) {
        temp1 <- ddply(df,.(study_id),randomRows,1)
        temp3 <- glmulti(yi_pos ~ continent+taxatype+metric, data = temp1, family = gaussian( link = log), crit = aic, plotty = F,  report = F)          
        temp4 <- noquote(paste(summary(temp3)$bestmodel[1]))
        output<-rbind(output,temp4)       
        }
      write.table(output, "output.glm.random1.txt", append=TRUE, sep="t", quote=FALSE)
    }
    # run function to obtain error
    glm.random(df)
# debug(glm.random)
# glm.random(df)
# undebug(glm.random)

从 ?glmulti ，

如果未指定 [参数data]，glmulti 将尝试在公式环境中、从作为 y 参数传递的拟合模型或全局环境中查找数据。

但是，当您指定 data = temp1 时，glmulti显然会在全局环境中查找此对象。因此，您可能需要将随机选择的数据分配给全局环境(我稍微重命名了一下，以尝试检查名称和对象(：

glm.random2<-function(df){
  reps=5
  output<-matrix(NA, ncol=1, nrow=0)
  while (length(output[,1])<reps) {
## Here things are different
    temp2 <- ddply(df,.(study_id),randomRows,1)
    names(temp2)[2]<-"cOntinent"
    assign("temp1",temp2,envir=.GlobalEnv)
## Note the slightly modified formula, to check whether
## gmulti looks for terms in temp1 or simply as named objects in the environment
## It looks like the former, which is good.
    temp3 <- glmulti(yi_pos ~ cOntinent+taxatype+metric, data = temp1, 
      family = gaussian( link = log), crit = aic, plotty = F,  report = F)          
    temp4 <- noquote(paste(summary(temp3)$bestmodel[1]))
    output<-rbind(output,temp4)
## Remove the object temp1 from the global environment
    rm(temp1,envir=.GlobalEnv)
    }
  write.table(output, "output.glm.random1.txt", append=TRUE, sep="t", quote=FALSE)
}
# run function - no error for me!
glm.random2(df)

您可能需要与软件包维护者联系，看看这是否是glmulti工作的预期方式。

相关内容

最新更新

热门标签：