r语言 - 为什么代码在循环时会抛出错误?当我增加索引"by hand"时,我的代码可以工作,但是当我放入循环时,它失败了



我想将一个数据帧中的值作为列名附加到另一个数据帧。

我已经编写了代码,如果我"手动"分配索引值,一次将生成一列:

df_searchtable <- data.frame(category = c("air", "ground", "ground", "air"), wiggy = c("soar", "trot", "dive", "gallop"))  
df_host <- data.frame(textcolum = c("run on the ground", "fly through the air"))
#create vector of categories
categroups <- as.character(unique(df_searchtable$category))

##### if I assign colum names one at a time using index numbers no prob:
group = categroups[1]
df_host[, group] <- NA

##### if I use a loop to assign the column names:
for (i in categroups) {
  group = categroups[i]
  df_host[, group] <- NA
}

代码失败,给出:

Error in [<-.data.frame(`*tmp*`, , group, value = NA) : 
missing values are not allowed in subscripted assignments of data frames

如何解决此问题?

下面是一个简单的基本 R 解决方案:

df_host[categroups] <- NA
df_host
            textcolum air ground
1   run on the ground  NA     NA
2 fly through the air  NA     NA
循环的问题在于您正在循环遍历

每个元素,而您的代码假设您正在循环遍历1, 2, ..., n

例如:

for (i in categroups) {
  print(i)
  print(categroups[i])
}
[1] "air"
[1] NA
[1] "ground"
[1] NA

要修复循环,您可以执行以下两项操作之一:

for (group in categroups) {
  df_host[, group] <- NA
}
# or
for (i in seq_along(categroups)) {
  group <- categroups[i]
  df_host[, group] <- NA
}

这是一个使用 purrr 的解决方案 map .

bind_cols(df_host,
          map_dfc(categroups, 
                  function(group) tibble(!!group := rep(NA_real_, nrow(df_host)))))

给:

            textcolum air ground
1   run on the ground  NA     NA
2 fly through the air  NA     NA
  • map_dfc映射输入categroups,为每个 tibble 创建一个单列 tibble,并将新创建的 tibble 连接到数据帧中
  • bind_cols将原始数据帧联接到新的 Tibble

或者,您可以使用walk

walk(categroups, function(group){df_host <<- mutate(df_host, !!group := rep(NA_real_, nrow(df_host)))})

下面是一个丑陋的基本 R 解决方案:创建一个包含列名的空矩阵,并将其cbind到第二个数据帧。

df_searchtable <- data.frame(category = c("air", "ground", "ground", "air"), 
                             wiggy = c("soar", "trot", "dive", "gallop"),
                             stringsAsFactors = FALSE)
df_host <- data.frame(textcolum = c("run on the ground", "fly through the air"),
                      stringsAsFactors = FALSE)
cbind(df_host, 
      matrix(nrow = nrow(df_host), 
             ncol = length(unique(df_searchtable$category)), 
             dimnames = list(NULL, unique(df_searchtable$category))))

结果:

            textcolum air ground
1   run on the ground  NA     NA
2 fly through the air  NA     NA

最新更新