如何在 R 中调试无效的下标类型'integer'错误



我正在尝试在R:在某些条件下在向量中提取最大值,但我继续遇到错误

Error in list(id.2 = c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,  : 
  invalid subscript type 'integer'

代码如下:

require(dplyr)
dat <- read.table(header = TRUE, text = "id    name    year    job    job2 cumu_job2
1   Jane    1980    Worker  0   0
1   Jane    1981    Manager 1   1
1   Jane    1982    Sales   0   0
1   Jane    1983    Sales   0   0
1   Jane    1984    Manager 1   1
1   Jane    1985    Manager 1   2
1   Jane    1986    Boss    0   0
2   Bob     1985    Worker  0   0
2   Bob     1986    Sales   0   0
2   Bob     1987    Manager 1   1
2   Bob     1988    Manager 1   2
2   Bob     1989    Boss    0   0
3   Jill    1989    Worker  0   0
3   Jill    1990    Boss    0   0")
dat %.%
  group_by(id) %.%
  mutate(
    all_jobs = sum(unique(job) %in% c("Sales","Manager","Boss")),
    cumu_max = max(cumu_job2)
  ) %.%
  filter(all_jobs == 3, job %in% c("Sales","Boss"))
Source: local data frame [5 x 8]
Groups: id
  id name year   job job2 cumu_job2 all_jobs cumu_max
1  1 Jane 1982 Sales    0         0        3        2
2  1 Jane 1983 Sales    0         0        3        2
3  1 Jane 1986  Boss    0         0        3        2
4  2  Bob 1986 Sales    0         0        3        2
5  2  Bob 1989  Boss    0         0        3        2

示例代码也对我有用。但是我发现,如果尝试此尝试,我可以将类似的错误重复出现:

dat %.%
 group_by(dat$id) %.%
 mutate(
     all_jobs = sum(unique(job) %in% c("Sales","Manager","Boss")),
     cumu_max = max(cumu_job2)
 ) %.%
 filter(all_jobs == 3, job %in% c("Sales","Boss"))

也就是说,如果我键入" group_by(dat $ id)"而不是" group_by(id)"

bug

样本代码也对我有用。但是,正如Schnee提到的那样,您可以通过group_by(dat $ id)替换group_by(id)来创建类似的错误。可再现的代码:

dat1 <- data.frame(x=c('A','A','B','B'), y=c('A','B','C','D'), val = 1:4)
dat2 <- data.frame(val = 1:4)
dat_group <- data.frame(x=c('A','A','B','B'))
# invalid subscript type 'integer'
dat1 %>%
  group_by(dat1$x) %>%
  mutate(y = sum(unique(y) %in% c("A","B","C")))
# invalid subscript type 'list'
dat2 %>%
  group_by(dat_group$x) %>%
  mutate(y = sum(unique(y) %in% c("A","B","C")))

虽然第一个通常只是错字(您可以用x替换dat $ x),但第二个可能是有效的用例(尽管我建议加入以使其更清洁)。

解决方案

dplyr软件包不喜欢" $"的用法。尝试使用'[',例如:

dat1[,'x']

引用变量也有效:

dat1$'x'

完整代码:

dat1 %>%
  group_by(dat1[,'x']) %>%
  mutate(y = sum(unique(y) %in% c("A","B","C")))
dat1 %>%
  group_by(dat1$'x') %>%
  mutate(y = sum(unique(y) %in% c("A","B","C")))

另请参阅https://github.com/hadley/dplyr/issues/433或https://github.com/hadley/dplyr/issues/1554

最新更新