r语言 - dplyr() 分组和获取计数 - 错误消息 评估错误:没有适用于类 "logical" 对象的'summarise_'的方法



我有一个数据框(df(,有两个变量,位置和天气。

我想要一个宽数据框 (dfgoal(,其中数据按位置分组,其中有三个新变量(weather_1 到 weather_3(,其中包含原始天气变量中观测值的计数。

问题是当我尝试使用 dplyr((::mutate(( 时,我只得到 TRUE/FALSE 输出而不是计数,或者一条错误消息:Evaluation error: no applicable method for 'summarise_' applied to an object of class "logical" .

任何帮助将不胜感激。

起点:

df <- data.frame(location=c("az","az","az","az","bi","bi","bi","ca","ca","ca","ca","ca"),weather=c(1,1,2,3,2,3,2,1,2,3,1,2))

期望结果(df(:

dfgoal <- data.frame(location=c("az","bi","ca"),weather_1=c(2,0,2),weather_2=c(1,2,2),weather_3=c(1,1,1))

当前代码:

library(dplyr)
df %>% group_by(location)  %>% mutate(weather_1 = (weather == 1)) %>% mutate(weather_2 = (weather == 2)) %>% mutate(weather_3 = (weather == 3))
df %>% group_by(location)  %>% mutate(weather_1 = summarise(weather == 1)) %>% mutate(weather_2 = summarise(weather == 2)) %>% mutate(weather_3 = summarise(weather == 3))

它非常简单,函数称为 table

df %>% table  
        weather
location 1 2 3
      az 2 1 1
      bi 0 2 1
      ca 2 2 1

Krzysztof 的解决方案是要走的路,但如果你坚持使用 tidyverse ,这里有一个 dplyr + tidyr 的解决方案:

library(dplyr)
library(tidyr)
df %>%
  group_by(location, weather) %>%
  summarize(count = count(weather)) %>%
  spread(weather, count, sep="_") %>%
  mutate_all(funs(coalesce(., 0L)))

结果:

# A tibble: 3 x 4
# Groups:   location [3]
  location weather_1 weather_2 weather_3
    <fctr>     <int>     <int>     <int>
1       az         2         1         1
2       bi         0         2         1
3       ca         2         2         1

Krzysztof的答案以简单性而获胜,但是如果您想要一个仅整洁的解决方案(dplyrtidyr(:

df %>% 
    group_by(location, weather) %>% 
    summarize(bin = sum(weather==weather)) %>%
    spread(weather, bin, fill = 0, sep='_')

这导致:

location    weather_1   weather_2   weather_3
az  2   1   1
bi  0   2   1
ca  2   2   1

最新更新