r-计算堆叠条形图的标准偏差



我想计算标准偏差和标准误差,这样我就可以在堆叠条形图上显示误差条。

Management    Habitat   Intensity     Var2   
A           Urban        High        6   
A          Farmland      High        9   
A          Farmland      Medium     10 
B          Forest        Medium     17 
B          Peatland      Medium     23     
C          Peatland      Low        22    
C          Urban         Low        10     

我的堆叠条形图代码是

ggplot(df, aes(fill=Habitat, y= Var1, x=Intensity)) + 
geom_bar(position="stack", stat="identity")+
labs(y = "Area of habitat (hectares)")+
theme(legend.title = element_text())

我尝试了ddply函数,通过强度计算Var 2的标准偏差和标准误差,通过强度给出每个条的总体误差,然后设置ymin和ymax的限制,但我得到了一个错误

错误:美学必须是长度1或与数据相同(96(:ymax和ymin

EB<-ddply(Mean_PFB, c("Intensity"), summarise,
N    = length(Var2),
mean = mean(Var2),
sd   = sd(Var2),
se   = sd / sqrt(N))

这是您的完整数据集吗?这样就不可能计算标准偏差或标准误差,因为您没有适当的复制。参见以下

library(tidyverse)
#> Warning: package 'tidyr' was built under R version 3.6.2
#> Warning: package 'dplyr' was built under R version 3.6.2
df <- read.table(text = "Management    Habitat   Intensity     Var2   
A          Urban         High        6   
A          Farmland      High        9   
A          Farmland      Medium     10 
B          Forest        Medium     17 
B          Peatland      Medium     23     
C          Peatland      Low        22    
C          Urban         Low        10", header=T)
#standard deviation calculation
df %>% 
group_by(Habitat) %>% 
summarise(new = list(mean_sdl(Var2))) %>% 
unnest(new)
#> # A tibble: 4 x 4
#>   Habitat      y   ymin  ymax
#>   <fct>    <dbl>  <dbl> <dbl>
#> 1 Farmland   9.5   8.09  10.9
#> 2 Forest    17   NaN    NaN  
#> 3 Peatland  22.5  21.1   23.9
#> 4 Urban      8     2.34  13.7
df %>% 
group_by(Management) %>% 
summarise(new = list(mean_sdl(Var2))) %>% 
unnest(new)
#> # A tibble: 3 x 4
#>   Management     y   ymin  ymax
#>   <fct>      <dbl>  <dbl> <dbl>
#> 1 A           8.33  4.17   12.5
#> 2 B          20    11.5    28.5
#> 3 C          16    -0.971  33.0
df %>% 
group_by(Intensity) %>% 
summarise(new = list(mean_sdl(Var2))) %>% 
unnest(new)
#> # A tibble: 3 x 4
#>   Intensity     y   ymin  ymax
#>   <fct>     <dbl>  <dbl> <dbl>
#> 1 High        7.5  3.26   11.7
#> 2 Low        16   -0.971  33.0
#> 3 Medium     16.7  3.65   29.7
#standard deviation calculation for grouped data with Intensity, Habitat 
#give you NAs as it does not have proper replications
df %>% 
group_by(Intensity, Habitat) %>% 
summarise(new = list(mean_sdl(Var2))) %>% 
unnest(new)
#> # A tibble: 7 x 5
#> # Groups:   Intensity [3]
#>   Intensity Habitat      y  ymin  ymax
#>   <fct>     <fct>    <dbl> <dbl> <dbl>
#> 1 High      Farmland     9   NaN   NaN
#> 2 High      Urban        6   NaN   NaN
#> 3 Low       Peatland    22   NaN   NaN
#> 4 Low       Urban       10   NaN   NaN
#> 5 Medium    Farmland    10   NaN   NaN
#> 6 Medium    Forest      17   NaN   NaN
#> 7 Medium    Peatland    23   NaN   NaN

同样适用于标准错误,只需使用mean_se代替mean_sdl

由reprex包于2020-04-27创建(v0.3.0(

最新更新