r语言 - 使用基于因子水平的汇总平均值(具有总计列)对汇总表进行编码



我希望代码创建一个汇总表,其中根据两个标准(即因子变量的水平(计算多个均值。这些级别位于各自的列中,但我想将它们切入表格的自己的列中,并创建一个总列(即两个级别的平均值组合(。我有以下示例代码:

我想使用表格作为降价的整洁数据摘要,并可能转换为单词。

Depth<-c('0',   '0.1-2.0',  '2.1-10.0', '10.1-20.0',    '20.1- 
50.0',  '50.1-100.0',   '0', '0.1-2.0', '2.1-10.0', '10.1-20.0',     
'20.1-50.0',    '50.1-100.0')
Tag<-   c('Tag.1',  'Tag.1',    'Tag.1',    'Tag.1',     
'Tag.1',    'Tag.1',    'Tag.2',    'Tag.2',    'Tag.2',     
'Tag.2',    'Tag.2',    'Tag.2')
Proportion<-    c(2.287356322,  5.896551724,    9.528735632,     
7.229885057,    73.54022989,    1.517241379,    0.5,    86.3,   13.2,    
0.1,    0.1,    0.1)
Season<-    c('Autumn', 'Autumn',   'Autumn',   'Autumn',    
'Autumn',   'Autumn',   'Summer',   'Summer',   'Summer',    
'Summer',   'Summer',   'Summer')
df<-data.frame(Depth, Tag, Proportion, Season)

我可以从中创建下表:

library(knitr)
df$Proportion<-as.numeric(df$Proportion)
df$Depth<-as.factor(df$Depth)
tt1<-df%>%
  group_by(Season, Depth)%>%
  summarise(Mean=mean(Proportion))
kable(tt1)

|Season |Depth      |      Mean|
|:------|:----------|---------:|
|Autumn |0          |  2.287356|
|Autumn |0.1-2.0    |  5.896552|
|Autumn |10.1-20.0  |  7.229885|
|Autumn |2.1-10.0   |  9.528736|
|Autumn |20.1-50.0  | 73.540230|
|Autumn |50.1-100.0 |  1.517241|
|Summer |0          |  0.500000|
|Summer |0.1-2.0    | 86.300000|
|Summer |10.1-20.0  |  0.100000|
|Summer |2.1-10.0   | 13.200000|
|Summer |20.1-50.0  |  0.100000|
|Summer |50.1-100.0 |  0.100000|

但进一步的总结将使读者受益(即表格只有四列:1 深度、2 平均自动、3 平均和 4 总计(

我试过:

ttt1<-df%>%
  group_by(Depth)%>%
  mutate(meanAut=case_when(Season=='Autumn' ~ 
 summarise(mean(Proportion))))%>%
    mutate(meanSum=case_when(Season=='Summer' ~ 
summarise(mean(Proportion))))%>%
 bind_rows(summarise_all(., funs(if(is.numeric(.)) sum(.) else "Total")))

但是得到错误:mutate_impl(.data, dots( 中的错误:计算错误:没有适用于类"c('double', 'numeric'("对象的"summarise_"方法。

预期产出:

Depth       meanAut meanSum Total
0           2.2     NA      2.2
0.1-2.0     5.8     86.3    46.05
10.1-20.0   7.2     0.1     3.65
2.1-10.0    9.5     13.2    11.35
20.1-50.0   73.5    0.1     36.8
50.1-100.0  1.5     0.1     0.8

有关如何格式化表格的任何建议将不胜感激!

一种tidyverse可能性可能是:

df %>%
 group_by(Depth, Season) %>%
 summarise(mean_season = mean(Proportion, na.rm = TRUE)) %>%
 mutate(Season = paste("Mean", Season, sep = "_")) %>%
 spread(Season, mean_season)  %>%
 left_join(df %>%
 group_by(Depth) %>%
 summarise(Mean_Total = mean(Proportion, na.rm = TRUE)),
 by = c("Depth" = "Depth"))
  Depth      Mean_Autumn Mean_Summer Mean_Total
  <fct>            <dbl>       <dbl>      <dbl>
1 0                 2.29         0.5      1.39 
2 0.1-2.0           5.90        86.3     46.1  
3 10.1-20.0         7.23         0.1      3.66 
4 2.1-10.0          9.53        13.2     11.4  
5 20.1-50.0        73.5          0.1     36.8  
6 50.1-100.0        1.52         0.1      0.809

在这里,它首先计算每个深度和季节的平均值。其次,它创建新的变量名称,包含"平均值"。第三,它将新的变量名称转换为列,均值作为值。第四,它计算每个深度的总体平均值。最后,它结合了整体和季节性手段,在"深度"上将两者结合在一起。

并添加来自knitr kable()

df %>%
 group_by(Depth, Season) %>%
 summarise(mean_season = mean(Proportion, na.rm = TRUE)) %>%
 mutate(Season = paste("Mean", Season, sep = "_")) %>%
 spread(Season, mean_season)  %>%
 left_join(df %>%
 group_by(Depth) %>%
 summarise(Mean_Total = mean(Proportion, na.rm = TRUE)),
 by = c("Depth" = "Depth")) %>%
 kable()
|Depth      | Mean_Autumn| Mean_Summer| Mean_Total|
|:----------|-----------:|-----------:|----------:|
|0          |    2.287356|         0.5|  1.3936782|
|0.1-2.0    |    5.896552|        86.3| 46.0982759|
|10.1-20.0  |    7.229885|         0.1|  3.6649425|
|2.1-10.0   |    9.528736|        13.2| 11.3643678|
|20.1-50.0  |   73.540230|         0.1| 36.8201149|
|50.1-100.0 |    1.517241|         0.1|  0.8086207|

最新更新