如何在R中找到数据帧的特定行的总数?

  • 本文关键字:数据帧 r dataframe
  • 更新时间 :
  • 英文 :


我有一个数据帧(df),这是一个更大的版本:

txnID   date        product          sold   repID   lastName
1001    8/5/2020    Clobromizen      600    203     Kappoorthy
1002    6/28/2020   Alaraphosol      276    887     da Silva
1003    6/28/2020   Alaraphosol      184    887     da Silva
1004    4/16/2020   Diaprogenix       36    887     da Silva
1005    6/14/2020   Diaprogenix       40    887     da Silva
1006    5/19/2020   Xinoprozen      5640    332     McRowe
1007    8/23/2020   Diaprogenix       60    332     McRowe
1008    11/14/2020  Clobromizen     2880    332     McRowe
1009    9/26/2020   Colophrazen      738    203     Kappoorthy
1010    2/5/2020    Diaprogenix       20    332     McRowe
1011    9/23/2020   Gerantrazeophem 3740    100     Schwab
1012    12/4/2020   Clobromizen     1584    221     Sixt

我想创建一个新的数据框架,它获取显示的每个员工的所有销售产品的总和(显示所有员工),它看起来像这样:

View(df1)
lastName    totalSold
1  Kappoorthy  sum(df$sold)
2  da Silva    sum(df$sold)
3  McRowe      sum(df$sold)
4  Schwab      sum(df$sold)
5  Sixt        sum(df$sold)

在Base R中可以这样做:

aggregate(sold~lastName, df, sum)
lastName sold
1    da Silva  536
2 Kappoorthy  1338
3     McRowe  8600
4     Schwab  3740
5       Sixt  1584

:

aggregate(sold~lastName, df, sum, subset = !product %in%c("Xinoprozen","Diaprogenix"))
lastName sold
1    da Silva  460
2 Kappoorthy  1338
3     McRowe  2880
4     Schwab  3740
5       Sixt  1584

如果你有NAs:

aggregate(sold~lastName, df, sum, na.rm =TRUE)

这是dplyr的一种方法

library(dplyr)
df %>% 
filter(!(product %in% c("Xinoprozen", "Diaprogenix") )%>%
group_by(lastName) %>% 
summarize(totalSold = sum(sold,na.rm = TRUE))
library(dplyr) 
df%>%
group_by(lastName)%>%
summarise(Totalsold = sum(sold))

如果您想排除任何产品,例如"Xinoprozen"one_answers"Diaprogenix">

df%>%
filter(!(product %in% c("Xinoprozen", product!="Diaprogenix")))%>%
group_by(lastName)%>%
summarise(Totalsold = sum(sold))

using R baseaggregate

aggregate(sold ~ lastName, sum, na.rm=TRUE, data=df)

最新更新