r语言 - 创建计数列



我在R中有一个数据框,如下所示:

ID   REGION  FACTOR  
01    north    1
02    north    1
03    north    0
04    south    1
05    south    1
06    south    1
07    south    0
08    south    0

我想创建一个列,其中包含按"区域"划分的行数并按某个因素(factor==1(过滤。

我知道如何计算值,但我找不到具有此输出的函数:

ID   REGION  FACTOR  COUNT
01    north     1      2
02    north     1      2
03    north     0      2
04    south     1      3
05    south     1      3
06    south     1      3
07    south     0      3 
08    south     0      3

有人可以帮助我吗?

我们可以使用add_count

library(dplyr)
df1 %>%
add_count(REGION)

如果是sum因素

df1 %>%
group_by(REGION) %>%
mutate(COUNT = sum(FACTOR))
#or use
# mutate(COUNT = sum(FACTOR != 0))
# A tibble: 8 x 4
# Groups:   REGION [2]
#     ID REGION FACTOR COUNT
#  <int> <chr>   <int> <int>
#1     1 north       1     2
#2     2 north       1     2
#3     3 north       0     2
#4     4 south       1     3
#5     5 south       1     3
#6     6 south       1     3
#7     7 south       0     3
#8     8 south       0     3

或使用"数据表">

library(data.table)
setDT(df1)[, COUNT := sum(FACTOR), by = REGION]

数据

df1 <- structure(list(ID = 1:8, REGION = c("north", "north", "north", 
"south", "south", "south", "south", "south"), FACTOR = c(1L, 
1L, 0L, 1L, 1L, 1L, 0L, 0L)), class = "data.frame", row.names = c(NA, 
-8L))

一个使用ave的基本 R 解决方案,即:

dfout <- within(df, COUNT <- ave(FACTOR,REGION, FUN = sum))

这样

> dfout
ID REGION FACTOR COUNT
1  1  north      1     2
2  2  north      1     2
3  3  north      0     2
4  4  south      1     3
5  5  south      1     3
6  6  south      1     3
7  7  south      0     3
8  8  south      0     3

数据

df <- structure(list(ID = 1:8, REGION = c("north", "north", "north", 
"south", "south", "south", "south", "south"), FACTOR = c(1L, 
1L, 0L, 1L, 1L, 1L, 0L, 0L)), class = "data.frame", row.names = c(NA, 
-8L))

group_by区域,然后创建(mutate(一个名为count的新列,这是每组观测值的总和,n()

library(tidyverse)
group_by(df, region) %>%
mutate(count = n()) %>%
ungroup()

您希望在最后ungroup(),以便将来的计算不会在分组级别进行。

最新更新