我想以这样一种方式转换数据,即列变成单行
col1 Col2 Col3 Col4 Col5
1 344230. masalas & spices 4 14 2
2 344231. hair care 4 14 1
3 344231. otc 4 14 1
4 344231. personal hygiene 4 14 1
5 344232. detergents 4 14 2
6 344233. biscuits 4 14 2
7 344233. chocolates & sweets 4 14 1
8 344233. dry fruits 4 14 2
输出将类似
col1 Col2 Col5
344230 masalas & spices 2
344231 hair care,otc,personal hygine 1+1+1=3
344232 detergent 2
344233 biscuits,choclates&sweets,dry fruits 2+1+ 2=5
library(dplyr)
df %>%
group_by(col1) %>%
mutate(Col5 = sum(Col5),
Col2=paste(Col2,collapse=',')) %>%
slice(1)
col1 Col2 Col3 Col4 Col5
<dbl> <chr> <int> <int> <int>
1 344230 masalas&spices 4 14 2
2 344231 haircare,otc,personalhygiene 4 14 3
3 344232 detergents 4 14 2
4 344233 biscuits,chocolates&sweets,dryfruits 4 14 5
如果不需要Col3
和Col4
,可以用summarise
替换mutate
,并跳过slice(1)
。
数据:
df <- read.table(text = "
col1 Col2 Col3 Col4 Col5
1 344230. masalas&spices 4 14 2
2 344231. haircare 4 14 1
3 344231. otc 4 14 1
4 344231. personalhygiene 4 14 1
5 344232. detergents 4 14 2
6 344233. biscuits 4 14 2
7 344233. chocolates&sweets 4 14 1
8 344233. dryfruits 4 14 2 ", h = T)