我正在努力重塑R上的数据帧。我的数据帧pippo
看起来如下所示:
等。我想有一个独特的时间维度,同时拥有份额和体积,Brand Year Share1 Volume 1 2012 1 2013 1 2014
如何使用
pivot_long()
执行此操作?我被卡住了,我不知道,我尝试了两次重塑,但都不起作用:
longhino_df_cpsNBO <- pivot_longer(pippo, cols = `share2012`:`share2021`, names_to = "Year", values_to = "CNBO_pct") long_df_cpsNBO <- pivot_longer(longhino_df_cpsNBO, cols = starts_with("volume"), names_to = "Year_Unit", values_to = "Volumes_CNBO")
我们可以在单个pivot_longer
中这样做,方法是将names_to
指定为.value
的矢量,并指定与names_pattern
中捕获的模式匹配的"年份"((.*)
-零个或多个字符,(\d{4})
-字符串末尾($
)的四位数字)
library(tidyr)
pivot_longer(pippo, cols = -Brand,
names_to = c(".value", "Year"),
names_pattern = "(.*)(\d{4})$")
-输出
# A tibble: 6 × 4
Brand Year share volume
<int> <chr> <dbl> <chr>
1 1 2012 0.2 15
2 1 2021 0.1 18
3 2 2012 0.5 bla
4 2 2021 0.2 urc
5 3 2012 0.3 blob
6 3 2021 0.7 sdo
数据
pippo <- structure(list(Brand = 1:3, share2012 = c(0.2, 0.5, 0.3),
share2021 = c(0.1,
0.2, 0.7), volume2012 = c("15", "bla", "blob"), volume2021 = c("18",
"urc", "sdo")), class = "data.frame", row.names = c(NA, -3L))
有时重命名列以获得良好的分隔符更容易:
library(dplyr)
library(tidyr)
df %>%
rename_with(., ~gsub("e", "e_", .)) %>%
# rename_with(., ~str_replace(., "e", "e_")) %>% #needs library(stringr)
pivot_longer(-Brand,
names_to =c('.value','Year'),
names_sep = "_")
Brand Year share volume
<int> <chr> <dbl> <chr>
1 1 2012 0.2 15
2 1 2021 0.1 18
3 2 2012 0.5 bla
4 2 2021 0.2 urc
5 3 2012 0.3 blob
6 3 2021 0.7 sdo
看起来您需要先使用pivot_longer()
,然后使用pivot_wider()
:
library(tidyverse)
df <- tribble(
~Brand, ~share2012, ~share2021, ~volume2012, ~volume2021,
1, 0.2, 0.1, 15, 18,
2, 0.5, 0.2, 16, 19,
3, 0.3, 0.7, 14, 25
)
df |>
pivot_longer(
cols = !Brand,
names_pattern = "(share|volume)(\d{4})",
names_to = c("type", "year")
) |> pivot_wider(
names_from = type,
values_from = value
)
#> # A tibble: 6 × 4
#> Brand year share volume
#> <dbl> <chr> <dbl> <dbl>
#> 1 1 2012 0.2 15
#> 2 1 2021 0.1 18
#> 3 2 2012 0.5 16
#> 4 2 2021 0.2 19
#> 5 3 2012 0.3 14
#> 6 3 2021 0.7 25
创建于2023-04-03,reprex v2.0.2