我有一个df如下:
x = data.frame(retailer = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2),
store = c(5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6),
week = c(2021100301, 2021092601, 2021091901, 2021091201, 2020082901, 2020082201, 2020081501, 2020080801, 2021080101, 2021072501, 2021071801,
2021071101, 2020070401, 2020062701, 2020062001, 2020061301))
我有几个星期,其中值对应于日期,例如2021100301是10/03/2021,2021071101是07/11/2020。我想添加一个额外的列,我可以用这样的格式标记这些周,其中2021100301是10,因为它是10月份,2021090301是09,因为它是9月份,以此类推。我想对列中的所有周都这样做基本上就是在一个单独的列中按月标记它们。我试图得到的输出是:
x = data.frame(retailer = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2),
store = c(5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6),
week = c(2021100301, 2021092601, 2021091901, 2021091201, 2020082901, 2020082201, 2020081501, 2020080801, 2021080101, 2021072501, 2021071801,
2021071101, 2020070401, 2020062701, 2020062001, 2020061301),
month = c(10, 09, 09, 09, 08, 08, 08, 08, 08, 07, 07, 07, 07, 06, 06 ,06))
我可以这样做吗?谢谢。
Try withsubstr
library(dplyr)
x %>%
mutate(month = as.numeric(substr(week, 5, 6)))
与产出
retailer store week month
1 2 5 2021100301 10
2 2 5 2021092601 9
3 2 5 2021091901 9
4 2 5 2021091201 9
5 2 5 2020082901 8
6 2 5 2020082201 8
7 2 5 2020081501 8
8 2 5 2020080801 8
9 2 6 2021080101 8
10 2 6 2021072501 7
11 2 6 2021071801 7
12 2 6 2021071101 7
13 2 6 2020070401 7
14 2 6 2020062701 6
15 2 6 2020062001 6
16 2 6 2020061301 6
这是另一种方法:逻辑:
- 去掉最后两位数字
- 转换为船期 <
- 得到月/gh>
library(dplyr)
library(lubridate)
x %>%
mutate(month1 = gsub('.{2}$', '', week),
month1 = month(ymd(month1)))
retailer store week month1
1 2 5 2021100301 10
2 2 5 2021092601 9
3 2 5 2021091901 9
4 2 5 2021091201 9
5 2 5 2020082901 8
6 2 5 2020082201 8
7 2 5 2020081501 8
8 2 5 2020080801 8
9 2 6 2021080101 8
10 2 6 2021072501 7
11 2 6 2021071801 7
12 2 6 2021071101 7
13 2 6 2020070401 7
14 2 6 2020062701 6
15 2 6 2020062001 6
16 2 6 2020061301 6