我有每月的数据,我想为该期间添加另一列。专栏会说M01代表一月,M02代表二月,M03代表三月,依此类推。有办法做到这一点吗?
这就是我所拥有的:
unemployment = data.frame(Month = c("Sept 2002", "Oct 2002", "Nov 2002", "Dec 2002", "Jan 2003", "Feb 2003"),
Total = c(5.7, 5.7, 5.9,
6, 5.8, 5.9))
> unemployment
Month Total
1 Sept 2002 5.7
2 Oct 2002 5.7
3 Nov 2002 5.9
4 Dec 2002 6.0
5 Jan 2003 5.8
6 Feb 2003 5.9
这就是我想要的:
Month Period Total
1 Sept 2002 M09 5.7
2 Oct 2002 M10 5.7
3 Nov 2002 M11 5.9
4 Dec 2002 M12 6.0
5 Jan 2003 M01 5.8
6 Feb 2003 M02 5.9
编辑更新代码以显示所有12个月的
structure(list(Month = c("Jan", "Feb", "Mar", "Apr", "May", "June"
), Year = c("2003", "2003", "2003", "2003", "2003", "2003"),
Unemp_percent = c(5.8, 5.9, 5.9, 6, 6.1, 6.3)), row.names = 5:10, class = "data.frame")
使用dplyr
:
unemployment %>%
mutate(Period = case_when(grepl("Jan",Month) ~ "M01",
grepl("Feb",Month) ~ "M02",
grepl("Mar",Month) ~ "M03",
grepl("Apr",Month) ~ "M04",
grepl("May",Month) ~ "M05",
grepl("June",Month) ~ "M06",
grepl("July",Month) ~ "M07",
grepl("Aug",Month) ~ "M08",
grepl("Sept",Month) ~ "M09",
grepl("Oct",Month) ~ "M10",
grepl("Nov",Month) ~ "M11",
grepl("Dec",Month) ~ "M12"))
您可以使用内置month.abb
数据的前3个字母mutate
、gsub
和match
library(dplyr)
unemployment |>
mutate(.after = Month,
Period = paste0("M", match(gsub("(.{3})(.*)", "\1", Month ), month.abb)))
Month Period Total
1 Sept 2002 M9 5.7
2 Oct 2002 M10 5.7
3 Nov 2002 M11 5.9
4 Dec 2002 M12 6.0
5 Jan 2003 M1 5.8
6 Feb 2003 M2 5.9
这里有另一种选择:
unemployment %>%
mutate(month = gsub("(^.{3}).*", "\1", Month),
Period = paste0("M", as.numeric(factor(x$month, month.abb)))) %>%
select(Month, Period, Total)
输出:
Month Period Total
1 Sept 2002 M9 5.7
2 Oct 2002 M10 5.7
3 Nov 2002 M11 5.9
4 Dec 2002 M12 6.0
5 Jan 2003 M1 5.8
6 Feb 2003 M2 5.9
只需制作一个月和期间的参考表,然后制作left_join
。
library(tidyverse)
months <- data.frame(mo = c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sept", "Oct", "Nov", "Dec"),
Period = paste0("M", 1:12))
unemployment %>%
mutate(mo = str_extract(Month, "^[:alpha:]+")) %>%
left_join(months, by = "mo")
Month Total mo Period
1 Sept 2002 5.7 Sept M9
2 Oct 2002 5.7 Oct M10
3 Nov 2002 5.9 Nov M11
4 Dec 2002 6.0 Dec M12
5 Jan 2003 5.8 Jan M1
6 Feb 2003 5.9 Feb M2
如果Month列的第4个字符是一个提供标准3个月缩写的单词字符,则删除该字符。然后将其转换为yearmon对象,并以所需的方式对其进行格式化。最后,我们将Period列重新定位到Month列之后。如果Period列位于Month列之后并不重要,则省略最后一行代码。
library(dplyr, exclude = c("filter", "lag"))
library(zoo)
unemployment %>%
mutate(Period = format(as.yearmon(sub("^(...)\w?", "\1", Month)), "M%m")) %>%
relocate(Period, .after = Month)
给予:
Month Period Total
1 Sept 2002 M09 5.7
2 Oct 2002 M10 5.7
3 Nov 2002 M11 5.9
4 Dec 2002 M12 6.0
5 Jan 2003 M01 5.8
6 Feb 2003 M02 5.9