让我们采用数据
library(plm)
data("Produc", package="plm")
head(Produc)
state year region pcap hwy water util pc gsp emp unemp
ALABAMA-1970 ALABAMA 1970 6 15032.67 7325.80 1655.68 6051.20 35793.80 28418 1010.5 4.7
ALABAMA-1971 ALABAMA 1971 6 15501.94 7525.94 1721.02 6254.98 37299.91 29375 1021.9 5.2
ALABAMA-1972 ALABAMA 1972 6 15972.41 7765.42 1764.75 6442.23 38670.30 31303 1072.3 4.7
ALABAMA-1973 ALABAMA 1973 6 16406.26 7907.66 1742.41 6756.19 40084.01 33430 1135.5 3.9
ALABAMA-1974 ALABAMA 1974 6 16762.67 8025.52 1734.85 7002.29 42057.31 33749 1169.8 5.5
ALABAMA-1975 ALABAMA 1975 6 17316.26 8158.23 1752.27 7405.76 43971.71 33604 1155.4 7.7
我想从pcap开始为每个变量添加滞后。我知道我可以使用:
Produc %>%
group_by(state) %>%
mutate(pcap = dplyr::lag(pcap, n = 1, default = NA))
但我发现它效率很低,因为我必须分别对每个变量执行此操作。有没有可能在一次内完成?
试试这个:
library(dplyr)
New_Produc <- Produc %>% mutate(across(.cols = c("pcap","hwy","water","util"), .fns = lag, n = 1, default = NA))
如果您想更改名称:
New_Produc <- Produc %>% mutate(across(.cols = c("pcap","hwy","water","util"), .fns = lag, .names = 'lag_{.col}', n = 1, default = NA))
我们只需要使用
View(Produc %>%
group_by(state) %>%
mutate_at(4:length(Produc), lag))
您可以使用plm
提供的means包来实现这一点。首先,使您的数据成为update.frame,使其具有面板意识(因此您在每个组(状态(中滞后(。然后lapply
plm的lag
函数在列上执行,并重建update.frame:
pProduc <- pdata.frame(Produc)
l <- lapply(as.list(pProduc[ ,-c(1:3)], keep.attributes = TRUE), lag)
df <- as.data.frame(l)
df <- data.frame(Produc[ , 1:3], df) # plug-in index columns again
pdf <- pdata.frame(df)
head(pdf)
state year region pcap hwy water util pc gsp emp unemp
ALABAMA-1970 ALABAMA 1970 6 NA NA NA NA NA NA NA NA
ALABAMA-1971 ALABAMA 1971 6 15032.67 7325.80 1655.68 6051.20 35793.80 28418 1010.5 4.7
ALABAMA-1972 ALABAMA 1972 6 15501.94 7525.94 1721.02 6254.98 37299.91 29375 1021.9 5.2
ALABAMA-1973 ALABAMA 1973 6 15972.41 7765.42 1764.75 6442.23 38670.30 31303 1072.3 4.7
ALABAMA-1974 ALABAMA 1974 6 16406.26 7907.66 1742.41 6756.19 40084.01 33430 1135.5 3.9
ALABAMA-1975 ALABAMA 1975 6 16762.67 8025.52 1734.85 7002.29 42057.31 33749 1169.8 5.5