R中的面板数据如何取多年平均值?



我的数据集如下(完整的数据集包含了世界上所有的国家):

structure(list(Country = c("Afghanistan", "Afghanistan", "Afghanistan", 
"Afghanistan", "Afghanistan", "Albania"), CountryCode = c("AFG", 
"AFG", "AFG", "AFG", "AFG", "ALB"), Time = c("2017", "2018", 
"2019", "2020", "2021", "2017"), Time.Code = c("YR2017", "YR2018", 
"YR2019", "YR2020", "YR2021", "YR2017"), gdp_pc_g = c(0.0647641950115201, 
-1.19490038335884, 1.53563667424, -4.57503186436271, NA, 3.89774060601475
), gdp_pc = c(2058.40022106979, 2033.80438893717, 2065.03623501599, 
1970.56016925338, NA, 12770.9642910053), population_g = c(2.54783303994568, 
2.38430907940751, 2.31307334941778, 2.30381209309418, 2.30341213305193, 
-0.0919722937442495), savings = c(NA, NA, NA, NA, NA, 16.5293641587755
), inflation = c(2.40365618844285, 2.07134857890898, 6.52147984507825, 
7.82166712677619, NA, 1.45106269924993), unemployment = c(11.1800003051758, 
11.1520004272461, 11.2170000076294, 11.710000038147, 13.2829999923706, 
13.6199998855591), crime = c(6.6784014408596, 6.65556115177472, 
NA, NA, NA, 2.01097785878705), health = c(0.64313394, 0.54922014, 
1.08443093, NA, NA, 2.67507172), health_new = c(2.31911564, 1.89807308, 
3.87310839, NA, NA, 9.17743587), co2 = c(4780.00020980835, 6070.00017166138, 
6079.99992370605, NA, NA, 5139.9998664856), debt_wb = c(NA, NA, 
NA, NA, NA, 75.8887515969161), debt_imf = c(7.9976602, 7.3844129, 
6.1304099, 7.3974947, NA, 70.11991), qualityofgovernment = c(-1.60108568271001, 
-1.59558055798213, -1.56942588090897, -1.67851787805557, -1.68115723133087, 
0.0151268268624942), migration = c(-47.09, -47.205, -8.082, 166.821, 
-183.672, -9.768), gdp_pc_log = c(7.62968436836638, 7.61766340135312, 
7.63290305246442, 7.58607313118708, NA, 9.45492945839223), gdp_pc_1996 = c(NA, 
NA, NA, NA, NA, 4908), gdp_pc_1996_log = c(NA, NA, NA, NA, NA, 
8.4986218058308), unemployment_log = c(2.41412649502354, 2.41161889232933, 
2.41743048534325, 2.46044318086727, 2.58648502157599, 2.61153929231895
), crime_log = c(1.89887865364994, 1.8954527686191, NA, NA, NA, 
0.69862110067871), health_new_log = c(0.841185923351801, 0.640839203134581, 
1.35405738619833, NA, NA, 2.21674784862268), co2_log = c(8.47219586937846, 
8.71111391233384, 8.71275996241187, NA, NA, 8.54480833247365), 
migration_t5 = c(105.905, 48.076, 255.611, -281.739, -90.238, 
-19.946), debt_imf_1996 = c(NA, NA, NA, NA, NA, 34.69009), 
debt_imf_1996_log = c(NA, NA, NA, NA, NA, 3.5464540553859
), debt_imf_2001 = c(NA, NA, NA, NA, NA, 60.609729), debt_imf_2006 = c(22.984644, 
22.984644, 22.984644, 22.984644, 22.984644, 56.690318), debt_imf_2011 = c(7.5010844, 
7.5010844, 7.5010844, 7.5010844, 7.5010844, 59.399812), debt_imf_2016 = c(8.4372661, 
8.4372661, 8.4372661, 8.4372661, 8.4372661, 72.380049), politics_1996 = c(-1.94518780708313, 
-1.94518780708313, -1.94518780708313, -1.94518780708313, 
-1.94518780708313, -0.621049637595813), politics_2006 = c(-1.63495093584061, 
-1.63495093584061, -1.63495093584061, -1.63495093584061, 
-1.63495093584061, -0.443528113265832), politics_2016 = c(-1.55767250061035, 
-1.55767250061035, -1.55767250061035, -1.55767250061035, 
-1.55767250061035, 0.00307970804472781), politics_t5 = c(0.943790654341378, 
0.931954483191173, 1.02754203478495, 1.0176876783371, 0.94232749938965, 
2.25874653831124), politics_2001 = c(NA, NA, NA, NA, NA, 
NA), politics_2011 = c(-1.72906082868576, -1.72906082868576, 
-1.72906082868576, -1.72906082868576, -1.72906082868576, 
-0.213528143862883), politics_t10 = c(-1.68152981996536, 
-1.76271136601766, -1.78180730342865, -1.74940039714178, 
-1.72906082868576, -0.309388402849436), debt_imf_2001_log = c(NA, 
NA, NA, NA, NA, 4.10445542474383), debt_imf_2006_log = c(3.13482634077627, 
3.13482634077627, 3.13482634077627, 3.13482634077627, 3.13482634077627, 
4.03760343777541), debt_imf_2011_log = c(2.01504759675729, 
2.01504759675729, 2.01504759675729, 2.01504759675729, 2.01504759675729, 
4.08429106138043), debt_imf_2016_log = c(2.13265833433292, 
2.13265833433292, 2.13265833433292, 2.13265833433292, 2.13265833433292, 
4.28193069511796), debt_imf_log = c(2.07914902390068, 1.9993714138387, 
1.81326161557553, 2.00114138883371, NA, 4.25020677653823), 
gdp_pc_t5 = c(2075.49161435331, 2116.46525771251, 2102.38460375974, 
2068.26590413364, 2057.06797755329, 11227.9663308153), population_g_t5 = c(3.40760003260595, 
3.49459192739625, 3.35560233082199, 3.07705362914441, 2.77803478841236, 
-0.165151040121679), savings_t5 = c(NA, NA, NA, NA, NA, 19.6251036733499
), inflation_t5 = c(7.30175647136598, 4.82278547931647, 0.566944540816536, 
2.44756300204637, -2.19752646590604, 1.04271460643408), unemployment_t5 = c(11.3409996032715, 
11.1929998397827, 11.1420001983643, 11.1269998550415, 11.1579999923706, 
13.3800001144409), crime_t5 = c(6.25132816655284, NA, NA, 
9.78392178232544, 6.5511634561067, 5.38761486858166), health_new_t5 = c(1.37000275, 
1.77506948, 1.8606739, 2.01440573, 2.14110541, 8.69735146
), co2_t5 = c(8079.99992370605, 5989.99977111816, 4880.00011444092, 
5949.99980926514, 5300.00019073486, 4360.0001335144), debt_imf_t5 = c(6.7594679, 
6.9044792, 8.700291, 9.1540829, 8.4372661, 62.119953), qualityofgovernment_t5 = c(-1.55620934565862, 
-1.56804551680883, -1.47245796521505, -1.4823123216629, -1.55767250061035, 
-0.241253461688757), gdp_pc_t5_log = c(7.63795332718298, 
7.65750264485461, 7.65082750508984, 7.63446580769799, 7.62903693604581, 
9.32616293879677), inflation_t5_log = c(1.98811493172768, 
1.57335166141558, -0.5674937916493, 0.895092836478912, NA, 
0.0418275109957932), unemployment_t5_log = c(2.4284244428514, 
2.41528856859307, 2.41072176940192, 2.40937457414793, 2.41215672873958, 
2.59376106325321), crime_t5_log = c(1.83279394782042, NA, 
NA, 2.28074040390539, 1.87964266075989, 1.68410277651703), 
health_new_t5_log = c(0.314812747137289, 0.573839565823258, 
0.620938733933986, 0.70032422882336, 0.761322242398395, 2.1630185494273
), co2_t5_log = c(8.99714714207282, 8.69784665289884, 8.4929005222982, 
8.6911464664834, 8.57546213552792, 8.38022736696565), debt_imf_t5_log = c(1.91094417393909, 
1.93217036041067, 2.16335647337702, 2.21419999834076, 2.13265833433292, 
4.12906724168007), politics = c(0.898914317289989, 0.904419442017872, 
0.930574119091033, 0.821482121944428, 0.818842768669128, 
2.51512682686249), politics_t5_log = c(-0.0578509018988914, 
-0.0704713032668603, 0.0271695763178487, 0.0175330717798688, 
-0.0594024009312679, 0.814810030354778), migration_rate = c(-1.32, 
-1.286, -0.214, 4.289, -4.569, -3.387), migration_rate_t5 = c(3.481, 
1.525, 7.842, -8.31, -2.602, -6.872), inflation_new = c(1.02403656188443, 
1.02071348578909, 1.06521479845078, 1.07821667126776, NA, 
1.0145106269925), inflation_new_t5 = c(1.07301756471366, 
1.04822785479316, 1.00566944540816, 1.02447563002046, 0.97802473534094, 
1.01042714606434), inflation_new_t5_log = c(0.0704748332384, 
0.0471009809625991, 0.00565343458901919, 0.0241809012205975, 
-0.0222203175074989, 0.0103731583444112)), row.names = c(NA, 
-6L), class = c("tbl_df", "tbl", "data.frame"))

现在我想取每个国家的平均值,例如population_g,并将其放在同一数据集的新列中。

谁能帮我创建这个新列,它将有5年的平均人口为每个国家的g ?

所以我想得到的东西是:

阿富汗2017(所有以前的变量)(新变量为阿富汗2017年至2021年的人口平均值)

阿富汗2018(所有以前的变量)(新变量为阿富汗2017年至2021年的人口平均值)

阿富汗2019(所有以前的变量)(新变量为阿富汗2017年至2021年的人口平均值)

阿富汗2020(所有以前的变量)(新变量为阿富汗2017年至2021年的人口平均值)

阿富汗2021(所有以前的变量)(新变量为阿富汗2017年至2021年的人口平均值)

我认为这是你想要的。我利用plm包的面板数据函数得到每个国家的平均值。继续处理您的数据并将其命名为dat:

library("plm")
pdat <- pdata.frame(dat, index = c("CountryCode", "Time"))
pdat$avg_population_g <- plm::Between(pdat$population_g)
pdat[ , c("CountryCode", "Time", "population_g", "avg_population_g")]
##          CountryCode Time population_g avg_population_g
## AFG-2017         AFG 2017   2.54783304       2.37048794
## AFG-2018         AFG 2018   2.38430908       2.37048794
## AFG-2019         AFG 2019   2.31307335       2.37048794
## AFG-2020         AFG 2020   2.30381209       2.37048794
## AFG-2021         AFG 2021   2.30341213       2.37048794
## ALB-2017         ALB 2017  -0.09197229      -0.09197229

最新更新