r语言 - 根据第二列中的唯一值返回一列的子集向量



我有一个看起来像这样的数据帧:

my.col  gathered
1       Country 2008_2019
2          Year 2008_2019
3         avgbw 2008_2019
4         avgmx 2008_2019
5         adspd 2008_2019 
12      ecom_cb 2011_2019
13     ecom_mbl 2011_2019
14      iaccess 2008_2019
15        ibank 2012_2019
16         ibus 2008_2019
17       ictinv 2008_2019
18       iusage 2008_2019
19       laptop 2008_2019
20   mbl_access 2014_2019
21        midle 2008_2019
22        pcomp 2008_2019
23        phone 2008_2019
24        rtlpc 2008_2019
25        smart 2008_2019
26          tpc 2008_2019
27   ibuy_class 2014_2017
28  rdply_class 2014_2017
29  mdply_class 2014_2017
30   acnt_class 2011_2017
31     ibuy_gen 2014_2017
32    rdply_gen 2014_2017
33    mdply_gen 2014_2017
34     acnt_gen 2011_2017
35   ibuy_rural 2014_2017
36  rdply_rural 2014_2017
37  mdply_rural 2014_2017
38   acnt_rural 2011_2017

我正在寻找一个脚本,该脚本返回向量,其中包含gathered中每个唯一值的所有my.col值(由gathered值命名(

示例输出:

`2014_2017` <- c("ibuy_class", "rdply_class", ... , "mdply_rural")
`2011_2017` <- c("acnt_class", "acnt_gen", "acnt_rural")

谢谢

我们可以在数据集的unique行上使用split来创建向量list,并使用list2env在全局 env 上创建对象(但不推荐(

lst1 <- with(unique(df), split(my.col, gathered))
list2env(lst1, .GlobalEnv)
`2008_2019`
#[1] "Country" "Year"    "avgbw"   "avgmx"   "adspd"   "iaccess" "ibus"    "ictinv"  "iusage"  "laptop"  "midle"  
#[12] "pcomp"   "phone"   "rtlpc"   "smart"   "tpc" 

数据

df <- structure(list(my.col = c("Country", "Year", "avgbw", "avgmx", 
"adspd", "ecom_cb", "ecom_mbl", "iaccess", "ibank", "ibus", "ictinv", 
"iusage", "laptop", "mbl_access", "midle", "pcomp", "phone", 
"rtlpc", "smart", "tpc", "ibuy_class", "rdply_class", "mdply_class", 
"acnt_class", "ibuy_gen", "rdply_gen", "mdply_gen", "acnt_gen", 
"ibuy_rural", "rdply_rural", "mdply_rural", "acnt_rural"), gathered = c("2008_2019", 
"2008_2019", "2008_2019", "2008_2019", "2008_2019", "2011_2019", 
"2011_2019", "2008_2019", "2012_2019", "2008_2019", "2008_2019", 
"2008_2019", "2008_2019", "2014_2019", "2008_2019", "2008_2019", 
"2008_2019", "2008_2019", "2008_2019", "2008_2019", "2014_2017", 
"2014_2017", "2014_2017", "2011_2017", "2014_2017", "2014_2017", 
"2014_2017", "2011_2017", "2014_2017", "2014_2017", "2014_2017", 
"2011_2017")), class = "data.frame", row.names = c("1", "2", 
"3", "4", "5", "12", "13", "14", "15", "16", "17", "18", "19", 
"20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30", 
"31", "32", "33", "34", "35", "36", "37", "38"))

最新更新