如何加快我在R中的数据帧中搜索值的列表迭代速度



我有下面的代码,我运行了很多次,因为它在我的多个模型中,具有不同的参数,这个特殊的代码是最耗时的,有更快的方法吗?

membership_labels = unlist(lapply(comun$membership, 
function(x) membership_table[
membership_table$membership == x,]$membership_label))

描述数据:

  • comun$membership
c(4, 6, 9, 6, 7, 7, 6, 3, 3, 6, 3, 9, 7, 3, 3, 3, 3, 3, 7, 6, 
3, 7, 7, 3, 7, 7, 6, 6, 7, 6, 7, 3, 5, 7, 7, 3, 3, 3, 3, 3, 3, 
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 
8, 3, 3, 3, 3, 9, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 
3, 3, 9, 7, 9, 3, 19, 8, 6, 9, 8, 6, 19, 2, 2, 3, 2, 19, 7, 3, 
3, 7, 7, 2, 9, 3, 3, 19, 2, 3, 7, 11, 3, 17, 7, 9, 3, 9, 6, 7, 
6, 3, 17, 8, 3, 3, 19, 3, 3, 6, 1, 7, 9, 7, 7, 6, 3, 9, 6, 6, 
6, 7, 8, 4, 6, 6, 3, 7, 6, 6, 8, 6, 6, 3, 6, 3, 3, 3, 6, 3, 6, 
2, 3, 3, 3, 18, 6, 3, 3, 9, 3, 3, 3, 3, 8, 3, 6, 9, 6, 6, 6, 
6, 6, 6, 8, 3, 3, 6, 6, 3, 3, 3, 6, 1, 5, 7, 7, 7, 1, 7, 9, 7, 
7, 7, 7, 7, 7, 7, 7, 9, 7, 7, 7, 8, 3, 3, 17, 1, 1, 1, 7, 7, 
17, 6, 6, 6, 6, 6, 6, 6, 6, 3, 6, 7, 7, 7, 7, 6, 6, 7, 7, 7, 
7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 6, 7, 7, 7, 6, 7, 7, 7, 7, 6, 
7, 6, 7, 6, 7, 6, 7, 2, 7, 6, 6, 7, 6, 12, 12, 8, 2, 6, 7, 6, 
6, 7, 3, 6, 12, 6, 6, 8, 7, 7, 7, 7, 7, 7, 1, 7, 7, 7, 1, 16, 
7, 7, 6, 12, 7, 7, 2, 6, 6, 7, 3, 3, 6, 8, 9, 9, 6, 3, 6, 6, 
6, 18, 6, 3, 3, 6, 3, 17, 6, 8, 6, 6, 8, 10, 3, 14, 16, 6, 1, 
7, 16, 7, 5, 1, 11, 1, 19, 7, 3, 2, 7, 5, 8, 7, 14, 2, 17, 7, 
7, 11, 7, 1, 7, 1, 6, 1, 16, 6, 1, 7, 6, 7, 14, 2, 7, 2, 10, 
7, 7, 1, 2, 7, 7, 7, 5, 1, 7, 9, 1, 1, 1, 12, 1, 5, 13, 1, 9, 
2, 2, 2, 4, 2, 4, 2, 3, 2, 2, 3, 9, 9, 6, 6, 3, 6, 18, 15, 7, 
12, 7, 6, 7, 6, 17, 6, 7, 6, 7, 7, 6, 8, 5, 2, 7, 2, 8, 18, 6, 
8, 6, 6, 6, 12, 12, 6, 11, 2, 12, 8, 8, 8, 6, 3, 7, 3, 1, 1, 
8, 7, 6, 9, 6, 18, 6, 6, 6, 7, 7, 5, 2, 18, 6, 5, 7, 6, 6, 1, 
14, 6, 6, 9, 9, 7, 3, 6, 6, 9, 6, 3, 3, 6, 3, 6, 6, 3, 9, 9, 
3, 7, 3, 6, 6, 9, 6, 6, 3, 9, 6, 9, 11, 11, 1, 1, 5, 5, 7, 10, 
7, 9, 13, 10, 10, 2, 1, 5, 1, 16, 10, 10, 5, 10, 7, 10, 10, 5, 
10, 10, 1, 7, 5, 10, 1, 2, 2, 10, 10, 1, 1, 14, 16, 7, 1, 1, 
1, 7, 7, 16, 1, 5, 6, 1, 9, 1, 6, 5, 9, 9, 9, 9, 1, 18, 18, 16, 
1, 6, 1, 1, 1, 1, 6, 10, 1, 1, 1, 1, 1, 11, 1, 1, 10, 9, 1, 10, 
1, 1, 7, 1, 1, 1, 10, 10, 1, 1, 1, 9, 6, 7, 5, 1, 1, 1, 1, 1, 
1, 1, 9, 1, 1, 12, 1, 1, 6, 1, 1, 1, 1, 9, 1, 1, 1, 9, 1, 1, 
1, 9, 10, 1, 5, 1, 1, 1, 6, 1, 6, 6, 16, 12, 6, 2, 5, 1, 12, 
1, 6, 1, 12, 5, 1, 5, 12, 7, 5, 2, 10, 2, 1, 7, 1, 1, 5, 9, 10, 
5, 10, 10, 1, 5, 10, 1, 9, 2, 1, 6, 10, 1, 6, 1, 12, 9, 1, 3, 
6, 13, 9, 5, 1, 15, 7, 2, 12, 1, 12, 6, 5, 1, 2, 1, 1, 6, 2, 
7, 1, 5, 2, 6, 7, 7, 5, 6, 5, 5, 2, 1, 18, 8, 6, 1, 14, 1, 1, 
16, 15, 8, 1, 12, 2, 1, 4, 2, 4, 13, 2, 14, 16, 17, 4, 2, 4, 
4, 4, 2, 13, 2, 13, 16, 4, 10, 1, 2, 10, 13, 13, 14, 7, 2, 7, 
10, 5, 10, 5, 10, 1, 10, 5, 10, 7, 4, 1, 5, 1, 6, 5, 11, 7, 7, 
7, 10, 5, 10, 10, 4, 10, 4, 10, 6, 10, 5, 10, 5, 19, 1, 19, 2, 
10, 5, 10, 5, 10, 10, 5, 5, 10, 5, 14, 10, 6, 10, 2, 4, 4, 4, 
19, 17, 4, 2, 2, 2, 2, 2, 9, 7, 9, 6, 9, 5, 8, 6, 2, 1, 1, 9, 
1, 1, 1, 5, 1, 1, 5, 14, 7, 18, 1, 1, 12, 1, 1, 1, 6, 3, 15, 
1, 10, 6, 2, 6, 10, 15, 2, 12, 1, 5, 1, 1, 1, 11, 6, 1, 2, 18, 
15, 5, 8, 5, 9, 1, 13, 3, 8, 11, 8, 6, 5, 7, 6, 8, 3, 8, 2, 1, 
15, 1, 15, 2, 9, 9, 9, 6, 13, 2, 10, 11, 10, 8, 6, 4, 2, 4, 4, 
4, 4, 2, 4, 2, 2, 4, 10, 4, 10, 4, 13, 4, 4, 4, 2, 14, 13, 10, 
5, 10, 10, 10, 5, 2, 6, 1, 11, 1, 5, 1, 5, 2, 1, 2, 15, 5, 1, 
5, 10, 10, 10, 10, 4, 10, 5, 10, 5, 10, 10, 10, 10, 10, 4, 10, 
5, 10, 11, 10, 2, 10, 5, 10, 5, 4, 5, 10, 10, 10, 5, 10, 5, 5, 
10, 5, 10, 10, 10, 10, 5, 10, 5, 10, 5, 10, 2, 2, 10, 1, 1, 5, 
1, 10, 10, 5, 10, 5, 10, 10, 11, 5, 1, 6, 10, 5, 10, 5, 10, 5, 
10, 5, 10, 6, 1, 2, 1, 1, 1, 15, 2, 4, 4, 4, 10, 10, 4, 10, 10, 
4, 4, 10, 6, 4, 4, 4, 10, 4, 10, 4, 4, 4, 6, 2, 8, 10, 4, 10, 
14, 6, 2, 6, 10, 9, 1, 13, 1, 1, 1, 5, 1, 1, 2, 1, 2, 1, 1, 1, 
1, 5, 5, 1, 5, 8, 6, 1, 2, 2, 5, 6, 1, 11, 1, 5, 15, 5, 10, 10, 
10, 10, 10, 1, 10, 6, 2, 4, 2, 4, 4, 2, 4, 4, 2, 4, 2, 4, 2, 
4, 10, 2, 10, 4, 4, 19, 15, 10, 4, 13, 2, 17, 4, 17, 4, 4, 4, 
2, 4, 4, 2, 4, 2, 4, 4, 4, 4, 2, 4, 16, 2, 4, 4, 2, 2, 4, 4, 
4, 4, 2, 2, 4, 4, 4, 15, 4, 4, 5, 11, 10, 2, 2, 2, 14, 2, 18, 
5, 2, 8, 4, 10, 10, 4, 5, 4, 2, 10, 10, 11, 2, 15, 10, 11, 1, 
4, 4, 14, 4, 2, 1, 4, 4, 4, 14, 8, 8, 8, 4, 2, 4, 1, 11, 1, 4, 
2, 2, 6, 2, 6, 9, 1, 6, 2, 8, 6, 2, 2, 2, 6, 9, 2, 4, 2, 2, 14, 
6, 9, 8, 2, 2, 2, 2, 4, 4, 6, 4, 4, 15, 2, 2, 2, 4, 2, 2, 8, 
2, 2, 2, 2, 4, 4, 4, 2, 4, 4, 2, 4, 4, 4, 2, 4, 4, 4, 4, 4, 4, 
2, 4, 6, 4, 2, 4, 2, 2, 4, 4, 4, 2, 4, 4, 4, 4, 4, 2, 2, 4, 2, 
4, 2, 4, 2, 4, 2, 4, 2, 4, 2, 4, 4, 4, 2, 4, 2, 4, 2, 17, 2, 
2, 4, 2, 2, 4, 2, 4, 2, 4, 4, 2, 2, 4, 2, 4, 2, 2, 4, 4, 4, 2, 
10, 5, 4, 4, 4, 2, 2, 2, 2, 2, 6, 9, 2, 9, 4, 11, 4, 2, 15, 2, 
4, 10, 4, 1, 4, 4, 2, 2, 2, 8, 6, 4, 4, 2, 8, 2, 9, 2, 1, 4, 
2, 8, 2, 8, 4, 10, 4, 4, 4, 5, 10, 4, 10, 4)
  • membership_table
membership community_count community_maj_count membership_label
<int>           <int>               <int>            <int>
1          1             276                 171                2
2          2             319                 206                2
3          3             482                 322               -2
4          4             404                 293                2
5          5             161                  88                2
6          6             271                 110               -2
7          7             332                 167               -2
8          8              56                  46                2
9          9              77                  37                2
10         10             434                 244                2
11         11              27                  12               -2
12         12              24                  11                2
13         13              19                   8                2
14         14              17                   8               -2
15         15              18                  16                2
16         16              14                   5               -2
17         17              13                   8               -2
18         18              16                  13                2
19         19              12                   9               -2

dput:

structure(list(membership = 1:19, community_count = c(276L, 319L, 
482L, 404L, 161L, 271L, 332L, 56L, 77L, 434L, 27L, 24L, 19L, 
17L, 18L, 14L, 13L, 16L, 12L), community_maj_count = c(171L, 
206L, 322L, 293L, 88L, 110L, 167L, 46L, 37L, 244L, 12L, 11L, 
8L, 8L, 16L, 5L, 8L, 13L, 9L), membership_label = c(2L, 2L, -2L, 
2L, 2L, -2L, -2L, 2L, 2L, 2L, -2L, 2L, 2L, -2L, 2L, -2L, -2L, 
2L, -2L)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-19L))

需要注意的是,membership_table中的成员资格从1开始是连续的,但并不总是如此,它可以从任何索引开始,并且之间有间隔。

临时答案:

membership_index_list <- list()
membership_index_list[membership_table$membership] = membership_table$membership_label
membership_labels <- membership_index_list[comun$membership]

将我的注释更改为使用match而不是%in%:

membership_table$membership_label[match(comun$membership, membership_table$membership)]
#    [1]  2 -2  2 -2 -2 -2 -2 -2 -2 -2 -2  2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2  2 -2 -2 -2 -2 -2 -2
#   [40] -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2
#   [79] -2 -2 -2 -2 -2  2 -2 -2 -2 -2  2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2  2 -2  2 -2 -2  2 -2  2  2 -2 -2
### ...truncated...
# [1366]  2  2  2  2  2  2  2  2  2  2  2  2  2  2 -2  2  2  2  2 -2  2  2  2  2  2  2  2  2  2  2  2  2  2  2 -2  2  2  2  2
# [1405]  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2

根据您的使用情况和对数据完整性的信心,当comun$membership中有membership_table中找不到的内容时,您可能会看到NA。例如,

membership_table$membership_label[match(c(99, head(comun$membership)), membership_table$membership)]
# [1] NA  2 -2  2 -2 -2 -2

其中CCD_ 10不在可用的CCD_。

相关内容

  • 没有找到相关文章

最新更新