r语言 - 使用咕噜咕噜::map 函数计算每行大型数据帧中的零数 - r - Count number of zeros in each row of large data.frame using purrr::map function 小贝子编程网

>我有一个非常大的数据帧，280,000 x 20，许多行(obs(中只有1或0个值。我正在使用的函数每个操作至少需要 2 个值。我可以使用 for 循环进行迭代，但这需要很长时间。我想使用咕噜声地图功能之一来提高速度，因为我会这样做很多次。这就是我使用 for 循环的方式：

library(Matrix)
M1 <- as.matrix(rsparsematrix(100, 20, .1, rand.x = runif))
x <- vector("integer")
for(i in 1:dim(M1)[1]){
  l <- (length(which(M1[i,] == 0)))
  x <- c(x,l)
}
ind <- which(x == 19 | x == 20)
M1 <- M1[-ind,]

我还没有找到使用地图的正确方法。我认为它需要使用 mutate 创建另一列。

M1 %>% mutate(zero_count = length(map(which(. == 0))))

目前尚不清楚

预期的情况。首先，我们将matrix转换为tibble或data.frame，然后将列mutate为逻辑列，reduce为单个vector，方法是在每行中添加(+(所有TRUE值，并与原始矩阵("M1"(的vector cbind

library(tidyverse)
M1 %>% 
  as_tibble %>%
  mutate_all(funs(.==0)) %>%
  reduce(`+`) %>% 
  cbind(M1, Count = .)

更新

用于根据总和对行进行子集化

M1 %>% 
  as_tibble %>% 
  mutate_all(funs(.==0)) %>% 
  reduce(`+`) %>% 
  `%in%`(19:20)  %>%
  magrittr::extract(M1, .,)

使用base R，它rowSums在逻辑matrix上，并与原始matrix cbind

cbind(M1, Count = rowSums(!M1))

或与rowSums子集

M1[rowSums(!M1) %in% 19:20, ]

你可以

用apply实现同样的事情

apply(M1, 1 , function(x) sum(!x))

r语言 - 使用咕噜咕噜::map 函数计算每行大型数据帧中的零数

更新

相关内容

最新更新

热门标签：