R-将数据框架分为相等大小的重叠组



我正在寻找一种将我的数据分成组成的组的方法

      Chrom     Start   End        
      chr1       1    10      
      chr1       11   20      
      chr1       21   30      
      chr1       31   40 

例如,如果我想要20个窗口大小,则组为: 1-20、11-30、21-40。
只要组的大小不超过20个,就可以不断增加同一组。

我尝试使用拆分函数,但无法使用它来实现。有办法解决吗?

可以将向量(或数据框的列(分为重叠的窗口:

# Size of overlap
o <- 10
# Size of sliding window
n <- 20
# Dummy data
x <- sample(LETTERS, size = 40, replace = T)
# Define start and end point (s and e)
s <- 1
e <- n
# Loop to create fragments
for(i in 1:(length(x)/o)){
  assign(paste0("x", i), x[s:e])
  s <- s + o
  e <- (s + n) - 1
  }
# Call fragments  
x1
x2
x3

结果:

> x
 [1] "F" "E" "G" "X" "R" "S" "L" "F" "F" "C" "I" "X" "A" "C" "B" "Z" "Q" "T" "W" "L" "G" "I" "B" "I" "O" "V" "J" "Z" "C" "R" "W" "Z" "F" "T" "N" "U" "F" "R" "A" "V"
> x1
 [1] "F" "E" "G" "X" "R" "S" "L" "F" "F" "C" "I" "X" "A" "C" "B" "Z" "Q" "T" "W" "L"
> x2
 [1] "I" "X" "A" "C" "B" "Z" "Q" "T" "W" "L" "G" "I" "B" "I" "O" "V" "J" "Z" "C" "R"
library(IRanges)
library(GenomicRanges)
(gr1 <- GRanges("chr1",IRanges(c(1,11,21,31),width=10),strand="*"))
(gr2 <- GRanges("chr1",IRanges(c(1,11,21),width=20),strand="*"))

fo <- findOverlaps(gr1, gr2)
queryHits(fo)
subjectHits(fo)

检查http://genomicsclass.github.io/book/pages/bioc1_igranges.html#intrarange有关更多详细信息。

相关内容

  • 没有找到相关文章

最新更新