在scala中的循环中引入一个计数器

我正在写一个小程序，它将把一个很大的文件转换成多个较小的文件，每个文件将包含100行。

我正在迭代一行迭代：

  while (lines.hasNext) {
      val line = lines.next()
  }

我想介绍一个计数器，当它达到某个值时，重置计数器并继续。在java中，我会做一些类似的事情：

int counter = 0;
      while (lines.hasNext) {
          val line = lines.next()
if(counter == 100){
 counter = 0;
}
++counter
      }

scala中是否有类似的内容或替代方法？

传统上在scala中使用.zipWithIndex

scala> List("foo","bar")
res0: List[java.lang.String] = List(foo, bar)
scala> for((x,i) <- res0.zipWithIndex) println(i + " : " +x)
0 : foo
1 : bar

（这也适用于您的行，只要它们在Iterator中，例如有hasNext和next()方法，或者一些其他标量集合）

但是，如果您需要一个复杂的逻辑，比如重置计数器，您可以用与java:中相同的方式编写它

var counter = 0
while (lines.hasNext) {
  val line = lines.next()
  if(counter % 100 == 0) {
    // now write to another file
  }
}

也许你们可以告诉我们你们为什么要重置计数器，所以我们可以说如何做得更好？

编辑根据您的更新，最好使用分组方法，如@pr1001所建议的：

lines.grouped(100).foreach(l => l.foreach(/* write line to file*/))

如果重置计数器表示原始列表中存在重复的数据组，则可能需要使用grouped方法：

scala> val l = List("one", "two", "three", "four")
l: List[java.lang.String] = List(one, two, three, four)
scala> l.grouped(2).toList
res0: List[List[java.lang.String]] = List(List(one, two), List(three, four))

更新：由于您正在读取文件，因此应该能够非常有效地迭代文件：

val bigFile = io.Source.fromFile("/tmp/verybigfile")
val groupedLines = bigFile.getLines.grouped(2).zipWithIndex
groupedLines.foreach(group => {
  val (lines, index) = group
  val p = new java.io.PrintWriter("/tmp/" + index)
  lines.foreach(p.println)
  p.close()
})

当然，这也可以写为理解。。。

在将每组行写入自己的文件之前，您甚至可以通过将groupedLines转换为具有.par的并行集合来获得更好的性能。

这将起作用：

lines grouped 100 flatMap (_.zipWithIndex) foreach {
  case (line, count) => //whatever
}

您可以使用zipWithIndex和一些转换。

scala> List(10, 20, 30, 40, 50).zipWithIndex.map(p => (p._1, p._2 % 3))
res0: List[(Int, Int)] = List((10,0), (20,1), (30,2), (40,0), (50,1))

相关内容

最新更新

热门标签：