以下Scala+Akka代码中有哪些提高性能/并发性的机会



我正在寻找在Scala 2.9/Akka 2.0 RC2代码中提高并发性和性能的机会。给定以下代码:

import akka.actor._
case class DataDelivery(data:Double)
class ComputeActor extends Actor {
    var buffer = scala.collection.mutable.ArrayBuffer[Double]()
    val functionsToCompute = List("f1","f2","f3","f4","f5")
    var functionMap = scala.collection.mutable.LinkedHashMap[String,(Map[String,Any]) => Double]()  
    functionMap += {"f1" -> f1}
    functionMap += {"f2" -> f2}
    functionMap += {"f3" -> f3}
    functionMap += {"f4" -> f4}
    functionMap += {"f5" -> f5}
    def updateData(data:Double):scala.collection.mutable.ArrayBuffer[Double] = {
        buffer += data
        buffer
    }
    def f1(map:Map[String,Any]):Double = {
//    println("hello from f1")
      0.0
    }
    def f2(map:Map[String,Any]):Double = {
//    println("hello from f2")
      0.0
    }
    def f3(map:Map[String,Any]):Double = {
//    println("hello from f3")
      0.0
    }
    def f4(map:Map[String,Any]):Double = {
//    println("hello from f4")
      0.0
    }
    def f5(map:Map[String,Any]):Double = {
//    println("hello from f5")
      0.0
    }
    def computeValues(immutableBuffer:IndexedSeq[Double]):Map[String,Double] = {
        var map = Map[String,Double]()
        try {
            functionsToCompute.foreach(function => {
                val value = functionMap(function)
                function match {
                    case "f1" =>
                        var v = value(Map("lookback"->10,"buffer"->immutableBuffer,"parm1"->0.0))
                        map += {function -> v}
                    case "f2" =>
                        var v = value(Map("lookback"->20,"buffer"->immutableBuffer))
                        map += {function -> v}
                    case "f3" =>
                        var v = value(Map("lookback"->30,"buffer"->immutableBuffer,"parm1"->1.0,"parm2"->false))
                        map += {function -> v}
                    case "f4" =>
                        var v = value(Map("lookback"->40,"buffer"->immutableBuffer))
                        map += {function -> v}
                    case "f5" =>
                        var v = value(Map("buffer"->immutableBuffer))
                        map += {function -> v}
                    case _ => 
                        println(this.unhandled())
                }
            })
        } catch {
            case ex: Exception =>
              ex.printStackTrace()
        }
        map
    }
    def receive = {
      case DataDelivery(data) =>
        val startTime = System.nanoTime()/1000
        val answers = computeValues(updateData(data))
        val endTime = System.nanoTime()/1000
        val elapsedTime = endTime - startTime
        println("elapsed time is " + elapsedTime)
        // reply or forward
      case msg =>
        println("msg is " + msg)
    }
}
object Test {
    def main(args:Array[String]) {
        val system = ActorSystem("actorSystem") 
        val computeActor = system.actorOf(Props(new ComputeActor),"computeActor")
        var i = 0
        while (i < 1000) {  
            computeActor ! DataDelivery(i.toDouble)
            i += 1
        }
    }
}

当我运行这个时,输出(转换为微秒)是

elapsed time is 4898
elapsed time is 184
elapsed time is 144
    .
    .
    .
elapsed time is 109
elapsed time is 103

您可以看到JVM的增量编译器正在启动。

我认为一个快速的胜利可能是改变

    functionsToCompute.foreach(function => {

    functionsToCompute.par.foreach(function => {

但这会导致以下经过时间

elapsed time is 31689
elapsed time is 4874
elapsed time is 622
    .
    .
    .
elapsed time is 698
elapsed time is 2171

一些信息:

1) 我在2核的Macbook Pro上运行这个。

2) 在完整版本中,函数是在可变共享缓冲区的部分上循环的长时间运行的操作。这似乎不是问题,因为从参与者的邮箱中检索消息控制着流,但我怀疑这可能是并发性增加的问题。这就是我转换为IndexedSeq的原因。

3) 在完整版本中,函数ToCompute列表可能会有所不同,因此并非函数Map中的所有项目都必须调用(即)functionMap.size可能比函数ToCompute.size 大得多

4) 函数可以并行计算,但在返回之前,生成的映射必须完整

一些问题:

1) 我能做些什么来使并行版本运行得更快?

2) 增加非阻塞和阻塞期货在哪里有意义?

3) 将计算转发给另一个参与者在哪里有意义?

4) 提高不变性/安全性的机会有哪些?

谢谢,Bruce

根据请求提供一个示例(很抱歉延迟…我没有SO的通知)。

Akka文档中有一个关于"构建未来"的好例子,但我会为您提供一些更适合您情况的内容。

现在,看完这篇文章后,请花点时间阅读Akka网站上的教程和文档。您缺少了这些文档将为您提供的许多关键信息。

import akka.dispatch.{Await, Future, ExecutionContext}
import akka.util.duration._
import java.util.concurrent.Executors
object Main {
  // This just makes the example work.  You probably have enough context
  // set up already to not need these next two lines
  val pool = Executors.newCachedThreadPool()
  implicit val ec = ExecutionContext.fromExecutorService(pool)
  // I'm simulating your function.  It just has to return a tuple, I believe
  // with a String and a Double
  def theFunction(s: String, d: Double) = (s, d)
  def main(args: Array[String]) {
    // Here we run your functions - I'm just doing a thousand of them
    // for fun.  You do what yo need to do
    val listOfFutures = (1 to 1000) map { i =>
      // Run them in parallel in the future
      Future {
        theFunction(i.toString, i.toDouble)
      }
    }
    // These lines can be composed better, but breaking them up should
    // be more illustrative.
    //
    // Turn the list of Futures (i.e. Seq[Future[(String, Double)]]) into a
    // Future with a sequence of results (i.e. Future[Seq[(String, Double)]])
    val futureOfResults = Future.sequence(listOfFutures)
    // Convert that future into another future that contains a map instead
    // instead of a sequence
    val intermediate = futureOfResults map { _.toList.toMap }
    // Wait for it complete.  Ideally you don't do this.  Continue to
    // transform the future into other forms or use pipeTo() to get it to go
    // as a result to some other Actor.  "Await" is really just evil... the
    // only place you should really use it is in silly programs like this or
    // some other special purpose app.
    val resultingMap = Await.result(intermediate, 1 second)
    println(resultingMap)
    // Again, just to make the example work
    pool.shutdown()
  }
}

在类路径中运行这个程序所需要的只是akka-actor jar。阿卡网站会告诉你如何设置你需要的东西,但它真的非常简单。

最新更新