Scala Streams:如何避免保留对头(和其他元素)的引用

假设我有一个尾部递归方法，它在Stream的元素上循环，如下所示(简化代码，未测试)：

@tailrec
def loop(s: Stream[X], acc: Y): Y = 
s.headOption match {
case None => acc
case Some(x) => loop(s.tail, accumulate(x, acc))
}

我在迭代时是否保留了对流头(和所有其他元素)的引用，我知道应该避免这种情况？如果是这样的话，那么实现同样目标的更好方法是什么呢？

调用此操作的代码(我希望)没有保留引用。假设list是List[X]，则代码调用

loop(list.sliding(n).toStream, initialY)

编辑：我知道这可以很容易地在没有尾部递归的情况下完成(例如使用foldLeft)，但非简化代码并不是一次只循环一个元素(有时使用s而不是s.tail，有时使用s.tail.dropWhile(...)。所以我想知道如何正确使用Stream。

tl；dr:您的方法loop是正确的，将不会引用流的头。你可以在无限的Stream上测试它。

让我们将您的代码示例简化到极限：

class Test {
private[this] var next: Test = _
final def fold(): Int = {
next = new Test
next.fold()
}
}

请注意，您的loop方法也是某个对象的方法。

方法是final(就像Stream#foldLeft一样)——它非常重要。

使用尾部递归优化后的scalac -Xprint:all test.scala，您将得到以下内容：

final def fold(): Int = {
<synthetic> val _$this: Test = Test.this;
_fold(_$this: Test){
({
Test.this.next = new Test();
_fold(Test.this.next)
}: Int)
}
};

这个代码不会帮助你理解发生了什么

通往理解的神奇之地的唯一道路是java字节码。

但你应该记住一件事：没有method of object这样的东西。所有方法都是"静态"的。CCD_ 17只是该方法的第一个参数。如果方法是虚拟的，那么就有vtable这样的东西，但我们的方法是最终的，所以在这种情况下不会有动态调度。

还要注意，并没有参数这样的东西：所有参数都只是变量，在方法执行之前初始化。

因此this只是该方法的第一个变量(索引0)。

让我们来看看字节码(javap -c Test.class)：

public final int fold();
Code:
0: aload_0       
1: new           #2                  // class Test
4: dup           
5: invokespecial #16                 // Method "<init>":()V
8: putfield      #18                 // Field next:LTest;
11: aload_0       
12: getfield      #18                 // Field next:LTest;
15: astore_0      
16: goto          0

让我们用类似伪标量的代码来写这个方法：

static foo(var this: Test): Int {
:start // label for goto jump
// place variable `this` onto the stack:
//   0: aload_0       
// create new `Test`
//   1: new           #2                  // class Test
// invoke `Test` constructor
//   4: dup           
//   5: invokespecial #16                 // Method "<init>":()V
// assign `this.next` field value
//   8: putfield      #18                 // Field next:LTest;
this.next = new Test
// place `this.next` onto the stack
//  11: aload_0       
//  12: getfield      #18                 // Field next:LTest;
// assign `this.next` to variable `this`!
//  15: astore_0      
this = this.next // we have no reference to the previous `this`!
//  16: goto          0
goto :start
}

在this = this.next之后，我们没有引用堆栈上或第一个变量中的前一个this。并且之前的this可以被GC移除！

因此Stream#foldLeft中的tail.foldLeft(...)将被替换为this = this.tail, ...; goto :start。由于this只是方法@tailrec的第一个自变量，因此foldLeft声明是有意义的。

现在我们终于可以理解scalac -Xprint:all test.scala的结果：

final def method(a: A, b: B, ...): Res = {
<synthetic> val _$this: ThisType = ThisType.this;
_method(_$this: Test, a: A, b: B, ...){
({
// body
_method(nextThis, nextA, nextB, ...)
}: Res)
}
};

表示：

final def method(var this: ThisType, var a: A, var b: B, ...): Res = {
// _method(_$this: Test, a: A, b: B, ...){
:start
// body
//   _method(nextThis, nextA, nextB, ...)
this = nextThis
a = nextA
b = nextB
...
goto :start
};

这正是在loop方法上scalac -Xprint:all之后得到的结果，但body将是巨大的。所以在你的情况下：

...
case Some(x) =>
this = this
s = s.tail
acc = accumulate(x, acc)
goto :start
...

在s = s.tail之后，您没有引用流的头。

在大多数出现此问题的情况下，更重要的问题是"调用代码是否挂在流的头上？">

真正重要的是，您将另一个方法的输出直接传递给loop，而不是先将其分配给val。

也就是说，我会使用一种更简单的方法来避免所有可能的混乱：

list.sliding(n).foldLeft(initialY)(accumulate)

相关内容

最新更新

热门标签：