这个rope的实现是如何正确的?



我试图解决一个涉及到大量插入到列表中的编程问题。然而,问题的细节与问题无关。

为了解决这个问题,我最终编写了某种rope数据结构,试图从我能在互联网上找到的稀缺资源中拼凑出一个实现。

我最后这样写:

from dataclasses import dataclass
import typing as t
from hypothesis import given, strategies as st
T = t.TypeVar("T")

@dataclass
class Branch(t.Generic[T]):
weight: int
left: t.Union[t.List[T], "Branch[T]"]
right: t.Union[t.List[T], "Branch[T]"]

Rope = t.Union[t.List[T], Branch[T]]

def insert(rope: Rope, index: int, item: T) -> Rope:
if isinstance(rope, list):  # leaf
if index == len(rope):  # at the end, just append
rope.append(item)
return rope
# split
left, right = rope[:index], [item, *rope[index:]]
return Branch(len(left), left, right)
else:  # branch
if rope.right and rope.weight <= index:
rope.right = insert(rope.right, index - rope.weight, item)
else:
rope.weight += 1
rope.left = insert(rope.left, index, item)
return rope

def straighten(rope: Rope) -> t.Iterator[T]:
if isinstance(rope, list):
yield from rope
else:
yield from straighten(rope.left)
yield from straighten(rope.right)

@st.composite
def instructions(draw, elements=st.integers(0, 255)):
items = draw(st.lists(elements, min_size=1))
result = []
for i, item in enumerate(items):
insertion_point = draw(st.integers(min_value=0, max_value=i))
result.append((insertion_point, item))
return result

@given(instructions())
def test_correctness(seq: t.List[t.Tuple[int, int]]) -> None:
manual: t.List[int] = []
for (i, item) in seq:
manual.insert(i, item)
rope: Rope = []
for (i, item) in seq:
rope = insert(rope, i, item)
assert manual == list(
straighten(rope)
), f"{manual!r} != {rope!r} => {list(straighten(rope))!r}"

if __name__ == "__main__":
test_correctness()
print("All good")
如您所见,代码传递了假设,因此,它解决了问题。

我无法理解,然而为什么我写的这段代码是正确的,因为试图使用分支和叶子(也就是说,不使用列表)重写它会产生一个不正确的实现:

from dataclasses import dataclass
import typing as t
from hypothesis import given, strategies as st
T = t.TypeVar("T")

@dataclass
class Branch(t.Generic[T]):
weight: int
left: t.Union[T, "Branch[T]"]
right: t.Union[T, "Branch[T]"]

Rope = t.Union[T, Branch[T]]

def insert(rope: Rope, index: int, item: T) -> Rope:
if isinstance(rope, Branch):  # leaf
if rope.right and rope.weight <= index:
rope.right = insert(rope.right, index - rope.weight, item)
else:
rope.weight += 1
rope.left = insert(rope.left, index, item)
return rope
if index == 0:
left, right = item, rope
else:
left, right = rope, item
return Branch(1, left, right)

def straighten(rope: Rope) -> t.Iterator[T]:
if isinstance(rope, Branch):
yield from straighten(rope.left)
yield from straighten(rope.right)
else:
yield rope

@st.composite
def instructions(draw, elements=st.integers(0, 255)):
items = draw(st.lists(elements, min_size=1))
result = []
for i, item in enumerate(items):
insertion_point = draw(st.integers(min_value=0, max_value=i))
result.append((insertion_point, item))
return result

@given(instructions())
def test_correctness(seq: t.List[t.Tuple[int, int]]) -> None:
it = iter(seq)
_, head = next(it)
rope: Rope = head
manual = [head]
for (i, item) in it:
manual.insert(i, item)
rope = insert(rope, i, item)
straight_rope: t.List[int] = list(straighten(rope))
assert manual == straight_rope, f"{manual!r} != {straight_rope!r} ({rope!r})"

if __name__ == "__main__":
test_correctness()
print("All good")

所以,如果有更深入了解这个数据结构的人能告诉我这两种实现之间的区别,以及为什么一种可以工作,而另一种不行,我会很感激。提前谢谢你。

您的代码失败了,因为在if rope.right and rope.weight <= index:中,当rope.right == 0时条件计算为False。在第二段代码中,当右节点是值为0的叶节点时,就会发生这种情况,并且会导致该值被插入到错误的子树中。

根本没有理由检查rope.right,所以这个检查应该简单地为if rope.weight <= index:

在这两种rope实现中都存在另一个潜在的问题:您正在改变节点,而rope实现通常将节点视为不可变的,并创建新节点以进行任何更改。这不会影响您的测试,因为您一次只有一个rope实例。一个可能导致问题的示例:

r = create_some_rope()
r2 = r
r = insert(r, idx, val)
# now r2 may or may not have changed, depending on its structure.

如果添加rope连接,也会导致问题。在这种情况下,如果您将一个rope连接到它自己(或它自己的子树),那么任何进一步的操作都可能无意中影响到rope的多个部分。

最新更新