Itertool.产品如何计算笛卡尔产品而不保留中间结果

根据此处的文档iterpools.products.products不在内存中保持中间结果（它计算输入列表的笛卡尔产品）。但是给出的算法的粗略草图使我相信它确实如此。注意结果如何通过在>结果中中的元素进行更多的元素而在每次迭代中进行更新。

def product(*args, repeat=1):
    # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
    # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
    pools = [tuple(pool) for pool in args] * repeat
    result = [[]]
    for pool in pools:
        result = [x+[y] for x in result for y in pool]
    for prod in result:
        yield tuple(prod)

我尝试在这里嘲笑基础 c 代码，但不能。我想了解 c 代码如何工作，而无需保持中间结果。我遇到了一种递归方法（在下面重现），该方法无法将中间结果保持在内存中，除了递归呼叫堆栈。 C代码还使用递归方法，否则它如何在不保留内存中的中间结果的情况下工作？

// Recursive approach
def product(input, ans, i): 
    if i == len(input): 
        print(ans) 
        return 
    j = 0 
    while j < len(input[i]): 
        ans.append(input[i][j]) 
        product(input, ans, i+1) 
        ans.pop() 
        j += 1
input = [] 
input.append(["hi", "hey"]) 
input.append(["there", "here"]) 
input.append(["cute", "handsome"])  
ans = [] 
print(product(input, ans, 0))
hi there cute
hi there handsome
....

它将输入存储在内存中（AS tuple s），以及每个tuple的索引，并且反复循环除了第一个。每次请求新的输出值时，IT：

将索引推向最右tuple
如果它超过了末端，则将其重置为零，并推进下一个最右边的索引
重复步骤2，直到发现可以增加索引而无需超过其特定迭代器的末端
通过在每个数据源的当前索引上拉出值来创建一个新的tuple

它具有第一个拉动的特殊情况，它只是从每个 tuple中拉出0个值，否则每次都遵循该模式。

对于一个非常简单的示例，内部状态：

for x, y in product('ab', 'cd'):

最初是前面创建元组('a', 'b')和('c', 'd')，最初是[0, 0]。在第一个拉动中，它产生('a', 'b')[0], ('c', 'd')[0]或('a', 'c')。在下一个拉动中，它将索引数组推向[0, 1]，并产生('a', 'b')[0], ('c', 'd')[1]或('a', 'd')。下一个拉动将最右边的索引提升到2，意识到它已经溢出，将其放回0，并将下一个索引推进了[1, 0]，并产生('a', 'b')[1], ('c', 'd')[0]或('b', 'c')。这一直持续到最左侧的索引溢出，此时迭代完成。

实际上等效的Python代码看起来更像：

def product(*iterables, repeat=1):
    tuples = [tuple(it) for it in iterables] * repeat
    if not all(tuples):
        return # A single empty input means nothing to yield
    indices = [0] * len(tuples)
    yield tuple(t[i] for i, t in zip(indices, tuples))
    while True:
        # Advance from rightmost index moving left until one of them
        # doesn't cycle back to 0
        for i in range(len(indices))[::-1]:
            indices[i] += 1
            if indices[i] < len(tuples[i]):
                break  # Done advancing for this round
            indices[i] = 0  # Cycle back to 0, advance next
        else:
            # The above loop will break at some point unless
            # the leftmost index gets cycled back to 0
            # (because all the leftmost values have been used)
            # so if reach the else case, all products have been computed
            return
        yield tuple(t[i] for i, t in zip(indices, tuples))

，但是如您所见，它比提供的简单版本要复杂得多。

如您所见，创建时，每个输出 tuple均立即为yield；仅将这些输入的输入和当前索引保留为迭代态状态。因此，只要呼叫者不存储结果，而只是现场迭代，就几乎不需要内存。

相关内容

最新更新

热门标签：