'elements'存储在生成器中的什么位置？

下面的代码总结了all_numbers列表中的所有数字。这是有道理的，因为所有要汇总的数字都保存在列表中。

def firstn(n):
'''Returns list number range from 0 to n '''
num, nums = 0, []
while num < n:
nums.append(num)
num += 1
return nums
# all numbers are held in a list which is memory intensive
all_numbers = firstn(100000000)
sum_of_first_n = sum(all_numbers)
# Uses 3.8Gb during processing and 1.9Gb to store variables
# 13.9 seconds to process
sum_of_first_n

将上述函数转换为生成器函数时，我发现我得到了相同的结果，使用的内存更少(下面的代码)。我不明白的是，如果all_numbers不包含上述列表中的所有数字，如何对其进行总结？

如果数字是按需生成的，那么人们会生成所有数字来将它们汇总在一起，那么这些数字存储在哪里，这如何转化为减少内存使用量？

def firstn(n):
num = 0
while num < n:
yield num
num += 1
# all numbers are held in a generator
all_numbers = firstn(100000000)
sum_of_first_n = sum(all_numbers)
# Uses < 100Mb during processing and to store variables
# 9.4 seconds to process
sum_of_first_n

我了解如何创建生成器函数以及为什么要使用它们，但我不明白它们是如何工作的。

一个generator不存储值，你需要将生成器视为一个带有上下文的函数，它将保存它的状态并在每次被要求这样做时GENERATE值，所以，它给你一个值，然后"丢弃"它，保存计算的上下文并等到你要求更多; 并将这样做，直到生成器上下文耗尽。

def firstn(n):
num = 0
while num < n:
yield num
num += 1

在您提供的此示例中，使用的"唯一"内存是num，是存储计算的地方，firstn生成器将num保存在其context中，直到while loop被完善。

我认为您的第一个和第二个函数/方法在引擎盖下所做的事情的真实示例会有所帮助，并且您会更好地理解正在发生的事情。

让我们打印在使用locals()处理每个函数/方法时隐藏的 Python ：

locals()：更新并返回表示当前本地符号表。自由变量由 locals() 返回，当它在功能块中调用，但不在类块中调用。

>>> def firstn(n):
'''Returns list number range from 0 to n '''
num, nums = 0, []
while num < n:
nums.append(num)
num += 1
print(locals())
return nums
>>> firstn(10)

将打印：

{'nums': [0], 'n': 10, 'num': 1}
{'nums': [0, 1], 'n': 10, 'num': 2}
{'nums': [0, 1, 2], 'n': 10, 'num': 3}
{'nums': [0, 1, 2, 3], 'n': 10, 'num': 4}
{'nums': [0, 1, 2, 3, 4], 'n': 10, 'num': 5}
{'nums': [0, 1, 2, 3, 4, 5], 'n': 10, 'num': 6}
{'nums': [0, 1, 2, 3, 4, 5, 6], 'n': 10, 'num': 7}
{'nums': [0, 1, 2, 3, 4, 5, 6, 7], 'n': 10, 'num': 8}
{'nums': [0, 1, 2, 3, 4, 5, 6, 7, 8], 'n': 10, 'num': 9}
{'nums': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], 'n': 10, 'num': 10}
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

但：

>>> def firstn(n):
num = 0
while num < n:
yield num
num += 1
print(locals())
>>> list(firstn(10))

将打印：

{'n': 10, 'num': 1}
{'n': 10, 'num': 2}
{'n': 10, 'num': 3}
{'n': 10, 'num': 4}
{'n': 10, 'num': 5}
{'n': 10, 'num': 6}
{'n': 10, 'num': 7}
{'n': 10, 'num': 8}
{'n': 10, 'num': 9}
{'n': 10, 'num': 10}
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

因此，正如你所看到的，第二个函数/方法(你的生成器)不关心过去或下一个进程的结果。此函数仅记住最后一个值(中断 while 循环的条件)并生成需求结果。

但是，在您的第一个示例中，您的函数/方法需要存储并记住每个步骤以及用于停止 while 循环然后返回最终结果的值......与发电机相比，这使得该过程非常长。

此示例可以帮助您了解如何以及何时计算项目：

def firstn(n):
num = 0
while num < n:
yield num
print('incrementing num')
num += 1
gen = firstn(n=10)
a0 = next(gen)
print(a0)      # 0
a1 = next(gen) # incrementing num
print(a1)      # 1
a2 = next(gen) # incrementing num
print(a2)      # 2

该函数不return，但它保持其内部状态(堆栈帧)并从上次yieldEd 点继续。

for循环只是反复调用next。

下一个值是按需计算的;并非所有可能的值都需要在内存中。

如果sum-function 是用 Python 编写的，它可能类似于这样：

def sum(iterable, start=0):
part_sum = start
for element in iterable:
part_sum += element
return part_sum

(当然，这个函数和实际sum有很多区别，但它在你的例子中的工作方式非常相似。

如果使用生成器调用sum(all_numbers)，则变量element仅存储当前元素，变量part_sum仅存储当前元素之前的所有数字的总和。这样，可以仅使用两个变量来计算整个总和，这显然比存储所有 100000000 个数字的数组需要的空间要少得多。正如其他人指出的那样，生成器本身只是存储它的当前状态，并在使用next调用时从那里继续计算，因此只需要在您的示例中存储n和num。

相关内容

最新更新

热门标签：