我正在尝试"有效地"计算两个迭代器的乘积。他们每个人都需要一点时间来产生每个结果,并且有很多产生的结果。由于似乎itertools.product
首先计算所有项目,因此获得第一对需要相当多的时间。
MCVE 是:
import time
from itertools import product
def costlygen(n):
for i in range(n):
time.sleep(1)
yield i
g1 = costlygen(5)
g2 = costlygen(5)
now = time.time()
g = product(g1,g2)
for x in g:
print(x)
print(time.time()-now)
输出为:
(0, 0)
10.027392148971558
(0, 1)
10.027477979660034
(0, 2)
10.027528285980225
...
(4, 3)
10.028220176696777
(4, 4)
10.028250217437744
从结果中可以清楚地看出,product
计算每个生成器生成的所有项目,因此第一个结果仅在 10 秒后生成,而本可以在 2 秒后生成。
有没有办法在产生结果后立即获得结果?
有一种可能的解决方案通过gone
列表使用缓存:
import time
from itertools import product
def costlygen(n):
for i in range(n):
time.sleep(1)
yield i
def simple_product(it1, it2):
gone = []
x = next(it1)
for y in it2:
gone.append(y)
yield x, y
for x in it1:
for y in gone:
yield x, y
def complex_product(*iterables):
if len(iterables) == 2:
yield from simple_product(*iterables)
return
it1, *rest = iterables
gone = []
x = next(it1)
for t in complex_product(*rest):
gone.append(t)
yield (x,) + t
for x in it1:
for t in gone:
yield (x,) + t
g1 = costlygen(5)
g2 = costlygen(5)
g3 = costlygen(5)
now = time.time()
g = complex_product(g1,g2,g3)
for x in g:
print(x)
print(time.time()-now)
计时:
(0, 0, 0)
3.002698898315429 # as soon as possible
(0, 0, 1)
4.003920316696167 # after one second
(0, 0, 2)
5.005135536193848
(0, 0, 3)
6.006361484527588
(0, 0, 4)
7.006711721420288
(0, 1, 0)
8.007975101470947
(0, 1, 1)
8.008066892623901 # third gen was already gone, so (*, *, 1) will be produced instantly after (*, *, 0)
(0, 1, 2)
8.008140802383423
(0, 1, 3)
8.00821304321289
(0, 1, 4)
8.008255004882812
(0, 2, 0)
9.009203910827637