我已经阅读了几个小时,试图了解会员资格测试和速度,因为我掉进了那个兔子洞。我以为我已经得到了它,直到我运行了自己的小时间它测试
这是代码
range_ = range(20, -1, -1)
w = timeit.timeit('0 in {seq}'.format(seq=list(range_)))
x = timeit.timeit('0 in {seq}'.format(seq=tuple(range_)))
y = timeit.timeit('0 in {seq}'.format(seq=set(range_)))
z = timeit.timeit('0 in {seq}'.format(seq=frozenset(range_)))
print('list:', w)
print('tuple:', x)
print('set:', y)
print('frozenset:', z)
这是结果
列表: 0.3762843
元组:0.38087859999999996
设置: 0.06568490000000005
冻结集: 1.5114070000000002
列表和元组具有相同的时间是有意义的。 我以为 set 和 frozenset 也会有相同的时间,但即使与列表相比,它也非常慢?
将代码更改为以下内容仍然会给我类似的结果:
list_ = list(range(20, -1, -1))
tuple_ = tuple(range(20, -1, -1))
set_ = set(range(20, -1, -1))
frozenset_ = frozenset(range(20, -1, -1))
w = timeit.timeit('0 in {seq}'.format(seq=list_))
x = timeit.timeit('0 in {seq}'.format(seq=tuple_))
y = timeit.timeit('0 in {seq}'.format(seq=set_))
z = timeit.timeit('0 in {seq}'.format(seq=frozenset_))
这不是成员资格测试,而是需要时间的构造。
请考虑以下事项:
import timeit
list_ = list(range(20, -1, -1))
tuple_ = tuple(range(20, -1, -1))
set_ = set(range(20, -1, -1))
frozenset_ = frozenset(range(20, -1, -1))
w = timeit.timeit('0 in list_', globals=globals())
x = timeit.timeit('0 in tuple_', globals=globals())
y = timeit.timeit('0 in set_', globals=globals())
z = timeit.timeit('0 in frozenset_', globals=globals())
print('list:', w)
print('tuple:', x)
print('set:', y)
print('frozenset:', z)
我在 Python 3.5 中得到了以下时间:
list: 0.28041897085495293
tuple: 0.2775509520433843
set: 0.0552431708201766
frozenset: 0.05547476885840297
下面将通过反汇编您正在基准测试的代码来演示为什么frozenset
速度如此之慢:
import dis
def print_dis(code):
print('{code}:'.format(code=code))
dis.dis(code)
range_ = range(20, -1, -1)
print_dis('0 in {seq}'.format(seq=list(range_)))
print_dis('0 in {seq}'.format(seq=tuple(range_)))
print_dis('0 in {seq}'.format(seq=set(range_)))
print_dis('0 in {seq}'.format(seq=frozenset(range_)))
它的输出非常不言自明:
0 in [20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]:
1 0 LOAD_CONST 0 (0)
3 LOAD_CONST 21 ((20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0))
6 COMPARE_OP 6 (in)
9 RETURN_VALUE
0 in (20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0):
1 0 LOAD_CONST 0 (0)
3 LOAD_CONST 21 ((20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0))
6 COMPARE_OP 6 (in)
9 RETURN_VALUE
0 in {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20}:
1 0 LOAD_CONST 0 (0)
3 LOAD_CONST 21 (frozenset({0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20}))
6 COMPARE_OP 6 (in)
9 RETURN_VALUE
0 in frozenset({0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20}):
1 0 LOAD_CONST 0 (0)
3 LOAD_NAME 0 (frozenset)
6 LOAD_CONST 0 (0)
9 LOAD_CONST 1 (1)
12 LOAD_CONST 2 (2)
15 LOAD_CONST 3 (3)
18 LOAD_CONST 4 (4)
21 LOAD_CONST 5 (5)
24 LOAD_CONST 6 (6)
27 LOAD_CONST 7 (7)
30 LOAD_CONST 8 (8)
33 LOAD_CONST 9 (9)
36 LOAD_CONST 10 (10)
39 LOAD_CONST 11 (11)
42 LOAD_CONST 12 (12)
45 LOAD_CONST 13 (13)
48 LOAD_CONST 14 (14)
51 LOAD_CONST 15 (15)
54 LOAD_CONST 16 (16)
57 LOAD_CONST 17 (17)
60 LOAD_CONST 18 (18)
63 LOAD_CONST 19 (19)
66 LOAD_CONST 20 (20)
69 BUILD_SET 21
72 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
75 COMPARE_OP 6 (in)
78 RETURN_VALUE
这是因为在将范围对象转换为的 4 种数据类型中,frozenset
是 Python 3 中唯一需要以文字形式进行名称查找的数据类型,并且名称查找很昂贵,因为它需要对名称的字符串进行哈希处理,然后通过本地、全局和内置命名空间查找它:
>>> repr(list(range(3)))
'[0, 1, 2]'
>>> repr(tuple(range(3)))
'(0, 1, 2)'
>>> repr(set(range(3)))
'{0, 1, 2}'
>>> repr(frozenset(range(3)))
'frozenset([0, 1, 2])' # requires a name lookup when evaluated by timeit
在 Python 2 中,集合在通过repr
转换时也需要名称查找,这就是为什么@NPE在评论中报告frozenset
和set
在 Python 2 中几乎没有性能差异的原因:
>>> repr(set(range(3)))
'set([0, 1, 2])'