python中的内存优化/实习



参考python的以下输出:

>>> x=254
>>> y=254
>>> id(x)
2039624591696  --> same as that of y
>>> id(y)
2039624591696  --> same as that of x
>>> x=300
>>> y=300
>>> id(x)
2039667477936 ---> different than y when value exceeds a limit of 256 
>>> id(y)
2039667477968 ----> 
>>> str7='g'*4096
>>> id(str7)
2039639279632  ---> same as that of str8
>>> str8='g'*4096
>>> id(str8)
2039639279632 ---> same as that of str7
>>> str9='g'*4097
>>> id(str9)
2039639275392 ----> ---> content is same as that of str10 but address is different than that of str10
>>> str10='g'*4097
>>> id(str10)
2039639337008

在这里,当我将str9定义为'g'*4097时,它使用的内存地址与str10不同,这里似乎有一些限制,现在我的问题是找出特定python版本的这些限制。

哪些整数和字符串在Python中自动插入是特定于实现的,并且在不同版本之间发生了变化。

以下是一些原则和限制,似乎至少适用于我当前的安装(CPython 3.10.7(:

范围[-5256]内的所有整数都会自动进行内部运算:

>>> x = 256
>>> y = 256
>>> x is y
True
>>> x = 257
>>> y = 257
>>> x is y
False

CPython(版本>=3.7(如果字符串<=4096个字符长,仅由ASCII字母、数字和下划线组成。(在CPython版本<=3.6中,限制为20个字符(。

>>> x = "foo"
>>> y = "foo"
>>> x is y
True
>>> x = "foo bar"
>>> y = "foo bar"
>>> x is y
False
>>> x = "A" * 4096
>>> y = "A" * 4096
>>> x is y
True
>>> x = "A" * 4097
>>> y = "A" * 4097
>>> x is y
False

在某些版本中,规则显然是插入看起来像有效标识符的字符串(例如,不是以数字开头的字符串(,但这似乎不是我安装中的规则:

>>> x = "5myvar"
>>> y = "5myvar"
>>> x is y
True
>>> 5myvar = 5
File "<stdin>", line 1
5myvar = 5
^
SyntaxError: invalid decimal literal

此外,字符串是在编译时而不是在运行时被插入的:

>>> x = "bar"
>>> y = "".join(["b","a","r"])
>>> x
'bar'
>>> y
'bar'
>>> x is y
False

依赖于自动字符串插入是有风险的(这取决于实现,可能会发生变化(。为了确保字符串被插入,您可以使用sys.intern()函数:

>>> x = "a string which would not normally be interned!"
>>> y = "a string which would not normally be interned!"
>>> x is y
False
>>> import sys
>>> x = sys.intern(x)
>>> y = sys.intern(y)
>>> x is y
True

最新更新