参考python的以下输出:
>>> x=254
>>> y=254
>>> id(x)
2039624591696 --> same as that of y
>>> id(y)
2039624591696 --> same as that of x
>>> x=300
>>> y=300
>>> id(x)
2039667477936 ---> different than y when value exceeds a limit of 256
>>> id(y)
2039667477968 ---->
>>> str7='g'*4096
>>> id(str7)
2039639279632 ---> same as that of str8
>>> str8='g'*4096
>>> id(str8)
2039639279632 ---> same as that of str7
>>> str9='g'*4097
>>> id(str9)
2039639275392 ----> ---> content is same as that of str10 but address is different than that of str10
>>> str10='g'*4097
>>> id(str10)
2039639337008
在这里,当我将str9定义为'g'*4097时,它使用的内存地址与str10不同,这里似乎有一些限制,现在我的问题是找出特定python版本的这些限制。
哪些整数和字符串在Python中自动插入是特定于实现的,并且在不同版本之间发生了变化。
以下是一些原则和限制,似乎至少适用于我当前的安装(CPython 3.10.7(:
范围[-5256]内的所有整数都会自动进行内部运算:
>>> x = 256
>>> y = 256
>>> x is y
True
>>> x = 257
>>> y = 257
>>> x is y
False
CPython(版本>=3.7(如果字符串<=4096个字符长,仅由ASCII字母、数字和下划线组成。(在CPython版本<=3.6中,限制为20个字符(。
>>> x = "foo"
>>> y = "foo"
>>> x is y
True
>>> x = "foo bar"
>>> y = "foo bar"
>>> x is y
False
>>> x = "A" * 4096
>>> y = "A" * 4096
>>> x is y
True
>>> x = "A" * 4097
>>> y = "A" * 4097
>>> x is y
False
在某些版本中,规则显然是插入看起来像有效标识符的字符串(例如,不是以数字开头的字符串(,但这似乎不是我安装中的规则:
>>> x = "5myvar"
>>> y = "5myvar"
>>> x is y
True
>>> 5myvar = 5
File "<stdin>", line 1
5myvar = 5
^
SyntaxError: invalid decimal literal
此外,字符串是在编译时而不是在运行时被插入的:
>>> x = "bar"
>>> y = "".join(["b","a","r"])
>>> x
'bar'
>>> y
'bar'
>>> x is y
False
依赖于自动字符串插入是有风险的(这取决于实现,可能会发生变化(。为了确保字符串被插入,您可以使用sys.intern()
函数:
>>> x = "a string which would not normally be interned!"
>>> y = "a string which would not normally be interned!"
>>> x is y
False
>>> import sys
>>> x = sys.intern(x)
>>> y = sys.intern(y)
>>> x is y
True