dill更新了dilled/undilled对象本身的类定义,但不更新dilled/unilled对象包含的对象的类定义。
pickle在任何一种情况下都会更新类定义。
为什么dill不遵循与pickle相同的行为?
泡菜
import os
import pickle
import tempfile
from dataclasses import dataclass, field
def pickle_save(x):
with tempfile.NamedTemporaryFile(delete=False) as f:
pickle.dump(x, f)
return f
def pickle_load(f):
with open(f.name, "rb") as f:
x = pickle.load(f)
os.unlink(f.name)
return x
@dataclass
class B:
attribute: str = "old"
def method_1(self):
print(f"old class: {self.attribute=}")
@dataclass
class A:
attribute_1: str = "old"
instances_of_B: list[B] = field(default_factory=list)
def method_1(self):
print(f"old class: {self.attribute_1=}, {self.instances_of_B=}")
def add_b_instance(self):
self.instances_of_B.append(B())
old_a = A()
old_a.add_b_instance()
old_a.method_1()
old_a.instances_of_B[0].method_1()
print(f"{old_a = }")
temp_file = pickle_save(old_a)
# old_a has been saved to file
# Next we update our class definitions
# then load old_a from file,
# and see whether the added methods exist
@dataclass
class A:
attribute_1: str = "new"
attribute_2: str = "new attribute 2"
instances_of_B: list[B] = field(default_factory=list)
def method_1(self):
print(f"new class: {self.attribute_1=}, {self.instances_of_B=}")
def method_2(self):
print("this method from A did not exist before")
print(f"this attribute did not exist before: {self.attribute_2=}")
@dataclass
class B:
attribute: str = "new"
def method_1(self):
print(f"new class: {self.attribute=}")
def method_2(self):
print("this method from B did not exist before")
new_a = pickle_load(temp_file)
print(f"{new_a=}")
new_a.method_1()
new_a.method_2()
new_a.instances_of_B[0].method_1()
new_a.instances_of_B[0].method_2()
加载后可以使用已腌制的A实例和已包含的B实例的新方法_2:
old class: self.attribute_1='old', self.instances_of_B=[B(attribute='old')]
old class: self.attribute='old'
old_a = A(attribute_1='old', instances_of_B=[B(attribute='old')])
new_a=A(attribute_1='old', attribute_2='new attribute 2', instances_of_B=[B(attribute='old')])
new class: self.attribute_1='old', self.instances_of_B=[B(attribute='old')]
this method from A did not exist before
this attribute did not exist before: self.attribute_2='new attribute 2'
new class: self.attribute='old'
this method from B did not exist before
dill
import dill as pickle
加载后只能使用腌制的A实例的新方法_2,而包含的B实例的新方式_2不能:
old class: self.attribute_1='old', self.instances_of_B=[B(attribute='old')]
old class: self.attribute='old'
old_a = A(attribute_1='old', instances_of_B=[B(attribute='old')])
new_a=A(attribute_1='old', attribute_2='new attribute 2', instances_of_B=[B(attribute='old')])
new class: self.attribute_1='old', self.instances_of_B=[B(attribute='old')]
this method from A did not exist before
this attribute did not exist before: self.attribute_2='new attribute 2'
old class: self.attribute='old'
Traceback (most recent call last):
File "c:question_dill_pickle.py", line 78, in <module>
new_a.instances_of_B[0].method_2()
AttributeError: 'B' object has no attribute 'method_2'
我是dill
的作者。dill
在这里不遵循pickle
的行为,因为pickle
通过引用序列化类(即,它别无选择,只能使用当前上下文中使用的任何类定义(,而dill
将类定义与pickle实例一起存储。。。这样你就可以选择行为。默认情况是使用存储类,这样您就可以获得所需的内容(更常见的情况是,这是所需的(。但是,如果要忽略存储的类并使用更新的定义,则可以在load
中使用ignore=True
关键字(或在dill.settings
中全局更改它(。
来自文档:
If *ignore=False* then objects whose class is defined in the module *__main__* are updated to reference the existing class in *__main__*, otherwise they are left to refer to the reconstructed type, which may be different.