假设我想初始化下面的数据类
from dataclasses import dataclass
@dataclass
class Req:
id: int
description: str
我当然可以用以下方式来做:
data = make_request() # gives me a dict with id and description as well as some other keys.
# {"id": 123, "description": "hello", "data_a": "", ...}
req = Req(data["id"], data["description"])
但是,考虑到我需要的密钥始终是字典的子集,我是否可以使用字典拆包?
req = Req(**data) # TypeError: __init__() got an unexpected keyword argument 'data_a'
这里有一个可以通用于任何类的解决方案。它只是过滤输入字典以排除不是init==True
:类的字段名的键
from dataclasses import dataclass, fields
@dataclass
class Req:
id: int
description: str
def classFromArgs(className, argDict):
fieldSet = {f.name for f in fields(className) if f.init}
filteredArgDict = {k : v for k, v in argDict.items() if k in fieldSet}
return className(**filteredArgDict)
data = {"id": 123, "description": "hello", "data_a": ""}
req = classFromArgs(Req, data)
print(req)
输出:
Req(id=123, description='hello')
UPDATE:这是上面策略的一个变体,它创建了一个实用程序类,为使用它的每个数据类缓存dataclasses.fields
(由@rv.kvech的一条评论提示,该评论表达了对同一数据类的多次调用重复处理dataclasses.fields
的性能问题(。
from dataclasses import dataclass, fields
class DataClassUnpack:
classFieldCache = {}
@classmethod
def instantiate(cls, classToInstantiate, argDict):
if classToInstantiate not in cls.classFieldCache:
cls.classFieldCache[classToInstantiate] = {f.name for f in fields(classToInstantiate) if f.init}
fieldSet = cls.classFieldCache[classToInstantiate]
filteredArgDict = {k : v for k, v in argDict.items() if k in fieldSet}
return classToInstantiate(**filteredArgDict)
@dataclass
class Req:
id: int
description: str
req = DataClassUnpack.instantiate(Req, {"id": 123, "description": "hello", "data_a": ""})
print(req)
req = DataClassUnpack.instantiate(Req, {"id": 456, "description": "goodbye", "data_a": "my", "data_b": "friend"})
print(req)
@dataclass
class Req2:
id: int
description: str
data_a: str
req2 = DataClassUnpack.instantiate(Req2, {"id": 123, "description": "hello", "data_a": "world"})
print(req2)
print("nHere's a peek at the internals of DataClassUnpack:")
print(DataClassUnpack.classFieldCache)
输出:
Req(id=123, description='hello')
Req(id=456, description='goodbye')
Req2(id=123, description='hello', data_a='world')
Here's a peek at the internals of DataClassUnpack:
{<class '__main__.Req'>: {'description', 'id'}, <class '__main__.Req2'>: {'description', 'data_a', 'id'}}
您可以引入一个新函数来执行从dict到dataclass的给定转换:
import inspect
from dataclasses import dataclass
@dataclass
class Req:
id: int
description: str
def from_dict_to_dataclass(cls, data):
return cls(
**{
key: (data[key] if val.default == val.empty else data.get(key, val.default))
for key, val in inspect.signature(cls).parameters.items()
}
)
from_dict_to_dataclass(Req, {"id": 123, "description": "hello", "data_a": ""})
# Output: Req(id=123, description='hello')
注意,需要if val.default == val.empty
条件来检查数据类是否设置了默认值。如果这是真的,那么我们在构造数据类时应该考虑给定的值。
python 3.10或更高版本
from dataclasses import dataclass
@dataclass(kw_only=True)
class Req:
id: int
description: str
# invalid keys will cause failure
Req(**{"id": 123, "description": "hello"})
解决方法是截取数据类的__init__
并过滤掉无法识别的字段。
from dataclasses import dataclass, fields
@dataclass
class Req1:
id: int
description: str
@dataclass
class Req2:
id: int
description: str
def __init__(self, **kwargs):
for key, value in kwargs.items():
if key in REQ2_FIELD_NAMES:
setattr(self, key, value)
# To not re-evaluate the field names for each and every creation of Req2, list them here.
REQ2_FIELD_NAMES = {field.name for field in fields(Req2)}
data = {
"id": 1,
"description": "some",
"data_a": None,
}
try:
print("Call for Req1:", Req1(**data))
except Exception as error:
print("Call for Req1:", error)
try:
print("Call for Req2:", Req2(**data))
except Exception as error:
print("Call for Req2:", error)
输出:
Call for Req1: __init__() got an unexpected keyword argument 'data_a'
Call for Req2: Req2(id=1, description='some')
相关问题:
- 如何忽略传递给数据类的额外参数