通过Pydantic模型递归迭代



假设我有一个模型,我想对它做一些预处理。(对于这个问题,不管这是pydantic模型,还是嵌套的iterable的孩子,这是一个通用的问题)。

def preprocess(string):
# Accepts some preprocessing and returnes that string
class OtherModel(BaseModel):
other_id:int
some_name: str

class DummyModel(BaseModel):
location_id: int
other_models: List[OtherModel]
name:str
surname:str
one_other_model : OtherModel

我想创建一个递归函数,它将遍历模型的每个属性并在其上运行一些预处理函数。例如,该函数可以从字符串中删除某个字母。

我已经走了这么远,我不知道如何再走下去:

from collections.abc import Iterable
def preprocess_item(request: BaseModel) -> BaseModel:

for attribute_key, attribute_value in request:
if isinstance(attribute_value, str):
setattr(
request,
attribute_key,
_remove_html_tag(getattr(request, attribute_key)),
)
elif isinstance(attribute_value, BaseModel):
preprocess_item(attribute_value)
elif isinstance(attribute_value, Iterable):
for item in getattr(request,attribute_key):
preprocess_item(item)

这给了我错误的答案,它基本上解包了每个值。我想要相同的请求对象返回,但字符串字段预处理。

如果您正在实际处理Pydantic模型,我认为这是验证器的一个用例。

实际上不需要递归,因为如果您希望它应用于所有模型(从它继承的),您可以在自己的基本模型上定义验证器:

from pydantic import BaseModel as PydanticBaseModel
from pydantic import validator

def process_string(string: str) -> str:
return string.replace("a", "")

class BaseModel(PydanticBaseModel):
@validator("*", pre=True, each_item=True)
def preprocess(cls, v: object) -> object:
if isinstance(v, str):
return process_string(v)
return v

class OtherModel(BaseModel):
other_id: int
some_name: str

class DummyModel(BaseModel):
location_id: int
other_models: list[OtherModel]
name: str
surname: str
one_other_model: OtherModel

如果您想要更有选择性,并将相同的验证器应用于特定的模型,它们也可以被重用:

from pydantic import BaseModel, validator

def preprocess(v: object) -> object:
if isinstance(v, str):
return v.replace("a", "")
return v

class OtherModel(BaseModel):
other_id: int
some_name: str
_preprocess = validator("*", pre=True, allow_reuse=True)(preprocess)

class DummyModel(BaseModel):
location_id: int
other_models: list[OtherModel]
name: str
surname: str
one_other_model: OtherModel
_preprocess = validator(
"*",
pre=True,
each_item=True,
allow_reuse=True,
)(preprocess)

class NotProcessed(BaseModel):
field: str

我们可以这样测试两个版本:

if __name__ == "__main__":
dummy = DummyModel.parse_obj({
"location_id": 1,
"other_models": [
{"other_id": 1, "some_name": "foo"},
{"other_id": 2, "some_name": "spam"},
],
"name": "bar",
"surname": "baz",
"one_other_model": {"other_id": 2, "some_name": "eggs"},
})
print(dummy.json(indent=4))

两种情况下的输出是相同的:

{
"location_id": 1,
"other_models": [
{
"other_id": 1,
"some_name": "foo"
},
{
"other_id": 2,
"some_name": "spm"
}
],
"name": "br",
"surname": "bz",
"one_other_model": {
"other_id": 2,
"some_name": "eggs"
}
}

最新更新