在问这个问题之前我搜索了很多,看起来我被卡住了,因此在这里问问题。我知道当模式和对象不匹配时,可能会遇到这种类型的错误,可能缺少某些数据类型或字段具有其他类型的值。
然而,我相信我的情况不同。我的应用程序很简单,它只将对象序列化和反序列化为avro
我DataClass:
from time import time
from faker import Faker
from dataclasses import dataclass, field
from dataclasses_avroschema import AvroModel
Faker.seed(0)
fake = Faker()
@dataclass
class Head(AvroModel):
msgId: str = field()
msgCode: str = field()
@staticmethod
def fakeMe():
return Head(fake.md5(),
fake.pystr(min_chars=5, max_chars=5)
)
@dataclass
class Message(AvroModel):
head: Head = field()
status: bool = field()
class Meta:
namespace = "me.com.Message.v1"
def fakeMe(self):
self.head = Head.fakeMe()
self.bool = fake.pybool()
现在运行序列化的脚本:
import json, io as mainio
from dto.temp_schema import Message
from avro import schema, datafile, io as avroio
obj = Message(None, True)
obj.fakeMe()
schema_obj = schema.parse(json.dumps(Message.avro_schema_to_python()))
buf = mainio.BytesIO()
writer = datafile.DataFileWriter(buf, avroio.DatumWriter(), schema_obj)
writer.append(obj)
writer.flush()
buf.seek(0)
data = buf.read()
print("serialized avro: ", data)
当我运行这个时,我得到以下错误:
Traceback (most recent call last):
File "/Users/office/Documents/projects/msg-bench/scrib.py", line 28, in <module>
writer.append(obj)
File "/Users/office/opt/anaconda3/envs/benchenv/lib/python3.9/site-packages/avro/datafile.py", line 329, in append
self.datum_writer.write(datum, self.buffer_encoder)
File "/Users/office/opt/anaconda3/envs/benchenv/lib/python3.9/site-packages/avro/io.py", line 771, in write
raise AvroTypeException(self.writer_schema, datum)
avro.io.AvroTypeException: The datum Message(head=Head(msgId='f112d652ecf13dacd9c78c11e1e7f987', msgCode='cYzVR'), status=True) is not an example of the schema {
"type": "record",
"name": "Message",
"namespace": "me.com.Message.v1",
"fields": [
{
"type": {
"type": "record",
"name": "Head",
"namespace": "me.com.Message.v1",
"fields": [
{
"type": "string",
"name": "msgId"
},
{
"type": "string",
"name": "msgCode"
}
],
"doc": "Head(msgId: str, msgCode: str)"
},
"name": "head"
},
{
"type": "boolean",
"name": "status"
}
],
"doc": "Message(head: dto.temp_schema.Head, status: bool)"
}
请注意,我正在使用Dataclass Object在python库的帮助下生成模式:dataclasses-avroschema
在使用相同的模式后,我仍然无法将数据序列化到Avro。
目前我不确定我哪里错了,我是新的avro。为什么这不能编译?
System and Library stats:
Python==3.9.7
avro==1.10.2
avro-python3==1.10.2
dataclasses-avroschema==0.25.1
Faker==9.3.1
fastavro==1.4.5
问题是您试图将Message
对象传递给标准avro库,该库不期望这样(相反,它期望字典)。您正在使用的库中有一节是关于序列化的,您可能想看一下:https://marcosschroh.github.io/dataclasses-avroschema/serialization/
from dto.temp_schema import Message
obj = Message(None, True)
obj.fakeMe()
print("serialized avro: ", obj.serialize())