SQLAlchemy辅助连接模型在奇怪的条件下失败



我有一个奇怪的问题,我根本无法解决。从本质上讲,我有一个完美的模型和系统——除了在非常具体(似乎是任意的)的情况下。

我马上粘贴模型,但这是我的想法。我希望对某些表进行版本控制。这意味着对于给定的表,我将其分成两个表,Master部分包含对象的自然键,Version表包含所有可能更改的相关数据。然后我的一些模型当然有关系,所以我创建了一个连接表来链接版本。

模型如下:

class Versioned(object):
    def __init__(self, **kwargs):
        super(Versioned, self).__init__(**kwargs)
        self.active = True
        self.created_on = datetime.datetime.now()
    active = Column(BOOLEAN)
    created_on = Column(TIMESTAMP, server_default=func.now())
    def __eq__(self, other):
        return self.__class__ == other.__class__ and 
            all([getattr(self, key) == getattr(other, key)
                for key in self.comparison_keys
                ])
    def __ne__(self, other):
        return not self.__eq__(other)
    comparison_keys = []
class Parent(Base):
    __tablename__ = 'parent'
    id = Column(INTEGER, primary_key=True)
    name = Column(TEXT)
    versions = relationship("ParentVersion", back_populates="master")
    children = relationship("Child", back_populates="parent")
    @property
    def current_version(self):
        active_versions = [v for v in self.versions if v.active==True]
        return active_versions[0] if active_versions else None
class ParentVersion(Versioned, Base):
    __tablename__ = 'parent_version'
    id = Column(INTEGER, primary_key=True)
    master_id = Column(INTEGER, ForeignKey(Parent.id))
    address = Column(TEXT)
    master = relationship("Parent", back_populates="versions")
    children = relationship("ChildVersion",
        secondary=lambda : Parent_Child.__table__
    )
class Child(Base):
    __tablename__ = 'child'
    id = Column(INTEGER, primary_key=True)
    parent_id = Column(INTEGER, ForeignKey(Parent.id))
    name = Column(TEXT)
    versions = relationship("ChildVersion", back_populates="master")
    parent = relationship("Parent", back_populates="children")
    @property
    def current_version(self):
        active_versions = [v for v in self.versions if v.active==True]
        return active_versions[0] if active_versions else None

class ChildVersion(Versioned, Base):
    __tablename__ = 'child_version'
    id = Column(INTEGER, primary_key=True)
    master_id = Column(INTEGER, ForeignKey(Child.id))
    age = Column(INTEGER)
    fav_toy = Column(TEXT)
    master = relationship("Child", back_populates="versions")
    parents = relationship("ParentVersion",
        secondary=lambda: Parent_Child.__table__,
    )
    comparison_keys = [
        'age',
        'fav_toy',
    ]
class Parent_Child(Base):
    __tablename__ = 'parent_child'
    id = Column(INTEGER, primary_key=True)
    parent_id = Column(INTEGER, ForeignKey(ParentVersion.id))
    child_id = Column(INTEGER, ForeignKey(ChildVersion.id))

好的,我知道最近的SQLAlchemy模型有一些版本控制的想法,我可能用错了方法。但这非常适合我的用例。所以迁就我,让我们假设模型是好的(在一般意义上-如果有一个小细节导致的错误,将是很好的修复)

现在假设我想插入数据。我从某些来源获得数据,然后将其纳入模型。例如,将内容分成Master/Version,分配子关系,分配版本关系。现在我想将它与数据库中的数据进行比较。对于每个主对象,如果我找到它,我会比较它的版本。如果版本不同,则创建一个新版本。棘手的部分是,如果Child版本不同,我想插入一个新的Parent版本,并更新它的所有关系。也许用代码来解释这部分更有意义。search_parent是我在预解析阶段创建的对象。它有一个版本和子对象,子对象也有版本。

parent_conds = [
    getattr(search_parent.__class__, name) == getattr(search_parent, name)
    for name, column in search_parent.__class__.__mapper__.columns.items()
    if not column.primary_key
]
parent_match = session.query(Parent).filter(*parent_conds).first()
# We are going to make a new version
parent_match.current_version.active=False
parent_match.versions.append(search_parent.current_version)
for search_child in search_parent.children[:]:
    search_child.parent_id = parent_match.id
    search_conds = [
        getattr(search_child.__class__, name) == getattr(search_child, name)
        for name, column in search_child.__class__.__mapper__.columns.items()
        if not column.primary_key
    ]
    child_match = session.query(Child).filter(*search_conds).first()
    if child_match.current_version != search_child.current_version:
        # create a new version: deactivate the old one, insert the new
        child_match.current_version.active=False
        child_match.versions.append(search_child.current_version)
    else:
        # copy the old version to point to the new parent version
        children = parent_match.current_version.children
        children.append(child_match.current_version)
        children.remove(search_child.current_version)
        session.expunge(search_child.current_version)
    session.expunge(search_child)
session.expunge(search_parent)
session.add(parent_match)
session.commit()

好了,再一次,这可能不是完美的,甚至不是最好的方法。但它确实有效。除了,这就是我不明白的。如果我将孩子的age属性更新为整数值0,它将不起作用。如果子对象从0岁开始,我把它改成别的,效果会很好。如果我从某个非零整数开始,并将年龄更新为0,我会得到以下警告:

SAWarning: Object of type <ChildVersion> not in session, add operation   along 'ParentVersion.children' won't proceed (mapperutil.state_class_str(child), operation, self.prop))

插入更新后的版本,但是没有插入到parent_child连接表中。这并不是说它失败了,而是SQLAlchemy确定子对象不存在,因此无法创建连接。但它确实存在,我知道它被插入了

同样,只有在插入age=0的新版本时才会发生这种情况。如果我插入一个其他版本的新版本,它就会像我想要的那样工作。

关于这个bug还有其他奇怪的事情——如果你没有插入足够的子元素(似乎大约12个会触发这个bug),它不会发生,有时取决于其他属性。我不认为我完全理解是什么引起了它的表面区域。

感谢您花时间阅读这篇文章。我有一个完整的工作演示和源数据,我很乐意分享,它只是需要一些设置,所以我不知道它是否适合这篇文章。我希望有人能告诉我该看什么,因为在这一点上我完全不知道。

edit:这是导致警告的完整堆栈跟踪。

  File "repro.py", line 313, in <module>
  load_data(session, second_run)
File "repro.py", line 293, in load_data
  session.commit()
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 801, in commit
  self.transaction.commit()
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 392, in commit
  self._prepare_impl()
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 372, in _prepare_impl
  self.session.flush()
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 2019, in flush
  self._flush(objects)
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 2101, in _flush
  flush_context.execute()
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 373, in execute
  rec.execute(self)
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 487, in execute
  self.dependency_processor.process_saves(uow, states)
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/dependency.py", line 1053, in process_saves
  False, uowcommit, "add"):
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/dependency.py", line 1154, in _synchronize
  (mapperutil.state_class_str(child), operation, self.prop))
File "/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 1297, in warn
  warnings.warn(msg, exc.SAWarning, stacklevel=2)
File "repro.py", line 10, in warn_with_traceback
  traceback.print_stack()
/Users/me/virtualenvs/dev/lib/python2.7/site-packages/sqlalchemy/orm/dependency.py:1154: SAWarning: Object of type <ChildVersion> not in session, add operation along 'ParentVersion.children' won't proceed
(mapperutil.state_class_str(child), operation, self.prop))

edit2:下面是一个python文件的要点,您可以运行以查看奇怪的行为。https://gist.github.com/jbouricius/2ede420fb1f7a2deec9f557c76ced7f9

得到此错误的原因是您无意中向会话中添加了对象。

MVCE:

engine = create_engine("sqlite://", echo=False)

def get_data():
    children = [
        Child(name="Carol", versions=[ChildVersion(age=0, fav_toy="med")]),
        Child(name="Timmy", versions=[ChildVersion(age=0, fav_toy="med")]),
    ]
    return Parent(
        name="Zane", children=children,
        versions=[
            ParentVersion(
                address="123 Fake St",
                children=[v for child in children for v in child.versions]
            )
        ]
    )

def main():
    Base.metadata.create_all(engine)
    session = Session(engine)
    parent_match = get_data()
    session.add(parent_match)
    session.commit()
    with session.no_autoflush:
        search_parent = get_data()
        parent_match.versions.append(search_parent.current_version)
        for search_child in search_parent.children[:]:
            child_match = next(c for c in parent_match.children if c.name == search_child.name)
            if child_match.current_version != search_child.current_version:
                child_match.versions.append(search_child.current_version)
            else:
                session.expunge(search_child.current_version)
            session.expunge(search_child)
        session.expunge(search_parent)
        session.commit()

旁白:这是你需要在问题本身中提供的内容。提供带有指令的tarball文件并不是获得答案的最佳方式。

parent_match.versions.append(search_parent.current_version)

不仅添加了search_parent.current_version,还添加了search_parent,这反过来又添加了所有相关对象,包括其他子对象的子版本。根据您稍后删除其他相关对象以防止它们被添加到会话的事实判断,我得出的结论是您只想添加search_parent.current_version而不添加其他相关对象。由于关系的循环性质,在添加对象之前,您需要注意只从search_parent中取出您想要的对象。下面是固定的MVCE:

with session.no_autoflush:
    search_parent = get_data()
    current_parent_version = search_parent.current_version
    search_parent.versions.remove(current_parent_version)
    current_parent_version.children = []  # <--- this is key
    for search_child in search_parent.children[:]:
        child_match = next(c for c in parent_match.children if c.name == search_child.name)
        if child_match.current_version != search_child.current_version:
            current_child_version = search_child.current_version
            search_child.versions.remove(current_child_version)
            child_match.versions.append(current_child_version)
            current_parent_version.children.append(current_child_version)
    parent_match.versions.append(current_parent_version)
    session.commit()

最新更新