使用sqlalchemy在数据库之间高效地复制数据



我正在尝试将我用sqlalchemy定义的postgresql+PostGIS数据库镜像到sqlite(spatialite(文件数据库。session.merge()方法似乎可以将从第一个会话查询的实例添加到另一个会话,但它不能扩展到近一百万行。请参阅下面的示例,该示例将数据从内存中的sqlite数据库复制到另一个内存数据库,以便于复制。我正在寻找一种方法(可能与我现在所做的完全不同(,可以有效地将所有数据从一个数据库移动到另一个数据库。

from sqlalchemy import create_engine
from sqlalchemy import Table, Column, Integer, ForeignKey, String
from sqlalchemy.orm import declarative_base, sessionmaker
from sqlalchemy.orm import relationship, joinedload


engine_0 = create_engine('sqlite:///:memory:', echo=True)
engine_1 = create_engine('sqlite:///:memory:', echo=True)
Base = declarative_base()
Session0 = sessionmaker(bind=engine_0)
Session1 = sessionmaker(bind=engine_1)

# Define ORM models
association_table = Table('association', Base.metadata,
Column('parent_id', ForeignKey('parent.id'), primary_key=True),
Column('child_id', ForeignKey('child.id'), primary_key=True)
)

class Parent(Base):
__tablename__ = 'parent'
id = Column(Integer, primary_key=True)
name = Column(String)
children = relationship(
"Child",
secondary=association_table,
back_populates="parents")

class Child(Base):
__tablename__ = 'child'
id = Column(Integer, primary_key=True)
name = Column(String)
parents = relationship(
"Parent",
secondary=association_table,
back_populates="children")

# Create schema
Base.metadata.create_all(engine_0)
Base.metadata.create_all(engine_1)

# Create some example instances
# Children
bart = Child(name='Bart')
lisa = Child(name='Lisa')
maggie = Child(name='Maggie')
milhouse = Child(name='Milhouse')
# Parents
homer = Parent(name='Homer',
children=[bart, lisa, maggie])
marge = Parent(name='Marge',
children=[bart, lisa, maggie])
flanders = Parent(name='Ned')
kirk = Parent(name='Kirk', children=[milhouse])

# Insert data into first database
session_0 = Session0()
session_0.add_all([homer, marge, flanders, kirk])
session_0.commit()

# Query the data and insert it into the second database
all_obj = session_0.query(Parent).options(joinedload('*')).all()
session_0.expunge_all()
session_1 = Session1()
for obj in all_obj:
session_1.merge(obj)
session_1.commit()

# MAke sure that 4 instance of child are present in the second database
print(session_1.query(Child).all())

我尝试过的一种替代方法(未成功(是使用sqlalchemy.orm.make_transient()函数使父对象瞬态,并使用session.add_all()而不是session.merge()将对象插入第二会话。但是,这不会传播到关系中,并且只有父对象是瞬态的。

我认为这不适用于不同的数据库,因为SQLAlchemy为每个数据库存储元数据。你的玩具示例之所以有效,是因为它们都是相同的。

最新更新