SQLAlchemy如何在SQLite和PostgreSQL之间可互换地使用带有索引的JSON列



我正在用sqlalchemy定义一个模式,该模式可以使用多个引擎后端,尤其是sqlite和postgresql。

我遇到了一个问题,因为我有一个JSON列,上面有一个索引。这似乎在sqlite上有效,但在postgresql中,它抱怨索引类型不能是btree。我看到一些文档强调了一个特定于postgres方言的JSONB类型,但问题是我的模式是声明性的:我不知道我要连接到SQLite还是PostgreSQL数据库。

举个例子,这里有一个玩具声明性模式:


# from sqlalchemy.dialects.postgresql import JSONB
from sqlalchemy import create_engine
from sqlalchemy import inspect
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
from sqlalchemy.sql.schema import Column, Index
from sqlalchemy.types import Integer, JSON
from sqlalchemy_utils import database_exists, create_database
CustomBase = declarative_base()
class User(CustomBase):
__tablename__ = 'users'
id = Column(Integer, primary_key=True, doc='unique internal id')
name = Column(JSON)
loose_identifer = Column(JSON, index=True, unique=False)
# loose_identifer = Column(JSONB, index=True, unique=False)
uri = 'sqlite:///test_sqlite_v7.sqlite'
# uri = 'postgresql+psycopg2://admin:admin@localhost:5432/test_postgresql_v4.postgres'
engine = create_engine(uri)
DBSession = sessionmaker(bind=engine)
session = DBSession()
if 'postgresql' in uri:
if not database_exists(uri):
create_database(uri)
inspector = inspect(engine)
table_names = inspector.get_table_names()
if len(table_names) == 0:
CustomBase.metadata.create_all(engine)
user_infos = [
{'name': 'user1', 'loose_identifer': "AA" },
{'name': 'user2', 'loose_identifer': "33" },
{'name': 'user3', 'loose_identifer': 33 },
{'name': 'user4', 'loose_identifer': 33 },
{'name': 'user5', 'loose_identifer': "AA" },
{'name': 'user6', 'loose_identifer': None},
{'name': 'user7', 'loose_identifer': [1, 'weird']},
]
for row in user_infos:
user = User(**row)
session.add(user)
session.commit()
import pandas as pd
import json
table_df = pd.read_sql_table('users', con=engine)
table_df['loose_identifer'] = table_df['loose_identifer'].apply(repr)
print(table_df)
query = session.query(User.name, User.loose_identifer).filter(User.loose_identifer == json.dumps(33))
results = list(query.all())
print(f'results={results}')
query = session.query(User.name, User.loose_identifer).filter(User.loose_identifer == json.dumps('33'))
results = list(query.all())
print(f'results={results}')

User表有一个loose_identifer列,我希望它是一个相当任意的JSON类型,我想在它上面添加一个索引;松散的";可以是整数或字符串的标识符。

当我使用sqlite时,使用Column(JSON, index=True, unique=False)似乎很好,但当我将其切换到postgresql引擎时,我会收到以下错误:

ProgrammingError: (psycopg2.errors.UndefinedObject) data type json has no default operator class for access method "btree"
HINT:  You must specify an operator class for the index or define a default operator class for the data type.
[SQL: CREATE INDEX ix_users_loose_identifer ON users (loose_identifer)]
(Background on this error at: https://sqlalche.me/e/14/f405)

我尝试通过添加以下类属性来显式添加索引:

__table_args__ =  (
# https://stackoverflow.com/questions/30885846/how-to-create-jsonb-index-using-gin-on-sqlalchemy
Index(
"ix_users_loose_identifer", loose_identifer,
postgresql_using="gin",
),
)

但这似乎不起作用。我可能在声明中做错了什么。

如果我在上面的模式中将JSON更改为JSONB,它确实有效,但JSONB与sqlite不兼容,所以我的问题是:如何用json列声明我的模式,这些json列将使用sqlite和postgresql后端兼容的语法进行索引?

我遇到了类似的问题。这里有一个更简洁的问题演示:

from sqlalchemy import JSON, Column, Integer, Unicode, create_engine
from sqlalchemy.dialects.postgresql import JSONB
from sqlalchemy.orm import declarative_base
Base = declarative_base()

class Node(Base):
__tablename__ = "nodes"
id = Column(Integer, primary_key=True, index=True, autoincrement=True)
key = Column(Unicode(1023), index=True, nullable=False)
ancestors = Column(JSON, index=True, nullable=True)

# This works:
engine = create_engine("sqlite:///:memory:")
Base.metadata.create_all(engine)
print("SQLite works")
Base.metadata.drop_all(engine)
# One way to get a postgres database to test against:
# docker run --name test-postgres -e POSTGRES_PASSWORD=secret -d docker.io/postgres
# This fails:
engine = create_engine("postgresql://postgres:secret@localhost:5432")
Base.metadata.create_all(engine)
print("PostgreSQL works")
Base.metadata.drop_all(engine)

在GitHub上的SQLAlchemy讨论中向我提供了一个解决方案。我和Erotemic缺少的功能是TypeEngine.with_variant。下面,我将其应用于我的简单示例来演示修复:

from sqlalchemy import JSON, Column, Integer, Unicode, create_engine
from sqlalchemy.dialects.postgresql import JSONB
from sqlalchemy.orm import declarative_base
Base = declarative_base()
# Use JSON with SQLite and JSONB with PostgreSQL.
JSONVariant = JSON().with_variant(JSONB(), "postgresql")

class Node(Base):
__tablename__ = "nodes"
id = Column(Integer, primary_key=True, index=True, autoincrement=True)
key = Column(Unicode(1023), index=True, nullable=False)
ancestors = Column(JSONVariant, index=True, nullable=True)

# Both of these now work:
engine = create_engine("sqlite:///:memory:")
Base.metadata.create_all(engine)
print("SQLite works")
Base.metadata.drop_all(engine)
engine = create_engine("postgresql://postgres:secret@localhost:5432")
Base.metadata.create_all(engine)
print("PostgreSQL works")
Base.metadata.drop_all(engine)

感谢GitHub上的@CaselIT提供的快速帮助!

最新更新