如何在使用sqlalchemy查询和pandas数据帧时最小化代码



我正试图通过现代化来升级我的查询代码。

我的旧密码(如下(。第一个查询连接两个表,并选择每个歌曲标题的评级以及标题的艺术家,第二个查询获得每个标题的流派(使用关联表(:

items = []
query = db.session.query(Rating, Song).filter(Rating.id==Song.id).all()
for x in query:
dic = {
"rating": x[0],
"title": x[1].title,
"artist": x[1].artist,
"genre": Genre.query.filter(Genre.songs.any(title=x[1].title)).all(),
}
items.append(dic)

我的清洁码。我现在使用pandas数据帧而不是字典。这给了我错误ArgumentError: SQL expression element or literal value expected, got somethingsomething

query = db.session.query(Rating, Song).filter(Rating.id==Song.id).all()
df = pd.DataFrame(query, columns=["rating", "title"])
for item in df.title:
df['genre'] = (Genre.query.filter(Genre.songs.any(title=item)).all())
  • 我该如何让它发挥作用
  • 有没有更有效的编码方法

产生的完全错误

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
~AppDataLocalProgramsPythonPython37libsite-packagessqlalchemyenginebase.py in execute(self, statement, *multiparams, **params)
1194         try:
-> 1195             meth = statement._execute_on_connection
1196         except AttributeError as err:
AttributeError: 'BaseQuery' object has no attribute '_execute_on_connection'
The above exception was the direct cause of the following exception:
ObjectNotExecutableError                  Traceback (most recent call last)
<ipython-input-4-99ffacdf2d91> in <module>
1 query = db.session.query(Rating, Song).filter(Rating.id==Song.id)
----> 2 df = pd.read_sql_query(query, db.engine)
3 df
~AppDataLocalProgramsPythonPython37libsite-packagespandasiosql.py in read_sql_query(sql, con, index_col, coerce_float, params, parse_dates, chunksize)
381         coerce_float=coerce_float,
382         parse_dates=parse_dates,
--> 383         chunksize=chunksize,
384     )
385 
~AppDataLocalProgramsPythonPython37libsite-packagespandasiosql.py in read_query(self, sql, index_col, coerce_float, parse_dates, params, chunksize)
1292         args = _convert_params(sql, params)
1293 
-> 1294         result = self.execute(*args)
1295         columns = result.keys()
1296 
~AppDataLocalProgramsPythonPython37libsite-packagespandasiosql.py in execute(self, *args, **kwargs)
1160     def execute(self, *args, **kwargs):
1161         """Simple passthrough to SQLAlchemy connectable"""
-> 1162         return self.connectable.execution_options().execute(*args, **kwargs)
1163 
1164     def read_table(
<string> in execute(self, statement, *multiparams, **params)
~AppDataLocalProgramsPythonPython37libsite-packagessqlalchemyutildeprecations.py in warned(fn, *args, **kwargs)
388         if not skip_warning:
389             _warn_with_version(message, version, wtype, stacklevel=3)
--> 390         return fn(*args, **kwargs)
391 
392     doc = func.__doc__ is not None and func.__doc__ or ""
~AppDataLocalProgramsPythonPython37libsite-packagessqlalchemyenginebase.py in execute(self, statement, *multiparams, **params)
3036         """
3037         connection = self.connect(close_with_result=True)
-> 3038         return connection.execute(statement, *multiparams, **params)
3039 
3040     @util.deprecated_20(
~AppDataLocalProgramsPythonPython37libsite-packagessqlalchemyenginebase.py in execute(self, statement, *multiparams, **params)
1196         except AttributeError as err:
1197             util.raise_(
-> 1198                 exc.ObjectNotExecutableError(statement), replace_context=err
1199             )
1200         else:
~AppDataLocalProgramsPythonPython37libsite-packagessqlalchemyutilcompat.py in raise_(***failed resolving arguments***)
209 
210         try:
--> 211             raise exception
212         finally:
213             # credit to
ObjectNotExecutableError: Not an executable object: <flask_sqlalchemy.BaseQuery object at 0x000001CBC5F14A48>

首先,只创建一个查询即可一次性返回所需的所有数据,其中使用GROUP_CONCAT函数对Genres进行分组:

query = (
db.session
.query(
Rating.rating,
Song.title,
Song.artist,
db.func.GROUP_CONCAT(Genre.category, ", ").label("genres")
)
.select_from(Song)
.where(Rating.id == Song.id)
.join(Genre, Song.genres)
.group_by(
Rating.rating,
Song.title,
Song.artist,
)
)

然后使用pandas方法将其转换为数据帧:

df = pd.read_sql_query(query.statement, db.engine)

print(df)应该产生这样的东西:

rating    title artist            genres
0     2.0  title 2  art-2               pop
1     3.0  title 3  art-3         rock, pop
2     4.0  title 4  art-3             other
3     5.0  title 5  art-4  rock, pop, other

最新更新