我正在尝试使用 pd.read_sql(sql, uri)
将数据库中规定的前 10 个处方放入数据帧中,但它返回了以下错误:
~AppDataLocalContinuumanaconda3envsGISProjectslibsite-packagessqlalchemyengineresult.py in _non_result(self, default)
1168 if self._metadata is None:
1169 raise exc.ResourceClosedError(
-> 1170 "This result object does not return rows. "
1171 "It has been closed automatically."
1172 )
ResourceClosedError: This result object does not return rows. It has been closed automatically.
我的查询具有局部变量来跟踪排名,以便它将按实践返回前 10 个处方。 如果我在 MySql 工作台中运行它,它可以工作,但在我使用pd.read_sql()
时不起作用
sql = """
SET @current_practice = 0;
SET @practice_rank = 0;
select practice, bnf_code_9, total_items, practice_rank
FROM (select a.practice,
a.bnf_code_9,
a.total_items,
@practice_rank := IF(@current_practice = a.practice, @practice_rank + 1, 1) AS practice_rank,
@current_practice := a.practice
FROM (select rp.practice, rp.bnf_code_9, sum(rp.items) as total_items
from rx_prescribed rp
where ignore_flag = '0'
group by practice, bnf_code_9) a
order by a.practice, a.total_items desc) ranked
where practice_rank <= 10;
"""
df = pd.read_sql(sql, uri)
我希望它返回数据并进入熊猫数据帧,但它返回错误。 我假设它来自设置局部变量的第一个语句。 前两个语句是必需的,以便数据返回前 10 个。
如果没有前两个语句,它可以正常工作,但是,它将在practice_rank
列的所有行中返回"1",而不是预期的值 1、2 、3 等。
有没有办法运行多个语句并返回最后一个执行的语句的结果?
简答
pandas.read_sql()
语句中调用的程序堆栈是:pandas> SQLAlchemy> MySQLdb 或 pymysql> MySql 数据库。数据库驱动程序 mysqlclient (mysqldb( 和 pymysql 不喜欢在单个execute()
调用中使用多个 SQL 语句。将它们拆分为单独的调用。
溶液
import pandas as pd
from sqlalchemy import create_engine
# mysqldb is the default, use mysql+pymysql to use the pymysql driver
# URI format: mysql<+driver>://<user:password@>localhost/database
engine = create_engine('mysql://localhost/test')
# First two lines starting with SET removed
sql = '''
SELECT practice, bnf_code_9, total_items, practice_rank
FROM (
SELECT
a.practice,
a.bnf_code_9,
a.total_items,
@practice_rank := IF(@current_practice = a.practice, @practice_rank + 1, 1) AS practice_rank,
@current_practice := a.practice
FROM (
SELECT
rp.practice, rp.bnf_code_9, sum(rp.items) AS total_items
FROM rx_prescribed rp
WHERE ignore_flag = '0'
GROUP BY practice, bnf_code_9
) a
ORDER BY a.practice, a.total_items DESC
) ranked
WHERE practice_rank <= 10;
'''
with engine.connect() as con:
con.execute('SET @current_practice = 0;')
con.execute('SET @practice_rank = 0;')
df = pd.read_sql(sql, con)
print(df)
结果:
practice bnf_code_9 total_items practice_rank
0 2 3 6.0 1
1 6 1 9.0 1
2 6 2 4.0 2
3 6 4 3.0 3
4 17 1 0.0 1
5 42 42 42.0 1
我使用以下代码为您的问题创建了一个测试数据库。
DROP TABLE IF EXISTS rx_prescribed;
CREATE TABLE rx_prescribed (
id INT AUTO_INCREMENT PRIMARY KEY,
practice INT,
bnf_code_9 INT,
items INT,
ignore_flag INT
);
INSERT INTO rx_prescribed (practice, bnf_code_9, items, ignore_flag) VALUES (2, 3, 4, 0);
INSERT INTO rx_prescribed (practice, bnf_code_9, items, ignore_flag) VALUES (2, 3, 2, 0);
INSERT INTO rx_prescribed (practice, bnf_code_9, items, ignore_flag) VALUES (6, 1, 9, 0);
INSERT INTO rx_prescribed (practice, bnf_code_9, items, ignore_flag) VALUES (6, 2, 4, 0);
INSERT INTO rx_prescribed (practice, bnf_code_9, items, ignore_flag) VALUES (6, 4, 3, 0);
INSERT INTO rx_prescribed (practice, bnf_code_9, items, ignore_flag) VALUES (9, 11, 1, 1);
INSERT INTO rx_prescribed (practice, bnf_code_9, items, ignore_flag) VALUES (17, 1, 0, 0);
INSERT INTO rx_prescribed (practice, bnf_code_9, items, ignore_flag) VALUES (42, 42, 42, 0);
在 MariaDB 10.3 上测试。