我可以使用sqlalchemy对不同的服务器进行查询吗?



我有几个sql服务器,我想并行查询。为此,我试图将请求放入进程,因为它不是一个服务器,我尝试查询多次,但许多我只查询一次:

import pandas as pd
from sqlalchemy import create_engine
from multiprocessing import Pool, cpu_count
def get_df(engine):
sql_string = "select * from sys.all_columns"
df = pd.read_sql(sql=sql_string, con=engine)
return df

def create_odbc_engine(server):
db_odbc_string = "mssql+pyodbc://@{server}-db:9999/some_database?driver=ODBC+Driver+17+for+SQL+Server".format(
server=server)
return create_engine(db_odbc_string)

if __name__ == "__main__":
servers = ["server1", "server2", "server3",...]
args = [(create_odbc_engine(server),) for server in servers]
n_processes = cpu_count() - 1
with Pool(processes=n_processes) as pool:
results = pool.map(get_df, args)

但是我得到pickle错误:

AttributeError: Can't pickle local object 'create_engine.<locals>.connect'

有没有办法让我同时做这些?

Python不能pickle函数,所以你不能在args中发送函数create_odbc_engine。您可以在get_df中调用此函数。

import pandas as pd
from sqlalchemy import create_engine
from multiprocessing import Pool, cpu_count
def get_df(server):
engine = (create_odbc_engine(server),)
sql_string = "select * from sys.all_columns"
df = pd.read_sql(sql=sql_string, con=engine)
return df

def create_odbc_engine(server):
db_odbc_string = "mssql+pyodbc://@{server}-db:9999/some_database?driver=ODBC+Driver+17+for+SQL+Server".format(
server=server)
return create_engine(db_odbc_string)

if __name__ == "__main__":
servers = ["server1", "server2", "server3",...]
# args = [(create_odbc_engine(server),) for server in servers]
n_processes = cpu_count() - 1
with Pool(processes=n_processes) as pool:
results = pool.map(get_df, servers)

相关内容

  • 没有找到相关文章

最新更新