如何将pandas数据帧插入到现有的postgressql数据库中



我有一个类似的数据帧

index  userID    OtherIDs
0   abcdef2035  [test650, test447, test968, test95]
1   abcdef3007  [test999, test992, test943, test834]
2   abcdef2006  [test175, test996, test986, test965]
3   abcdef2003  [test339, test968, test87, test678]
4   abcdef3000  [test129, test99, test921, test909]

生成此数据帧的代码将每天运行。我需要将其上传到表名";结果";在现有数据库中。我必须检查表";结果";如果存在,则使用上述数据帧中的当前值删除/覆盖这些值。

postgres数据库的可信度:

PGHOST = 'localhost'
PGDATABASE = 'TestDB'
PGUSER = 'postgres'
PGPASSWORD = 'admin1234'

您可以使用SQLAlchemy:(https://docs.sqlalchemy.org/en/14/core/engines.html)

pandas df.to_sql:(https://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.DataFrame.to_sql.html)

假设数据帧名称为df

from sqlalchemy import create_engine
engine = create_engine(user:password@host_ip:port/postgres_database)
df.to_sql('results', schema='<schema_name>', con = engine, if_exists='replace')

只需以正确的格式传递您的凭据即可。即engine = user:password@host_ip:port/postgres_database

构造引擎字符串:假设以下sign_in变量:

sign_in = {
"database": "TestDB",
"user": "postgres",
"password": "<your_password>",
"host": "localhost",
"port": "<your_port>"
}
signin_info = 'postgresql+pygresql://'+sign_in['user']+':'+sign_in['password']+'@'+sign_in['host']+':'+sign_in['port']+'/'+sign_in['database']
from sqlalchemy import create_engine
engine = create_engine(signin_info)
df.to_sql('results', schema='<schema_name>', con = engine, if_exists='replace')

最新更新