我是Python的新手,想将熊猫的数据框上传到Snowflake数据库的表中。所需的行为是替换现有表(如果已存在)。这是我这样做的代码:
#Import the required modules and packages
from snowflake.connector.pandas_tools import pandas
from snowflake.connector.pandas_tools import pd_writer
from snowflake.sqlalchemy import URL
import sqlalchemy
from sqlalchemy import create_engine
import os
#import the data from a local csv
my_data_frame= pandas.read_csv("my_data_frame.csv" , sep=',' , low_memory=False)
#Connect to Snowflake
engine = create_engine(URL(
account = 'my_snowflake_account',
user = 'my_snowflake_id',
password = 'my_snowflake_password',
database = 'my_database',
schema = 'my_schema',
warehouse = 'my_warehouse',
role='my_role',
))
connection = engine.connect()
#Push my local data frame into a new table
my_data_frame.to_sql('new_table_name_on_Snowflake', engine,
index=False, method=pd_writer, if_exists='replace')
代码将运行。它创建一个表,并在其中分配正确的表名。但是,所有列的所有行都填充了 NA。我怀疑这与数据类型有关。我该如何解决这个问题?
请注意,我按照 Snowflake 文档中的建议指定method=pd_writer
: https://docs.snowflake.com/en/user-guide/python-connector-api.html#pd_writer
使用my_data_frame.dtypes
检查my_data_fame时,它会返回:
column1 object
column2 int64
column3 object
column4 object
column5 object
column6 object
column7 int64
column8 int64
没有合理的理由,大写列名解决了这个问题。因此,在调用to_sql
之前插入以下内容
my_data_frame.columns = map(str.upper, my_data_frame.columns)