我正在使用SQLAlchemy从SQL Server数据库读取数据,然后将表数据转换为csv文件,以便稍后移交。
然而,我注意到当SQL Server表字段中有1或0时,csv输出具有True或False。我知道在Python中,它仍然是一个数字因为True == 1
返回True
是否有可能从一开始就不发生这种情况,并使csv保持1或0?
这是我当前的代码:
import pandas as pd
from sqlalchemy import create_engine
pd.set_option('display.max_columns', None)
pd.set_option('max_colwidth', None)
df = pd.read_csv(r'''C:UsersusernameDownloadspy_tma_tables.csv''')
tma_table = df['table_name'].tolist()
servername = 'SERVER'
dbname = 'DATABASE'
sqlcon = create_engine('mssql+pyodbc://@' + servername + '/' + dbname + '?driver=ODBC+Driver+17+for+SQL+Server')
df_list = []
count = 0
while count < 1:
df1 = pd.read_sql_query("SELECT * FROM " + tma_table[count], sqlcon)
df_list.append(df1)
df_list[count].to_csv(tma_table[count] + ".csv", index=False, header=None, encoding='utf-8')
count += 1
我有大约450个表,这将被应用,所以单表的解决方案不会工作,因为我需要一个自动化的方式,让每个表遵循这个规则,我的目标。
我开始尝试通过每个列循环并改变列的dtype的路线,但似乎更容易不需要将True或False替换为1或0。
dtypes输出:
cst_pk int64
cst_code object
cst_name object
cst_clnt_fk int64
cst_active bool
cst_encumbered object
cst_purgeDate object
cst_splitBill bool
cst_subLabor_fk object
cst_subParts_fk float64
cst_subOther_fk object
cst_subContract bool
cst_subContractLabor_fk object
cst_subContractParts_fk object
cst_subContractOther_fk object
cst_balanceType object
cst_normalBalance object
cst_ay_fk int64
cst_header bool
cst_beginningBalance object
cst_alias object
cst_modifier_fk float64
cst_modifiedDate datetime64[ns]
cst_creator_fk float64
cst_createddate datetime64[ns]
cst_curr_fk object
cst_exch_fk object
cst_exch_date object
cst_ag_fk object
cst_dp_fk float64
cst_alternateAccount object
dtype: object
SQL CREATE TABLE查询
CREATE TABLE [dbo].[f_account](
[cst_pk] [int] NOT NULL,
[cst_code] [nvarchar](100) NOT NULL,
[cst_name] [nvarchar](35) NOT NULL,
[cst_clnt_fk] [int] NOT NULL,
[cst_active] [bit] NOT NULL,
[cst_encumbered] [decimal](10, 2) NULL,
[cst_purgeDate] [datetime] NULL,
[cst_splitBill] [bit] NOT NULL,
[cst_subLabor_fk] [int] NULL,
[cst_subParts_fk] [int] NULL,
[cst_subOther_fk] [int] NULL,
[cst_subContract] [bit] NOT NULL,
[cst_subContractLabor_fk] [int] NULL,
[cst_subContractParts_fk] [int] NULL,
[cst_subContractOther_fk] [int] NULL,
[cst_balanceType] [nvarchar](20) NULL,
[cst_normalBalance] [nvarchar](20) NULL,
[cst_ay_fk] [int] NULL,
[cst_header] [bit] NOT NULL,
[cst_beginningBalance] [decimal](10, 2) NULL,
[cst_alias] [nvarchar](32) NULL,
[cst_modifier_fk] [int] NULL,
[cst_modifiedDate] [datetime] NULL,
[cst_creator_fk] [int] NULL,
[cst_createddate] [datetime] NULL,
[cst_curr_fk] [int] NULL,
[cst_exch_fk] [int] NULL,
[cst_exch_date] [datetime] NULL,
[cst_ag_fk] [int] NULL,
[cst_dp_fk] [int] NULL,
[cst_alternateAccount] [nvarchar](100) NULL
如
所述https://github.com/mkleehammer/pyodbc/issues/383
pyodbc返回bit
列作为布尔值,因为这是大多数人使用bit
列的目的。
import pandas as pd
import sqlalchemy as sa
engine = sa.create_engine("mssql+pyodbc://scott:tiger^5HHH@mssql_199")
table_data = """
SELECT -1 AS id, CAST(NULL AS bit) AS bit_col
UNION ALL
SELECT 0 AS id, CAST(0 AS bit) AS bit_col
UNION ALL
SELECT 1 AS id, CAST(1 AS bit) AS bit_col
"""
df = pd.read_sql_query(table_data, engine)
print(df)
"""
id bit_col
0 -1 None
1 0 False
2 1 True
"""
如果您希望pyodbc以其他类型返回bit
列,您可以使用输出转换器函数,如GitHub问题所示。诀窍是让SQLAlchemy使用它。这是使用事件侦听器完成的。
import pandas as pd
import pyodbc
import sqlalchemy as sa
engine = sa.create_engine("mssql+pyodbc://scott:tiger^5HHH@mssql_199")
table_data = """
SELECT -1 AS id, CAST(NULL AS bit) AS bit_col
UNION ALL
SELECT 0 AS id, CAST(0 AS bit) AS bit_col
UNION ALL
SELECT 1 AS id, CAST(1 AS bit) AS bit_col
"""
def handle_bit_type(bit_value):
return bit_value
@sa.event.listens_for(engine, "connect")
def connect(conn, rec):
conn.add_output_converter(pyodbc.SQL_BIT, handle_bit_type)
df = pd.read_sql_query(table_data, engine)
print(df)
"""
id bit_col
0 -1 None
1 0 b'x00'
2 1 b'x01'
"""
编辑:或者,如果使用
def handle_bit_type(bit_value):
if bit_value is None:
rtn = None
elif bit_value == b"x00":
rtn = "0"
else:
rtn = "1"
return rtn
你会得到
df = pd.read_sql_query(table_data, engine)
print(df)
"""
id bit_col
0 -1 None
1 0 0
2 1 1
"""