调整插入DataFrame的列值



在我的公司,我们有一个通过pyodbc连接的SQL数据库。在这里,当我们运行查询时,pyodbc会返回一个元组列表。通常,这样的列表包含>10个元组。下面是我们得到的一个可能输出的例子:

OUTPUT =
[(datetime.datetime(2003, 3, 26, 15, 12, 15), '490002_space'),
(datetime.datetime(2003, 3, 27, 16, 13, 14), '490002_space')] 

我的愿望是删除我们在OUTPUT中收到的所有元组中的'_space''datetime.datetime(…('

DESIRED_OUTPUT:
[('2003, 3, 26, 15, 12, 15', '490002'),
('2003, 3, 27, 16, 13, 14', '490002')] 

真希望能收到你的来信。

问候,Jerome

这里有一个可能的解决方案:

result = [(str(dt.timetuple()[:6])[1:-1], s.split('_')[0]) for dt, s in OUTPUT]

最终,我希望将新的元组列表传递给pandas数据帧。

您可以使用.read_sql_query()将信息直接拉入DataFrame:

import pandas as pd
import sqlalchemy as sa
connection_url = sa.engine.URL.create(
"mssql+pyodbc",
username="scott",
password="tiger^5HHH",
host="192.168.0.199",
database="test",
query={
"driver": "ODBC Driver 18 for SQL Server",
"TrustServerCertificate": "Yes",
}
)
engine = sa.create_engine(connection_url)
table_name = "so71297370"
# set up example environment
with engine.begin() as conn:
conn.exec_driver_sql(f"DROP TABLE IF EXISTS {table_name}")
conn.exec_driver_sql(f"CREATE TABLE {table_name} (col1 datetime2, col2 nvarchar(50))")
conn.exec_driver_sql(f"""
INSERT INTO {table_name} (col1, col2) VALUES
('2003-03-26 15:12:15', '490002_space'), 
('2003-03-27 16:13:14', '490002_space')
""")
# example
df = pd.read_sql_query(
# suffix '_space' is 6 characters in length
f"SELECT col1, LEFT(col2, LEN(col2) - 6) AS col2 FROM {table_name}",
engine,
)
print(df)
"""
col1    col2
0 2003-03-26 15:12:15  490002
1 2003-03-27 16:13:14  490002
"""

最新更新