如何使用Python在Oracle数据库中进行批量插入

我有一些每月的天气数据，我想插入到Oracle数据库表中，但我想在一批中插入相应的记录，以提高效率。有人能建议我如何在Python中做到这一点吗？

例如，假设我的表有四个字段：一个站ID、一个日期和两个值字段。记录由工作站ID和日期字段（组合键）唯一标识。我必须为每个站插入的值将保存在一个列表中，其中包含X个全年的数据，因此，例如，如果有两年的值，那么值列表将包含24个值。

如果我想一次插入一条记录，我假设以下是我的方法：

connection_string = "scott/tiger@testdb"
connection = cx_Oracle.Connection(connection_string)
cursor = cx_Oracle.Cursor(connection)
station_id = 'STATION_1'
start_year = 2000
temps = [ 1, 3, 5, 7, 9, 1, 3, 5, 7, 9, 1, 3 ]
precips = [ 2, 4, 6, 8, 2, 4, 6, 8, 2, 4, 6, 8 ]
number_of_years = len(temps) / 12
for i in range(number_of_years):
    for j in range(12):
        # make a date for the first day of the month
        date_value = datetime.date(start_year + i, j + 1, 1)
        index = (i * 12) + j
        sql_insert = 'insert into my_table (id, date_column, temp, precip) values (%s, %s, %s, %s)', (station_id, date_value, temps[index], precips[index]))
        cursor.execute(sql_insert)
connection.commit()

有没有一种方法可以做到我上面所做的，但可以执行批量插入以提高效率？顺便说一句，我的经验是使用Java/JDBC/HHibernate，所以如果有人能给出一个与Java方法相比的解释/示例，那将特别有帮助。

编辑：也许我需要像这里描述的那样使用cursor.executemany（）？

提前感谢您的任何建议、意见等。

以下是我提出的似乎效果良好的方法（但如果有改进的方法，请评论）：

# build rows for each date and add to a list of rows we'll use to insert as a batch 
rows = [] 
numberOfYears = endYear - startYear + 1
for i in range(numberOfYears):
    for j in range(12):
        # make a date for the first day of the month
        dateValue = datetime.date(startYear + i, j + 1, 1)
        index = (i * 12) + j
        row = (stationId, dateValue, temps[index], precips[index])
        rows.append(row)
# insert all of the rows as a batch and commit
ip = '192.1.2.3' 
port = 1521
SID = 'my_sid'
dsn = cx_Oracle.makedsn(ip, port, SID)
connection = cx_Oracle.connect('username', 'password', dsn)
cursor = cx_Oracle.Cursor(connection)
cursor.prepare('insert into ' + database_table_name + ' (id, record_date, temp, precip) values (:1, :2, :3, :4)')
cursor.executemany(None, rows)
connection.commit()
cursor.close()
connection.close()

使用Cursor.prepare()和Cursor.executemany()。

来自cx_Oracle文档：

Cursor.prepare（语句[，标签]）

这可以在调用execute（）之前使用，以定义将要执行的语句。完成此操作后，当使用None或与语句相同的字符串对象调用execute（）时，将不会执行准备阶段。[…]

Cursor.executemany（语句，参数）

准备一个针对数据库执行的语句，然后针对序列参数中的所有参数映射或序列执行该语句。语句的管理方式与execute（）方法的管理方式相同

因此，使用以上两个函数，您的代码变成：

connection_string = "scott/tiger@testdb"
connection = cx_Oracle.Connection(connection_string)
cursor = cx_Oracle.Cursor(connection)
station_id = 'STATION_1'
start_year = 2000
temps = [ 1, 3, 5, 7, 9, 1, 3, 5, 7, 9, 1, 3 ]
precips = [ 2, 4, 6, 8, 2, 4, 6, 8, 2, 4, 6, 8 ]
number_of_years = len(temps) / 12
# list comprehension of dates for the first day of the month
date_values = [datetime.date(start_year + i, j + 1, 1) for i in range(number_of_years) for j in range(12)]
# second argument to executemany() should be of the form:
# [{'1': value_a1, '2': value_a2}, {'1': value_b1, '2': value_b2}]
dict_sequence = [{'1': date_values[i], '2': temps[i], '3': precips[i]} for i in range(1, len(temps))]
sql_insert = 'insert into my_table (id, date_column, temp, precip) values (%s, :1, :2, :3)', station_id)
cursor.prepare(sql_insert)
cursor.executemany(None, dict_sequence)
connection.commit()

另请参阅Oracle的精通Oracle+Python系列文章。

正如其中一条评论所说，考虑使用INSERT ALL。据推测，它将明显快于使用executemany()。

例如：

INSERT ALL
  INTO mytable (column1, column2, column_n) VALUES (expr1, expr2, expr_n)
  INTO mytable (column1, column2, column_n) VALUES (expr1, expr2, expr_n)
  INTO mytable (column1, column2, column_n) VALUES (expr1, expr2, expr_n)
SELECT * FROM dual;

http://www.techonthenet.com/oracle/questions/insert_rows.php

fyi我的测试结果：

我插入5000行。每行3列。

运行insert 5000次，花费1.24分钟
用executemany运行，花费0.125秒
使用insert-all代码运行：花费4.08分钟

python代码，设置类似sql的将所有插入t（a，b，c）从对偶并集中选择：1，：2，：3全部从daul中选择：4，：5:：6。。。

python代码设置这个长sql，花费了0.145329秒。

我在一台非常旧的sun机器上测试我的代码。cpu：1415 MH.

在第三种情况下，我检查了数据库端，等待事件是"SQL*Net来自客户端的更多数据"。这意味着服务器正在等待来自客户端的更多数据。

没有经过测试，第三种方法的结果对我来说是难以置信的。

所以我的简短建议就是使用刽子手。

我会使用union:创建一个大型SQL插入语句

insert into mytable(col1, col2, col3)
select a, b, c from dual union
select d, e, f from dual union
select g, h, i from dual

您可以在python中构建字符串，并将其作为一条语句交给oracle执行。

相关内容

最新更新

热门标签：