当itersize小于数据大小并且fetch number小于itersize时,psycopg2服务器端游标如何操作



我已经阅读了文档、几篇文章、帖子和线程,但我不确定我是否清楚地理解了这一点。让我们假设这个场景:

1. I have a server side cursor.
2. I set the itersize to 1000.
3. I execute a SELECT query which would normally return 10000 records.
4. I use fetchmany to fetch 100 records at a time.

我的问题是,这是如何在幕后完成的?我的理解是,查询是执行的,但服务器端游标读取了1000条记录。光标禁止读取下一个1000,除非它滚动经过当前读取的1000的最后一个记录。此外,服务器端光标将1000保存在服务器的内存中,并一次滚动100个,将它们发送给客户端。我也想知道公羊的消耗量会是什么样子?根据我的理解,如果执行完整查询需要10000kb的内存,那么服务器端游标在服务器上只消耗1000kb,因为它一次只读取1000条记录,而客户端游标将使用100kb。我的理解正确吗?

更新根据我们在回复中的文档和讨论,我希望此代码一次打印10个项目的列表:

from psycopg2 import connect, extras as psg_extras

with connect(host="db_url", port="db_port", dbname="db_name", user="db_user", password="db_password") as db_connection:
with db_connection.cursor(name="data_operator", 
cursor_factory=psg_extras.DictCursor) as db_cursor:
db_cursor.itersize = 10
db_cursor.execute("SELECT rec_pos FROM schm.test_data;")
for i in db_cursor:
print(i)
print(">>>>>>>>>>>>>>>>>>>")

然而,在每次迭代中,它只打印一条记录。我获得10条记录的唯一方法是使用fetchmany:

from psycopg2 import connect, extras as psg_extras

with connect(host="db_url", port="db_port", dbname="db_name", user="db_user", password="db_password") as db_connection:
with db_connection.cursor(name="data_operator", 
cursor_factory=psg_extras.DictCursor) as db_cursor:
db_cursor.execute("SELECT rec_pos FROM schm.test_data;")
records = db_cursor.fetchmany(10)
while len(records) > 0:
print(i)
print(">>>>>>>>>>>>>>>>>>>")
records = db_cursor.fetchmany(10)

基于这两个代码片段,我猜在前面提到的场景中发生的是,给定下面的代码。。。

from psycopg2 import connect, extras as psg_extras

with connect(host="db_url", port="db_port", dbname="db_name", user="db_user", password="db_password") as db_connection:
with db_connection.cursor(name="data_operator", 
cursor_factory=psg_extras.DictCursor) as db_cursor:
db_cursor.itersize = 1000
db_cursor.execute("SELECT rec_pos FROM schm.test_data;")
records = db_cursor.fetchmany(100)
while len(records) > 0:
print(i)
print(">>>>>>>>>>>>>>>>>>>")
records = db_cursor.fetchmany(100)

itersize是服务器端的事情。它所做的是,当查询运行时,它设置了从数据库中只加载1000条记录的限制。但是fetchmany是客户端的事情。它从服务器获得1000个中的100个。每次fetchmany运行时,都会从服务器中提取下一个100。当服务器端的所有1000个被滚动时,接下来的1000个被从服务器端的DB中提取。但我很困惑,因为这似乎不是文件所暗示的。但话说回来。。。代码似乎暗示了这一点。

我会花一些时间在服务器端光标上。

您会发现itersize仅在您迭代光标时适用:

for record in cur:
print record

由于您使用的是fetchmany(size=100),因此一次只能处理100行服务器将不会在内存中容纳1000行我错了。光标将把所有行返回到内存中的客户端,然后fetchmany()将按照指定的批处理大小(如果未使用命名光标(从那里提取行。若使用了命名游标,那个么它将以批量大小从服务器中获取。

更新。显示itersizefetchmany()的工作方式。

使用带命名光标的itersizefetchmany()

cur = con.cursor(name='cp')
cur.itersize = 10
cur.execute("select * from cell_per")
for rs in cur:     
print(rs) 
cur.close()
#Log
statement: DECLARE "cp" CURSOR WITHOUT HOLD FOR select * from cell_per
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: CLOSE "cp"
cur = con.cursor(name='cp') 
cur.execute("select * from cell_per")
cur.fetchmany(size=10) 
#Log
statement: DECLARE "cp" CURSOR WITHOUT HOLD FOR select * from cell_per
statement: FETCH FORWARD 10 FROM "cp"

使用带有未命名光标的fetchmany

cur = con.cursor()
cur.execute("select * from cell_per")
rs = cur.fetchmany(size=10)
len(rs)                                                                                                                                                                   
10
#Log
statement: select * from cell_per

因此,命名游标(从服务器(批量获取由itersize在迭代时设置的行,或由size在使用fetchmany(size=n)时设置的列。而非命名游标将所有行拉入内存,然后根据fetchmany(size=n)中设置的size从内存中取出它们。

进一步更新

itersize仅在迭代光标对象本身时适用:

cur = con.cursor(name="cp")
cur.itersize = 10 
cur.execute("select * from cell_per")
for r in cur: 
print(r) 
cur.close()
#Postgres log:
statement: DECLARE "cp" CURSOR WITHOUT HOLD FOR select * from cell_per
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: CLOSE "cp"

在上面,r将是从服务器端(命名(游标返回的每批10行中提取的一行。该批大小为=itersize。因此,当您在命名的游标对象本身上迭代时,查询指定的所有行都将在迭代器中返回,只是分批返回itersize

未在命名光标对象上迭代。使用fetchmany(size=n):

cur = con.cursor(name="cp") 
cur.itersize = 10
cur.execute("select * from cell_per") 
cur.fetchmany(size=20)
cur.fetchmany(size=20)
cur.close()
#Postgres log:
statement: DECLARE "cp" CURSOR WITHOUT HOLD FOR select * from cell_per
statement: FETCH FORWARD 20 FROM "cp"
statement: FETCH FORWARD 20 FROM "cp"
CLOSE "cp"

itersize已设置,但它作为命名光标对象没有任何作用没有被重复。相反,fetchmany(size=20)每次被调用时都让服务器端游标发送一批20条记录。