如何循环插入查询?



我有一个这样的问题:我需要优化应用程序,使用db (postgreSQL),表看起来像这样:

CREATE TABLE voter_count(
id SERIAL, 
name VARCHAR NOT NULL, 
birthDate DATE NOT NULL, 
count INT NOT NULL, 
PRIMARY KEY(id), 
UNIQUE (name, birthDate))

我有超过一千个这样的选民,我需要把他们都放在数据库中,但其中有几个副本可以投票多次(从2到无穷),我需要,当遇到这样的副本时,增加现有字段的count字段(对于具有相同姓名和出生日期的选民)。之前,我只是检查表中是否有这样一个选民,如果有,然后找到它并增加计数。

但是程序工作了太长时间,我尝试通过MULTI INSERT和使用ON CONFLICT do UPDATE来增加计数,但是我得到了一个错误,然后我问了一个关于stackoverflow的问题,我被提供了很多insert,通过循环,但是在PostgreSQL中。

INSERT INTO voter_count(name, birthdate, count) 
VALUES 
('Ivan', '1998-08-05', 1), 
('Sergey', '1998-08-29', 1), 
('Ivan', '1998-08-05', 1) 
ON CONFLICT (name, birthdate) DO UPDATE SET count = (voter_count.count + 1)

问题:如何通过PostgreSQL在循环中执行INSERT。

可能最好的选择是在没有主键的表中的所有数据之前插入,例如:

CREATE TABLE voter_count_with_duplicates(
name VARCHAR NOT NULL, 
birthDate DATE NOT NULL)

,然后用一条语句插入数据:

INSERT INTO voter_count (name, birthDate, count)
SELECT name, birthDate, COUNT(*)
FROM voter_count_with_duplicates
GROUP BY name, birthDate

注意,如果数据在结构化文本文件中(例如CSV文件),则可以使用一条COPY语句将所有数据插入voter_count_with_duplicates

如果您必须在已经填充的表中插入(大量)新数据,有几种可能性。一种是使用评论中的解决方案。另一个是执行一个更新和一个插入:

WITH present_tuples AS 
(SELECT name, birthDate, COUNT(*) AS num_of_new_votes
FROM voter_count_with_duplicates d JOIN voter_count c ON
v.name = d.name and v.birthDate = d.birthDate
GROUP BY name, birthDate)
UPDATE voter_count SET count = count + num_of_new_votes
FROM present_tuples
WHERE present_tuples.name = voter_count.name
AND present_tuples.birthDate = voter_count.birthDate;
WITH new_tuples AS 
(SELECT name, birthDate, COUNT(*) AS votes
FROM voter_count_with_duplicates d 
WHERE NOT EXISTS SELECT * 
FROM voter_count c
WHERE v.name = d.name and v.birthDate = d.birthDate
GROUP BY name, birthDate)
INSERT INTO voter_count (name, birthDate, count)
SELECT name, birthDate, votes
FROM new_tuples;

你想要达到的是通俗地称为向上;插入行,如果不存在,则更新。此操作使用MERGE

要合并到现有表中的数据集是按名称和出生日期分组的值的总和,以及要插入/添加的总和。

MERGE INTO voter_count vc
USING 
(
SELECT name, birthdate, SUM(cnt) as total
FROM 
(
VALUES 
('Ivan', DATE '1998-08-05', 1), 
('Sergey', DATE '1998-08-29', 1), 
('Ivan', DATE '1998-08-05', 1) 
) input_data (name, birthdate, cnt)
GROUP BY name, birthdate
) data ON (data.name = vc.name and data.birthdate = vc.birthdate)
when not matched 
insert (name, birthdate, count) values (data.name, data.birthdate, data.total)
when matched
update set count = count + data.total;

最新更新