我有一些Postgres表:
CREATE TABLE source_redshift.staticprompts (
id INT,
projectid BIGINT,
scriptid INT,
promptnum INT,
prompttype VARCHAR(20),
inputs VARCHAR(2000),
attributes VARCHAR(2000),
text VARCHAR(2000),
corpuscode VARCHAR(2000),
comment VARCHAR(2000),
created TIMESTAMP,
modified TIMESTAMP
);
and
CREATE TABLE target_redshift.user_input_conf (
collect_project_id BIGINT NOT NULL UNIQUE,
prompt_type VARCHAR(20),
prompt_input_desc VARCHAR(300),
prompt_input_name VARCHAR(100),
no_of_prompt_count BIGINT,
prompt_input_value VARCHAR(100) UNIQUE,
prompt_input_value_id BIGSERIAL PRIMARY KEY,
script_id BIGINT,
corpuscode VARCHAR(20),
min_recordings VARCHAR(2000),
max_recordings VARCHAR(2000),
recordings_count VARCHAR(2000),
lease_duration VARCHAR(2000),
date_created TIMESTAMP WITHOUT TIME ZONE NOT NULL DEFAULT NOW(),
date_updated TIMESTAMP WITHOUT TIME ZONE,
CONSTRAINT must_be_different UNIQUE (prompt_input_value,collect_project_id)
);
我需要复制数据从静态提示到user_input_conf与以下规则:
Primary Key : prompt_input_value_id
Unique Values : collect_project_id, prompt_input_value
Data Load Logic :
Insert only when new prompt input value is found for given collect project from source. Inputs column stores the values in JSON format in staticprompts table.
Insert :
Generate unique sequence number for each of the new prompt input value for a collect project id from source and store in prompt_input_value_id.
Update :
If prompt value already exists for a collect project and if there are any value changes on prompt_input_desc or prompt input name or prompt input value then update only those columns.
prompt_input_value_id - Generate unique sequence number for the combination of each prompt_input_value and collect_project_id
prompt_input_value - Inputs.value is stored in the inputs column as JSON text. Create a unique record for each inputs.value. Look at the example below this table.
我尝试使用这个查询:
INSERT INTO target_redshift.user_input_conf AS t (
collect_project_id,
prompt_type,
prompt_input_desc,
prompt_input_name,
prompt_input_value,
script_id,
corpuscode)
SELECT
s.projectid,
s.prompttype,
s.inputs::jsonb#>>'{inputs,0,desc}' AS desc,
s.inputs::jsonb#>>'{inputs,0,name}' AS name,
s.inputs::jsonb#>>'{inputs,0,values}' AS values,
s.scriptid,
s.corpuscode
FROM source_redshift.staticprompts AS s
ON CONFLICT (collect_project_id, prompt_input_value)
DO UPDATE SET
(prompt_input_desc, prompt_input_name, prompt_input_value, date_updated) =
(EXCLUDED.prompt_input_desc, EXCLUDED.prompt_input_name, EXCLUDED.prompt_input_value, NOW())
WHERE t.prompt_input_desc != EXCLUDED.prompt_input_desc
OR t.prompt_input_name != EXCLUDED.prompt_input_name
OR t.prompt_input_value != EXCLUDED.prompt_input_value;
""")
但是我得到一个错误:
psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "user_input_conf_collect_project_id_key"
DETAIL: Key (collect_project_id)=(1) already exists.
我认为这是一个误解。两列上的唯一约束并不意味着每列都是唯一的,而是两列的组合是唯一的。
所以你的must_be_different
与prompt_input_value
和collect_project_id
的唯一约束不同(并且更弱)。例如,如果您有三行
collect_project_id | prompt_input_value
--------------------+--------------------
1 | a
1 | b
2 | b
它们将与单列唯一约束产生冲突,但不会与must_be_different
产生冲突。
我猜潜在的问题是你想使用INSERT ... ON CONFLICT
与多个唯一的约束。这是不可能的;参考这个问题的讨论和可能的解决方案。