如何使用postgresql生成列中唯一的随机数



我想在PostgreSQL中生成随机数,就像我在MySQL中所做的那样,如下所示。我想在Postgres函数中这样做。

MySQL:

DROP PROCEDURE IF EXISTS Generate_random;
DELIMITER $$
CREATE PROCEDURE Generate_random()
BEGIN
Drop table if exists aa_dev.`Agents`;
CREATE TABLE aa_dev.`Agents`(AgentID int PRIMARY KEY);
SET @first = 1;
SET @last = 1000;
WHILE(@first <= @last) Do
INSERT INTO aa_dev.`Agents` VALUES(FLOOR(RAND()*(2900000-2800000+1)+2800000))
ON DUPLICATE KEY UPDATE AgentID = FLOOR(RAND()*(2900000-2800000+1)+2800000);
IF ROW_COUNT() = 1 THEN
SET @first = @first + 1;
END IF;
END WHILE;
END$$

DELIMITER ;
CALL Generate_random();

到目前为止,我已经在Postgres中生成了随机数,但它们在专栏中不断重复。请告诉我如何在PostgreSQL中实现上述MySQL代码。

drop function if exists aa_dev.rand_cust(low INT, high INT, total INT);
CREATE OR REPLACE FUNCTION aa_dev.rand_cust(low INT ,high INT, total INT)
RETURNS TABLE (Cust_id  int) AS
$$
declare
counter int := 0;
rand int := 0;

begin
------------------- Creating a customer table with Cust_id----------------------------
DROP TABLE IF EXISTS aa_dev.Customer;
CREATE TABLE IF NOT EXISTS aa_dev.Customer (
Cust_id INT
);
--------------------- Loop to insert random -----------------------
while counter < total loop
rand = floor(random()* (high-low + 1) + low);
Insert into aa_dev.Customer (Cust_id) values(rand);
counter := counter + 1;
end loop;
return query
select *
from aa_dev.customer;
end
$$
LANGUAGE plpgsql;
select * from aa_dev.rand_cust(1, 50, 100);

对于Postgres,您要求提供100个介于1到50之间的数字-自然会有重复!

MySQL代码有更大范围的可能值(100000(,其中只有1000个被采样。MySQL代码还会生成随机数,直到没有键错误,即列中没有重复项。

因此,对于Postgres,您可以尝试检查重复项,如果发现则重试。使列唯一将防止重复插入,但您必须处理它。

此外,还需要大于值数量的样本大小。小心重试,不要复制MySQL的例子。如果样本大小小于所需计数,则循环将永远不会终止。


更新

这里有一个函数,它将在一个范围内生成唯一的随机数,并用它们填充一个表:

DROP FUNCTION IF EXISTS rand_cust (low INT, high INT, total INT);
CREATE OR REPLACE FUNCTION rand_cust (low INT, high INT, total INT) 
RETURNS TABLE (Cust_id INT) 
AS 
$$ 
BEGIN
------------------- Creating a customer table with Cust_id----------------------------
DROP TABLE IF EXISTS Customer;
CREATE TABLE IF NOT EXISTS Customer(Cust_id INT);
RETURN query
INSERT INTO Customer(Cust_id)
SELECT *
FROM generate_series(low, high)
ORDER BY random() LIMIT total
RETURNING -- returns the id's you generated
Customer.Cust_id;
END $$ 
LANGUAGE plpgsql;
SELECT *
FROM rand_cust(1000, 2000, 100);  -- 100 unique numbers between 1000 and 2000 inclusive

请注意,这将无法生成比样本大小更多的数字,例如,您不能生成1到50之间的100个数字,最多只能生成50个。这是唯一性要求的结果。LIMIT子句不会导致错误,但您可以在尝试查询之前添加代码来检查该(hi - low) >= total

如果你更喜欢一个简单的函数来生成n个随机唯一数:

DROP FUNCTION IF EXISTS sample(low INT, high INT, total INT);
CREATE OR REPLACE FUNCTION sample(low INT, high INT, total INT) 
RETURNS TABLE (Cust_id INT) 
AS 
$$ 
BEGIN
RETURN query
SELECT *
FROM generate_series(low, high)
ORDER BY random() LIMIT total;  
END $$ 
LANGUAGE plpgsql;
-- create a table of unique random values
SELECT INTO Customer FROM sample(100, 200, 10);

如前所述,您的范围在1到50之间,并且您想要创建100条记录。这永远不会是独一无二的。而且您的查询不会要求唯一的值,所以即使有一百万条记录,您也可能有重复的记录。

但是,您的代码也可以简单得多,没有循环,只有一个查询:

DROP FUNCTION IF EXISTS aa_dev.rand_cust ( low INT, high INT, total INT );
CREATE OR REPLACE FUNCTION aa_dev.rand_cust ( low INT, high INT, total INT ) 
RETURNS TABLE ( Cust_id INT ) 
AS 
$$ 
BEGIN
------------------- Creating a customer table with Cust_id----------------------------
DROP TABLE IF EXISTS aa_dev.Customer;
CREATE TABLE IF NOT EXISTS aa_dev.Customer ( Cust_id INT );
--------------------- No Loop to insert random -----------------------
RETURN query
INSERT INTO aa_dev.Customer ( Cust_id )
SELECT FLOOR ( random( ) * ( high - low + 1 ) + low ) -- no uniqueness!
FROM    generate_series(1, total) -- no loop needed
RETURNING -- returns the id's you generated
Customer.Cust_id;

END $$ 
LANGUAGE plpgsql;
SELECT
* 
FROM
aa_dev.rand_cust ( 1, 50, 100 );

最新更新