Postgresql窗口函数的数字精度错误



这是我的查询

示例1

SELECT sum(percent::numeric)::numeric
FROM (
SELECT (amount::numeric * 100::numeric) / sum(amount::numeric) OVER (PARTITION BY a, b, c)::numeric as percent
FROM (
values ('a', 'b', 'c', 1, '609639'::numeric)
, ('a', 'b', 'c', 2, '72296.66'::numeric)
, ('a', 'b', 'c', 10, '174027.6'::numeric)
, ('a', 'b', 'c', 12, '154179.16'::numeric)
, ('a', 'b', 'c', 14, '97530.2'::numeric)
, ('a', 'b', 'c', 15, '1335529.7'::numeric)
, ('a', 'b', 'c', 25, '201315.1'::numeric)
, ('a', 'b', 'c', 51, '1424121.5'::numeric)
, ('a', 'b', 'c', 61, '51045.5'::numeric)
, ('a', 'b', 'c', 63, '14664.4'::numeric)
, ('a', 'b', 'c', 66, '740921.9'::numeric)
, ('a', 'b', 'c', 70, '189266.9'::numeric)
) AS sample_values (a, b, c, d, amount)
) as table_1;
sum
-------------------------
99.99999999999999997676
(1 row)

示例2

SELECT sum(percent::numeric)::numeric
FROM (
SELECT (amount::numeric * 100::numeric) / sum(amount::numeric) OVER (PARTITION BY a, b, c)::numeric as percent
FROM (
values  ('a', 'b', 'c', 2, '72296.66'::numeric)
, ('a', 'b', 'c', 10, '174027.6'::numeric)
, ('a', 'b', 'c', 12, '154179.16'::numeric)
, ('a', 'b', 'c', 15, '1335529.7'::numeric)
, ('a', 'b', 'c', 25, '201315.1'::numeric)
, ('a', 'b', 'c', 51, '1424121.5'::numeric)
, ('a', 'b', 'c', 61, '51045.5'::numeric)
, ('a', 'b', 'c', 63, '14664.4'::numeric)
, ('a', 'b', 'c', 66, '740921.9'::numeric)
, ('a', 'b', 'c', 70, '189266.9'::numeric)
) AS sample_values (a, b, c, d, amount)
) as table_1;
sum
-------------------------
100.0000000000000000583
(1 row)

我将所有内容都转换为numeric,但查询仍然给我

详细信息

psql --version
psql (PostgreSQL) 13.1
postgres=# select version();
version
------------------------------------------------------------
PostgreSQL 13.1, compiled by Visual C++ build 1914, 64-bit
(1 row)
Windows 10 x64

我期待一个干净的100,因为数字不应该有浮点错误。出了什么问题?

我读过源代码,除法似乎尽了最大努力来确定数字数据类型的适当小数位数,这样它的性能就不会比浮点差。

这会导致计算中使用16-19位小数,从而导致精度误差。https://github.com/postgres/postgres/blob/472e518a44eacd9caac7d618f1b6451672ca4481/src/interfaces/ecpg/pgtypeslib/numeric.c#L1044

SELECT (amount::numeric * 100::numeric)::numeric / sum(amount::numeric) OVER (PARTITION BY a, b, c) as percent
FROM (
values ('a', 'b', 'c', 15, '1335529.7'::numeric)
, ('a', 'b', 'c', 63, '14664.4'::numeric)
, ('a', 'b', 'c', 70, '189266.9'::numeric)
) AS sample_values (a, b, c, d, amount)
percent
------------------------
86.7530713671863074
0.95256716474142573277
12.2943614680722669

覆盖比例的一种方法是手动设置其中一个数字以使用更高的比例,从而覆盖全局比例;

SELECT (amount::numeric * 100::numeric)::numeric / sum(amount::numeric) OVER (PARTITION BY a, b, c) as percent
FROM (
values ('a', 'b', 'c', 15, '1335529.7'::numeric)
, ('a', 'b', 'c', 63, '14664.4'::numeric)
, ('a', 'b', 'c', 70, '189266.9'::numeric(100,50))
) AS sample_values (a, b, c, d, amount)
percent
-------------------------------------------------------
86.75307136718630741538759344991526254968459740129825
0.95256716474142573277270421270821410870428026432628
12.29436146807226685183970233737652334161112233437547
(3 rows)

具有更高的标度精度,可以避免精度误差。

如果有人知道如何全局设置数据库的小数位数,请评论,我不喜欢在查询中手动设置。谢谢!

最新更新