如何使用SQL从查找字典的值中获取平均值



我的数据帧如下所示:

id            value
a       0:3,1:0,2:0,3:4
a       0:0,1:0,2:2,3:0
a       0:0,1:5,2:4,3:0

我想写一个查询来获得列值中键的平均值?

例如,对于0:3,1:0,2:0,3:4,它必须是(0+0+0+3+3+3+3)/7 = 1.71

对于0:0,1:0,2:2,3:0,它必须是(2+2)/2=2

对于0:0,1:5,2:4,3:0,它必须是(1+1+1+1+1+2+2+2+2)/9 = 1.44

所以期望的结果是:

id            value
a              1.71
a              2.00
a              1.44

如何做到这一点?有sql函数可以得到这个结果吗?

请参阅此DBFIDDLE

代码:

CREATE  PROCEDURE `avg_dict`(s varchar(100))
BEGIN
SET @result = CONCAT('SELECT (', replace(replace(s, ":","*"),",","+"), ')/(',regexp_replace(s,",?[0-9]:","+"),')');
PREPARE stmt FROM @result;
EXECUTE stmt  ;
DEALLOCATE PREPARE stmt;
END

结果:

stmt输出
CALL avg_dict("0:3,1:0,2:0,3:4");1.743
CALL avg_dict("0:0,1:0,2:2,3:0");2000
CALL avg_dict("0:0,1:5,2:4,3:0");1.4444

通过splittransformrepeat的组合,您可以实现您的目标:

WITH dataset(id, value) AS (
values ('a', '0:3,1:0,2:0,3:4'),
('a', '0:0,1:0,2:2,3:0'),
('a', '0:0,1:5,2:4,3:0')
)
SELECT id,
reduce(arr, 0.0, (s, x)->s + x, s->s) / cardinality(arr)
FROM(
SELECT *,
flatten(
transform(
transform(
split(value, ','),
s->split(s, ':')
),
arr->repeat(
cast(arr [ 1 ] as INTEGER),
cast(arr [ 2 ] as INTEGER)
)
)
) as arr
FROM dataset
)

输出:

id_col1
a1.7142857142857142
a2.0
a1.44444444444444

相关内容

  • 没有找到相关文章

最新更新