在简单的 postgres 和 timescaleDB 查询性能测试中没有看到任何提升？

我在postgres和timescaleDB之间尝试了一个简单的性能测试。以下是我的结果：-

共403204行

使用Postgres

聚合查询176行的提取时间：203ms-240ms

Join查询102行的获取时间：660ms-720ms

使用TimescaleDB

聚合查询176行的提取时间：175ms-200ms

加入查询102行的获取时间：614ms-650ms

CREATE TABLE public.sensors(
id SERIAL PRIMARY KEY,
type VARCHAR(50),
location VARCHAR(50)
);

-- Postgres table
CREATE TABLE sensor_data (
time TIMESTAMPTZ NOT NULL,
sensor_id INTEGER,
temperature DOUBLE PRECISION,
cpu DOUBLE PRECISION,
FOREIGN KEY (sensor_id) REFERENCES sensors (id)
);

--drop table public.sensor_data;

-- TimescaleDB table
CREATE TABLE sensor_data_ts (
time TIMESTAMPTZ NOT NULL,
sensor_id INTEGER,
temperature DOUBLE PRECISION,
cpu DOUBLE PRECISION,
FOREIGN KEY (sensor_id) REFERENCES sensors (id)
);
SELECT create_hypertable('sensor_data_ts', 'time');

-- Insert Data

INSERT INTO sensors (type, location) VALUES
('a','floor'),
('a', 'ceiling'),
('b','floor'),
('b', 'ceiling');


-- Postgres 

INSERT INTO sensor_data (time, sensor_id, cpu, temperature)
SELECT
time,
sensor_id,
random() AS cpu,
random()*100 AS temperature
FROM generate_series(now() - interval '50 week', now(), interval '5 minute') AS g1(time), generate_series(1,4,1) AS g2(sensor_id);

-- TimescaleDB
INSERT INTO sensor_data_ts (time, sensor_id, cpu, temperature)
SELECT
time,
sensor_id,
random() AS cpu,
random()*100 AS temperature
FROM generate_series(now() - interval '50 week', now(), interval '5 minute') AS g1(time), generate_series(1,4,1) AS g2(sensor_id);


--truncate table public.sensor_data;
--truncate table public.sensor_data_ts;

select count(*) from public.sensor_data sd ;
select count(*) from public.sensor_data_ts sd ;

--Postgres

--Aggregate queries
SELECT 
floor(extract(epoch from "time")/(60*60*24*2)) as period,
AVG(temperature) AS avg_temp, 
AVG(cpu) AS avg_cpu 
FROM sensor_data 
GROUP BY period;
--ORDER BY PERIOD;
--Join Queries
SELECT 
sensors.location,
floor(extract(epoch from "time")/(60*60*24*7)) as period,
AVG(temperature) AS avg_temp, 
last(temperature, time) AS last_temp, 
AVG(cpu) AS avg_cpu 
FROM sensor_data JOIN sensors on sensor_data.sensor_id = sensors.id
GROUP BY period, sensors.location;

--Timescale DB

--Aggregate Queries
SELECT 
time_bucket('2 day', time) AS period, 
AVG(temperature) AS avg_temp, 
AVG(cpu) AS avg_cpu 
FROM sensor_data_ts 
GROUP BY period;
--ORDER BY PERIOD;

--Join Queries
SELECT 
sensors.location,
time_bucket('1 week', time) AS period, 
AVG(temperature) AS avg_temp, 
last(temperature, time) AS last_temp, 
AVG(cpu) AS avg_cpu 
FROM sensor_data JOIN sensors on sensor_data.sensor_id = sensors.id
GROUP BY period, sensors.location;

我期待着查询性能得到一些实质性的提升。我还能做些什么来提高查询性能？

几件事：

我有点困惑。time_bucket是TimescaleDB函数，而不是Postgres函数，所以它可能正在运行我们的一些代码
您仍在对所有数据执行全表扫描。这里没有太多的优化方法。数据集很小(400K(，因此将全部放入缓冲缓存；如果您想查看一些插入/查询性能，可能需要(a(更多的数据，(b(更复杂的查询类型
但TimescaleDB也有其他功能。例如，打开压缩，你可能会发现这些"；全表扫描"；更快(尽管一旦进入磁盘绑定的工作负载(。或者打开连续的aggs，这样您就可以连续/增量地实现这些结果，例如面向用户的仪表板

相关内容

最新更新

热门标签：