如何根据时间使用flink过滤数据?



我有一些数据像这样的格式:

(id, time, value)

给出了以下模拟数据(可能有重复数据):

("a1", "2022-06-28 00:00:00", "0.23"), // The time interval is 15 minutes, and there is only 24-hour data of the day
("a1", "2022-06-28 00:15:00", "0.89"),
...
("a1", "2022-06-28 23:59:59", "0.11"),
("b1", "2022-06-28 00:00:00", "0.23"), 
("b1", "2022-06-28 00:15:00", "0.89"),
...
("b1", "2022-06-28 23:59:59", "0.11"),

("c1", "2022-06-28 00:00:00", "0.23"), 
("c1", "2022-06-28 00:15:00", "0.89"),
...
("c1", "2022-06-28 23:59:59", "0.11"),

假设现在是2022-06-28 16:00:00,我要计算1h,45min,30min,15min前和现在的数据。

输出应该是这样的:

("a1", "2022-06-28 15:00:00", "1"),
("a1", "2022-06-28 15:15:00", "1"),
("a1", "2022-06-28 15:30:00", "1"),
("a1", "2022-06-28 15:45:00", "1"),
("a1", "2022-06-28 16:00:00", "1"),
("b1", "2022-06-28 15:00:00", "1"),
("b1", "2022-06-28 15:15:00", "1"),
("b1", "2022-06-28 15:30:00", "1"),
("b1", "2022-06-28 15:45:00", "1"),
("b1", "2022-06-28 16:00:00", "1"),
("c1", "2022-06-28 15:00:00", "1"),
("c1", "2022-06-28 15:15:00", "1"),
("c1", "2022-06-28 15:30:00", "1"),
("c1", "2022-06-28 15:45:00", "1"),
("c1", "2022-06-28 16:00:00", "1"),

如何编写Flink程序?最好是用Java或Scala编写。如果你能给我看一些代码片段,我将不胜感激!

我推荐看看窗口从Flink表值函数。您可以在https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/window-tvf/#tumble

找到文档和示例。

最新更新