在Bigquery中使用time格式时,如何计算平均时间



我正在尝试获取AVG时间,但AVG函数不支持时间格式。我尝试了CAST功能,就像在一些帖子中解释的那样,但它似乎无论如何都不起作用。感谢

WITH october_fall AS
(SELECT
start_station_name,
end_station_name,
start_station_id,
end_station_id,
EXTRACT (DATE FROM started_at) AS start_date,
EXTRACT(DAYOFWEEK FROM started_at) AS start_week_date,
EXTRACT (TIME FROM started_at) AS start_time,    
EXTRACT (DATE FROM ended_at) AS end_date,
EXTRACT(DAYOFWEEK FROM ended_at) AS end_week_date,    
EXTRACT (TIME FROM ended_at) AS end_time,
DATETIME_DIFF (ended_at,started_at, MINUTE) AS total_lenght,
member_casual
FROM 
`ciclystic.cyclistic_seasonal_analysis.fall_202010` AS fall_analysis
ORDER BY 
started_at DESC)
SELECT
COUNT (start_week_date) AS avg_start_1,
AVG (start_time) AS avg_start_time_1, ## here is where the problem start
member_casual
FROM 
october_fall
WHERE 
start_week_date = 1
GROUP BY
member_casual

尝试低于

SELECT
COUNT (start_week_date) AS avg_start_1,
TIME(
EXTRACT(hour   FROM AVG(start_time - '0:0:0')), 
EXTRACT(minute FROM AVG(start_time - '0:0:0')), 
EXTRACT(second FROM AVG(start_time - '0:0:0'))
) as avg_start_time_1
member_casual
FROM 
october_fall
WHERE 
start_week_date = 1
GROUP BY
member_casual     

另一种选择是

SELECT
COUNT (start_week_date) AS avg_start_1,
PARSE_TIME('0-0 0 %H:%M:%E*S', '' || AVG(start_time - '0:0:0')) as avg_start_time_1
member_casual
FROM 
october_fall
WHERE 
start_week_date = 1
GROUP BY
member_casual     

因为BigQuery无法在TIME类型上计算AVG,所以如果您尝试这样做,您会看到错误消息。

相反,您可以通过INT64计算AVG。
time_ts是时间戳格式
我尝试使用CCD_ 2来计算从时间到";00:00:00";,然后我可以得到FLOAT64格式的秒,并将其转换为INT64格式
我创建了一个函数secondToTime。计算小时/分钟/秒并解析回时间格式非常简单。

对于日期格式,我认为你可以用同样的方法。

create temp function secondToTime (seconds INT64)
returns time 
as (
PARSE_TIME (
"%H:%M:%S",
concat(
cast(seconds / 3600 as int),
":",
cast(mod(seconds, 3600) / 60 as int),
":",
mod(seconds, 60)
)
)
);

with october_fall as (
select
extract (date from time_ts) as start_date,
extract (time from time_ts) as start_time
from `bigquery-public-data.hacker_news.comments`
limit 10
) SELECT 
avg(time_diff(start_time, time '00:00:00', second)),
secondToTime(
cast(avg(time_diff(start_time, time '00:00:00', second)) as INT64) 
),
secondToTime(0),
secondToTime(60),
secondToTime(3601),
secondToTime(7265)
FROM october_fall

我知道几个月过去了,但也许其他人也会面临同样的问题。至于出现问题的部分,类似的方法对我有效,并给出了平均乘车长度:

FORMAT_TIMESTAMP
('%T', 
TIMESTAMP_SECONDS(CAST(AVG(TIME_DIFF(ride_length, '00:00:00', SECOND)) AS 
INT64)))
AS avg_ride_length

最新更新