如何查找BigQuery中播放的会话总数



如何找出一个月内所有用户播放的会话总数。事件user_engagement有一个参数会话计数,该参数在每个会话中递增。问题是播放10个会话的用户的会话计数为1到10。那么,我应该如何只添加最大会话计数,即在本例中为10,并类似地为所有用户添加。

SELECT
SUM(session_count) AS total_sessions,
COUNT(DISTINCT user_pseudo_id) AS users
FROM
`xyz.analytics_111.events_*`
WHERE
event_name = "user_engagement" AND (_TABLE_SUFFIX BETWEEN "20200201" AND "20200229")
AND platform = "ANDROID"

试试下面的(BigQuery标准SQL(

#standardSQL
SELECT
SUM(session_count) AS total_sessions,
COUNT(user_pseudo_id) AS users
FROM (
SELECT user_pseudo_id, MAX(session_count) session_count
FROM `xyz.analytics_111.events_*`
WHERE event_name = "user_engagement" 
AND _TABLE_SUFFIX BETWEEN "20200201" AND "20200229"
AND platform = "ANDROID"
GROUP BY user_pseudo_id
)

我不清楚您的数据是什么样子的。如果每个会话有一行,那么您可以简单地使用:

SELECT COUNT(*) AS total_sessions,
COUNT(DISTINCT user_pseudo_id) AS users
. . .

如果每个会话可以有多个事件,则可以使用一种巧妙的方法:

SELECT COUNT(DISTINCT CONCAT(user_pseudo_id, ':', CAST(session_count as string)))

我提供这一点,因为有时在复杂的查询中,只调整一行更简单。否则,米哈伊尔的解决方案是合理的。

然而,我建议使用窗口功能:

SELECT SUM(CASE WHEN seqnum = 1 THEN session_count END) AS total_sessions,
COUNT(DISTINCT user_pseudo_id) AS users
FROM (SELECT e.*,
ROW_NUMBER() OVER (PARTITION BY user_pseudo_id ORDER BY session_count DESC) as seqnum
FROM `xyz.analytics_111.events_*`
WHERE e.event_name = 'user_engagement' AND
_TABLE_SUFFIX BETWEEN '20200201' AND '20200229' AND
platform = 'ANDROID'
) e;

我建议这样做的原因是,您可以保留其余的计算而不更改它们。这在复杂的查询中很方便。

最新更新