在 BigQuery 上重新创建 GA 漏斗



我正在尝试使用 BigQuery 重新创建 GA 漏斗(Google360 上的自定义报告(。GA 上的漏斗使用每个页面上发生的事件的唯一计数。我在网上找到了这个查询,它在很大程度上有效:

SELECT
COUNT( s0.firstHit) AS Landing_Page,
COUNT( s1.firstHit) AS Model_Selection
from(
SELECT
s0.fullvisitorID,
s0.firstHit,
s1.firstHit,
FROM (
# Begin Subquery #1 aka s0
SELECT
fullvisitorID,
MIN(hits.hitNumber) AS firstHit
FROm [64269470.ga_sessions_20170720]
WHERE
hits.eventInfo.eventAction  in ('landing_page') 
AND totals.visits = 1
GROUP BY
fullvisitorID
) s0
# End Subquery #1 aka s0
left join (
# Begin Subquery #2 aka s1
SELECT
fullvisitorID,
MIN(hits.hitNumber) AS firstHit
FROM [64269470.ga_sessions_20170720]
WHERE
hits.eventInfo.eventAction  in ('model_selection_page')
AND totals.visits = 1
GROUP BY
fullvisitorID,
) s1
ON
s0.fullvisitorID = s1.fullvisitorID
)

查询工作正常,着陆页的值与我在 GA 上获得的值相同,但Model_Selection高出约 10%。这种差异也沿着漏斗增加(为了清楚起见,我只发布了 2 个步骤(。 知道我在这里错过了什么吗?

此查询执行您需要的操作,但在标准 SQL 版本中:

#standardSQL
SELECT 
SUM((SELECT COUNTIF(eventInfo.eventAction = 'landing_page') FROM UNNEST(hits))) Landing_Page,
SUM((SELECT COUNTIF(eventInfo.eventAction = 'model_selection_page') FROM UNNEST(hits) WHERE EXISTS(SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page'))) Model_Selection  
FROM `64269470.ga_sessions_20170720`

仅此而已。 4条线,更快,更便宜。

您还可以使用模拟数据,例如:

#standardSQL
WITH data AS(
SELECT '1' AS fullvisitorid, ARRAY<STRUCT<eventInfo STRUCT<eventAction STRING > >> [STRUCT(STRUCT('landing_page' AS eventAction) AS eventInfo)] AS hits UNION ALL
SELECT '1' AS fullvisitorid, ARRAY<STRUCT<eventInfo STRUCT<eventAction STRING > >> [STRUCT(STRUCT('landing_page' AS eventAction) AS eventInfo), STRUCT(STRUCT('landing_page' AS eventAction) AS eventInfo)] AS hits UNION ALL
SELECT '1' AS fullvisitorid, ARRAY<STRUCT<eventInfo STRUCT<eventAction STRING > >> [STRUCT(STRUCT('landing_page' AS eventAction) AS eventInfo), STRUCT(STRUCT('model_selection_page' AS eventAction) AS eventInfo)] AS hits UNION ALL
SELECT '1' AS fullvisitorid, ARRAY<STRUCT<eventInfo STRUCT<eventAction STRING > >> [STRUCT(STRUCT('model_selection_page' AS eventAction) AS eventInfo), STRUCT(STRUCT('model_selection_page' AS eventAction) AS eventInfo)] AS hits
)
SELECT 
SUM((SELECT COUNTIF(eventInfo.eventAction = 'landing_page') FROM UNNEST(hits))) Landing_Page,
SUM((SELECT COUNTIF(eventInfo.eventAction = 'model_selection_page') FROM UNNEST(hits) WHERE EXISTS(SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page'))) Model_Selection  
FROM data

请注意,在 GA 中构建此类报表可能会有点困难,因为您需要选择至少在事件"landing_page"触发后触发事件"model_selection_page"的访问者。确保您在 GA 中也正确生成了此报告(一种方法可能是首先生成自定义报告,仅包含已触发"landing_page"的客户,然后应用第二个筛选器查找"model_selection_page"(。

[编辑]:

您在评论中询问了有关在会话和用户级别进行计数的问题。对于每个会话的计数,可以将每个子查询评估的结果限制为 1,如下所示:

SELECT 
SUM((SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page' LIMIT 1)) Landing_Page,
SUM((SELECT 1 FROM UNNEST(hits) WHERE EXISTS(SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page') AND eventInfo.eventAction = 'model_selection_page' LIMIT 1)) Model_Selection  
FROM data

对于计算不同的用户,想法是相同的,但您必须应用COUNT(DISTINCT)操作,如下所示:

SELECT 
COUNT(DISTINCT(SELECT fullvisitorid FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page' LIMIT 1)) Landing_Page,
COUNT(DISTINCT(SELECT fullvisitorid FROM UNNEST(hits) WHERE EXISTS(SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page') AND eventInfo.eventAction = 'model_selection_page' LIMIT 1)) Model_Selection  
FROM data

最新更新