计算 BigQuery 中的谷歌分析唯一事件



我已经设法通过ISOweek计算了总事件,但没有使用BigQuery计算给定Google Analytics事件的唯一事件。检查 GA 时,total_events与点上的 GA 接口匹配,但unique_events处于关闭状态。你知道我该怎么解决这个问题吗?

查询:

SELECT INTEGER(STRFTIME_UTC_USEC(PARSE_UTC_USEC(date),"%V")) iso8601_week_number,
hits.eventInfo.eventCategory,
hits.eventInfo.eventAction,
COUNT(hits.eventInfo.eventCategory) AS total_events,
EXACT_COUNT_DISTINCT(fullVisitorId) AS unique_events
FROM
    TABLE_DATE_RANGE([XXXXXX.ga_sessions_], TIMESTAMP('2017-05-01'), TIMESTAMP('2017-05-07'))
WHERE
  hits.type = 'EVENT' AND hits.eventInfo.eventCategory = 'BIG_Transaction'
GROUP BY
iso8601_week_number, hits.eventInfo.eventCategory, hits.eventInfo.eventAction

根据范围,您需要count(distinct )不同的事情,但您始终需要满足以下条件:

  • 唯一事件是指类别、操作和标签的组合
  • 确保eventActionNULL
  • 确保eventLabelNULL
  • 允许eventCategory NULL

我正在使用COALESCE()来避免NULL

示例会话范围

SELECT
  SUM( (SELECT COUNT(h.eventInfo.eventCategory) FROM t.hits h) ) events,
  SUM( (SELECT COUNT(DISTINCT 
    CONCAT( h.eventInfo.eventCategory,
      COALESCE(h.eventinfo.eventaction,''),
      COALESCE(h.eventinfo.eventlabel, ''))
      )
    FROM
      t.hits h ) ) uniqueEvents
FROM
  `google.com:analytics-bigquery.LondonCycleHelmet.ga_sessions_20130910` t

命中范围示例

SELECT
  h.eventInfo.eventCategory,
  COUNT(1) events,
  -- we need to take sessions into account, so we add fullvisitorid and visitstarttime
  COUNT(DISTINCT CONCAT(fullvisitorid, CAST(visitstarttime AS string), 
    COALESCE(h.eventinfo.eventaction,''), 
    COALESCE(h.eventinfo.eventlabel, ''))) uniqueEvents
FROM
  `google.com:analytics-bigquery.LondonCycleHelmet.ga_sessions_20130910` t,
  t.hits h
WHERE
  h.type='EVENT'
GROUP BY
  1
ORDER BY
  2 DESC

嗨!

Google Analytics(分析(中唯一事件的定义是:

具有类别/操作/标签的事件的次数计数 值在会话中至少出现一次。

换句话说,发送特定事件(由类别、操作和标签定义(的会话数。在查询中,您需要计算拥有该事件的唯一身份访问者的数量,同时您需要计算会话数,并记住,具有不同标签的事件应计为不同的唯一身份事件(尽管我们只对类别和操作感兴趣(。

修复代码的一种可能方法是:

SELECT 
  INTEGER(STRFTIME_UTC_USEC(PARSE_UTC_USEC(date),"%V")) iso8601_week_number,
  hits.eventInfo.eventCategory,
  hits.eventInfo.eventAction,
  COUNT(hits.eventInfo.eventCategory) AS total_events,
  EXACT_COUNT_DISTINCT(CONCAT(fullVisitorId,'-',string(visitId),'-',date,'-',ifnull(hits.eventInfo.eventLabel,'null'))) AS unique_events    
FROM
  TABLE_DATE_RANGE([XXXXXX.ga_sessions_], TIMESTAMP('2017-05-01'), TIMESTAMP('2017-05-07'))
WHERE
  hits.type = 'EVENT' AND hits.eventInfo.eventCategory = 'BIG_Transaction'
GROUP BY
    iso8601_week_number, hits.eventInfo.eventCategory, hits.eventInfo.eventAction

此查询的结果应与 GA 接口中的数据匹配。

我认为

问题在于您只计算完成指定操作的唯一身份访问者的数量,而 GA 将唯一身份事件定义为"会话在某个日期范围内包含特定维度的次数"。

因此,我只是将您的代码更改为以下内容:

SELECT INTEGER(STRFTIME_UTC_USEC(PARSE_UTC_USEC(date),"%V")) iso8601_week_number,
hits.eventInfo.eventCategory,
hits.eventInfo.eventAction,
COUNT(hits.eventInfo.eventCategory) AS total_events,
EXACT_COUNT_DISTINCT(CONCAT(fullVisitorId, STRING(visitId))) AS unique_events
FROM
    TABLE_DATE_RANGE([XXXXXX.ga_sessions_], TIMESTAMP('2017-05-01'), TIMESTAMP('2017-05-07'))
WHERE
  hits.type = 'EVENT' AND hits.eventInfo.eventCategory = 'BIG_Transaction'
GROUP BY
iso8601_week_number, hits.eventInfo.eventCategory, hits.eventInfo.eventAction

这应该为您提供具有给定事件的会话的不同计数。

我们做了一些类似于@Martin建议的一些很酷的CTE的事情,我们能够从BigQuery获得100%匹配的Google Analytics

查看下面的代码片段,该代码段返回每天的会话总和 + 唯一的添加到购物车事件:

   #standardSQL
WITH AN_ATC AS 
(
  SELECT
    -- full date w/ hyphens (ie 2021-01-07)
      CAST(format_date('%Y-%m-%d', parse_date("%Y%m%d", date)) AS DATE) as DATE,
      -- COUNT OF SESSIONS
      COUNT(DISTINCT CONCAT(fullVisitorId, CAST(visitStartTime AS STRING))) AS Sessions,
      -- COUNT OF UNIQUE EVENTS PER SESSION
       COUNT(DISTINCT CONCAT(fullvisitorid, CAST(visitstarttime AS string), 
        COALESCE(hits.eventinfo.eventaction,''), 
        COALESCE(hits.eventinfo.eventlabel, ''))) AS EVENTS
  
  FROM `an-big-query.PROJECT_ID.ga_sessions_*` , 
  UNNEST(hits) as hits
  WHERE
  -- start date
  _table_suffix BETWEEN '20190101'
  -- yesterday
  AND FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(),INTERVAL 1 DAY)) 
  AND hits.eventInfo.eventAction = 'add to cart'
  GROUP BY 
    date
)
SELECT
DATE,
SESSIONS,
EVENTS
FROM AN_ATC
ORDER BY date DESC

哪里

会话 = Google Analytics ga:会话

事件 = Google Analytics ga:uniqueEvents

事件操作 = @add 到购物车

希望能帮助所有正在搜索/谷歌搜索的人!

最新更新