添加JOIN语句时值会增加



几天来,我在这个问题上遇到了一些困难,无法解决。

(运行这是Bigquery(

作为一个营销客户,我正在努力获得每个广告商的印象和点击量。

p_impression拥有所有的impression数据,包括ID Campaign_ID、Advertiser_IDp_click拥有所有的点击数据,包括ID Campaign_ID、Advertiser_IDmatch_table_campaigns具有活动数据:活动ID,活动名称

(本质上,我的简单目标是获得印象和点击次数,并将其与活动表相匹配,这样我就可以看到活动的名称,而不仅仅是ID。

当我使用UNION-all为Impression和Click表运行查询时,只使用IDS,所有数字都与我的平台匹配,一切看起来都很好。

然而,当我插入一个JOIN(我已经尝试了所有这些(来包含活动名称时,印象和点击的所有指标似乎都在增加,我似乎再也无法获得正确的数字了。

with combinedDeliveryData AS (
SELECT
FORMAT_DATETIME("%F",DATETIME(TIMESTAMP_MILLIS( CAST(SUBSTR(CAST( Event_Time AS String),0,LENGTH(CAST( Event_Time AS String)) - 3) AS INT64 )),"America/Toronto")) AS Date,
Impression_ID,
DBM_Advertiser_ID,
Campaign_ID,
Ad_ID,
Rendering_ID,
Event_Type,
Advertiser_ID,
FROM
CampaignManager.p_impression
UNION ALL
SELECT
FORMAT_DATETIME("%F",DATETIME(TIMESTAMP_MILLIS( CAST(SUBSTR(CAST( Event_Time AS String),0,LENGTH(CAST( Event_Time AS String)) - 3) AS INT64 )),"America/Toronto")) AS Date,
Impression_ID,
DBM_Advertiser_ID,
Campaign_ID,
Ad_ID,
Rendering_ID,
Event_Type,
Advertiser_ID,
FROM
CampaignManager.p_click)

SELECT cpg.Campaign_ID, cpg.Campaign,data.Date,COUNT(case data.Event_Type when 'VIEW' then 1 else null end) AS Impressions
FROM match_table_campaigns AS cpg
LEFT JOIN combinedDeliveryData AS data ON cpg.Campaign_ID = data.Campaign_ID
WHERE cpg.Advertiser_ID = "4739279"
GROUP BY 3,1,2

如有任何帮助,我们将不胜感激。我不确定我错过了什么。

值得注意的是,以下是在没有JOIN 的情况下有效的方法

SELECT
Impression_ID,
Event_Time,
DBM_Advertiser_ID,
Campaign_ID,
Ad_ID,
Rendering_ID,
Event_Type,
Advertiser_ID
FROM
CampaignManager.p_impression
UNION ALL
SELECT
Impression_ID,
Event_Time,
DBM_Advertiser_ID,
Campaign_ID,
Ad_ID,
Rendering_ID,
Event_Type,
Advertiser_ID
FROM
CampaignManager.p_click)
SELECT FORMAT_DATETIME("%F",DATETIME(TIMESTAMP_MILLIS( CAST(SUBSTR(CAST( clicks.Event_Time AS String),0,LENGTH(CAST( clicks.Event_Time AS String)) - 3) AS INT64 )),"America/Toronto")) AS exposureDate,clicks.Campaign_ID,COUNT(case Event_Type when 'VIEW' then 1 else null end) AS Impressions,COUNT(case Event_Type when 'CLICK' then 1 else null end) AS Clicks
FROM 
combinedDeliveryData AS clicks
WHERE clicks.Campaign_ID = "4739279"
GROUP BY 1,2

新问题:我添加了以下代码,看起来很棒

this_is_deduplicated AS (
SELECT x.*
FROM (
SELECT Campaign_ID, ARRAY_AGG(a LIMIT 1)[OFFSET(0)] x
FROM combinedDeliveryData a
GROUP BY 1
) 
)
LEFT JOIN this_is_deduplicated AS data ON cpg.Campaign_ID = data.Campaign_ID

得到了我所有的价值观,它们是敏锐的。我现在遇到了另一个问题,我必须加入最后一张表,但上面的解决方案不适用。

新表中的值:日期活动_ID活动_ID

我现在需要加入最后一个表,这样对于每个日期,我都可以得到Activity_ID As Conversions的计数。

我的最终表格应该是:

广告商,活动,印象,点击,转换(如上所述(

我几乎可以肯定原因是combinedDeliveryData表中有多个相同的campaign_id

在加入之前,请确保每个活动只有一个id。类似于:

WITH [...], 
this_is_deduplicated AS (
SELECT x.*
FROM (
SELECT Campaign_ID, ARRAY_AGG(a LIMIT 1)[OFFSET(0)] x
FROM combinedDeliveryData a
GROUP BY 1
) 
)

JOIN反对,如:

LEFT JOIN this_is_deduplicated AS data ON cpg.Campaign_ID = data.Campaign_ID

而不是

LEFT JOIN combinedDeliveryData AS data ON cpg.Campaign_ID = data.Campaign_ID

或更短的

LEFT JOIN this_is_deduplicated AS data USING(Campaign_ID)

您可能需要加入campaigncampaign_id。如果您在下查询样本

select
t1.date,
t1.campaign_id,
t1.campaign,
sum(case when t1.event = 'view' then 1 else 0 end) as impressions,
sum(case when t2.event = 'click' then 1 else 0 end) as clicks
from table1 as t1
left join table2 as t2
on t1.campaign = t2.campaign
and on t1.campaign_id = t2.campaign_id
where advertiser_id = '12345'
group by
t1.campaign_id,
t1.campaign,
t1.date

最新更新