SQL语法错误:SELECT DISTINCT、ORDER BY表达式必须出现在选择列表中



我最近刚开始使用SQL,我正在尝试获得所有"ua_keys";由一个月内10000名客户的随机样本生成。运行以下SQL查询时:

WITH
selected_users AS (
SELECT DISTINCT
user_id
FROM
oneapp_data.oa_events
WHERE
logtime BETWEEN '2020-10-01T21:59:57.940Z' AND '2020-11-01T22:59:58.027Z'
ORDER BY rand()
LIMIT 10000
)
SELECT
user_id,
ua_key
FROM
oa_events
WHERE
ua_key IN ( selected_users)

检索以下错误:

运行查询时出错:SYNTAX_Error:第2:4行:对于SELECT DISTINCT,ORDER BY表达式必须出现在选择列表中

关于如何解决此问题,有什么想法吗?非常感谢!!!

您需要在IN子句中添加一个子查询SELECT语句。试试下面的一个:

WITH
selected_users AS (
SELECT DISTINCT
user_id,
ua_key, 
rand() random_value
FROM
oneapp_data.oa_events
WHERE
logtime BETWEEN '2020-10-01T21:59:57.940Z' AND '2020-11-01T22:59:58.027Z'
),
sampling_example AS(
SELECT 
user_id,
ua_key, 
random_value
FROM 
selected_users
ORDER BY 
random_value
LIMIT 10000
)
SELECT
user_id,
ua_key
FROM
oa_events
WHERE
ua_key IN ( 

SELECT ua_key
FROM sampling_example
)

我认为这应该有效:

WITH selected_users AS (
SELECT TOP (10000) distinct user_id
FROM oneapp_data.oa_events
WHERE logtime BETWEEN '2020-10-01T21:59:57.940Z' AND '2020-11-01T22:59:58.027Z'
ORDER BY NEWID()
)
SELECT user_id, ua_key
FROM oa_events
WHERE ua_key IN (SELECT user_id FROM selected_users);

如果修复IN子句不起作用,可以执行以下操作:

WITH selected_users AS (
SELECT TOP (10000) user_id
FROM oneapp_data.oa_events
WHERE logtime BETWEEN '2020-10-01T21:59:57.940Z' AND '2020-11-01T22:59:58.027Z'
GROUP BY user_id
ORDER BY NEWID()
)

最新更新