提取表的随机样本，其中两列的唯一组合与值不关联

我有一个客户端表，格式如下

client_firstname  client_lastname  fruit    purchase_date
Lionel            Messi            apples   11/11/2020
Lionel            Messi            bananas  11/13/2020
Federico          Chiesa           oranges  11/20/2020
.
.
.

我想创建一个由10000对从未买过苹果的名字和姓氏组成的随机子集，并提取与它们相关的所有条目。

到目前为止我有：

select t.*
from client_table t
where exists (select 1
from client_table t2
where t2.firstname = t.firstname and
t2.lastname = t.lastname
t2.fruit <> "apples")

我知道可以用创建第二个表

select distinct client_table.firstname, client_table.lastname
where table.fruit <> "apples"
order by rand() limit 10000

但是是否可以将该表包括在CCD_ 1语句中并避免创建第二个表？

谁从未购买过苹果

这：

exists (...
t2.fruit <> "apples")

不符合规范。代码是"；买了不是苹果的东西的人"；，而不是"；从不买苹果的人；

这给了你非苹果买家：

select t.*
from client_table t
where not exists (select 1
from client_table t2
where t2.firstname = t.firstname and
t2.lastname = t.lastname
t2.fruit = 'apples')

但我不认为我会这样做，因为该列表将包含重复的

相反，让我们来看看没有买苹果的人的唯一名单：

select firstname, lastname
from client_table 
group by firstname, lastname
having sum(case when fruit = 'apples' then 1 else 0 end) = 0

然后，您可以将您的订单添加到其中，如果您需要表中的更多数据，请将其转换为子查询，并将其连接回主表

select t.*
from 
client_table t 
inner join
(
select firstname, lastname
from client_table 
group by firstname, lastname
having sum(case when fruit = 'apples' then 1 else 0 end) = 0
) x on x.firstname = t.firstname and x.lastname = t.lastname

相关内容

最新更新

热门标签：