我有这样的Bigquery表name_fruits
。。。
| name | fruits |
|------|--------|
| aaaa | [apple, orange, grape] |
| bbbb | [apple, orange] |
| cccc | [orange, grape] |
| dddd | [orange] |
| eeee | [orange, peach] |
我试着选择那些同时拥有apple
和orange
果实的名字。这意味着我要选择aaaa
和bbbb
。
查询需要是动态的。无法在查询中对apple
和orange
进行硬编码。
有什么解决方案可以选择它们吗?
SELECT name
FROM name_fruits, UNNEST(fruits) fruit
WHERE fruit IN UNNEST(@find_fruits) # IN query can only "OR" search condition
AND ARRAY_LENGTH(fruits) >= ARRAY_LENGTH(@find_fruits)
# @find_fruits = ["apple", "orange"]
#
# => aaaa, bbbb, cccc, eeee
# I want => aaaa, bbbb
如果我理解正确,您希望使用变量来完成此操作。因此:
SELECT nf.*
FROM name_fruits nf
WHERE (SELECT COUNT(1)
FROM UNNEST(nf.fruits) fruit JOIN
UNNEST(@find_fruits) ff
ON ff = fruit
) >= (SELECT COUNT(*) FROM UNNEST(@find_fruits);
你可以试试这个:
WITH
`project.dataset.table` AS (
SELECT 'aaaa' AS name, ['apple', 'orange', 'grape'] AS fruits
UNION ALL
SELECT 'bbbb' AS name, ['apple', 'orange'] AS fruits
UNION ALL
SELECT 'cccc' AS name, ['orange', 'grape'] AS fruits
UNION ALL
SELECT 'dddd' AS name, ['orange'] AS fruits
UNION ALL
SELECT 'eeee' AS name, ['orange', 'peach'] AS fruits )
SELECT name
FROM `project.dataset.table`
WHERE (
SELECT COUNT(1)
FROM UNNEST(fruits) AS fruit
WHERE fruit IN ('apple', 'orange')
) >=2
以下是BigQuery标准SQL和使用BigQuery 脚本功能
#standardSQL
DECLARE search ARRAY<STRING>;
SET search = ['apple', 'orange'];
CREATE TEMP FUNCTION checkFruits(fruits ARRAY<STRING>, search ARRAY<STRING>) AS (
ARRAY_LENGTH(search) = (
SELECT COUNT(DISTINCT fruit)
FROM UNNEST(fruits) AS fruit
WHERE fruit IN UNNEST(search)
)
);
SELECT name
FROM `project.dataset.table`
WHERE checkFruits(fruits, search)
如果要应用于问题中的样本数据,则输出为
Row name
1 aaaa
2 bbbb
如果您希望将名称列表作为输出,您可以使用以下内容而不是SELECT name
SELECT STRING_AGG(name, ', ') names
在这种情况下,输出是
Row names
1 aaaa, bbbb