大查询按条件选择数组值中的数组列'And'记录



我有这样的Bigquery表name_fruits。。。

| name | fruits |
|------|--------|
| aaaa | [apple, orange, grape] |
| bbbb | [apple, orange] |
| cccc | [orange, grape] |
| dddd | [orange] |
| eeee | [orange, peach] |

我试着选择那些同时拥有appleorange果实的名字。这意味着我要选择aaaabbbb

查询需要是动态的。无法在查询中对appleorange进行硬编码。

有什么解决方案可以选择它们吗?

SELECT name
FROM name_fruits, UNNEST(fruits) fruit
WHERE fruit IN UNNEST(@find_fruits) # IN query can only "OR" search condition
AND ARRAY_LENGTH(fruits) >= ARRAY_LENGTH(@find_fruits)
# @find_fruits = ["apple", "orange"]
#
# => aaaa, bbbb, cccc, eeee
# I want => aaaa, bbbb

如果我理解正确,您希望使用变量来完成此操作。因此:

SELECT nf.*
FROM name_fruits nf
WHERE (SELECT COUNT(1) 
FROM UNNEST(nf.fruits) fruit JOIN
UNNEST(@find_fruits) ff
ON ff = fruit
) >= (SELECT COUNT(*) FROM UNNEST(@find_fruits);

你可以试试这个:

WITH
`project.dataset.table` AS (
SELECT 'aaaa' AS name, ['apple', 'orange', 'grape'] AS fruits
UNION ALL
SELECT 'bbbb' AS name, ['apple', 'orange'] AS fruits
UNION ALL
SELECT 'cccc' AS name, ['orange', 'grape'] AS fruits
UNION ALL
SELECT 'dddd' AS name, ['orange'] AS fruits
UNION ALL
SELECT 'eeee' AS name, ['orange', 'peach'] AS fruits )
SELECT name
FROM `project.dataset.table` 
WHERE (
SELECT COUNT(1) 
FROM UNNEST(fruits) AS fruit 
WHERE fruit IN ('apple', 'orange')
) >=2

以下是BigQuery标准SQL和使用BigQuery 脚本功能

#standardSQL 
DECLARE search ARRAY<STRING>;
SET search = ['apple', 'orange'];
CREATE TEMP FUNCTION checkFruits(fruits ARRAY<STRING>, search ARRAY<STRING>) AS (
ARRAY_LENGTH(search) = (
SELECT COUNT(DISTINCT fruit) 
FROM UNNEST(fruits) AS fruit 
WHERE fruit IN UNNEST(search)
)
);
SELECT name
FROM `project.dataset.table`
WHERE checkFruits(fruits, search)    

如果要应用于问题中的样本数据,则输出为

Row name     
1   aaaa     
2   bbbb    

如果您希望将名称列表作为输出,您可以使用以下内容而不是SELECT name

SELECT STRING_AGG(name, ', ') names    

在这种情况下,输出是

Row names    
1   aaaa, bbbb   

最新更新