我有下表:
row | cust_id | name
1 | 34 | word1 word2 someworkd
| | something word1
| | blabla
2 | 35 | word1 word2 word3
| | word4 word5
select*result JSON
[
{
"cust_id": "34",
"name": [
"word1 word2 someworkd",
"something word1",
"blabla"
]
},
{
"cust_id": "35",
"name": [
"word1 word2 word3",
"word4 word5",
"word1word5"
]
}
]
架构JSON:
[
{
"mode": "NULLABLE",
"name": "cust_id",
"type": "STRING"
},
{
"mode": "REPEATED",
"name": "name",
"type": "STRING"
}
]
我想根据cust_id和它们值上的重复字段进行筛选,所以使用以下查询:
SELECT * FROM `mytable` WHERE cust_id='34' and name like'%word1%'
查询的预期输出:
row | cust_id | name
1 | 34 | word1 word2 someworkd
| | something word1
JSON格式的查询预期输出:
{
"cust_id": "34",
"name": [
"word1 word2 someworkd",
"something word1"
]
}
]
受此问题启发,BigQuery:使用标准SQL过滤重复字段我尝试这个查询:
SELECT cust_id, name FROM `mytable`, UNNEST(name) AS name WITH OFFSET o
WHERE name like '%word1%' and cust_id='34'
哪个输出:
row | cust_id | name
1 | 34 | word1 word2 someworkd
| 34 | something word1
输出JSON:
[
{
"cust_id": "34",
"name": "word1 word2 someworkd"
},
{
"cust_id": "34",
"name": "something word1"
}
]
输出结果和我想要的不一样,请问我该如何实现?
我会发布我所做的解决方案,它可以根据我的需要工作,可能对其他有用
SELECT
cust_id,
ARRAY(
SELECT
name
FROM
`mytable`,
UNNEST(name) AS name
WHERE
name LIKE '%word1%'
AND cust_id='34') AS name
FROM
`mytable`
WHERE
cust_id="34"
结果:
row | cust_id | name
1 | 34 | word1 word2 someworkd
| | something word1
JSON:
[
{
"cust_id": "34",
"name": [
"word1 word2 someworkd",
"something word1"
]
}
]
下面是BigQuery标准SQL
#standardSQL
SELECT * REPLACE(
ARRAY(SELECT * FROM t.name WHERE name LIKE'%word1%')
AS name)
FROM `project.dataset.table` t
WHERE cust_id = 34