WITH
longest_used_bike AS (
SELECT
bikeid,
SUM(duration_minutes) AS trip_duration
FROM
`bigquery-public-data.austin_bikeshare.bikeshare_trips`
GROUP BY
bikeid
ORDER BY
trip_duration DESC
LIMIT 1
)
-- find station at which longest_used bike leaves most often
SELECT
trips.start_station_id,
COUNT(*) AS trip_ct
FROM
longest_used_bike AS longest
INNER JOIN
`bigquery-public-data.austin_bikeshare.bikeshare_trips` AS trips
ON longest.bikeid = trips.bikeid
GROUP BY
trips.start_station_id
ORDER BY
trip_ct DESC
LIMIT 1
这个查询将给你一个结果是2575,但为什么结果改变为3798当你使用全连接而不是内连接?我想弄清楚那是什么,但我不确定该怎么想
完全连接将包括trips表中的所有条目,无论它们是否可连接到longest_used_bike
ID (longest
中的列将具有NULL值)
有关连接类型的解释,请参阅此处。
提示:如果您遇到类似的情况,请尝试查看未聚合的查询(省略GROUP BY子句和COUNT函数)-然后您会注意到,在FULL JOIN查询中会突然出现更多(不需要的)行。
INNER JOIN将只返回满足JOIN条件的行。因此,只有在两个表中存在匹配的行。
FULL JOIN将返回左起所有行和右起所有行,在没有匹配的字段中为空值。