Date Range-Partition in Bigquery



我正在尝试对类型为整数或日期的表进行范围分区。在第一种情况下,当我对整数进行分区时,使用如下查询就可以了。

CREATE TABLE `project-id.dataset.parted_employee_int`
PARTITION BY RANGE_BUCKET(employee_id, GENERATE_ARRAY(1, 100000, 5000)) AS 
select employee_id, full_name, birth_date from `project-id.dataset.employee`;

我正在尝试扩展这个查询日期范围分区。一样。

CREATE TABLE `project-id.dataset.parted_employee_date`
PARTITION BY RANGE_BUCKET(birth_date, GENERATE_DATE_ARRAY(CAST('1912-07-09' AS DATE), CAST('1979-06-23' AS DATE), INTERVAL 2 MONTH)) AS
select employee_id, full_name, birth_date from `project-id.dataset.employee`;

但是我得到以下错误:

PARTITION BY expression must be DATE(<timestamp_column>), DATE(<datetime_column>), DATETIME_TRUNC(<datetime_column>, DAY/HOUR/MONTH/YEAR), a DATE column, TIMESTAMP_TRUNC(<timestamp_column>, DAY/HOUR/MONTH/YEAR), DATE_TRUNC(<date_column>, MONTH/YEAR), or RANGE_BUCKET(<int64_column>, GENERATE_ARRAY(<int64_value>, <int64_value>[, <int64_value>]))

如何在日期上进行区间分区?有没有办法?

这里列出了所有支持的分区方法。

最接近于开箱即用的特性是按月分区。

更精确地说,下面是如何使用PARTITION BY DATE(<date_column>, MONTH/YEAR)

按日期进行分区
CREATE TABLE `project-id.dataset.parted_employee_date`
PARTITION BY DATE_TRUNC(birth_date, MONTH) AS
select employee_id, full_name, birth_date from `project-id.dataset.employee`;

如果我想在DAY或HOUR进行分区,我必须使用PARTITION BY DATETIME_TRUNC(<datetime_column>, DAY/HOUR/MONTH/YEAR)

因此,首先,我必须在select子句中将birth_date转换为DATETIME

CREATE TABLE `project-id.dataset.parted_employee_date_time`
PARTITION BY DATETIME_TRUNC(birth_date_dt, MONTH) AS
select employee_id, full_name, CAST (birth_date AS DATETIME) as birth_date_dt
from `project-id.dataset.employee`;

最新更新