- About Uqbar
- Release Note
- Uqbar Installation
- Uqbar Management
- Data Retention Policy
- Time-Series Table Management
- Time-Series Data Write
- Data Compression
- Data Deletion
- Data Query
- Continuous Aggregation
- Time-Series Views
- Cluster Management
- Backup and Restoration
- Security
- GUC Parameters
- SQL Syntax
- Third Party Tools Support
- Glossary
CREATE CONTINUOUS AGGREGATION
Function
Creates a new continuous aggregation in the current database.
Syntax Format
CREATE CONTINUOUS AGGREGATION [ IF
NOT EXISTS ] conagg_name [ TABLESPACE tablespace_name ]
EVERY resample_interval
[ START_OFFSET end_offset]
[ END_OFFSET start_offset ]
AS <select_query>;
Where select_query is:
SELECT time_bucket(<const_value>,<time_col>) as
bucket,[grouping_exprs],<aggregation_functions> FROM
source_timeseries_table
[WHERE ...]
GROUP BY bucket,[grouping_exprs],[optional_grouping_exprs]
[HAVING ...]
Parameter Description
-
conagg_name
Continuous aggregation name, can contain schema_name, if it does not contain schema_name, the default schema will be used as schema_name, and this continuous aggregation name will be used as the table name of the target table of the continuous aggregation, and if the table does not exist, the target table will be created automatically.
Note: The target table is also a timing table that uses the default policy. the policy can be replaced with the Update Timing Table Preservation Policy syntax after creating a continuous aggregation.
Note: The existence of a table with the same name as the persistent aggregation will cause the creation to fail.
-
tablespace_name
Specifies the tablespace to which the continuous aggregation belongs, an optional parameter. The continuous aggregation target table also uses this tablespace.
-
resample_interval
The time interval for continuous aggregation auto-resampling.
Note: the minimum value is 1s, the maximum is the maximum value of type INTERVAL. If set to 0 then the next start time for continuous aggregation is infinity (maximum value of timestamp type), indicating that it will not start.
-
start_offset
The offset of the start time of the aggregated data window from the sampling time, i.e., resample_time (continuous aggregated resampling time) - start_offset and aligned forward by the width of the time_bucket (the alignment time datum is 2001-01-01 00:00:00) as the start time of the sampling window. Default value resample_interval.
Note: start_offset only works on the first run after a continuous aggregation is created; after the first run, the end of the previous aggregation window is used as the start of the next aggregation window.
-
end_offset
The offset of the end time of the aggregated data window from the sampling time, i.e. resample_time - end_offset and aligned forward by time_bucket width as the end time of the sampling window, default 0s.
Note: resample_interval,start_offset,end_offset can only use 'numeric time unit' format. If the number part is not integer, it will be automatically rounded up, for example, 1.5day will be rounded up to 2day, -1.1day will be rounded up to -1day (resample_interval does not support negative values). Supported time units include: second, minute, hour, day, week, month, year.
Note: The constraint start_offset-end_offset>=resample_interval ensures that there is no data missing in between the two aggregations. start_offset-end_offset>=bucket_width implicitly constrains that bucket_width cannot be equal to 0, so start_offset-end_offset>0 ensures that the start time of the sampling window is less than the end time of the sampling window. Therefore start_offset and end_offset cannot be equal. start_offset-end_offset>0 ensures that the start time of the sampling window is less than the end time of the sampling window. start_offset and end_offset can be negative values, indicating aggregation of future data.
-
selecrt_query
Continuously aggregates the aggregated query statement to be executed. The parameters it contains are described below.
-
time_bucket
The select_query must contain, and can only contain, a time_bucket function. For the relevant usage of time_bucket, see the time_bucket section.
-
grouping_exprs
Columns in the GROUY BY statement that appear in the targetlist, other than those generated by the time_bucket function, may be null. When null there will be an extra_tag column with a default value of null in the continuous aggregation target table.
-
option_grouping_expes
Columns that appear only in the GROUP BY clause but not in the targetlist may be null.
Note: Only TAG columns are supported in GROUP BY except for time_bucket, or target columns generated by an expression containing a tag that does not appear in the GROUP BY clause.
-
source_timeseries_table
Continuous aggregation source table, which must be a single time-series table, cannot be a SELECT statement.
-
aggregation_functions
Aggregate functions such as avg,min,max etc.
-
Examples
Uqbar=# CREATE TIMESERIES TABLE weather(time timestamp tstime,city text
tstag,temperature float);
CREATE TIMESERIES TABLE
Uqbar=# CREATE CONTINUOUS aggregation IF NOT EXISTS weather_ca
EVERY '1 hour'
START_OFFSET '2 day'
END_OFFSET '1 day'
AS
SELECT time_bucket('1 hour', time) as bucket, province, city,
avg(temperature)
FROM weather
GROUP BY bucket, province, city ORDER BY bucket;
CREATE CONTINUOUS AGGRETATION
Suggestions
-
Continuous aggregation is not applicable to scenarios where data is still being changed after it has been written for a period of time.
-
The data written to the target table by the current version of continuous aggregation cannot be changed. If the data in the aggregated range changes and you really need to refresh the continuous aggregation, you can create a new continuous aggregation and set start_offset so that the aggregated range covers the time range of the changed data when the background task of the continuous aggregation is run for the first time.
-
The target table of the continuous aggregation is still a time-ordered table, and you can create another continuous aggregation for it.
-
When the bucket_width of time_bucket is automatically rounded up, try to avoid writing '1.5 day' and replace it with '36 hour'.
-
Estimate the time from the start of writing to the end of writing/modifying data in a certain time range and set end_offset with this value as a reference to avoid the mismatch between the aggregation result and the new data after continuous aggregation writing.
-
Since the target table is also a timing table, partition compression will result in the continuous aggregation background task not being able to write data containing the compressed partition range to the target table. This problem can be avoided by updating the uqbar.timeseries_compression_delay parameter with the value of the resampling interval as a reference before creating the continuous aggregation, so that the start interval of the auto-compression background task on the target table is greater than the start interval of the continuous aggregation background task.