HomeMogDBMogDB StackUqbar

Documentation:v2.0

Supported Versions:

Other Versions:

time_bucket_gapfill()

Function Description

time_bucket_gapfill extends the functionality of time_bucket by adding the ability to fill in missing data for specified time intervals.

Syntax

time_bucket_gapfill(bucket_width, time, start_time , finish_time)

Parameter Description

Parameter Name Type Mandatory or Not Description
bucket_width TEXT Yes Time interval in the Positive integer + Time mark format. The time mark includes all units supported by the interval type, including microsecond, millisecond, second, minute, hour, day, week, month, year, decade, century, millennium, the plurality of these units, such as years, months, and days, and the abbreviated format of these units, such as y, m, and d.
time TIMESTAMP Yes Time to be aligned.
start_time TIMESTAMP No The starting time for gap filling, i.e., if a certain time is missing in the result set but should be present in the query results. If this parameter is not specified, it needs to be specified in the WHERE condition.
finish_time TIMESTAMP No The ending time for gap filling, i.e., if a certain time is missing in the result set but should be present in the query results. If this parameter is not specified, it needs to be specified in the WHERE condition.

For example, let's say there are 6 time-series data points in the system, and the timestamp format does not include year, month, or day information.

time 00:00:00 00:01:00 00:15:00 00:36:00 00:45:00 01:06:00
value 30 31 32 33 NULL 34

Calculate the average value within a 10-minute window for the given data points. The result is as follows:

time 00:00:00 00:10:00 00:30:00 00:40:00 01:00:00
avg 30.5 32 33 NULL 34

Due to the absence of time-series data reporting within some of the time windows, there are no aggregation results within those time windows. The purpose of "time_bucket_gapfill" is to fill in these time windows according to certain rules, thereby generating aggregation results for the time windows.

The system provides a set of filling functions that can be used in conjunction with "time_bucket_gapfill" to fill in missing values according to specified rules.

  • locf(value[, prev, treat_null_as_missing])

    Used in conjunction with "time_bucket_gapfill," it fills in missing values by using the value from the previous window.

    Parameter Name Type Mandatory or Not Description
    value ANY Yes The expression for the value that needs to be filled.
    prev EXPRESSION No If the previous value is not found, then use this expression to search for a value.
    treat_null_as_missing BOOLEAN No Do not carry forward null values. Default value is false.

    For example, if the event window at 00:20:00 is being filled and the value to fill is based on the aggregation result from the previous window at 00:10:00, which is 32.

    time 00:00:00 00:10:00 00:20:00 00:30:00 00:40:00 00:50:00 01:00:00
    avg 30.5 32 32 33 NULL NULL 34

    When we set the parameter treat_null_as_missing to true in the locf function, it treats null values as missing values for filling.

    time 00:00:00 00:10:00 00:20:00 00:30:00 00:40:00 00:50:00 01:00:00
    avg 30.5 32 32 33 33 33 34
  • interpolate(value[, prev[, next]])

    Linear interpolation is used to fill missing data, which means taking the average of the previous and next windows. If the previous and next windows do not exist, the value is left empty.

    Parameter Name Type Mandatory or Not Description
    value ANY Yes The expression for the value that needs to be filled.
    prev EXPRESSION No If the preceding value for interpolation is not calculated, then this expression is used to search for the preceding value instead.
    next EXPRESSION No If the succeeding value for interpolation is not calculated, then this expression is used to search for the succeeding value instead.

    For example, the filling value for the 00:20:00 window is the average of the aggregated values in the 00:10:00 window and the 00:30:00 window.

    time 00:00:00 00:10:00 00:20:00 00:30:00 00:40:00 00:50:00 01:00:00
    avg 30.5 32 32.5 33 NULL NULL 34

Constraints

  • Only one time_bucket_gapfill function can be used in the same-level query.

  • When using interpolation functions, you must specify start_time and finish_time either in the function parameters or in the WHERE clause.

  • start_time should be less than or equal to finish_time.

  • The result is automatically ordered in ascending order by the time column.

Examples

Creating a continuous aggregation with a target time-series table containing a time_bucket_gapfill() function column fails.

  1. Create the source time-series table.

    CREATE TIMESERIES TABLE t_uqbar_ca_create_case0008_1(time timestamp TSTIME, city text TSTAG, temperature float);
    insert into t_uqbar_ca_create_case0008_1 values ('2023-02-14 06:00:00','beijing',14.4);
    select * from timeseries_views.tstable;
  2. Create a continuous aggregation, but it failed because the target time-series table contains a column with the time_bucket_gapfill() function.

    CREATE CONTINUOUS AGGREGATION IF NOT EXISTS agg_ca_create_case0008_1 EVERY '1 year' START_OFFSET '10 year' END_OFFSET '1h' AS 
    SELECT time_bucket_gapfill('1 h',time) as bucket,city,avg(temperature)
    FROM t_uqbar_ca_create_case0027_1 GROUP BY bucket,city ORDER BY bucket;
  3. Unable to query continuous aggregation metadata.

    select ca_schema,ca_name,resample_time,resample_interval,start_offset,end_offset,query from timeseries_catalog.continuous_aggregation;
    select * from timeseries_views.continuous_aggregation;
  4. Unable to query destination time-series table information.

    • Unable to query time-series table metadata, the query fails.

      select timecol,tagcol from timeseries_catalog.tstable;
      select  tablename,timecolname,tagcol,tspolicy from timeseries_views.tstable;
      select * from agg_ca_create_case0027_1;
    • Unable to query column field information for the target table, the query fails.

      SELECT col_description(a.attrelid,a.attnum) as comment,format_type(a.atttypid,a.atttypmod) as type,a.attname as name, a.attnotnull as notnull
      FROM pg_class as c,pg_attribute as a
      where c.relname = 'agg_ca_create_case0027_1' and a.attrelid = c.oid and a.attnum>0;
    • Querying the background job view does not return any tasks for the target table.

      select * from timeseries_catalog.bgw_job;
      select * from timeseries_views.bgw_job;
      SELECT ca.ca_name, ca.resample_interval, job.interval ,as job_interval 
      FROM timeseries_catalog.continuous_aggregation AS ca LEFT JOIN 
      pg_job AS job ON ca.job_id = job.job_id WHERE ca.ca_name = 'agg_ca_create_case0027_1';
Copyright © 2011-2024 www.enmotech.com All rights reserved.