HomeMogDBMogDB StackUqbar
v3.0

Documentation:v3.0

Supported Versions:

Other Versions:

BLOOM Index

Availability

This feature is available since MogDB 3.0.0.

Introduction

Bloom provides a Bloom filter-based approach to index access. A Bloom filter is a spatially efficient data structure that is used to test whether an element is a member of a collection. In the case of the index access method, it can quickly exclude non-matching meta-ancestors by sizing the signature decided at index creation.

Benefits

Bloom indexes are suitable for scenarios where there are many columns in a table and the query can use any combination of columns. For traditional indexes, such as Btree in this scenario may need to build many indexes to cover possible query conditions, which may lead to a large index space and affect the performance of insertion and update, while Bloom indexes only need to build one to cope with it.

Description

Users can query through Bloom index in the applicable scenario of Bloom index, and also reduce the size of index space occupied compared with Btree index.

Scenarios

Blooms indexes are most useful when the table has many attributes and the query may test any combination of them. Traditional btree indexes will be faster than Bloom indexes, but many btree indexes are needed to support all possible queries, while for Bloom only one is needed.

Create Bloom Index

Constraints

  • Bloom index only supports being created on row-stored table.

  • The column type for index creation can only be 4-byte-length int type or variable-length string type. 4-byte-length int type in MogDB can be int, int4 and integer, and variable-length string type can be varchar, text and clob.

  • Bloom does not support creating unique indexes.

  • null values cannot be indexed and null values are skipped.

  • Only equal-value queries are supported.

  • Partitioned table Global indexes are not supported.

Example

CREATE INDEX ON <table_name> USING bloom(col1,col2...) with (length=80,col1=2,col2=4);

Parameters

length

Specifies how many bits are used to represent the signature generated in an index. The default value is 80, and the user-specified value is internally rounded up to an integer multiple of 16 (transparent to the user), with a minimum value of 1 and a maximum value of 4096. If the set value is not within the limit, the command execution reports an error and indicates the correct range of the value.

col1-col32

Specifies how many bits are used to represent each index column. The default value is 2, the minimum value is 1, and the maximum value is 4095. If the set value is not within the limit, the command execution reports an error and prompts for the correct range of values.

If Bloom index is created for non-row-stored tables, an error is reported, and it is prompted that it cannot be created for non-row-stored tables.

If a unique index is created for Bloom, an error is reported indicating that the unique index cannot be created.

If a Global index for a partitioned table with index type Bloom is created, an error is reported, indicating that the partitioned table Global index can only be of type Btree.

Delete Bloom Index

Example:

MogDB=# DROP INDEX bloom_idx;
DROP INDEX

Reindex Bloom Index

This feature supports rebuilding a Bloom index that already exists. The user issues the command to rebuild Bloom index through the client or database driver, the index is finally rebuilt successfully, the index metadata is correct, and the index can be used normally (e.g., if the original execution plan goes to Bloom index, the execution plan can still use the rebuilt Bloom index after the index is rebuilt).

Example:

MogDB=# REINDEX INDEX bloom_idx;
REINDEX

Alter Bloom Index

This feature supports modifying the attributes of an existing Bloom index. The constraints related to the index attributes are consistent with those at the time of index creation, e.g. the modified length of a Bloom index needs to be in the range of [1,4096].

Key points:

  • Modified index attributes need to conform to indexing rules and constraints, and if they do not, an error needs to be reported, returning the corresponding prompt message.
  • Renaming the index and verifying whether the index metadata is correct in pg_class.
  • Modify the length or col attribute of Bloom index. If you want to verify whether the attribute is effective, you can simply judge by the change of the value of the relpages field in pg_class, or by SELECT pg_size_pretty(pg_relation_size('blidx ')) and other commands to see the change in index size.
  • To modify tablespace, you can confirm whether it takes effect by viewing the data file on disk.

Example:

MogDB=# ALTER INDEX IF EXISTS bloom_idx RENAME TO newbloom_idx;
ALTER INDEX
MogDB=# SELECT oid,relname,relfilenode,relpages FROM pg_class WHERE relname = 'newbloom_idx';
  oid  |   relname    | relfilenode | relpages 
-------+--------------+-------------+----------
 41159 | newbloom_idx |       41160 |       30
(1 row)
MogDB=# ALTER INDEX IF EXISTS newbloom_idx SET (length=160);
ALTER INDEX
MogDB=# REINDEX INDEX newbloom_idx;
REINDEX
MogDB=# SELECT oid,relname,relfilenode,relpages FROM pg_class WHERE relname = 'newbloom_idx';
  oid  |   relname    | relfilenode | relpages 
-------+--------------+-------------+----------
 41159 | newbloom_idx |       41162 |       49
(1 row)

Querying with Bloom Index

When a Bloom index exists on the table and the query conditions match the applicable conditions of the Bloom index, the execution plan will show that the Bloom index has been used.

Example:

CREATE TABLE tbloom AS
   SELECT
     (random() * 1000000)::int as i1,
     (random() * 1000000)::int as i2,
     (random() * 1000000)::int as i3,
     (random() * 1000000)::int as i4,
     (random() * 1000000)::int as i5,
     (random() * 1000000)::int as i6
   FROM
  generate_series(1,10000000);

CREATE INDEX btreeidx ON tbloom (i1, i2, i3, i4, i5, i6);
CREATE INDEX bloomidx ON tbloom USING bloom (i1, i2, i3, i4, i5, i6);

MogDB=# EXPLAIN ANALYZE SELECT * FROM tbloom where i3 = 100 AND i5 = 1000;
                                                          QUERY PLAN                                                          
------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on tbloom  (cost=178436.31..179393.23 rows=250 width=24) (actual time=138.209..138.209 rows=0 loops=1)
   Recheck Cond: ((i3 = 100) AND (i5 = 1000))
   Rows Removed by Index Recheck: 21936
   Heap Blocks: exact=18673
   ->  Bitmap Index Scan on bloomidx  (cost=0.00..178436.25 rows=250 width=0) (actual time=85.681..85.681 rows=21936 loops=1)
         Index Cond: ((i3 = 100) AND (i5 = 1000))
 Total runtime: 138.412 ms
(7 rows)

MogDB=# EXPLAIN ANALYZE SELECT * FROM tbloom where i1 = 100 AND i2 = 1000;
                                                      QUERY PLAN                                                       
-----------------------------------------------------------------------------------------------------------------------
 [Bypass]
 Index Only Scan using btreeidx on tbloom  (cost=0.00..8.27 rows=1 width=24) (actual time=0.084..0.084 rows=0 loops=1)
   Index Cond: ((i1 = 100) AND (i2 = 1000))
   Heap Fetches: 0
 Total runtime: 0.134 ms
(5 rows)

Update Bloom Index

Example:

MogDB=# select i2,i3,i4 from tbloom where rownum <=5;
   i2   |   i3   |   i4   
--------+--------+--------
 778090 | 624067 | 948170
 927435 | 800792 | 904419
 325217 | 726778 | 834407
 925272 | 221411 | 826500
  93906 | 992575 | 997677
  
UPDATE tbloom SET i1 = 10 
            WHERE i2 = 325217 AND 
                  i3 = 726778 AND 
                  i4 = 834407; 
                  
MogDB=# select * from tbloom where i2 = 325217 and i3 = 726778;
 i1 |   i2   |   i3   |   i4   |   i5   |   i6   
----+--------+--------+--------+--------+--------
 10 | 325217 | 726778 | 834407 | 702579 | 525581
(1 row)

MogDB=# explain select * from tbloom where i2 = 325217 and i3 = 726778;
                                 QUERY PLAN                                 
----------------------------------------------------------------------------
 Bitmap Heap Scan on tbloom  (cost=178439.48..178443.50 rows=1 width=24)
   Recheck Cond: ((i2 = 325217) AND (i3 = 726778))
   ->  Bitmap Index Scan on bloomidx  (cost=0.00..178439.48 rows=1 width=0)
         Index Cond: ((i2 = 325217) AND (i3 = 726778))
(4 rows)

DELETE FROM tbloom WHERE i2 = 1000 AND i3 = 789678 AND i4 = 311551; 
select * from tbloom where i2 = 1000 and i3 = 789678;
 i1 | i2 | i3 | i4 | i5 | i6 
----+----+----+----+----+----
(0 rows)
explain select * from tbloom where i2 = 1000 and i3 = 789678;
 Bitmap Heap Scan on tbloom  (cost=178440.26..178444.28 rows=1 width=24)
   Recheck Cond: ((i2 = 1000) AND (i3 = 789678))
   ->  Bitmap Index Scan on tbloomidx  (cost=0.00..178440.26 rows=1 width=0)
         Index Cond: ((i2 = 1000) AND (i3 = 789678))
(4 rows)

CREATE INDEX, DROP INDEX, ALTER INDEX

Copyright © 2011-2024 www.enmotech.com All rights reserved.