Parallel Index Scanning

Availability

This feature is available since MogDB 3.1.0.

Introduction

Parallel index scanning is a database query feature that can dramatically improve data access speed. By breaking the query task into multiple parallel subtasks and scanning multiple indexes at the same time, you can avoid scanning all data files, improve query efficiency, and speed up data access.

Benefits

Using the parallel index scanning feature can be expected to yield performance gains of around 25% or so, while for overall execution, performance gains of around 10% or so can be expected.

Description

This feature supports parallel index scanning for ordinary and partitioned tables, which mainly contains three ways: indexscan, indexonlyscan and bitmapscan.

indexscan: When performing range query on a data table, you can locate the start and end positions through the index, scan the indexed data first, and then perform parallel scanning on the data file according to the scanning result to improve the query efficiency.
indexonlyscan: in table scanning, when the target columns are contained in the index, only the index data can be scanned, reducing the number of files to be scanned, and at the same time, according to the constraints to determine the scope of the index data can be scanned in parallel.
bitmapscan: When performing a table scan, the index scan is divided into two phases. First, scan the index files in parallel to get all the datafile pages to be scanned; then scan the datafile pages in parallel, which can reduce the number of random accesses and reads of the datafiles.

Enhancements

None

Constraints

This feature requires enabling of the parallel switch (setting query_dop to a value greater than 1.
This feature supports BTree index and does not support Ustore and Cstore storage.

Example

Partitioned table indexes are divided into LOCAL indexes and GLOBAL indexes, a LOCAL index corresponds to a specific partition, while a GLOBAL index corresponds to the entire partitioned table. The following is an example of parallel index scanning for partitioned tables.

GLOBAL index

-- 1. Create a partitioned table and insert data.
CREATE TABLE parallel_partition_index_01
(
    c1 int,
    c2 int,
    c3 int
)
PARTITION BY RANGE(c1)
(
    PARTITION P1 VALUES LESS THAN(2000),
    PARTITION P2 VALUES LESS THAN(4000),
    PARTITION P3 VALUES LESS THAN(6000),
    PARTITION P4 VALUES LESS THAN(MAXVALUE)
)enable row movement;
INSERT INTO parallel_partition_index_01 VALUES (generate_series(1, 10000), generate_series(1,10000), generate_series(1, 10000));

-- 2. Create an index.
CREATE INDEX index_parallel_partition_index_01 on parallel_partition_index_01(c1) GLOBAL;

-- 3. Unable other scan types such as seqscan, bitmapscan, indexonlyscan, etc.
SET enable_seqscan = OFF;
SET enable_bitmapscan = OFF;
SET enable_indexonlyscan = OFF;

-- 4. Enable parallelism to set the synchronization cost (smp_thread_cost) for communication between threads.
-- Note: A low value for the smp_thread_cost parameter can prompt the optimizer to prefer parallelism.
SET query_dop = 2;
SET smp_thread_cost = 0;

-- 5. Performs a query operation.
SELECT * FROM parallel_partition_index_01 WHERE c1=100;
 c1  | c2  | c3  
-----+-----+-----
 100 | 100 | 100
(1 row)

-- 6. Execute the EXPLAIN statement to view execution plan information.
EXPLAIN (COSTS OFF) SELECT * FROM parallel_partition_index_01 WHERE c1<1000;
                                       QUERY PLAN                                        
-----------------------------------------------------------------------------------------
 Streaming(type: LOCAL GATHER dop: 1/2)
   ->  Index Scan using index_parallel_partition_index_01 on parallel_partition_index_01
         Index Cond: (c1 < 1000)
(3 rows)

LOCAL index

-- 1. Create a partitioned table and insert data.
CREATE TABLE parallel_partition_index_01
(
    c1 int,
    c2 int,
    c3 int
)
PARTITION BY RANGE(c1)
(
    PARTITION P1 VALUES LESS THAN(2000),
    PARTITION P2 VALUES LESS THAN(4000),
    PARTITION P3 VALUES LESS THAN(6000),
    PARTITION P4 VALUES LESS THAN(MAXVALUE)
)enable row movement;
INSERT INTO parallel_partition_index_01 VALUES (generate_series(1, 10000), generate_series(1,10000), generate_series(1, 10000));

-- 2. Create an index.
CREATE INDEX index_parallel_partition_index_01 on parallel_partition_index_01(c1) LOCAL;

-- 3. Unable other scan types such as seqscan, bitmapscan, indexonlyscan, etc.
SET enable_seqscan = OFF;
SET enable_bitmapscan = OFF;
SET enable_indexonlyscan = OFF;

-- 4. Enable parallelism to set the synchronization cost (smp_thread_cost) for communication between threads.
-- Note: A low value for the smp_thread_cost parameter can prompt the optimizer to prefer parallelism.
SET query_dop = 2;
SET smp_thread_cost = 0;

-- 5. Performs a query operation.
SELECT * FROM parallel_partition_index_01 WHERE c1=100;
 c1  | c2  | c3  
-----+-----+-----
 100 | 100 | 100
(1 row)

-- 6. Execute the EXPLAIN statement to view execution plan information.
EXPLAIN (COSTS OFF) SELECT * FROM parallel_partition_index_01 WHERE c1<1000;
                                        QUERY PLAN                                                 
-----------------------------------------------------------------------------------------
 Streaming(type: LOCAL GATHER dop: 1/2)
   -> Partition Iterator
        Iterations: 1
        Selected Partitions:  1
        ->  Partitioned Index Scan using index_parallel_partition_index_01 on parallel_partition_index_01
               Index Cond: (c1 < 1000)
(6 rows)

smp_thread_cost