v5.0

Sequential Scan Prefetch

Availability

This feature is available since MogDB 5.0.8.

Introduction

MogDB's sequential scan prefetching is optimized for scenarios involving large volumes of data in sequential scans of pure data tables (full table scan scenarios), enhancing the scanning performance. This feature supports both Astore and Ustore storage engines and is also available for parallel scan prefetching.

Benefits

Parallelizes CPU processing and I/O operations during sequential scans, reducing I/O blockage to CPU, improving CPU utilization, and enhancing the performance of sequential scans.

Description

Data in the database is organized and managed in pages, with the CPU processing data in page units, creating an alternating serial execution model between CPU processing and I/O. In this model, due to the I/O latency of a page being significantly greater than the time for CPU processing, the CPU processing is frequently interrupted by I/O operations, leading to low CPU utilization, which is the main reason for poor performance in scenarios such as full table scans.

The sequential scan prefetching mechanism changes this model by parallelizing the CPU processing and I/O operations of sequential scans, avoiding CPU blockage due to waiting for I/O as much as possible. The ideal state is when the CPU is about to process the next data page, the I/O service routine has already prepared this data page in memory. This model is defined as the data page prefetching mechanism (data prefetch).

This feature achieves a 20%-60% performance improvement for the SeqScan operator in full table scan query scenarios (such as TPCH), with a 10%-20% end-to-end performance improvement.

Note:

Not all SQL statements experience the above performance improvements in any testing scenario. The performance improvement of prefetching is mainly related to the complexity of the query statement (CPU calculation and I/O time) and disk bandwidth, with other influencing factors including whether it is a full cache scenario and whether it is a mixed query load.

SQL features with significant operator performance improvements: CPU calculation time is heavy, and I/O bandwidth has not reached the maximum disk bandwidth.
SQL features with significant end-to-end performance improvements: CPU calculation and I/O time account for about 50% each, and I/O bandwidth has not reached the maximum disk bandwidth.

This feature is turned off by default. Set the GUC parameters enable_ios = on and enable_heap_async_prefetch = on to enable Astore sequential scan prefetching. Set the GUC parameters enable_ios = on and enable_uheap_async_prefetch = on to enable Ustore sequential scan prefetching.

Performance Comparison

The performance improvement results of this feature under different degrees of parallelism for the master node executing TPCH tests, as well as the performance improvement results under mixed load (tpcc+tpch) and the impact on TPMC, are as follows.

Note: In the figures below, the vertical axis represents the execution time of the operator or SQL (unit: seconds), and the horizontal axis represents the executed SQL.

Astore performance data
- dop=1: TPCH sequential scan operator improvement is 52%, end-to-end improvement is 27%.
- dop=8: TPCH sequential scan operator improvement is 28%, end-to-end improvement is 13%.
- Performance improvement results and impact on TPMC under mixed load (tpcc+tpch):
  
  TPCH sequential scan operator improvement is 19%, end-to-end improvement is 10%, and tpmc is not affected by prefetching.
  
  Note: The tpmc without prefetching is 410204, and with prefetching enabled, tpmc is 414793.
Ustore performance data

Results of TPCH testing on the master node under different degrees of parallelism:
- dop=1: Overall operator improvement is 41%, end-to-end improvement is 19%.
- dop=4: Overall operator improvement is 43%, end-to-end improvement is 21%.
- dop=8: Overall operator improvement is 45%, end-to-end improvement is 23%.
- dop=16: Overall operator improvement is 37%, end-to-end improvement is 13%.
Performance improvement results and impact on TPMC under mixed load (tpcc+tpch):
- dop=1: Overall operator improvement is 32%, end-to-end improvement is 19%, tpmc effect improvement is 3%, tpmc is not affected by prefetching.
- dop=4: Overall operator improvement is 38%, end-to-end improvement is 22%, tpmc effect improvement is 2%, tpmc is not affected by prefetching.

Constraints

Supports prefetching for serial and parallel scan scenarios.
The current version supports Astore and Ustore engines, and does not support Cstore or segmented page engines.
It is recommended to enable prefetching for NVMe SSDs and not for disk prefetching.

Usage Guidance

Usage Restrictions

If using a regular mechanical hard disk, disk I/O bandwidth may be the system bottleneck, so the advantages of prefetching cannot be demonstrated.
The sequential prefetching mechanism is mainly suitable for tables with a large amount of data (at least GB level). For tables with a small amount of data, it is not recommended to enable prefetching. The current default triggers prefetching at 1G, and the minimum table size for triggering prefetching can be adjusted to 512MB by the user through the GUC parameters min_table_block_num_enable_ios and min_uheap_table_block_num_enable_ios.

Configuration Steps

Configure Astore prefetching

enable_ios = true  // System level, takes effect after restarting the database, default is false
enable_heap_async_prefetch=true  // Session level, supports online configuration, default is false

Configure Ustore prefetching

enable_ios = true  // System level, takes effect after restarting the database, default is false
enable_uheap_async_prefetch=true  // Session level, supports online configuration, default is false

GUC Parameters

Note: In addition to enable_ios and ios_worker_num which require a database restart to take effect, other GUC parameters support online configuration.

No.	Parameter Description
1	enable_ios: Controls whether to start the IOS service.
2	enable_heap_async_prefetch: Controls whether to enable prefetching for Astore full table scan scenarios.
3	enable_uheap_async_prefetch: Controls whether to enable prefetching for Ustore full table scan scenarios.
4	ios_worker_num: The number of ios_workers in the IOS thread pool.
5	parallel_scan_gap: The number of pages each worker thread handles at a time when parallel scanning is enabled (query_dop > 1).
6	ios_batch_read_size: The number of pre-read pages issued to the disk by ios_worker in each batch.
7	max_requests_per_worker: The maximum queue depth for each ios_worker.
8	min_table_block_num_enable_ios: The minimum table size threshold for triggering prefetching of Astore tables. prefetching can only be triggered when the total number of data pages in the table is greater than or equal to this threshold. The current data page size is 8kB.
9	min_uheap_table_block_num_enable_ios: The minimum table size threshold for triggering prefetching of Ustore tables. prefetching can only be triggered when the total number of data pages in the table is greater than or equal to this threshold. The current data page size is 8kB.
10	prefetch_protect_time: The maximum protection time for pre-read buffers.
11	ios_status_update_gap: The time interval for updating IOS performance status.

Operations and Monitoring Capabilities

Users can perceive the effect of enabling prefetching through the shared buffer hit metric in the execution plan, which can significantly show a very high buffer hit rate. Combined with the GUC parameter track_io_timing = on, observe I/O Timings: read, i.e., the IO read latency is very low.
Related performance view: IOS_STATUS

Usage method: select * from ios_status();

Used to view the performance status of the IO thread pool responsible for prefetching in the recent period, including IOSCtl dispatched requests, IO latency/bandwidth, queue congestion, and other metrics. When the main query thread has high IO latency or low cache hit rate issues, users or developers can help locate problems by intuitively viewing the performance of the prefetching thread pool.

In-place Update Storage Engine Ustore

Issue