v5.0

MogDB 5.0.0

1. Version Description

MogDB 5.0.0 was officially released on 07/14/2023. As an LTS release, it is further enhanced based on version 3.0/3.1 and introduces new features from openGauss 5.0.0.

2. New Features

2.1 Incorporated with New Features of openGauss 5.0.0

High performance: The performance of basic operators is improved.
High security: user-level audit.
High availability: CM supports VIP management.
High availability: CM can be deployed on two nodes.
High availability: The read-only detection capability of CM disks is enhanced.
High availability: CM usability is improved.
High intelligence: Exception detection supports self-configuration.
Enterprise-level feature: event trigger.
Application development interface: enhanced JDBC capabilities.
Compatibilities: Compatible with most common MySQL syntax, including data types, operators, system functions, DDL, DML, DCL, and PL/SQL, implementing application migration without modification.

2.2 Parallel Index Scanning

In scenarios where indexes are used for queries (IndexScan, IndexOnlyScan), end-to-end performance can be improved by more than 10% in typical scenarios by using multiple threads to scan the indexes in parallel.

Support for normal and partitioned tables
Automatically determines whether to enable the parallel index scanning mechanism based on cost evaluation
Optimizing the Stream operator to improve the data merge processing rate between parallel scanning threads.

Related pages: Parallel Index Scanning

2.3 Parallel Query Performance Optimization

The following operators are transformed for parallel querying to better utilize multi-core processors and memory resources to improve operator querying performance:

Parallel sort merge join: when the system configuration allows parallel execution and the execution plan is suitable for parallel execution, multiple threads can be used to scan and match the associated columns at the same time, thus speeding up the execution of join operations;
Adaptive two-phase hash agg.: when executing aggregation queries, according to the characteristics of the actual data and query conditions, in scenarios suitable for adaptive two-phase hash aggregation, unnecessary memory and disk overhead can be avoided and query performance can be improved by about 6% to 17%;
Runtime result caching: In some queries, if the query results can be reused, MogDB caches the results in memory so that they can be reused for subsequent queries, which reduces the overhead of repeated computations and disk accesses and improves query performance;

2.4 Sorting Operator Optimization

Targeted optimization of sorting operators for different scenarios to reduce resource consumption and improve sorting operator performance:

Rapid sorting: For sorting on fixed-length field types, a performance improvement of 2%~5% can usually be obtained by eliminating the overhead of a large number of comparison function calls;
Incremental sorting: when sorting a specified combination of fields, when the combination of prefix fields can be utilized on the index or intermediate query results of the order, the full sort is optimized for incremental sorting to reduce the amount of data to be processed by the sort, and thus improve performance, with the use of LIMIT scenarios, it can bring about a 10 to 100 times performance improvement effect;

Related pages: Sorting Operator Optimization

2.5 SQL Bypass Enhancement

SQL Bypass is a query optimization technique for simple SQL, and this version extends the applicable scenarios of SQL Bypass:

By enabling SQL Bypass, the query performance can be improved by more than 10% under large data volume, and the performance improvement is more obvious under PBE scenarios.

Related pages:SQL Bypass

2.6 Astore Row-Level Compression

When the data is written to the aStore row storage table, the data is compressed in the background, which can get 50% storage space saving in typical scenarios, and at the same time, the performance is almost lossless through the background flow control technology to reduce the occupation of system resources.

Compression and decompression in units of record rows, for OLTP point checking scenarios, only one row of data needs to be decompressed, there is no decompression and amplification, and better query performance can be obtained;
The page in memory is compressed, which can provide higher memory hit rate under the same Shared Buffer size, and thus improve query performance;
When MVCC (Multi-Version Concurrency Control) looks up the historical version chain, there is no need to decompress the intermediate version, only the data rows of the target version. This can greatly reduce the workload of decompression and improve query performance;
Adopting the scheme of sharing dictionary with multiple data pages, compared with the scheme of training dictionary with single table, the compressed dictionary can evolve automatically with the change of data, and there is no problem of gradual failure of dictionary.

Related pages: Enhancement of Astore Row-Level Compression

2.7 Unique Index Compression

When a non-unique index of type B+Tree is created in the data table, there are multiple duplicate key/value key-value pairs in the index. By combining and compressing multiple key/value key-value pairs with the same key into a key->value list, you can get 50% index space saving, and at the same time, you can also improve the memory hit rate of the index page, which will in turn Improve the performance of index scan.

Related pages: BTree Index Compression

2.8 gstrace Enhancement

gstrace is a tool for tracing kernel code execution paths, recording kernel data structures and analyzing code performance, which can help developers gain a deeper understanding of kernel code execution and improve the efficiency of problem location. gstrace is enhanced in the following aspects in this version:

Support for tracing observation in production environment: gstrace can work without stopping the library, allowing users to specify one or more modules and functions for tracing observation, so as to avoid interference with the normal operation of the system and ensure the stability of the production environment;
Export SQL execution details: gstrace can record the details of SQL execution, including query statements, execution plans, running time and so on, this information can be used for subsequent fault analysis to help quickly locate and solve the problem, thereby improving diagnostic efficiency;
Support for tracing and analysis of modules: gstrace currently supports tracing walwriter module, pagewriter module and checkpointer module, these modules have an important role in the database system, by tracing their execution paths and recording the relevant data structure, you can gain an in-depth understanding of the system's operation, so as to carry out performance optimization and Troubleshooting;
Export analysis of lightweight locks: gstrace also supports export analysis of lightweight locks. Lightweight locks are a locking mechanism used to improve the efficiency of concurrent multi-threaded access, and are widely used in database systems. By tracing and analyzing lightweight locks, it can help identify and resolve potential thread safety issues;
Tracing Sessions and Threads: gstrace supports tracing one or more sessions and threads so that it can be pinpointed to a specific session or thread for detailed analysis and debugging;
DCF module enhancements: MogDB version 5.0 adds the ability to turn on trace for the DCF module. By specifying gstrace to trace DCF module you can get DCF-related function call information and specific information of key data structures. This helps developers and debuggers to quickly locate problems and improve diagnostic efficiency.

2.9 CM (Cluster Manager) Cluster Supports Two-node Deployments

CM cluster supports two-node deployment and realizes the majority arbitration mechanism by introducing a third-party gateway as an arbitration node, which reduces the cluster starting cost from 3 nodes to 2 nodes:

When the cluster network is normal, when any party in the CM master and backup instances initiates an election, as long as it communicates successfully with the IP of the third-party gateway, it will increase its own votes, thus realizing majority decision-making;
When the cluster network state is abnormal, any party in the CM master and backup instances finds that the communication with the CM instance at the opposite end is abnormal, but the communication with the third-party gateway IP is normal, it will consider that the opposite end may have a fault. If it is the CM master node, it will execute the down-standby process, and if it is the CM standby node, it will determine whether to execute the up-standby to realize automatic failover based on the configuration items;
Provide optional configuration to decide whether to raise the master of the standby in network partitioning scenarios to avoid the occurrence of brain cracking problems.

Related pages: CM Supporting Two-Node Deployment

2.10 Oracle Compatibility Enhancement

【1】Support To Keep The Same Name After The End With Oracle

【2】Support PLPGSQL subtype

【3】Add %rowtype Attribute To The View

【4】Oracle DBLink Syntax Compatibility

【5】Support For Constants In Package As Default Values

【6】current_date/current_time Keywords As Field Name

【7】MogDB Supports Insert All

【8】For Update Support Outer Join

【9】Support For dbms_utility.format_error_backtrace

【10】Support For Passing The Count Attribute Of An Array As A Parameter Of The Array Extend

【11】Custom Type Array

【12】Support Bypass Method When Merge Into Hit Index

【13】Support Where Current Of

【14】Aggregate Functions Support Keep Clause

【15】Aggregate Functions Support Scenario Extensions

【16】Support Subtracting Two Date Types To Return Numeric Type

【17】Support table()

【18】Remove Type Conversion Hint When Creating PACKAGE/FUNCTION/PROCEDURE

【19】Support Q Quote Escape Character

【20】Support Synonym Calls Without Parentheses For Function Without Parameters

【21】Support For Adding Nocopy Attributes To Procedure And Function Parameters

【22】Aggregate Functions Distinct Performance Optimization

3. Fixed Defects

3.1 Incorporate Fixed Defects in openGauss 5.0.0

I6NVAB: An error is reported during rollback and forcible rollback after gray upgrade from 3.0.3 to 5.0.0.
I6NK8U: When a publication/subscription name is changed to an existing name, the error message “ERROR: unsupported object class xxx” is displayed.
I6N2QD: When ESCAPE is used in SIMILAR TO, if the escape character is Chinese, an error is reported. If the LIKE escape character is Chinese, the operation is successful.
I6MOCS: In the B-compatible database, if the partition key of a level-2 partitioned table is the year function, an unknown error is reported when pg_get_tabledef is used after the ADD PARTITION operation is performed.
I6LMTU: In B-compatible mode, objects in backquotes are not supported.
I6KO5S: [Resource pooling] When the upgrade command is executed to upgrade shared storage, the upgrade fails.
I6KRZ6: [Resource pooling] The dsscmd du -p +data -f a command calculates only directories but not files.
I6IZH9: [Resource Pooling] The reform fails and does not enter the next round. The cluster is suspended.
I6O0K6: The versions 2.0.1, 2.0.0, and 2.0.5 without the CM tool fail to be committed after being gray upgraded to version 5.0.0 with the CM tool.
I61OX7: No error is reported when the value of the CM agent parameter log_dir is changed to null. After the CM agent is restarted for the change to take effect, no new CM agent log is generated.
I6K2BZ: Chameleon migration fails due to double quotation marks.
I6KIWQ: During Debezium+kafka incremental migration, the timestamp after the migration is incorrect.
I6I640: When the data check tool is used to check data in 150,000 tables, the check process is suspended.

3.2 Fixed Defects in MogDB 5.0.0

Fix the problem of sampling the same table in the same sql, specifying the same random number seedseed, and returning different results
Fix a problem with xlogdump exporting IDENTITY FULL's DELETE EVENT presence
Fix the probabilistic mutex deadlock problem in dolphin when thread pooling is turned on.
Fix the problem of abnormal exit when expanding segment page space on the standby machine
【TPCH】ERROR: Stream plan check failed
Fix the problem of slow query caused by selecting wrong query plan for Q20 under TPCH column storage
Fix switchover taking too long after 1000 warehouse high concurrency tpcc execution for half an hour
Fix gpi_interval test samples occasionally reporting errors on mogdb_function

4. Compatibility

This version supports the following operating system and CPU architecture combinations.

OS	CPU Architecture	Note
CentOS 7.x	X86_64 (Intel, AMD, Hygon, ZHAOXIN)
Redhat 7.x	X86_64 (Intel, AMD, Hygon, ZHAOXIN)
openEuler 20.03LTS	ARM (Kunpeng)、X86_64 (Intel, AMD, Hygon, ZHAOXIN)	The current version has not been verified for compatibility yet, 5.0.1 will support
Kylin V10	ARM (Kunpeng)、X86_64 (Intel, AMD, Hygon, ZHAOXIN)	OM is not compatible with openssh-8.2p1-9.p06 or above (including p06, p07, p08, p09) in the Kylin V10 sp2 upgrade patch for now.
UOS V20-D / V20-E	ARM (Kunpeng)、X86_64 (Intel, AMD, Hygon, ZHAOXIN)	The current version has not been verified for compatibility yet, 5.0.1 will support
UOS V20-A	X86_64 (Intel, AMD, Hygon, ZHAOXIN)	The current version has not been verified for compatibility yet, 5.0.1 will support

Issue