Pruning ORDER BY in Subqueries

Availability

This feature is available since MogDB 5.0.6.

Introduction

This feature allows subqueries to prune unnecessary ORDER BY columns as needed when the parent query involves aggregation, grouping, or sorting operations, reducing unnecessary performance waste.

Benefits

Eliminate useless ORDER BY statements within SQL subqueries to improve query performance, simplify query logic, and reduce unnecessary overhead, thereby enhancing the overall efficiency and performance of the database system.

Description

When the parent query has no need for order, useless ORDER BY statements in the SQL subquery can be pruned to enhance query performance and execution efficiency, generating a better execution plan.

For example:

SELECT a, b FROM (SELECT * FROM t1 ORDER BY a, b, c) s1 GROUP BY a, b;

In the above SQL statement, subquery s1 is sorted by a, b, and c, but the upper query only performs grouping operations on a and b. Column c will not be output, and the sorting of column c in the subquery is not used in the upper query, so it can be pruned.

Pruning the sort keys in the subquery must meet the following conditions:

The parent query is valid SQL and can generate an execution plan normally.
If the parent query does not change the output order of the subquery, the subquery is not pruned.
The subquery cannot be a non-inline CTE or expression subquery.
The subquery must not have order-sensitive operations (e.g., for update, limit, etc.).

Note that when the natural order of the parent query comes entirely from the data source subquery, the sorting in the subquery will not be pruned.

The feature is enabled by the GUC parameter sort_key_pruning_level, which takes effect when set to balanced/aggressive, with the default being on.

Parameter Description

A new GUC parameter sort_key_pruning_level is added to control the optimizer's pruning rules for useless sort keys in subqueries.

This parameter is of type USERSET, with a range of values from off, balanced, to aggressive. The values correspond to different optimization levels. The default is balanced.

off: Turns off the pruning feature for useless sort keys in subqueries.
balanced: A conservative pruning strategy. The database optimizer only attempts to prune completely useless sort keys and will not prune subqueries in set operations.
aggressive: A more aggressive pruning strategy. The database optimizer will attempt to prune all potentially unused sort keys, including those in set operations.

Constraints

When the rewrite_rule parameter has reduce_orderby enabled, it takes precedence.
For the beta feature of the old version, you need to turn off the canonical_pathkey option in sql_beta_feature. This feature only supports pruning after the Pathkey is regularized.
For set operations, in the balance mode, only UNION ALL will not be pruned, as other set operations require sorting or hashing for de-duplication, while UNION ALL does not; in aggressive mode, all set operations will be pruned.

Example

-- Create a test table
MogDB=# CREATE TABLE t1(a int, b int, c int, d int);
CREATE TABLE

-- Check for the presence of sorting in the plan
MogDB=# EXPLAIN (COSTS OFF) SELECT COUNT(*) FROM (SELECT * FROM t1 ORDER BY 1, 2) AS s1;
      QUERY PLAN
----------------------
 Aggregate
   ->  Seq Scan on t1
(2 rows)

-- Turn off the reduce_orderby option in rewrite_rule
MogDB=# set rewrite_rule = 'magicset';
SET

-- Turn off the canonical_pathkey option in sql_beta_feature
MogDB=# set sql_beta_feature = 'none';
SET

-- Conservative pruning of sort keys
MogDB=# SET sort_key_pruning_level TO balanced;
SET

-- Partial pruning
MogDB=# EXPLAIN (COSTS OFF) SELECT a FROM (SELECT * FROM t1 ORDER BY a, b) s1 GROUP BY a;
         QUERY PLAN
----------------------------
 Group
   Group By Key: t1.a
   ->  Sort
         Sort Key: t1.a
         ->  Seq Scan on t1
(5 rows)

-- Partial pruning, incremental sorting
MogDB=# EXPLAIN (COSTS OFF) SELECT * FROM (SELECT * FROM t1 ORDER BY a, b) ORDER BY a, c;
         QUERY PLAN
----------------------------
 Incremental Sort
   Sort Key: t1.a, t1.c
   Presorted Key: t1.a
   ->  Sort
         Sort Key: t1.a
         ->  Seq Scan on t1
(6 rows)

-- Complete pruning
MogDB=# EXPLAIN (COSTS OFF) SELECT * FROM (SELECT * FROM t1 ORDER BY a, b) ORDER BY b, c;
       QUERY PLAN
------------------------
 Sort
   Sort Key: t1.b, t1.c
   ->  Seq Scan on t1
(3 rows)

-- Retain merge keys
MogDB=# EXPLAIN (COSTS OFF) SELECT * FROM (SELECT * FROM t1 ORDER BY a, b) s1 LEFT JOIN t1 ON s1.a = t1.a;
                QUERY PLAN
------------------------------------------
 Hash Right Join
   Hash Cond: (public.t1.a = public.t1.a)
   ->  Seq Scan on t1
   ->  Hash
         ->  Sort
               Sort Key: public.t1.a
               ->  Seq Scan on t1
(7 rows)

-- Do not prune sorting in append
MogDB=# EXPLAIN (COSTS OFF) SELECT v.b FROM (SELECT * FROM (SELECT a, b FROM t1 ORDER BY b) UNION ALL (SELECT a, b FROM t1 ORDER BY a)) v GROUP BY 1;
                    QUERY PLAN
---------------------------------------------------
 HashAggregate
   Group By Key: __unnamed_subquery__.b
   ->  Append
         ->  Subquery Scan on __unnamed_subquery__
               ->  Sort
                     Sort Key: public.t1.b
                     ->  Seq Scan on t1
         ->  Subquery Scan on "*SELECT* 2"
               ->  Sort
                     Sort Key: public.t1.a
                     ->  Seq Scan on t1
(11 rows)

-- Aggressive pruning of sort keys
MogDB=# SET sort_key_pruning_level TO aggressive;
SET

-- Complete pruning
MogDB=# EXPLAIN (COSTS OFF) SELECT a FROM (SELECT * FROM t1 ORDER BY a, b) s1 GROUP BY a;
         QUERY PLAN
----------------------------
 HashAggregate
   Group By Key: s1.a
   ->  Subquery Scan on s1
         ->  Seq Scan on t1
(4 rows)

-- Complete pruning
MogDB=# EXPLAIN (COSTS OFF) SELECT * FROM (SELECT * FROM t1 ORDER BY a, b) ORDER BY a, c;
       QUERY PLAN
------------------------
 Sort
   Sort Key: t1.a, t1.c
   ->  Seq Scan on t1
(3 rows)

-- Complete pruning
MogDB=# EXPLAIN (COSTS OFF) SELECT * FROM (SELECT * FROM t1 ORDER BY a, b) ORDER BY b, c;
       QUERY PLAN
------------------------
 Sort
   Sort Key: t1.b, t1.c
   ->  Seq Scan on t1
(3 rows)

-- Do not retain merge keys
MogDB=# EXPLAIN (COSTS OFF) SELECT * FROM (SELECT * FROM t1 ORDER BY a, b) s1 LEFT JOIN t1 ON s1.a = t1.a;
                QUERY PLAN
------------------------------------------
 Hash Left Join
   Hash Cond: (public.t1.a = public.t1.a)
   ->  Seq Scan on t1
   ->  Hash
         ->  Seq Scan on t1
(5 rows)

-- Pruning sorting in append
MogDB=# EXPLAIN (COSTS OFF) SELECT v.b FROM (SELECT * FROM (SELECT a, b FROM t1 ORDER BY b) UNION ALL (SELECT a, b FROM t1 ORDER BY a)) v GROUP BY 1;
                    QUERY PLAN
---------------------------------------------------
 HashAggregate
   Group By Key: __unnamed_subquery__.b
   ->  Append
         ->  Subquery Scan on __unnamed_subquery__
               ->  Seq Scan on t1
         ->  Subquery Scan on "*SELECT* 2"
               ->  Seq Scan on t1
(7 rows)

sort_key_pruning_level, sql_beta_feature, rewrite_rule