MogDB
Ecological Tools
Doc Menu

Hint-based Tuning

Plan Hint Optimization

In plan hints, you can specify a join order, join and scan operations, and the number of rows in a result to tune an execution plan, improving query performance.

Function

The hint syntax must follow immediately after a SELECT keyword and is written in the following format:

/*+ <plan hint>*/

You can specify multiple hints for a query plan and separate them by spaces. A hint specified for a query plan does not apply to its subquery plans. To specify a hint for a subquery, add the hint following the SELECT of this subquery.

For example:

select /*+ <plan_hint1> <plan_hint2> */ * from t1, (select /*+ <plan_hint3> */ from t2) where 1=1;

In the preceding command, <plan_hint1> and <plan_hint2> are the hints of a query, and <plan_hint3> is the hint of its subquery.

img NOTICE: If a hint is specified in the CREATE VIEW statement, the hint will be applied each time this view is used. If the random plan function is enabled (plan_mode_seed is set to a value other than 0), the specified hint will not be used.

Scope

Currently, the following hints are supported:

  • Join order hints (leading)
  • Join operation hints, excluding the semi join, anti join, and unique plan hints
  • Rows hints
  • Scan operation hints, supporting only tablescan, indexscan, and indexonlyscan
  • Sublink name hints

Precautions

Hints do not support Agg, Sort, Setop, or Subplan.

Example

The following is the original plan and is used for comparing with the optimized ones:

create table store
(
    s_store_sk                integer               not null,
    s_store_id                char(16)              not null,
    s_rec_start_date          date                          ,
    s_rec_end_date            date                          ,
    s_closed_date_sk          integer                       ,
    s_store_name              varchar(50)                   ,
    s_number_employees        integer                       ,
    s_floor_space             integer                       ,
    s_hours                   char(20)                      ,
    s_manager                 varchar(40)                   ,
    s_market_id               integer                       ,
    s_geography_class         varchar(100)                  ,
    s_market_desc             varchar(100)                  ,
    s_market_manager          varchar(40)                   ,
    s_division_id             integer                       ,
    s_division_name           varchar(50)                   ,
    s_company_id              integer                       ,
    s_company_name            varchar(50)                   ,
    s_street_number           varchar(10)                   ,
    s_street_name             varchar(60)                   ,
    s_street_type             char(15)                      ,
    s_suite_number            char(10)                      ,
    s_city                    varchar(60)                   ,
    s_county                  varchar(30)                   ,
    s_state                   char(2)                       ,
    s_zip                     char(10)                      ,
    s_country                 varchar(20)                   ,
    s_gmt_offset              decimal(5,2)                  ,
    s_tax_precentage          decimal(5,2)                  ,
    primary key (s_store_sk)
);
create table store_sales
(
    ss_sold_date_sk           integer                       ,
    ss_sold_time_sk           integer                       ,
    ss_item_sk                integer               not null,
    ss_customer_sk            integer                       ,
    ss_cdemo_sk               integer                       ,
    ss_hdemo_sk               integer                       ,
    ss_addr_sk                integer                       ,
    ss_store_sk               integer                       ,
    ss_promo_sk               integer                       ,
    ss_ticket_number          integer               not null,
    ss_quantity               integer                       ,
    ss_wholesale_cost         decimal(7,2)                  ,
    ss_list_price             decimal(7,2)                  ,
    ss_sales_price            decimal(7,2)                  ,
    ss_ext_discount_amt       decimal(7,2)                  ,
    ss_ext_sales_price        decimal(7,2)                  ,
    ss_ext_wholesale_cost     decimal(7,2)                  ,
    ss_ext_list_price         decimal(7,2)                  ,
    ss_ext_tax                decimal(7,2)                  ,
    ss_coupon_amt             decimal(7,2)                  ,
    ss_net_paid               decimal(7,2)                  ,
    ss_net_paid_inc_tax       decimal(7,2)                  ,
    ss_net_profit             decimal(7,2)                  ,
    primary key (ss_item_sk, ss_ticket_number)
);
create table store_returns
(
    sr_returned_date_sk       integer                       ,
    sr_return_time_sk         integer                       ,
    sr_item_sk                integer               not null,
    sr_customer_sk            integer                       ,
    sr_cdemo_sk               integer                       ,
    sr_hdemo_sk               integer                       ,
    sr_addr_sk                integer                       ,
    sr_store_sk               integer                       ,
    sr_reason_sk              integer                       ,
    sr_ticket_number          integer               not null,
    sr_return_quantity        integer                       ,
    sr_return_amt             decimal(7,2)                  ,
    sr_return_tax             decimal(7,2)                  ,
    sr_return_amt_inc_tax     decimal(7,2)                  ,
    sr_fee                    decimal(7,2)                  ,
    sr_return_ship_cost       decimal(7,2)                  ,
    sr_refunded_cash          decimal(7,2)                  ,
    sr_reversed_charge        decimal(7,2)                  ,
    sr_store_credit           decimal(7,2)                  ,
    sr_net_loss               decimal(7,2)                  ,
    primary key (sr_item_sk, sr_ticket_number)
);
create table customer
(
    c_customer_sk             integer               not null,
    c_customer_id             char(16)              not null,
    c_current_cdemo_sk        integer                       ,
    c_current_hdemo_sk        integer                       ,
    c_current_addr_sk         integer                       ,
    c_first_shipto_date_sk    integer                       ,
    c_first_sales_date_sk     integer                       ,
    c_salutation              char(10)                      ,
    c_first_name              char(20)                      ,
    c_last_name               char(30)                      ,
    c_preferred_cust_flag     char(1)                       ,
    c_birth_day               integer                       ,
    c_birth_month             integer                       ,
    c_birth_year              integer                       ,
    c_birth_country           varchar(20)                   ,
    c_login                   char(13)                      ,
    c_email_address           char(50)                      ,
    c_last_review_date        char(10)                      ,
    primary key (c_customer_sk)
);
create table promotion
(
    p_promo_sk                integer               not null,
    p_promo_id                char(16)              not null,
    p_start_date_sk           integer                       ,
    p_end_date_sk             integer                       ,
    p_item_sk                 integer                       ,
    p_cost                    decimal(15,2)                 ,
    p_response_target         integer                       ,
    p_promo_name              char(50)                      ,
    p_channel_dmail           char(1)                       ,
    p_channel_email           char(1)                       ,
    p_channel_catalog         char(1)                       ,
    p_channel_tv              char(1)                       ,
    p_channel_radio           char(1)                       ,
    p_channel_press           char(1)                       ,
    p_channel_event           char(1)                       ,
    p_channel_demo            char(1)                       ,
    p_channel_details         varchar(100)                  ,
    p_purpose                 char(15)                      ,
    p_discount_active         char(1)                       ,
    primary key (p_promo_sk)
);
create table customer_address
(
    ca_address_sk             integer               not null,
    ca_address_id             char(16)              not null,
    ca_street_number          char(10)                      ,
    ca_street_name            varchar(60)                   ,
    ca_street_type            char(15)                      ,
    ca_suite_number           char(10)                      ,
    ca_city                   varchar(60)                   ,
    ca_county                 varchar(30)                   ,
    ca_state                  char(2)                       ,
    ca_zip                    char(10)                      ,
    ca_country                varchar(20)                   ,
    ca_gmt_offset             decimal(5,2)                  ,
    ca_location_type          char(20)                      ,
    primary key (ca_address_sk)
);
create table item
(
    i_item_sk                 integer               not null,
    i_item_id                 char(16)              not null,
    i_rec_start_date          date                          ,
    i_rec_end_date            date                          ,
    i_item_desc               varchar(200)                  ,
    i_current_price           decimal(7,2)                  ,
    i_wholesale_cost          decimal(7,2)                  ,
    i_brand_id                integer                       ,
    i_brand                   char(50)                      ,
    i_class_id                integer                       ,
    i_class                   char(50)                      ,
    i_category_id             integer                       ,
    i_category                char(50)                      ,
    i_manufact_id             integer                       ,
    i_manufact                char(50)                      ,
    i_size                    char(20)                      ,
    i_formulation             char(20)                      ,
    i_color                   char(20)                      ,
    i_units                   char(10)                      ,
    i_container               char(10)                      ,
    i_manager_id              integer                       ,
    i_product_name            char(50)                      ,
    primary key (i_item_sk)
);
explain
select i_product_name product_name
,i_item_sk item_sk
,s_store_name store_name
,s_zip store_zip
,ad2.ca_street_number c_street_number
,ad2.ca_street_name c_street_name
,ad2.ca_city c_city
,ad2.ca_zip c_zip
,count(*) cnt
,sum(ss_wholesale_cost) s1
,sum(ss_list_price) s2
,sum(ss_coupon_amt) s3
FROM   store_sales
,store_returns
,store
,customer
,promotion
,customer_address ad2
,item
WHERE  ss_store_sk = s_store_sk AND
ss_customer_sk = c_customer_sk AND
ss_item_sk = i_item_sk and
ss_item_sk = sr_item_sk and
ss_ticket_number = sr_ticket_number and
c_current_addr_sk = ad2.ca_address_sk and
ss_promo_sk = p_promo_sk and
i_color in ('maroon','burnished','dim','steel','navajo','chocolate') and
i_current_price between 35 and 35 + 10 and
i_current_price between 35 + 1 and 35 + 15
group by i_product_name
,i_item_sk
,s_store_name
,s_zip
,ad2.ca_street_number
,ad2.ca_street_name
,ad2.ca_city
,ad2.ca_zip
;

img

Join Order Hints

Function

Theses hints specify the join order and outer/inner tables.

Syntax

  • Specify only the join order.

    leading(join_table_list) 
  • Specify the join order and outer/inner tables. The outer/inner tables are specified by the outermost parentheses.

    leading((join_table_list)) 

Parameter Description

join_table_list specifies the tables to be joined. The values can be table names or table aliases. If a subquery is pulled up, the value can also be the subquery alias. Separate the values with spaces. You can add parentheses to specify the join priorities of tables.

img NOTICE: A table name or alias can only be a string without a schema name. An alias (if any) is used to represent a table.

To prevent semantic errors, tables in the list must meet the following requirements:

  • The tables must exist in the query or its subquery to be pulled up.
  • The table names must be unique in the query or subquery to be pulled up. If they are not, their aliases must be unique.
  • A table appears only once in the list.
  • An alias (if any) is used to represent a table.

For example:

leading(t1 t2 t3 t4 t5):t1, t2, t3, t4, and t5 are joined. The join order and outer/inner tables are not specified.

leading((t1 t2 t3 t4 t5)):t1, t2, t3, t4, and t5 are joined in sequence. The table on the right is used as the inner table in each join.

leading(t1 (t2 t3 t4) t5): First, t2, t3, and t4 are joined and the outer/inner tables are not specified. Then, the result is joined with t1 and t5, and the outer/inner tables are not specified.

leading((t1 (t2 t3 t4) t5)): First, t2, t3, and t4 are joined and the outer/inner tables are not specified. Then, the result is joined with t1, and (t2 t3 t4) is used as the inner table. Finally, the result is joined with t5, and t5 is used as the inner table.

leading((t1 (t2 t3) t4 t5)) leading((t3 t2)): First, t2 and t3 are joined and t2 is used as the inner table. Then, the result is joined with t1, and (t2 t3) is used as the inner table. Finally, the result is joined with t4 and then t5, and the table on the right in each join is used as the inner table.

Example

Hint the query plan in Example as follows:

explain
select /*+ leading((((((store_sales store) promotion) item) customer) ad2) store_returns) leading((store store_sales))*/ i_product_name product_name ...

First, store_sales and store are joined and store_sales is the inner table. Then, the result is joined with promotion, item, customer, ad2, and store_returns in sequence. The optimized plan is as follows:

img

Join Operation Hints

Function

These hints specify the join method, which can be nested loop join, hash join, or merge join.

Syntax

[no] nestloop|hashjoin|mergejoin(table_list)

Parameter Description

  • no indicates that the specified hint will not be used for a join.
  • table_list specifies the tables to be joined. The values are the same as those of join_table_list but contain no parentheses.

For example:

no nestloop(t1 t2 t3):nestloop is not used for joining t1, t2, and t3. The three tables may be joined in either of the two ways: Join t2 and t3, and then t1; join t1 and t2, and then t3. This hint takes effect only for the last join. If necessary, you can hint other joins. For example, you can add no nestloop(t2 t3) to join t2 and t3 first and to forbid the use of nestloop.

Example

Hint the query plan in Example as follows:

explain
select /*+ nestloop(store_sales store_returns item) */ i_product_name product_name ...

nestloop is used for the last join between store_sales, store_returns, and item. The optimized plan is as follows:

img

Rows Hints

Function

These hints specify the number of rows in an intermediate result set. Both absolute values and relative values are supported.

Syntax

rows(table_list #|+|-|* const)

Parameter Description

  • #, +, -, and * are operators used for hinting the estimation. # indicates that the original estimation is used without any calculation. +, -, and * indicate that the original estimation is calculated using these operators. The minimum calculation result is 1. table_list specifies the tables to be joined. The values are the same as those of table_list in Join Operation Hints.
  • const can be any non-negative number and supports scientific notation.

For example:

rows(t1 #5): The result set of t1 is five rows.

rows(t1 t2 t3 *1000): Multiply the result set of joined t1, t2, and t3 by 1000.

Suggestion

  • The hint using * for two tables is recommended. This hint will be triggered if the two tables appear on two sides of a join. For example, if the hint is rows(t1 t2 * 3), the join result of (t1 t3 t4) and (t2 t5 t6) will be multiplied by 3 because t1 and t2 appear on both sides of the join.
  • rows hints can be specified for the result sets of a single table, multiple tables, function tables, and subquery scan tables.

Example

Hint the query plan in Example as follows:

explain
select /*+ rows(store_sales store_returns *50) */ i_product_name product_name ...

Multiply the result set of joined store_sales and store_returns by 50. The optimized plan is as follows:

img

Scan Operation Hints

Function

These hints specify a scan operation, which can be tablescan, indexscan, or indexonlyscan.

Syntax

[no] tablescan|indexscan|indexonlyscan(table [index])

Parameter Description

  • no indicates that the specified hint will not be used for a join.
  • table specifies the table to be scanned. You can specify only one table. Use a table alias (if any) instead of a table name.
  • index indicates the index for indexscan or indexonlyscan. You can specify only one index.

img NOTE: indexscan and indexonlyscan hints can be used only when the specified index belongs to the table. Scan operation hints can be used for row-store tables, column-store tables, OBS tables, and subquery tables.

Example

To specify an index-based hint for a scan, create an index named i on the i_item_sk column of the item table.

create index i on item(i_item_sk);

Hint the query plan in Example as follows:

explain
select /*+ indexscan(item i) */ i_product_name product_name ...

item is scanned based on an index. The optimized plan is as follows:

img

Function

These hints specify the name of a sublink block.

Syntax

blockname (table)

Parameter Description

  • table specifies the name you have specified for a sublink block.

img NOTE:

  • The blockname hint is used by an outer query only when a sublink is pulled up. Currently, only the Agg equivalent join, IN, and EXISTS sublinks can be pulled up. This hint is usually used together with the hints described in the previous sections.
  • The subquery after the FROM keyword is hinted by using the subquery alias. In this case, blockname becomes invalid.
  • If a sublink contains multiple tables, the tables will be joined with the outer-query tables in a random sequence after the sublink is pulled up. In this case, blockname also becomes invalid.

Example

explain select /*+nestloop(store_sales tt) */ * from store_sales where ss_item_sk in (select /*+blockname(tt)*/ i_item_sk from item group by 1);

tt indicates the sublink block name. After being pulled up, the sublink is joined with the outer-query table store_sales by using nestloop. The optimized plan is as follows:

img

Hint Errors, Conflicts, and Other Warnings

Plan hints change an execution plan. You can run EXPLAIN to view the changes.

Hints containing errors are invalid and do not affect statement execution. The errors will be displayed in different ways based on statement types. Hint errors in an EXPLAIN statement are displayed as a warning on the interface. Hint errors in other statements will be recorded in debug1-level logs containing the PLANHINT keyword.

Hint error types are as follows:

  • Syntax errors

    An error will be reported if the syntax tree fails to be reduced. The No. of the row generating an error is displayed in the error details.

    For example, the hint keyword is incorrect, no table or only one table is specified in the leading or join hint, or no tables are specified in other hints. The parsing of a hint is terminated immediately after a syntax error is detected. Only the hints that have been parsed successfully are valid.

    For example:

    leading((t1 t2)) nestloop(t1) rows(t1 t2 #10)

    The syntax of nestloop(t1) is wrong and its parsing is terminated. Only leading(t1 t2) that has been successfully parsed before nestloop(t1) is valid.

  • Semantic errors

    • An error will be reported if the specified tables do not exist, multiple tables are found based on the hint setting, or a table is used more than once in the leading or join hint.
    • An error will be reported if the index specified in a scan hint does not exist.
    • If multiple tables with the same name exist after a subquery is pulled up and some of them need to be hinted, add aliases for them to avoid name duplication.
  • Duplicated or conflicted hints

    If hint duplication or conflicts occur, only the first hint takes effect. A message will be displayed to describe the situation.

    • Hint duplication indicates that a hint is used more than once in the same query, for example, nestloop(t1 t2) nestloop(t1 t2).
    • A hint conflict indicates that the functions of two hints with the same table list conflict with each other.

      For example, if nestloop (t1 t2) hashjoin (t1 t2) is used, hashjoin (t1 t2) becomes invalid. nestloop(t1 t2) does not conflict with no mergejoin(t1 t2).

      img NOTICE: The table list in the leading hint is disassembled. For example, leading ((t1 t2 t3)) will be disassembled as leading((t1 t2)) leading(((t1 t2) t3)), which will conflict with leading((t2 t1)) (if any). In this case, the latter leading(t2 t1) becomes invalid. If two hints use duplicated table lists and only one of them has the specified outer/inner table, the one without a specified outer/inner table becomes invalid.

  • A hint becomes invalid after a sublink is pulled up.

    In this case, a message will be displayed. Generally, such invalidation occurs when a sublink contains multiple tables to be joined. After the sublink is pulled up, the tables will not be join members.

  • Unsupported column types

    • Skew hints are specified to optimize redistribution. They will be invalid if their corresponding columns do not support redistribution.
  • Hints are not used.

    • If hashjoin or mergejoin is specified for non-equivalent joins, it will not be used.
    • If indexscan or indexonlyscan is specified for a table that does not have an index, it will not be used.
    • If indexscan hint or indexonlyscan is specified for a full-table scan or for a scan whose filtering conditions are not set on index columns, it will not be used.
    • The specified indexonlyscan hint is used only when the output column contains only indexes.
    • In equivalent joins, only the joins containing equivalence conditions are valid. Therefore, the leading, join, and rows hints specified for the joins without an equivalence condition will not be used. For example, t1, t2, and t3 are to be joined, and the join between t1 and t3 does not contain an equivalence condition. In this case, leading(t1 t3) will not be used.
    • To generate a streaming plan, if the distribution key of a table is the same as its join key, redistribute specified for this table will not be used. If the distribution key and join key are different for this table but the same for the other table in the join, redistribute specified for this table will be used but broadcast will not.
    • If no sublink is pulled up, the specified blockname hint will not be used.
    • Skew hints are not used possibly because:

      • The plan does not require redistribution.
      • The columns specified by hints contain distribution keys.
      • Skew information specified in hints is incorrect or incomplete, for example, no value is specified for join optimization.
      • Skew optimization is disabled by GUC parameters.