文档中心MogDBMogDB StackUqbar
v2.0

文档:v2.0

支持的版本:

其他版本:

TPCC测试优化指南

本章节主要介绍MogDB数据库TPCC测试方法,以及为达到最佳tpmC性能所依赖的关键系统级调优。


硬件环境

  • 服务器

    • 最佳TPCC结果是使用4路鲲鹏服务器(256C, 512G-1024G内存) + 一个2路鲲鹏服务器
    • 常规可使用2个2路鲲鹏服务器,128C, 512G-1024G内存
    • 2台X86也可以,但测试指南未使用NUMA优化
  • 硬盘

    • 数据库端尽可能使用两块NVME闪存卡
    • 其次使用3-4块SSD硬盘
  • 网卡

    • 鲲鹏配套的Hi1822
    • X86尽可能使用万兆网卡

软件环境

  • 数据库:MogDB 2.0.1

  • TPCC客户端:使用tidb优化过的BenchmarkSQL 5.0(https://github.com/pingcap/benchmarksql

  • 依赖包

    所需软件 建议版本
    numactl -
    jdk 1.8.0-242
    ant 1.10.5
    htop -

测试步骤

  1. 安装MogDB。参考安装MogDB,单机部署即可。

  2. 初始化参数设置,并重启数据库,使参数生效。参考推荐参数设置及新建测试库

  3. 下载TPCC标准测试工具BenchmarkSQL5.0。

    [root@node151 ~]# git clone -b 5.0-mysql-support-opt-2.1 https://github.com/pingcap/benchmarksql.git
    Cloning into 'tpcc-mysql'...
    remote: Enumerating objects: 106, done.
    remote: Total 106 (delta 0), reused 0 (delta 0), pack-reused 106
    Receiving objects: 100% (106/106), 64.46 KiB | 225.00 KiB/s, done.
    Resolving deltas: 100% (30/30), done.
  4. 下载安装JDK和ant依赖包。

    [root@node151 ~]# rpm -ivh ant-1.10.5-6.oe1.noarch.rpm jdk-8u281-linux-aarch64.rpm --force --nodeps
    warning: ant-1.10.5-6.oe1.noarch.rpm: Header V3 RSA/SHA1 Signature, key ID b25e7f66: NOKEY
    warning: jdk-8u281-linux-aarch64.rpm: Header V3 RSA/SHA256 Signature, key ID ec551f03: NOKEY
    Verifying...                          ################################# [100%]
    Preparing...                          ################################# [100%]
    Updating / installing...
       1:jdk1.8-2000:1.8.0_281-fcs        ################################# [ 50%]
    Unpacking JAR files...
            tools.jar...
            rt.jar...
            jsse.jar...
            charsets.jar...
            localedata.jar...
       2:ant-0:1.10.5-6.oe1               ################################# [100%]
  5. 配置JAVA环境变量

    [root@node151 ~]# tail -3  /root/.bashrc
    export JAVA_HOME=/usr/java/jdk1.8.0_281-aarch64
    export PATH=$JAVA_HOME/bin:$PATH
    export CLASSPATH=.:JAVA_HOME/lib:$BENCHMARKSQLPATH/run/ojdbc7.jar
  6. 在BenchmarkSQL所在目录下输入ant命令进行编译,编译成功后会生成build和dist两个目录。

    [root@node151 benchmarksql-5.0-mysql-support-opt-2.1]# pwd
    /tmp/benchmarksql-5.0-mysql-support-opt-2.1
    [root@node151 benchmarksql-5.0-mysql-support-opt-2.1]# ant
    Buildfile: /tmp/benchmarksql-5.0-mysql-support-opt-2.1/build.xml
    
    init:
        [mkdir] Created dir: /tmp/benchmarksql-5.0-mysql-support-opt-2.1/build
    
    compile:
        [javac] Compiling 12 source files to /tmp/benchmarksql-5.0-mysql-support-opt-2.1/build
    
    dist:
        [mkdir] Created dir: /tmp/benchmarksql-5.0-mysql-support-opt-2.1/dist
          [jar] Building jar: /tmp/benchmarksql-5.0-mysql-support-opt-2.1/dist/BenchmarkSQL-5.0.jar
    
    BUILD SUCCESSFUL
    Total time: 1 second
  7. 根据您的系统架构下载对应的JDBC驱动至BenchmarkSQL目录的lib/postgresql文件夹,并解压,删除自带的JDBC驱动。

    [root@node151 postgres]# pwd
    /tmp/benchmarksql-5.0-mysql-support-opt-2.1/lib/postgres/
    [root@node151 postgres]# ls
    openGauss-2.0.0-JDBC.tar.gz  postgresql-9.3-1102.jdbc41.jar
    [root@node151 postgres]# rm -f postgresql-9.3-1102.jdbc41.jar
    [root@node151 postgres]# tar -xf openGauss-2.0.0-JDBC.tar.gz
    [root@node151 postgres]# ls
    openGauss-2.0.0-JDBC.tar.gz  postgresql.jar
  8. 数据库端准备,创建数据库tpcc_db及用户tpcc。

    [omm@node151 ~]$ gsql -d postgres -p 26000 -r
    postgres=# create database tpcc_db;
    CREATE DATABASE
    postgres=# \q
    [omm@node151 ~]$ gsql -d tpcc_db -p 26000 -r
    tpcc_db=# CREATE USER tpcc WITH PASSWORD "tpcc@123";
    CREATE ROLE
    tpcc_db=# GRANT ALL ON schema public TO tpcc;
    GRANT
    tpcc_db=# ALTER User tpcc sysadmin;
    ALTER ROLE
  9. 客户端准备,进入BenchmarkSQL目录下的run文件夹,编辑benchmarksql配置文件,修改测试参数,包括数据库用户名、密码、IP、端口、数据库。

    [root@node151 db1]# cd /tmp/benchmarksql-5.0-mysql-support-opt-2.1/run
    [root@node151 run]# vim props.mogdb
    db=postgres
    driver=org.postgresql.Driver
    conn=jdbc:postgresql://172.16.0.176:26000/tpcc_db?prepareThreshold=1&batchMode=on&fetchsize=10&loggerLevel=off #修改连接字符串, 包含IP、端口号、数据库
    user=tpcc #用户名
    password=tpcc@123 #密码
    warehouses=100  #仓位数
    terminals=300 #并发数
    runMins=5  #运行时间
    runTxnsPerTerminal=0
    loadWorkers=100
    limitTxnsPerMin=0
    terminalWarehouseFixed=false
    newOrderWeight=45
    paymentWeight=43
    orderStatusWeight=4
    deliveryWeight=4
    stockLevelWeight=4
  10. 初始化数据

    [root@node151 run]# sh runDatabaseBuild.sh props.mogdb
    
    # ------------------------------------------------------------
    # Loading SQL file ./sql.common/tableCreates.sql
    # ------------------------------------------------------------
    create table bmsql_config (
    cfg_name    varchar(30) primary key,
    cfg_value   varchar(50)
    );
    
    
    ......
    
    # ------------------------------------------------------------
    # Loading SQL file ./sql.postgres/buildFinish.sql
    # ------------------------------------------------------------
    -- ----
    -- Extra commands to run after the tables are created, loaded,
    -- indexes built and extra's created.
    -- PostgreSQL version.
    -- ----
    vacuum analyze;
  11. 修改runBenchmark.sh文件中funcs.sh所在的实际路径。

    [root@node151 run]# vim runBenchmark.sh
    #!/usr/bin/env bash
    
    if [ $# -ne 1 ] ; then
        echo "usage: $(basename $0) PROPS_FILE" >&2
        exit 2
    fi
    
    SEQ_FILE="./.jTPCC_run_seq.dat"
    if [ ! -f "${SEQ_FILE}" ] ; then
        echo "0" > "${SEQ_FILE}"
    fi
    SEQ=$(expr $(cat "${SEQ_FILE}") + 1) || exit 1
    echo "${SEQ}" > "${SEQ_FILE}"
    
    source /tmp/benchmarksql-5.0-mysql-support-opt-2.1/run/funcs.sh $1  #将此处路径修改为文件所在的实际路径
    
    setCP || exit 1
    
    myOPTS="-Dprop=$1 -DrunID=${SEQ}"
    
    java -cp "$myCP" $myOPTS jTPCC
  12. 开始测试,运行tpcc跑分,tpmC部分即为测试结果,结果同时保存在 runLog_mmdd-hh24miss.log 下。

    [root@node151 run]# sh runBenchmark.sh props.mogdb| tee runLog_`date +%m%d-%H%M%S`.log
    
    ...
    
    15:08:26,663 [Thread-16] INFO   jTPCC : Term-00, Measured tpmC (NewOrders) = 106140.46
    15:08:26,663 [Thread-16] INFO   jTPCC : Term-00, Measured tpmTOTAL = 235800.39
    15:08:26,664 [Thread-16] INFO   jTPCC : Term-00, Session Start     = 2021-08-04 15:03:26
    15:08:26,664 [Thread-16] INFO   jTPCC : Term-00, Session End       = 2021-08-04 15:08:26
    15:08:26,664 [Thread-16] INFO   jTPCC : Term-00, Transaction Count = 1179449
    15:08:26,664 [Thread-16] INFO   jTPCC : executeTime[Payment]=29893614
    15:08:26,664 [Thread-16] INFO   jTPCC : executeTime[Order-Status]=2564424
    15:08:26,664 [Thread-16] INFO   jTPCC : executeTime[Delivery]=4438389
    15:08:26,664 [Thread-16] INFO   jTPCC : executeTime[Stock-Level]=4259325
    15:08:26,664 [Thread-16] INFO   jTPCC : executeTime[New-Order]=48509926

    调整props.mogdb,或者使用多个props文件,根据需要进行多次测试。

  13. 为了避免多次测试导致数据量太大,影响性能,可以把数据清空重新开始。

    [root@node151 run]# sh runDatabaseDestroy.sh props.mogdb
    # ------------------------------------------------------------
    # Loading SQL file ./sql.common/tableDrops.sql
    # ------------------------------------------------------------
    drop table bmsql_config;
    drop table bmsql_new_order;
    drop table bmsql_order_line;
    drop table bmsql_oorder;
    drop table bmsql_history;
    drop table bmsql_customer;
    drop table bmsql_stock;
    drop table bmsql_item;
    drop table bmsql_district;
    drop table bmsql_warehouse;
    drop sequence bmsql_hist_id_seq;

    在调试时,可以一次Build, 多次Run,但是如果是正式测试,建议每次都是 Build / Run / Destroy。


调优

1. 主机优化(鲲鹏专享)

​ 调整BIOS

  • BIOS>Advanced>MISC Config,配置Support Smmu为Disabled
  • BIOS>Advanced>MISC Config,配置CPU Prefetching Configuration为Disabled
  • BIOS>Advanced>Memory Config,配置Die Interleaving为Disable

2. 操作系统优化(鲲鹏专享)

  • 修改操作系统内核PAGESIZE为64KB(一般默认值)

  • 关闭irqbalance

    systemctl stop irqbalance
  • 调整numa_balance

    echo 0 > /proc/sys/kernel/numa_balancing
  • 调整透明大页

    echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled
    echo 'never' > /sys/kernel/mm/transparent_hugepage/defrag
  • 针对nvme磁盘io队列调度机制设置。

    echo none > /sys/block/nvme*n*/queue/scheduler

3. 文件系统配置

  • 格式为xfs,数据库大小为8K

    mkfs.xfs -b size=8192 /dev/nvme0n1 -f

4. 网络配置

  • 网卡多中断队列设置

    下载 IN500_solution_5.1.0.SPC401.zip 安装hinicadm

    [root@node151 fc]# pwd
    /root/IN500_solution_5/tools/linux_arm/fc
    [root@node151 fc]# rpm -ivh hifcadm-2.4.1.0-1.aarch64.rpm
    Verifying...                          ################################# [100%]
    Preparing...                          ################################# [100%]
            package hifcadm-2.4.1.0-1.aarch64 is already installed
    [root@node151 fc]#
  • 修改系统支持的最大中断队列数

    [root@node151 config]# pwd
    /root/IN500_solution_5/tools/linux_arm/nic/config
    [root@node151 config]# ./hinicconfig hinic0 -f std_sh_4x25ge_dpdk_cfg_template0.ini
    [root@node151 config]# reboot
    [root@node151 config]# ethtool -L enp3s0 combined 48

    不同平台,不同应用的优化值可能不同,当前128核的平台,服务器端调优值为12,客户端调优值为48。

  • 中断调优开启tso,lro,gro,gso特性。

    ethtool -K enp3s0 tso on
    ethtool -K enp3s0 lro on
    ethtool -K enp3s0 gro on
    ethtool -K enp3s0 gso on
  • 网卡固件确认与更新

    [root@node151 ~]# ethtool -i enp3s0
    driver: hinic
    version: 2.3.2.11
    firmware-version: 2.4.1.0
    expansion-rom-version:
    bus-info: 0000:03:00.0
    supports-statistics: yes
    supports-test: yes
    supports-eeprom-access: no
    supports-register-dump: no
    supports-priv-flags: no

    网卡固件版本应为2.4.1.0

  • 更新网卡固件。

    [root@node151 cfg_data_nic_prd_1h_4x25G]# pwd
    /root/IN500_solution_5/firmware/update_bin/cfg_data_nic_prd_1h_4x25G
    [root@node151 cfg_data_nic_prd_1h_4x25G]# hinicadm updatefw -i enp3s0 -f /root/IN500_solution_5/firmware/update_bin/cfg_data_nic_prd_1h_4x25G/Hi1822_nic_prd_1h_4x25G.bin

    重启服务器,再确认网卡固件版本成功更新为2.4.1.0。

5. 数据库服务端及客户端绑核

  • 鲲鹏上numa绑定优化(128核)

    • 数据库主机和客户端主机安装numa

      yum install numa* -y

      网卡驱动安装(包括数据库主机和客户端主机),参考上文网络配置部分。

    • 数据库主机

      cp `find /opt -name "bind*.sh"|head -1 ` /root
      sh /root/bind_net_irq.sh 12

      设置数据库参数:

      thread_pool_attr = '345,4,(cpubind:1-28,32-60,64-92,96-124)'
      enable_thread_pool = on

      关闭数据库 启动数据库命令替换成:

      numactl -C 1-28,32-60,64-92,96-124 mogdb --single_node -D /opt/data/db2/ -p 26000 &
    • 客户端 拷贝/root/bind_net_irq.sh 到客户端

      sh /root/bind_net_irq.sh 48

      benchmark启动命令改成

      numactl -C 0-19,32-51,64-83,96-115 sh runBenchmark.sh props.mog
  • 鲲鹏上numa绑定优化(256核)

    • 数据库主机和客户端主机安装numa

      yum install numa* -y

      网卡驱动安装(包括数据库主机和客户端主机),参考上文网络配置部分。

    • 数据库主机

      cp `find /opt -name "bind*.sh"|head -1 ` /root
      sh /root/bind_net_irq.sh 24

      设置数据库参数:

      thread_pool_attr = '696,4,(cpubind:1-28,32-60,64-92,96-124,128-156,160-188,192-220,224-252)'
      enable_thread_pool = on

      关闭数据库 启动数据库命令替换成:

      numactl -C 1-28,32-60,64-92,96-124,128-156,160-188,192-220,224-252 mogdb --single_node -D /opt/data/db2/ -p 26000 &
    • 客户端 拷贝/root/bind_net_irq.sh 到客户端

      sh /root/bind_net_irq.sh 48

      benchmark启动命令改成

      numactl -C 0-19,32-51,64-83,96-115 sh runBenchmark.sh props.mog

6. 数据库参数优化(通用)

修改PGDATA下的postgresql.conf,并重启

max_connections = 4096
allow_concurrent_tuple_update = true
audit_enabled = off
cstore_buffers =16MB
enable_alarm = off
enable_codegen = false
enable_data_replicate = off
full_page_writes = off
max_files_per_process = 100000
max_prepared_transactions = 2048
shared_buffers = 350GB
use_workload_manager = off
wal_buffers = 1GB
work_mem = 1MB
transaction_isolation = 'read committed'
default_transaction_isolation = 'read committed'
synchronous_commit = on
fsync = on
maintenance_work_mem = 2GB
vacuum_cost_limit = 2000
autovacuum = on
autovacuum_mode = vacuum
autovacuum_vacuum_cost_delay =10
xloginsert_locks = 48
update_lockwait_timeout =20min
enable_mergejoin = off
enable_nestloop = off
enable_hashjoin = off
enable_bitmapscan = on
enable_material = off
wal_log_hints = off
log_duration = off
checkpoint_timeout = 15min
autovacuum_vacuum_scale_factor = 0.1
autovacuum_analyze_scale_factor = 0.02
enable_save_datachanged_timestamp =FALSE
log_timezone = 'PRC'
timezone = 'PRC'
lc_messages = 'C'
lc_monetary = 'C'
lc_numeric = 'C'
lc_time = 'C'
enable_double_write = on
enable_incremental_checkpoint = on
enable_opfusion = on
numa_distribute_mode = 'all'
track_activities = off
enable_instr_track_wait = off
enable_instr_rt_percentile = off
track_counts =on
track_sql_count = off
enable_instr_cpu_timer = off
plog_merge_age = 0
session_timeout = 0
enable_instance_metric_persistent = off
enable_logical_io_statistics = off
enable_user_metric_persistent =off
enable_xlog_prune = off
enable_resource_track = off
instr_unique_sql_count = 0
enable_beta_opfusion = on
enable_thread_pool = on
#0核用于walwriter线程绑核
enable_partition_opfusion=off
wal_writer_cpu=0
xlog_idle_flushes_before_sleep = 500000000
max_io_capacity = 2GB
dirty_page_percent_max = 0.1
candidate_buf_percent_target = 0.7
bgwriter_delay = 500
pagewriter_sleep = 30
checkpoint_segments =10240
advance_xlog_file_num = 100
autovacuum_max_workers = 20
autovacuum_naptime = 5s
bgwriter_flush_after = 256kB
data_replicate_buffer_size = 16MB
enable_stmt_track = off
remote_read_mode=non_authentication
wal_level = archive
hot_standby = off
hot_standby_feedback = off
client_min_messages = ERROR
log_min_messages = FATAL
enable_asp = off
enable_bbox_dump = off
enable_ffic_log = off
enable_twophase_commit = off
minimum_pool_size = 200
wal_keep_segments = 1025
incremental_checkpoint_timeout = 5min
max_process_memory = 12GB
vacuum_cost_limit = 10000
xloginsert_locks = 8
wal_writer_delay = 100
wal_file_init_num = 30
wal_level=minimal
max_wal_senders=0
fsync=off
synchronous_commit = off
enable_indexonlyscan=on
thread_pool_attr = '345,4,(cpubind:1-28,32-60,64-92,96-124)'
enable_page_lsn_check = off
enable_double_write = off

7. benchmarksql调优

  • 连接串

    conn=jdbc:postgresql://10.10.10.40:26000/tpcc?prepareThreshold=1&batchMode=on&fetchsize=10&loggerLevel=off
  • 修改文件内容将数据分散,调整FILLFACTOR,数据分区。

    [root@node151 ~]# ls benchmarksql-5.0-mysql-support-opt-2.1/run/sql.common/tableCreates.sql
    benchmarksql-5.0-mysql-support-opt-2.1/run/sql.common/tableCreates.sql
    [root@node151 sql.common]# cat tableCreates.sql
    CREATE TABLESPACE example2 relative location 'tablespace2';
    CREATE TABLESPACE example3 relative location 'tablespace3';
    
    create table bmsql_config (
      cfg_name    varchar(30),
      cfg_value   varchar(50)
    );
    
    create table bmsql_warehouse (
      w_id        integer   not null,
      w_ytd       decimal(12,2),
      w_tax       decimal(4,4),
      w_name      varchar(10),
      w_street_1  varchar(20),
      w_street_2  varchar(20),
      w_city      varchar(20),
      w_state     char(2),
      w_zip       char(9)
    ) WITH (FILLFACTOR=80);
    
    create table bmsql_district (
      d_w_id       integer       not null,
      d_id         integer       not null,
      d_ytd        decimal(12,2),
      d_tax        decimal(4,4),
      d_next_o_id  integer,
      d_name       varchar(10),
      d_street_1   varchar(20),
      d_street_2   varchar(20),
      d_city       varchar(20),
      d_state      char(2),
      d_zip        char(9)
     ) WITH (FILLFACTOR=80);
    
    create table bmsql_customer (
      c_w_id         integer        not null,
      c_d_id         integer        not null,
      c_id           integer        not null,
      c_discount     decimal(4,4),
      c_credit       char(2),
      c_last         varchar(16),
      c_first        varchar(16),
      c_credit_lim   decimal(12,2),
      c_balance      decimal(12,2),
      c_ytd_payment  decimal(12,2),
      c_payment_cnt  integer,
      c_delivery_cnt integer,
      c_street_1     varchar(20),
      c_street_2     varchar(20),
      c_city         varchar(20),
      c_state        char(2),
      c_zip          char(9),
      c_phone        char(16),
      c_since        timestamp,
      c_middle       char(2),
      c_data         varchar(500)
    ) WITH (FILLFACTOR=80) tablespace example2;
    
    create sequence bmsql_hist_id_seq;
    
    create table bmsql_history (
      hist_id  integer,
      h_c_id   integer,
      h_c_d_id integer,
      h_c_w_id integer,
      h_d_id   integer,
      h_w_id   integer,
      h_date   timestamp,
      h_amount decimal(6,2),
      h_data   varchar(24)
    ) WITH (FILLFACTOR=80);
    
    create table bmsql_new_order (
      no_w_id  integer   not null,
      no_d_id  integer   not null,
      no_o_id  integer   not null
    ) WITH (FILLFACTOR=80);
    
    create table bmsql_oorder (
      o_w_id       integer      not null,
      o_d_id       integer      not null,
      o_id         integer      not null,
      o_c_id       integer,
      o_carrier_id integer,
      o_ol_cnt     integer,
      o_all_local  integer,
      o_entry_d    timestamp
    ) WITH (FILLFACTOR=80);
    
    create table bmsql_order_line (
      ol_w_id         integer   not null,
      ol_d_id         integer   not null,
      ol_o_id         integer   not null,
      ol_number       integer   not null,
      ol_i_id         integer   not null,
      ol_delivery_d   timestamp,
      ol_amount       decimal(6,2),
      ol_supply_w_id  integer,
      ol_quantity     integer,
      ol_dist_info    char(24)
    ) WITH (FILLFACTOR=80);
    
    create table bmsql_item (
      i_id     integer      not null,
      i_name   varchar(24),
      i_price  decimal(5,2),
      i_data   varchar(50),
      i_im_id  integer
    );
    
    create table bmsql_stock (
      s_w_id       integer       not null,
      s_i_id       integer       not null,
      s_quantity   integer,
      s_ytd        integer,
      s_order_cnt  integer,
      s_remote_cnt integer,
      s_data       varchar(50),
      s_dist_01    char(24),
      s_dist_02    char(24),
      s_dist_03    char(24),
      s_dist_04    char(24),
      s_dist_05    char(24),
      s_dist_06    char(24),
      s_dist_07    char(24),
      s_dist_08    char(24),
      s_dist_09    char(24),
      s_dist_10    char(24)
    ) WITH (FILLFACTOR=80) tablespace example3;

8.数据库文件位置优化(通用)

通过将默认主目录、xlog、example2、example3分开多个底层磁盘来避免I/O瓶颈。 如果只有2个性能好的盘,优先移走xlog,如果有3个,优先xlog+ example2。 分开的示例方法如下:

PGDATA=/opt/data/mogdb
cd $PGDATA
mv pg_xlog /tpccdir1
ln -s /tpccdir1/pg_xlog .
cd pg_location
mv tablespace2 /tpccdir2
ln -s /tpccdir2/tablespace2 .
mv tablespace3 /tpccdir3
ln -s /tpccdir3/tablespace3 .

9.观察系统资源工具

  • htop 观察CPU使用情况,arm平台需要从源码编译。

    使用htop监控数据库服务端和tpcc客户端CPU利用情况,最佳性能测试情况下,各个业务CPU的占用率都非常高(> 90%)。如果有CPU占用率没有达标,可能是绑核方式不对或其他问题,需要定位找到根因进行调整。

  • iostat 查看系统IO使用情况。

  • sar 查看网络使用情况。

  • nmon 作为系统资源整体监控。

部分数据截图

  • 数据库htop htop.png
  • 客户端htop 客户端_htop.png
  • iostat 查看系统IO使用情况 iostat.png
  • sar 查看网络使用情况 sar.png

理想结果

4路鲲鹏 256C, 1000仓500并发:250W TPMC
2路鲲鹏服务器 100仓100并发:90W
2路鲲鹏服务器 100仓300并发:140W

Copyright © 2011-2024 www.enmotech.com All rights reserved.