SCA Command Options

SCA is a command line tool. All functions can be realized through single executable program.

General Options

General options may be used in all types of tasks.

--help (Help Information)

Before using the analysis program, you are advised to read help information carefully. (Command: ./sca_linux_x86_64 --help):

Introduction:
    SCA is a tool used to do SQL life-cycle inspection and simulation
    when migration from heterogeneous database to MogDB.
    Source databases supported as below:
        1. Oracle     : SQL Inspection, Simulation
        2. DB2        : SQL Inspection
        3. MySQL      : SQL Inspection
        4. PostgreSQL : SQL Inspection
        5. Informix   : SQL Inspection
        5. SQL Server : SQL Inspection

Options:
                          --[ Overall ]--
        --help            : Show help message
    -v, --version         : Show SCA version
    -T, --type            : Run type:
                          :        I = Init SCA repository
                          :        L = Apply for license
                          :        D = Delete repository data
                          :     ----------------[Analysis: Target required]---------------
                          :       OI = Oracle Inspection, this is default type
                          :       MI = MySQL Inspection
                          :       DI = DB2 Inspection
                          :       PI = PostgreSQL Inspection
                          :       II = Informix Inspection
                          :       SI = SQL Server Inspection
                          :       OS = Oracle Simulation, just like Oracle SPA
                          :      OIS = Oracle Inspection & Simulation
                          :     ------------[Collection: No need of target DB]------------
                          :       OC = Oracle Collection
                          :       MC = MySQL Collection
                          :       DC = DB2 Collection
                          :       PC = PostgreSQL Collection
                          :       IC = Informix Collection
                          :       SC = SQL Server Collection
    -d, --data            : Unzipped data directory for analyzer, or directory for collection
    -D, --data-id         : Use data with data id existed in the repository
    -w, --workers         : Parallel workers for tasks, default: 10
    -x, --debug           : Enable debug mode
    -l, --logfile         : Output to both logfile (without progress-bar) and screen (with progress-bar)
    -L, --log-only        : Output to only logfile (without progress-bar)
    -F, --force           : Force mode in REPO Creation, drop old objects before create it
    -r, --report          : Final report file location, default in data directory with name 'report'
        --report-lang     : Report language, you can use this option to create multiple report
        --license         : License file, default is [./license.json]
        --sql-modified    : Modified SQL list used in simulation
        --sql-config      : SQL configuration file (for internal use)
        --sql-transformer : Regular rules for SQL transformation (for internal use)
                          : Format: [{"name": "xxx"
                          :           "source": "xxx",
                          :           "target": "xxx",
                          :           "comment": "xxx"}, ...]
        --steps           : Run given steps (with step id or name) in analysis tasks
                          : Valid in type:
                          :     OI/MI/DI/PI/OS/OIS
                          : Step valid in order:
                          :     1.load            : Load data from file to repository database
                          :     2.make_object     : Create source objects in target database
                          :     3.inspect         : Do SQL compatible analysis
                          :     4.rollback_object : Rollback source objects created by make_object
                          :     5.simulate        : Do SQL performance analysis
                          :     6.html_report     : Genarate summary report in HTML format
                          :     7.excel_report    : Genarate detail report in Excel format
                          : Option value for example:
                          :     --steps 'load'      : Just load data to repository database
                          :     --steps 'inspect-'  : Run steps after 'inspect' (Step 3-7)
                          :     --steps '-5'        : Run steps before 'simulate' (Step 1-5)
                          :     --steps '2-5'       : Run steps from 'make_object' to 'simulate' (Step 2-5)
        --upgrade         : Upgrade current binary SCA command
        --passwd_encrypt  : DB password is encrypted

                          --[ Repository Connection ]--
    -H, --repo-host       : Repository DB Server host address, default: 127.0.0.1
    -P, --repo-port       : Repository DB server port, default: 5432
    -N, --repo-name       : Repository database, default: sca_db
    -U, --repo-user       : Repository user, default: sca_repo
    -E, --repo-password   : Repository password, default: SCA@password
        --user            : Administrator used to create repository DB and user, default: mogdb
        --password        : Password for Administrator, default: mogdb

                          --[ Source & Target Connection ]--
    -h, --db-host         : Source & Target DB Server host address, default same as -H
    -p, --db-port         : Source & Target DB server port, default same as -P
    -n, --db-name         : Source & Target database, default same as -N
    -u, --db-user         : Source & Target user, default same as -U
    -e, --db-password     : Source & Target password, default same as -E
        --target-type     : Target database type in analysis tasks, default: MOGDB
                          : Valid type: ORACLE, MOGDB, OPENGAUSS, POSTGRESQL, MYSQL, DB2

                          --[ Collection Options ]--
    -q, --sql-days        : How many days for session sql data, default: 7
    -Q, --sql-interval    : SQL collect interval in seconds, default: 600
    -s, --schema-include  : Users/Schemas included in data collection, default: ''
    -S, --schema-exclude  : Users/Schemas excluded in data collection
                          : Default: <<depends on DB type>>
    -m, --enable-monitor  : Starting background monitor process in SQL Collection
                          : Valid values: 1/on/true/t = ENABLE, default: on
                          :               0/off/false/f = DISABLE
        --slow-log        : MySQL slow-log for client data collection
        --ignore-bind-plan: Ingore SQL binds and plans in Oracle data collection
        --callback        : Using callback to get PID/progress
        --disable-rowid   : Disable rowid rules and type convertor in SQL inspector
        --sql-csv         : SQL file in csv format for SQL Inspection (@todo)

Usage:
    0. Apply for license
       ./sca_linux_x86_64 -T L
    1. Init repository (used for first running)
       ./sca_linux_x86_64 -T i -H <host> -P <port> -N <repo-database> -U <repo-user> -E <repo-password> --user <super_user> --password <super_password>
    2. Oracle data collection
       # Notice: "-q 0.001 -Q 60" means gather Session SQL only once
       #         "-m off" means do not monitor system status (CPU Idle and Disk Free)
       ./sca_linux_x86_64 -T OC -s SCOTT -h <host> -p <port> -n '<target-db>' -u <oracle-user> -e <oracle-password> -q 0.001 -Q 60 -m off
       ./sca_linux_x86_64 -T OC -s SCOTT -h <host> -p <port> -n '<target-db>' -u <oracle-user> -e <oracle-password>
    3. MySQL data collection using slow-log file
       ./sca_linux_x86_64 -T MC -d <report-directory> --slow-log=<slow-log-file>
    4. Oracle SQL compatible analysis (Required: Repository, Target DB)
       Note: use [H/P/N/U/E] options to assign the repository
             use [h/p/n/u/e] options to assign the target database
       ./sca_linux_x86_64 -T OI -d <unzipped data directory> -n <target_db>
    5. Oracle SQL performance simulation (Required: Repository, Target DB)
       Note: use [H/P/N/U/E] options to assign the repository
             use [h/p/n/u/e] options to assign the target database
       ./sca_linux_x86_64 -T OS -d <unzipped data directory> -n <target_db>
    6. MySQL SQL compatible analysis (Required: Repository, Target DB)
       Not e: use [H/P/N/U/E] options to assign the repository
             use [h/p/n/u/e] options to assign the target database
       ./sca_linux_x86_64 -T MI -d <unzipped data directory> -h <host> -p <port> -n <target-db>

-v, --version (Version)

View the current version of SCA:

hongyedba@localhost ~ % ./sca_linux_x86_64 -v
SCA version: 5.1.0

-T, --type (Task Type)

Default value: OI

Specifies the task type. Currently, SCA supports the following types of tasks:

Runtime tasks
1. I [Initialize]: Initialize, used to initialize the SCA repository
2. L [Apply License]: Apply license, task for license application
3. D [Deletion]: Deletion，used to delete some data_id related data in the database
Collection tasks
1. OC [Oracle Collection]: Oracle collection, used to collect the business SQL executed in Oracle database, need long time to collect
2. MC [MySQL Collection]: MySQL collection, used to collect the business SQL executed in MySQL database, need to configure the slow log in advance, and then collect at once
3. DC [DB2 Collection]: DB2 collection, used to collect the business SQL executed in DB2 database, need long time to collect
4. PC [PostgreSQL Collection]: PostgreSQL collection, used to collect the business SQL executed in PostgreSQL database (using plugin pg_stat_statements), need long time to collect
5. IC [Informix Collection]: Informix collection, used to collect the business SQL executed in Informix database (need enable global sql tracing manually), need long time to collect
6. SC [SQL Server Collection]: SQL Server collection, used to collect the business SQL executed in SQL Server database, need long time to collect
Analysis tasks
1. OI [Oracle Inspection]: Oracle compatibility evaluation, used to evaluate the actual compatibility of business SQL collected in Oracle on the source side in MogDB on the target side
2. MI [MySQL Inspection]: MySQL compatibility evaluation, used to evaluate the actual compatibility of business SQL collected in MySQL on the source side in MogDB on the target side
3. DI [DB2 Inspection]: DB2 compatibility evaluation, used to evaluate the actual compatibility of business SQL collected in DB2 on the source side in MogDB on the target side
4. PI [PostgreSQL Inspection]: PostgreSQL compatibility evaluation, used to evaluate the actual compatibility of business SQL collected in PostgreSQL on the source side in MogDB on the target side
5. II [Informix Inspection]: Informix compatibility evaluation, used to evaluate the actual compatibility of business SQL collected in Informix on the source side in MogDB on the target side
6. SI [SQL Server Inspection]: SQL Server compatibility evaluation, used to evaluate the actual compatibility of business SQL collected in SQL Server on the source side in MogDB on the target side
7. OS [Oracle Simulation]: Oracle performance evaluation, used to evaluate the execution performance of business SQL collected in Oracle on the source side in MogDB on the target side
8. OIS [Oracle Inspection & Simulation]: Oracle compatibility and performance evaluation, equivalent to OI + OS two tasks at the same time

-d, --data (Data Directory)

The -d data directory can be specified for all task types except repository initialization (-T I).

The collection task writes the collected data to the data directory specified by -d.

The analysis task reads data from the data directory, inserts the data into the repository table, and the final generated report results are also written to the data directory by default.

-D, --data-id (Data ID)

Specify the data ID and then read the data with the specified ID directly from the repository instead of re-reading and loading the data from the data directory.

Specifying the -D option skips the step of loading data and performs the relevant analysis task directly.

-w, --workers (Degree of Parallelism)

The default degree of parallelism is 10.

Specifies the degree of parallelism for running tasks. Appropriate degree of parallelism can improve the running speed of each task.

Applicable to: file data loaded to a data repository, SQL compatibility assessment, SQL complexity assessment, SQL performance simulation, and other operations

Note:

In SQL simulation tasks, a greater degree of parallelism may lead to degradation of the execution efficiency of single SQL. Therefore, an appropriate degree of parallelism needs to be chosen according to the loading of the actual production environment and is generally set to the average number of active sessions of a production database.

-x, --debug (Debug Mode)

Enabling the debug mode does not affect normal analysis logic but will output a large number of logs. The debug mode is usually used for assisting the analysis program in running exceptions itself.

-l, --logfile (Log File)

Specifies the log file of a program. The program will output data to both the terminal command line and log file.

-L, --log-only (Log File)

Specifies the log file of a program. The program will output data only to the log file.

-F, --force (Forcible Mode)

Specifies whether to enable the forcible mode. This mode takes effect in the data repository initialization scenarios.

During the initialization of a data repository, if the forcible mode is enabled, the data repository will be deleted first and then re-created.

-r, --report (Report Directory)

Default value: <DATA_DIR>/report

Specifies the report directory. The report here includes the compatibility evaluation and SQL simulation reports in the HTML format, which can be viewed offline.

Also, for the compatibility evaluation task, a sql_detail_list.csv file is generated to record the evaluation results of all SQL and possible rewriting options.

--report-lang (Report Language)

Default value: zh_CN

Specifies the report language. SCA was released with two built-in languages: 'zh_CN' (Chinese) and 'en_US' (English).

Customized language files are also supported, and generate reports with different languages are alse supported.

If you want to use a custom language, then please refer to the following steps

Download the sample localization language file (each version of the localization language file may be slightly different, so be sure to use the language file for the same version of the program)
Customize the item values of each variable in the language file
Upload the edited language file to the same directory of the sca program, change its name to 'Localization_.toml', where is the custom language name, case independent
Run the sca analysis task, specifying the command option --report-lang=''

--license (License File)

Default value: ./license.json

The location of the License file.

--sql-modified (custom SQL rewriting)

In SQL performance simulation, a list of unsupported SQL file (stored in a custom report directory) is generated each time the performance simulation is run, named: Simulator_modified_list.csv.

This option is used to manually modify some of the unsupported SQL in the SQL performance simulation task to achieve more SQL support and compare as many SQL performance changes as possible.

The process of custom SQL modification is as follows.

Complete the SQL performance simulation task (first performance simulation) using the standard process
Open the Simulator_modified_list.csv file in the report directory and manually modify the syntax of the SQL statements in this file so that they can be executed correctly in the target database
Add the -D <data-id> --sql-modified <repor-dir>/Simulator_modified_list.csv option to the original options to resimulate the performance of the failed SQL
Continue to check step 2 and step 3 until the final performance simulated SQL ranges as expected and retrieve the final analysis report

--sql-config (自定义SQL)

[Advanced Usage] Specify SQL command used in each steps, for internal emergency troubleshooting use only。

--sql-transformer (SQL transform rules)

[Advanced Usage] Specify the SQL transform rules from the source database to the target database, and use regular matching for transform.

--steps

Run the specified steps during the analysis, either with the step name or number.

This option is available for the following task types: OI, MI, DI, PI, OS, OIS.

Valid step numbers and names are described below:

load : Load data from file to repository database
make_object : Create source objects in target database
inspect : Do SQL compatible analysis
rollback_object : Rollback source objects created by make_object
simulate : Do SQL performance analysis
html_report : Genarate summary report in HTML format
excel_report : Genarate detail report in Excel format

Valid usage examples as follows:

--steps 'load' : Just load data to repository database
--steps 'inspect-' : Run steps after 'inspect' (Step 3-7)
--steps '-5' : Run steps before 'simulate' (Step 1-5)
--steps '2-5' : Run steps from 'make_object' to 'simulate' (Step 2-5)

--upgrade

Automatic upgrade current SCA program when the environment is online.

passwd_encrypt

Adapts to MDB and supports decrypting encrypted passwords in MDB.

Repository Options

-H, --repo-host (Repository IP Addresses)

Default value: 127.0.0.1

Repository database IP Addresses.

-P, --repo-port (Repository Port)

Default value: 5432

Repository database port.

-N, --repo-name (Repository Name)

Default value: sca_db

Repository database name.

-U, --repo-user (Repository User)

Default value: sca_repo

Repository database login user.

-E, --repo-password (Repository Password)

Default value: SCA@password

Repository database login password.

-U, --repo-schema (Repository Schema)

Default value: sca_repo

Repository database schema。

--user (Administrator User)

Repository database administrator user, used to create repository user and repository database during the initialization of the repository.

--password (Administrator Password)

Repository administrator user login password, used to create repository user and repository database during the initialization of the repository.

Source or Target Database Connection

-h, --db-host (Source or Target Database IP Addresses)

Source or target database IP Addresses, inherits the -H option value by default.

-p, --db-port (Source or Target Database Port)

Source or target database port, inherits the -P option value by default.

-n, --db-name (Source or Target Database Name)

Source or target database Name, inherits the -N option value by default.

In the SQL Analysis task, specify the target database name, which is usually the target database created on the MogDB side when using MTK or other data migration tools for database structure and data migration.

Note that the repository user needs to have full operational privileges to the target database by default, and the default repository is administrator privileges.

In the compatibility assessment, only the target database needs to have an object structure
In SQL simulation, the target database needs to have both object structure and real peer-to-peer full production data, otherwise the performance simulation results are not informative

-u, --db-user (Source or Target Database User)

Source or target database login user, inherits the -U option value by default.

-e, --db-password (Source or Target Database Password)

Source or target database login password, inherits the -E option value by default.

--target-type (Target Database Type)

Specify the target database type in the analysis task, the default is MOGDB.

Currently this parameter is not perfect, only MOGDB/POSTGRESQL is supported.

Data Collection Options

-q, --sql-days (Number of Days for SQL Collection)

Specifies the total number of days for SQL-related data collection. Data miss may occur in collecting the executed SQL data from session cache GV$SQLAREA. You can prolong the number of collection days to reduce the data collection miss possibility.

By default, SQL data of a week will be collected.

-Q, --sql-interval (SQL Collection Interval)

Specifies the SQL-related data collection interval. By default, the SQL data is collected every 10 minutes.

The SQL data collected every time will be compared with the collected data to filter out the repeated data. This can prevent the data file from being too large.

-s, schema-include (Schema Whitelist)

Specifies the schema whitelist for data collection. Only schema-related data listed in the whitelist is collected.

-S, schema-exclude (Schema Blacklist)

Specifies the schema blacklist for data collection. Schema-related data listed in the blacklist is not collected.

By default, Oracle system users are listed in the schema blacklist.

-m, enable-monitor (Resource Monitoring)

Specifies whether to enable the resource monitoring process in the background. This process is enabled by default.

The resource monitoring process queries the CPU usage of the current server every 3s and the remaining space of the file system where the data directory is located.

When the CPU usage is greater than 90% or the remaining space of the file system is lower than 100 MB, the monitoring sub-process will stop collecting data from the primary process so that the server will not become faulty due to resource problems.

--slow-log (Slow Query Log)

Specifies the MySQL slow query log file.

When the collection program cannot access the target MySQL database and the collection program is not running on the MySQL server, you can manually retrieve the MySQL slow log from the target database and use the current option to specify the slow log to parse and generate the corresponding collection data, which can be used for subsequent MySQL SQL compatibility evaluation tasks.

--ignore-bind-plan (Ignore SQL Binds and Plans)

During Oracle data collection, ignore SQL bind variables and execution plan collection. Ignoring collection can avoid Oracle dictionary query bugs due to bind variables or execution plan parsing, and speed up data collection time, but may lead to accuracy degradation of data analysis later.

--callback (callback)

Using callback to get PID/progress and some error message.

Callback request method: POST Callback request message: { "progressBar": 80, "pid": 12345, "dataId": 2, "error": "error message occured" }

--disable-rowid (disable rowid convert)

Disable rowid rules and type convertor in SQL inspector. Avoid errors related to implicit conversion of rowid in SQL when rewriting rowid to ctid