- About MogDB
- MogDB Introduction
- Comparison Between MogDB and openGauss
- MogDB Release Notes
- High Availability and Performance
- Open Source Components
- Usage Limitations
- Terms of Use
- Quick Start
- Installation Guide
- Container Installation
- Simplified Installation Process
- Standard Installation
- Manual Installation
- Administrator Guide
- Routine Maintenance
- Starting and Stopping MogDB
- Using the gsql Client for Connection
- Routine Maintenance
- Checking OS Parameters
- Checking MogDB Health Status
- Checking Database Performance
- Checking and Deleting Logs
- Checking Time Consistency
- Checking The Number of Application Connections
- Routinely Maintaining Tables
- Routinely Recreating an Index
- Data Security Maintenance Suggestions
- Log Reference
- Primary and Standby Management
- MOT Engine
- Introducing MOT
- Using MOT
- Concepts of MOT
- Appendix
- Column-store Tables Management
- Backup and Restoration
- Importing and Exporting Data
- Importing Data
- Exporting Data
- Upgrade Guide
- Common Fault Locating Cases
- Core Fault Locating
- When the TPC-C is running and a disk to be injected is full, the TPC-C stops responding
- Standby Node in the Need Repair (WAL) State
- Insufficient Memory
- Service Startup Failure
- "Error:No space left on device" Is Displayed
- After You Run the du Command to Query Data File Size In the XFS File System, the Query Result Is Greater than the Actual File Size
- File Is Damaged in the XFS File System
- Primary Node Is Hung in Demoting During a Switchover
- Disk Space Usage Reaches the Threshold and the Database Becomes Read-only
- Slow Response to a Query Statement
- Analyzing the Status of a Query Statement
- Forcibly Terminating a Session
- Analyzing Whether a Query Statement Is Blocked
- Low Query Efficiency
- "Lock wait timeout" Is Displayed When a User Executes an SQL Statement
- Table Size Does not Change After VACUUM FULL Is Executed on the Table
- An Error Is Reported When the Table Partition Is Modified
- Different Data Is Displayed for the Same Table Queried By Multiple Users
- When a User Specifies Only an Index Name to Modify the Index, A Message Indicating That the Index Does Not Exist Is Displayed
- Reindexing Fails
- An Error Occurs During Integer Conversion
- "too many clients already" Is Reported or Threads Failed To Be Created in High Concurrency Scenarios
- B-tree Index Faults
- Routine Maintenance
- Security Guide
- Database Security Management
- Performance Tuning
- System Optimization
- SQL Optimization
- WDR Snapshot Schema
- TPCC Performance Tuning Guide
- Developer Guide
- Application Development Guide
- Development Specifications
- Development Based on JDBC
- Overview
- JDBC Package, Driver Class, and Environment Class
- Development Process
- Loading the Driver
- Connecting to a Database
- Connecting to the Database (Using SSL)
- Running SQL Statements
- Processing Data in a Result Set
- Closing a Connection
- Example: Common Operations
- Example: Retrying SQL Queries for Applications
- Example: Importing and Exporting Data Through Local Files
- Example 2: Migrating Data from a MY Database to MogDB
- Example: Logic Replication Code
- JDBC API Reference
- java.sql.Connection
- java.sql.CallableStatement
- java.sql.DatabaseMetaData
- java.sql.Driver
- java.sql.PreparedStatement
- java.sql.ResultSet
- java.sql.ResultSetMetaData
- java.sql.Statement
- javax.sql.ConnectionPoolDataSource
- javax.sql.DataSource
- javax.sql.PooledConnection
- javax.naming.Context
- javax.naming.spi.InitialContextFactory
- CopyManager
- Development Based on ODBC
- Development Based on libpq
- Development Based on libpq
- libpq API Reference
- Database Connection Control Functions
- Database Statement Execution Functions
- Functions for Asynchronous Command Processing
- Functions for Canceling Queries in Progress
- Example
- Connection Characters
- Commissioning
- Appendices
- Stored Procedure
- User Defined Functions
- Autonomous Transaction
- Logical Replication
- Logical Decoding
- Foreign Data Wrapper
- Materialized View
- Materialized View Overview
- Full Materialized View
- Incremental Materialized View
- AI Features
- Overview
- Predictor: AI Query Time Forecasting
- X-Tuner: Parameter Optimization and Diagnosis
- SQLdiag: Slow SQL Discovery
- A-Detection: Status Monitoring
- Index-advisor: Index Recommendation
- DeepSQL
- Application Development Guide
- Reference Guide
- System Catalogs and System Views
- Overview of System Catalogs and System Views
- System Catalogs
- GS_AUDITING_POLICY
- GS_AUDITING_POLICY_ACCESS
- GS_AUDITING_POLICY_FILTERS
- GS_AUDITING_POLICY_PRIVILEGES
- GS_CLIENT_GLOBAL_KEYS
- GS_CLIENT_GLOBAL_KEYS_ARGS
- GS_COLUMN_KEYS
- GS_COLUMN_KEYS_ARGS
- GS_ENCRYPTED_COLUMNS
- GS_MASKING_POLICY
- GS_MASKING_POLICY_ACTIONS
- GS_MASKING_POLICY_FILTERS
- GS_MATVIEW
- GS_MATVIEW_DEPENDENCY
- GS_OPT_MODEL
- GS_POLICY_LABEL
- GS_WLM_INSTANCE_HISTORY
- GS_WLM_OPERATOR_INFO
- GS_WLM_PLAN_ENCODING_TABLE
- GS_WLM_PLAN_OPERATOR_INFO
- GS_WLM_USER_RESOURCE_HISTORY
- PG_AGGREGATE
- PG_AM
- PG_AMOP
- PG_AMPROC
- PG_APP_WORKLOADGROUP_MAPPING
- PG_ATTRDEF
- PG_ATTRIBUTE
- PG_AUTHID
- PG_AUTH_HISTORY
- PG_AUTH_MEMBERS
- PG_CAST
- PG_CLASS
- PG_COLLATION
- PG_CONSTRAINT
- PG_CONVERSION
- PG_DATABASE
- PG_DB_ROLE_SETTING
- PG_DEFAULT_ACL
- PG_DEPEND
- PG_DESCRIPTION
- PG_DIRECTORY
- PG_ENUM
- PG_EXTENSION
- PG_EXTENSION_DATA_SOURCE
- PG_FOREIGN_DATA_WRAPPER
- PG_FOREIGN_SERVER
- PG_FOREIGN_TABLE
- PG_INDEX
- PG_INHERITS
- PG_JOB
- PG_JOB_PROC
- PG_LANGUAGE
- PG_LARGEOBJECT
- PG_LARGEOBJECT_METADATA
- PG_NAMESPACE
- PG_OBJECT
- PG_OPCLASS
- PG_OPERATOR
- PG_OPFAMILY
- PG_PARTITION
- PG_PLTEMPLATE
- PG_PROC
- PG_RANGE
- PG_RESOURCE_POOL
- PG_REWRITE
- PG_RLSPOLICY
- PG_SECLABEL
- PG_SHDEPEND
- PG_SHDESCRIPTION
- PG_SHSECLABEL
- PG_STATISTIC
- PG_STATISTIC_EXT
- PG_SYNONYM
- PG_TABLESPACE
- PG_TRIGGER
- PG_TS_CONFIG
- PG_TS_CONFIG_MAP
- PG_TS_DICT
- PG_TS_PARSER
- PG_TS_TEMPLATE
- PG_TYPE
- PG_USER_MAPPING
- PG_USER_STATUS
- PG_WORKLOAD_GROUP
- PLAN_TABLE_DATA
- STATEMENT_HISTORY
- System Views
- GS_AUDITING
- GS_AUDITING_ACCESS
- GS_AUDITING_PRIVILEGE
- GS_CLUSTER_RESOURCE_INFO
- GS_INSTANCE_TIME
- GS_LABELS
- GS_MASKING
- GS_MATVIEWS
- GS_SESSION_MEMORY
- GS_SESSION_CPU_STATISTICS
- GS_SESSION_MEMORY_CONTEXT
- GS_SESSION_MEMORY_DETAIL
- GS_SESSION_MEMORY_STATISTICS
- GS_SQL_COUNT
- GS_WLM_CGROUP_INFO
- GS_WLM_PLAN_OPERATOR_HISTORY
- GS_WLM_REBUILD_USER_RESOURCE_POOL
- GS_WLM_RESOURCE_POOL
- GS_WLM_USER_INFO
- GS_STAT_SESSION_CU
- GS_TOTAL_MEMORY_DETAIL
- MPP_TABLES
- PG_AVAILABLE_EXTENSION_VERSIONS
- PG_AVAILABLE_EXTENSIONS
- PG_COMM_DELAY
- PG_COMM_RECV_STREAM
- PG_COMM_SEND_STREAM
- PG_COMM_STATUS
- PG_CONTROL_GROUP_CONFIG
- PG_CURSORS
- PG_EXT_STATS
- PG_GET_INVALID_BACKENDS
- PG_GET_SENDERS_CATCHUP_TIME
- PG_GROUP
- PG_GTT_RELSTATS
- PG_GTT_STATS
- PG_GTT_ATTACHED_PIDS
- PG_INDEXES
- PG_LOCKS
- PG_NODE_ENV
- PG_OS_THREADS
- PG_PREPARED_STATEMENTS
- PG_PREPARED_XACTS
- PG_REPLICATION_SLOTS
- PG_RLSPOLICIES
- PG_ROLES
- PG_RULES
- PG_SECLABELS
- PG_SETTINGS
- PG_SHADOW
- PG_STATS
- PG_STAT_ACTIVITY
- PG_STAT_ALL_INDEXES
- PG_STAT_ALL_TABLES
- PG_STAT_BAD_BLOCK
- PG_STAT_BGWRITER
- PG_STAT_DATABASE
- PG_STAT_DATABASE_CONFLICTS
- PG_STAT_USER_FUNCTIONS
- PG_STAT_USER_INDEXES
- PG_STAT_USER_TABLES
- PG_STAT_REPLICATION
- PG_STAT_SYS_INDEXES
- PG_STAT_SYS_TABLES
- PG_STAT_XACT_ALL_TABLES
- PG_STAT_XACT_SYS_TABLES
- PG_STAT_XACT_USER_FUNCTIONS
- PG_STAT_XACT_USER_TABLES
- PG_STATIO_ALL_INDEXES
- PG_STATIO_ALL_SEQUENCES
- PG_STATIO_ALL_TABLES
- PG_STATIO_SYS_INDEXES
- PG_STATIO_SYS_SEQUENCES
- PG_STATIO_SYS_TABLES
- PG_STATIO_USER_INDEXES
- PG_STATIO_USER_SEQUENCES
- PG_STATIO_USER_TABLES
- PG_TABLES
- PG_TDE_INFO
- PG_THREAD_WAIT_STATUS
- PG_TIMEZONE_ABBREVS
- PG_TIMEZONE_NAMES
- PG_TOTAL_MEMORY_DETAIL
- PG_TOTAL_USER_RESOURCE_INFO
- PG_TOTAL_USER_RESOURCE_INFO_OID
- PG_USER
- PG_USER_MAPPINGS
- PG_VARIABLE_INFO
- PG_VIEWS
- PLAN_TABLE
- GS_FILE_STAT
- GS_OS_RUN_INFO
- GS_REDO_STAT
- GS_SESSION_STAT
- GS_SESSION_TIME
- GS_THREAD_MEMORY_CONTEXT
- Functions and Operators
- Logical Operators
- Comparison Operators
- Character Processing Functions and Operators
- Binary String Functions and Operators
- Bit String Functions and Operators
- Mode Matching Operators
- Mathematical Functions and Operators
- Date and Time Processing Functions and Operators
- Type Conversion Functions
- Geometric Functions and Operators
- Network Address Functions and Operators
- Text Search Functions and Operators
- JSON Functions
- HLL Functions and Operators
- SEQUENCE Functions
- Array Functions and Operators
- Range Functions and Operators
- Aggregate Functions
- Window Functions
- Security Functions
- Encrypted Equality Functions
- Set Returning Functions
- Conditional Expression Functions
- System Information Functions
- System Administration Functions
- Statistics Information Functions
- Trigger Functions
- Global Temporary Table Functions
- AI Feature Functions
- Other System Functions
- Internal Functions
- Obsolete Functions
- Supported Data Types
- SQL Syntax
- ABORT
- ALTER AGGREGATE
- ALTER AUDIT POLICY
- ALTER DATABASE
- ALTER DATA SOURCE
- ALTER DEFAULT PRIVILEGES
- ALTER DIRECTORY
- ALTER EXTENSION
- ALTER FOREIGN TABLE
- ALTER FUNCTION
- ALTER GROUP
- ALTER INDEX
- ALTER LANGUAGE
- ALTER LARGE OBJECT
- ALTER MASKING POLICY
- ALTER MATERIALIZED VIEW
- ALTER OPERATOR
- ALTER RESOURCE LABEL
- ALTER ROLE
- ALTER ROW LEVEL SECURITY POLICY
- ALTER RULE
- ALTER SCHEMA
- ALTER SEQUENCE
- ALTER SERVER
- ALTER SESSION
- ALTER SYNONYM
- ALTER SYSTEM KILL SESSION
- ALTER SYSTEM SET
- ALTER TABLE
- ALTER TABLE PARTITION
- ALTER TABLESPACE
- ALTER TEXT SEARCH CONFIGURATION
- ALTER TEXT SEARCH DICTIONARY
- ALTER TRIGGER
- ALTER TYPE
- ALTER USER
- ALTER USER MAPPING
- ALTER VIEW
- ANALYZE | ANALYSE
- BEGIN
- CALL
- CHECKPOINT
- CLOSE
- CLUSTER
- COMMENT
- COMMIT | END
- COMMIT PREPARED
- COPY
- CREATE AGGREGATE
- CREATE AUDIT POLICY
- CREATE CAST
- CREATE CLIENT MASTER KEY
- CREATE COLUMN ENCRYPTION KEY
- CREATE DATABASE
- CREATE DATA SOURCE
- CREATE DIRECTORY
- CREATE EXTENSION
- CREATE FOREIGN TABLE
- CREATE FUNCTION
- CREATE GROUP
- CREATE INCREMENTAL MATERIALIZED VIEW
- CREATE INDEX
- CREATE LANGUAGE
- CREATE MASKING POLICY
- CREATE MATERIALIZED VIEW
- CREATE OPERATOR
- CREATE ROW LEVEL SECURITY POLICY
- CREATE PROCEDURE
- CREATE RESOURCE LABEL
- CREATE ROLE
- CREATE RULE
- CREATE SCHEMA
- CREATE SEQUENCE
- CREATE SERVER
- CREATE SYNONYM
- CREATE TABLE
- CREATE TABLE AS
- CREATE TABLE PARTITION
- CREATE TABLESPACE
- CREATE TEXT SEARCH CONFIGURATION
- CREATE TEXT SEARCH DICTIONARY
- CREATE TRIGGER
- CREATE TYPE
- CREATE USER
- CREATE USER MAPPING
- CREATE VIEW
- CURSOR
- DEALLOCATE
- DECLARE
- DELETE
- DO
- DROP AGGREGATE
- DROP AUDIT POLICY
- DROP CAST
- DROP CLIENT MASTER KEY
- DROP COLUMN ENCRYPTION KEY
- DROP DATABASE
- DROP DATA SOURCE
- DROP DIRECTORY
- DROP EXTENSION
- DROP FOREIGN TABLE
- DROP FUNCTION
- DROP GROUP
- DROP INDEX
- DROP LANGUAGE
- DROP MASKING POLICY
- DROP MATERIALIZED VIEW
- DROP OPERATOR
- DROP OWNED
- DROP RESOURCE LABEL
- DROP ROW LEVEL SECURITY POLICY
- DROP PROCEDURE
- DROP ROLE
- DROP RULE
- DROP SCHEMA
- DROP SEQUENCE
- DROP SERVER
- DROP SYNONYM
- DROP TABLE
- DROP TABLESPACE
- DROP TEXT SEARCH CONFIGURATION
- DROP TEXT SEARCH DICTIONARY
- DROP TRIGGER
- DROP TYPE
- DROP USER
- DROP USER MAPPING
- DROP VIEW
- EXECUTE
- EXPLAIN
- EXPLAIN PLAN
- FETCH
- GRANT
- INSERT
- LOCK
- MOVE
- MERGE INTO
- PREPARE
- PREPARE TRANSACTION
- REASSIGN OWNED
- REFRESH INCREMENTAL MATERIALIZED VIEW
- REFRESH MATERIALIZED VIEW
- REINDEX
- RELEASE SAVEPOINT
- RESET
- REVOKE
- ROLLBACK
- ROLLBACK PREPARED
- ROLLBACK TO SAVEPOINT
- SAVEPOINT
- SELECT
- SELECT INTO
- SET
- SET CONSTRAINTS
- SET ROLE
- SET SESSION AUTHORIZATION
- SET TRANSACTION
- SHOW
- SHUTDOWN
- START TRANSACTION
- TRUNCATE
- UPDATE
- VACUUM
- VALUES
- SQL Reference
- MogDB SQL
- Keywords
- Constant and Macro
- Expressions
- Type Conversion
- Full Text Search
- Introduction
- Tables and Indexes
- Controlling Text Search
- Additional Features
- Parser
- Dictionaries
- Configuration Examples
- Testing and Debugging Text Search
- Limitations
- System Operation
- Controlling Transactions
- DDL Syntax Overview
- DML Syntax Overview
- DCL Syntax Overview
- Appendix
- GUC Parameters
- GUC Parameter Usage
- File Location
- Connection and Authentication
- Resource Consumption
- Parallel Import
- Write Ahead Log
- HA Replication
- Memory Table
- Query Planning
- Error Reporting and Logging
- Alarm Detection
- Statistics During the Database Running
- Load Management
- Automatic Vacuuming
- Default Settings of Client Connection
- Lock Management
- Version and Platform Compatibility
- Faut Tolerance
- Connection Pool Parameters
- MogDB Transaction
- Developer Options
- Auditing
- Upgrade Parameters
- Miscellaneous Parameters
- Wait Events
- Query
- System Performance Snapshot
- Equality Query in a Fully-encrypted Database
- Global Temporary Table
- Scheduled Task
- Thread Pool
- Appendix
- Information Schema
- DBE_PERF
- DBE_PERF Overview
- OS
- Instance
- Memory
- File
- Object
- STAT_USER_TABLES
- SUMMARY_STAT_USER_TABLES
- GLOBAL_STAT_USER_TABLES
- STAT_USER_INDEXES
- SUMMARY_STAT_USER_INDEXES
- GLOBAL_STAT_USER_INDEXES
- STAT_SYS_TABLES
- SUMMARY_STAT_SYS_TABLES
- GLOBAL_STAT_SYS_TABLES
- STAT_SYS_INDEXES
- SUMMARY_STAT_SYS_INDEXES
- GLOBAL_STAT_SYS_INDEXES
- STAT_ALL_TABLES
- SUMMARY_STAT_ALL_TABLES
- GLOBAL_STAT_ALL_TABLES
- STAT_ALL_INDEXES
- SUMMARY_STAT_ALL_INDEXES
- GLOBAL_STAT_ALL_INDEXES
- STAT_DATABASE
- SUMMARY_STAT_DATABASE
- GLOBAL_STAT_DATABASE
- STAT_DATABASE_CONFLICTS
- SUMMARY_STAT_DATABASE_CONFLICTS
- GLOBAL_STAT_DATABASE_CONFLICTS
- STAT_XACT_ALL_TABLES
- SUMMARY_STAT_XACT_ALL_TABLES
- GLOBAL_STAT_XACT_ALL_TABLES
- STAT_XACT_SYS_TABLES
- SUMMARY_STAT_XACT_SYS_TABLES
- GLOBAL_STAT_XACT_SYS_TABLES
- STAT_XACT_USER_TABLES
- SUMMARY_STAT_XACT_USER_TABLES
- GLOBAL_STAT_XACT_USER_TABLES
- STAT_XACT_USER_FUNCTIONS
- SUMMARY_STAT_XACT_USER_FUNCTIONS
- GLOBAL_STAT_XACT_USER_FUNCTIONS
- STAT_BAD_BLOCK
- SUMMARY_STAT_BAD_BLOCK
- GLOBAL_STAT_BAD_BLOCK
- STAT_USER_FUNCTIONS
- SUMMARY_STAT_USER_FUNCTIONS
- GLOBAL_STAT_USER_FUNCTIONS
- Workload
- Session/Thread
- SESSION_STAT
- GLOBAL_SESSION_STAT
- SESSION_TIME
- GLOBAL_SESSION_TIME
- SESSION_MEMORY
- GLOBAL_SESSION_MEMORY
- SESSION_MEMORY_DETAIL
- GLOBAL_SESSION_MEMORY_DETAIL
- SESSION_STAT_ACTIVITY
- GLOBAL_SESSION_STAT_ACTIVITY
- THREAD_WAIT_STATUS
- GLOBAL_THREAD_WAIT_STATUS
- LOCAL_THREADPOOL_STATUS
- GLOBAL_THREADPOOL_STATUS
- SESSION_CPU_RUNTIME
- SESSION_MEMORY_RUNTIME
- STATEMENT_IOSTAT_COMPLEX_RUNTIME
- Transaction
- Query
- STATEMENT
- SUMMARY_STATEMENT
- STATEMENT_COUNT
- GLOBAL_STATEMENT_COUNT
- SUMMARY_STATEMENT_COUNT
- GLOBAL_STATEMENT_COMPLEX_HISTORY
- GLOBAL_STATEMENT_COMPLEX_HISTORY_TABLE
- GLOBAL_STATEMENT_COMPLEX_RUNTIME
- STATEMENT_RESPONSETIME_PERCENTILE
- STATEMENT_USER_COMPLEX_HISTORY
- STATEMENT_COMPLEX_RUNTIME
- STATEMENT_COMPLEX_HISTORY_TABLE
- STATEMENT_COMPLEX_HISTORY
- STATEMENT_WLMSTAT_COMPLEX_RUNTIME
- STATEMENT_HISTORY
- Cache/IO
- STATIO_USER_TABLES
- SUMMARY_STATIO_USER_TABLES
- GLOBAL_STATIO_USER_TABLES
- STATIO_USER_INDEXES
- SUMMARY_STATIO_USER_INDEXES
- GLOBAL_STATIO_USER_INDEXES
- STATIO_USER_SEQUENCES
- SUMMARY_STATIO_USER_SEQUENCES
- GLOBAL_STATIO_USER_SEQUENCES
- STATIO_SYS_TABLES
- SUMMARY_STATIO_SYS_TABLES
- GLOBAL_STATIO_SYS_TABLES
- STATIO_SYS_INDEXES
- SUMMARY_STATIO_SYS_INDEXES
- GLOBAL_STATIO_SYS_INDEXES
- STATIO_SYS_SEQUENCES
- SUMMARY_STATIO_SYS_SEQUENCES
- GLOBAL_STATIO_SYS_SEQUENCES
- STATIO_ALL_TABLES
- SUMMARY_STATIO_ALL_TABLES
- GLOBAL_STATIO_ALL_TABLES
- STATIO_ALL_INDEXES
- SUMMARY_STATIO_ALL_INDEXES
- GLOBAL_STATIO_ALL_INDEXES
- STATIO_ALL_SEQUENCES
- SUMMARY_STATIO_ALL_SEQUENCES
- GLOBAL_STATIO_ALL_SEQUENCES
- GLOBAL_STAT_DB_CU
- GLOBAL_STAT_SESSION_CU
- Utility
- REPLICATION_STAT
- GLOBAL_REPLICATION_STAT
- REPLICATION_SLOTS
- GLOBAL_REPLICATION_SLOTS
- BGWRITER_STAT
- GLOBAL_BGWRITER_STAT
- GLOBAL_CKPT_STATUS
- GLOBAL_DOUBLE_WRITE_STATUS
- GLOBAL_PAGEWRITER_STATUS
- GLOBAL_RECORD_RESET_TIME
- GLOBAL_REDO_STATUS
- GLOBAL_RECOVERY_STATUS
- CLASS_VITAL_INFO
- USER_LOGIN
- SUMMARY_USER_LOGIN
- GLOBAL_GET_BGWRITER_STATUS
- Lock
- Wait Events
- Configuration
- Operator
- Workload Manager
- Global Plancache
- Appendix
- Tool Reference
- Tool Overview
- Client Tool
- Server Tools
- Tools Used in the Internal System
- Error Code Reference
- Description of SQL Error Codes
- Third-Party Library Error Codes
- GAUSS-00001 - GAUSS-00100
- GAUSS-00101 - GAUSS-00200
- GAUSS 00201 - GAUSS 00300
- GAUSS 00301 - GAUSS 00400
- GAUSS 00401 - GAUSS 00500
- GAUSS 00501 - GAUSS 00600
- GAUSS 00601 - GAUSS 00700
- GAUSS 00701 - GAUSS 00800
- GAUSS 00801 - GAUSS 00900
- GAUSS 00901 - GAUSS 01000
- GAUSS 01001 - GAUSS 01100
- GAUSS 01101 - GAUSS 01200
- GAUSS 01201 - GAUSS 01300
- GAUSS 01301 - GAUSS 01400
- GAUSS 01401 - GAUSS 01500
- GAUSS 01501 - GAUSS 01600
- GAUSS 01601 - GAUSS 01700
- GAUSS 01701 - GAUSS 01800
- GAUSS 01801 - GAUSS 01900
- GAUSS 01901 - GAUSS 02000
- GAUSS 02001 - GAUSS 02100
- GAUSS 02101 - GAUSS 02200
- GAUSS 02201 - GAUSS 02300
- GAUSS 02301 - GAUSS 02400
- GAUSS 02401 - GAUSS 02500
- GAUSS 02501 - GAUSS 02600
- GAUSS 02601 - GAUSS 02700
- GAUSS 02701 - GAUSS 02800
- GAUSS 02801 - GAUSS 02900
- GAUSS 02901 - GAUSS 03000
- GAUSS 03001 - GAUSS 03100
- GAUSS 03101 - GAUSS 03200
- GAUSS 03201 - GAUSS 03300
- GAUSS 03301 - GAUSS 03400
- GAUSS 03401 - GAUSS 03500
- GAUSS 03501 - GAUSS 03600
- GAUSS 03601 - GAUSS 03700
- GAUSS 03701 - GAUSS 03800
- GAUSS 03801 - GAUSS 03900
- GAUSS 03901 - GAUSS 04000
- GAUSS 04001 - GAUSS 04100
- GAUSS 04101 - GAUSS 04200
- GAUSS 04201 - GAUSS 04300
- GAUSS 04301 - GAUSS 04400
- GAUSS 04401 - GAUSS 04500
- GAUSS 04501 - GAUSS 04600
- GAUSS 04601 - GAUSS 04700
- GAUSS 04701 - GAUSS 04800
- GAUSS 04801 - GAUSS 04900
- GAUSS 04901 - GAUSS 05000
- GAUSS 05001 - GAUSS 05100
- GAUSS 05101 - GAUSS 05200
- GAUSS 05201 - GAUSS 05300
- GAUSS 05301 - GAUSS 05400
- GAUSS 05401 - GAUSS 05500
- GAUSS 05501 - GAUSS 05600
- GAUSS 05601 - GAUSS 05700
- GAUSS 05701 - GAUSS 05800
- GAUSS 05801 - GAUSS 05900
- GAUSS 05901 - GAUSS 06000
- GAUSS 06001 - GAUSS 06100
- GAUSS 06101 - GAUSS 06200
- GAUSS 06201 - GAUSS 06300
- GAUSS 06301 - GAUSS 06400
- GAUSS 06401 - GAUSS 06500
- GAUSS 06501 - GAUSS 06600
- GAUSS 06601 - GAUSS 06700
- GAUSS 06701 - GAUSS 06800
- GAUSS 06801 - GAUSS 06900
- GAUSS 06901 - GAUSS 07000
- GAUSS 07001 - GAUSS 07100
- GAUSS 07101 - GAUSS 07200
- GAUSS 07201 - GAUSS 07300
- GAUSS 07301 - GAUSS 07400
- GAUSS 07401 - GAUSS 07480
- GAUSS 50000 - GAUSS 50999
- GAUSS 51000 - GAUSS 51999
- GAUSS 52000 - GAUSS 52999
- GAUSS 53000 - GAUSS 53699
- System Catalogs and System Views
- FAQs
- Glossary
Overview
DeepSQL is an enhancement to MogDB DB4AI. DeepSQL encapsulates common machine learning algorithms into SQL statements and supports more than 60 general algorithms, including regression algorithms (such as linear regression, logistic regression, and random forest), classification algorithms (such as KNN), and clustering algorithms (such as K-means). In addition to basic machine learning algorithms, graph-related algorithms are also included, such as algorithms about the shortest path and graph diameter. Also, it supports data processing (such as PCA), sparse vectors, common statistical algorithms (such as covariance and Pearson coefficient calculation), training set and test set segmentation, and cross validation.
Table 1 Supported machine learning algorithms: regression algorithms
Algorithm Name | Abbreviation | Application Scenario |
---|---|---|
Logistic regression | - | For example, find the risk factors of a disease, or evaluate enterprises for financial and commercial institutions. Prediction: Use a model to predict the occurrence probabilities of a disease or situation under different independent variables. Judgment: Use a model to determine the probability that a person has certain diseases or be in certain situations. |
Cox proportional hazards regression | - | The model takes the survival result and the survival time as dependent variables, can analyze the influence of many factors on the survival time simultaneously, and can analyze the data with the truncated survival time, without the need of estimating the distribution type of the data. Because of the preceding excellent properties, this model has been widely used in medical research since its inception, and is the most widely used multi-factor analysis method in survival analysis. |
Elastic net regularization | - | Elastic regression is a hybrid technique of ridge regression and lasso regression, which uses L2 and L1 regularization at the same time. When there are multiple related features, the lasso regression is likely to randomly select one of them, while the elastic regression is likely to select all of them. |
Generalized linear model | - | In some practical problems, the relationship between variables is not always linear. In this case, curves should be used for fitting. |
Marginal effect | - | Calculation of marginal effects. |
Multinomial regression | - | If there are more than two target categories, multinomial regression is required. For example, evaluate the curative effect with "ineffective", "effective", and "cured". |
Ordinal regression | - | In statistics, ordinal regression is a regression analysis used to predict ordinal variables. That is, the values of variables are within any range, and the metric distances between different values are different. It can be considered as an issue between regression and classification. Examples include the severity of illness (levels 1, 2, 3, and 4), the pain scale (no pain, mild, moderate, and severe), and the drug dose-response effects (ineffective, less effective, effective, and very effective). The difference between levels is not necessarily equal, for example, the difference between no pain and mild is not necessarily equal to the difference between moderate and severe. |
Clustered variance | - | The clustered variance module adjusts the standard error of clustering. For example, when a dataset is copied 100 times, precision of parameter estimation should not be increased, but execution of this process in compliance with an independent identically distributed (IID) assumption actually improves precision. |
Robust variance | - | The functions in the robust variance module are used to compute the robust variance (Huber-White estimator) of linear regression, logistic regression, multinomial logistic regression, and Cox proportional hazard regression (Huber-White estimation). They can be used to compute differences of data in datasets with potential anomalous noises. |
Support vector machine | SVM | Compared with traditional query optimization schemes, SVM can obtain higher query accuracy for text and hypertext classification and image classification. This also applies to image segmentation systems. |
Linear regression | - | This is widely used in economics and finance. |
Table 2 Supported machine learning algorithms: other supervised learning
Algorithm Name | Abbreviation | Application Scenario |
---|---|---|
Decision tree | - | It is one of the most widely used inductive inference algorithms. It handles the classification and prediction problems of category or continuous variables. The model can be represented by graphs and if-then rules, which is highly readable. |
Random forest | RF | Random forest is a kind of combinatorial method specially designed for decision tree classifier. It combines multiple decision trees to make predictions. |
Conditional random field | CRF | CRF is a discriminant graphic model with undirected probability. A linear chain CRF is a special type of CRF that assumes that the current state depends only on the previous state. Good results have been obtained in sequence annotation tasks such as word segmentation, part-of-speech tagging, and named entity recognition. |
Naive Bayes | - | Classification by calculating probabilities can be used to deal with multi-classification issues, such as spam filters. |
Neural network | - | It has a wide range of application scenarios, such as speech recognition, image recognition, and machine translation. It is a standard supervised learning algorithm in the domain of pattern recognition, and continues to be a research subject in the domain of computational neurology. MLP has been proved to be a general function approximation method, which can be used to fit complex functions or solve classification problems. |
k-nearest neighbors | - | In the k-nearest neighbor classification method, a distance between each training sample and a to-be-classified sample is computed, and K training samples that are closest to the to-be-classified sample are selected. If training samples of a category in the K samples accounts for a majority, the to-be-classified tuple belongs to the category. It can be used for text recognition, facial recognition, gene pattern recognition, customer churn prediction, and fraud detection. |
Table 3 Supported machine learning algorithms: data processing algorithms
Algorithm Name | Abbreviation | Application Scenario |
---|---|---|
Array operation | - | Array and vector operations, including basic addition, subtraction, multiplication, and division, exponentiation, root extraction, cos, sin, absolute value, and variance. |
Principal component analysis for dimensionality reduction | PCA | This is used to reduce dimensions and compute the principal component. |
Encoding categorical variable | - | Currently, the one-hot and dummy encoding technologies are supported. When a specific group of prediction variables need to be compared with another group of prediction variables, dummy coding is usually used, and a group of variables compared with the group of prediction variables is referred to as a reference group. One-hot encoding is similar to dummy encoding, and a difference lies in that the one-hot encoding establishes a numeric type 0/1 indication column for each classification value. In each row of data (corresponding to one data point), a value of only one classification code column can be 1. |
Matrix operation | - | Using matrix decomposition to decompose a large matrix into the product form of a simple matrix can greatly reduce the difficulty and volume of computation. Matrix addition, subtraction, multiplication, and division, extremum, mean, rank calculation, inversion, matrix decomposition (QR, LU, Cholesky), and feature extraction. |
Norms and distance functions | - | This is used to compute the norm, cosine similarity, and distance between vectors. |
Sparse vector | - | This is used to implement the sparse vector type. If there are a large number of repeated values in the vector, the vector can be compressed to save space. |
Pivot | - | Pivot tables are used to meet common row and column transposition requirements in OLAP or report systems. The pivot function can perform basic row-to-column conversion on data stored in a table and output the aggregation result to another table. It makes row and column conversion easier and more flexible. |
Path | - | It performs regular pattern matching on a series of rows and extracts useful information about pattern matching. The useful information can be a simple match count or something more involved, such as an aggregate or window function. |
Sessionize | - | The sessionize function performs time-oriented session rebuilding on a dataset that includes an event sequence. The defined inactive period indicates the end of a session and the start of the next session. It can be used for network analysis, network security, manufacturing, finance, and operation analysis. |
Conjugate gradient | - | A method for solving numerical solutions of linear equations whose coefficient matrices are symmetric positive definite matrices. |
Stemming | - | Stemming is simply to find the stem of a word. It can be used to, for example, establish a topic-focused search engine. The optimization effect is obvious on English websites, which can be a reference for websites in other languages. |
Train-Test Split | - | It is used to split a dataset into a training set and a test set. The train set is used for training, and the test set is used for verification. |
Cross validation | - | It is used to perform cross validation. |
Prediction metric | - | It is used to evaluate the quality of model prediction, including the mean square error, AUC value, confusion matrix, and adjusted R-square. |
Mini-batch preprocessor | - | It is used to pack the data into small parts for training. The advantage is that the performance is better than that of the stochastic gradient descent (the default MADlib optimizer), and the convergence is faster and smoother. |
Table 4 Supported machine learning algorithms: graph
Algorithm Name | Abbreviation | Application Scenario |
---|---|---|
All pairs shortest path | APSP | APSP finds the length (summed weight) of the shortest path between all pairs of vertices to minimize the sum of the path edge weights. |
Breadth-first search | - | This algorithm traverses paths. |
Hyperlink-induced topic search | HITS | HITS outputs the authority score and hub score of each vertex, where authority estimates the value of the content of the page and hub estimates the value of its links to other pages. |
Average path length | - | This function computes the average value of the shortest paths between each pair of vertices. The average path length is based on the "reachable target vertexes", so it ignores infinite-length paths between unconnected vertices. |
Closeness centrality | - | The closeness measures are the inverse of the sum, the inverse of the average, and the sum of inverses of the shortest distances to all reachable target vertices (excluding the source vertex). |
Chart diameter | - | The diameter is defined as the longest of all the shortest paths in the graph. |
In-Out degree | - | This algorithm computes the in-degree and out-degree of each node. The node in-degree is the number of edges pointing in to the node and node out-degree is the number of edges pointing out of the node. |
PageRank | - | Given a graph, the PageRank algorithm outputs a probability distribution representing the likelihood that a person randomly traversing the graph will arrive at any particular vertex. |
Single source shortest path | SSSP | Given a graph and a source vertex, the SSSP algorithm finds a path from the source vertex to every other vertex in the graph to minimize the sum of the weights of the path edges. |
Weakly connected component | - | Given a directed graph, the WCC is a subgraph of the original graph, where all vertices are connected to each other through a path, ignoring the direction of the edges. In the case of an undirected graph, the WCC is also a strongly connected component. This module also includes many auxiliary functions that run on the WCC output. |
Table 5 Supported machine learning algorithms: time series
Algorithm Name | Abbreviation | Application Scenario |
---|---|---|
Autoregressive integrated moving average model | ARIMA | Time series forecasting, which is used to understand and forecast future values in the series. For example, international air traveler data can be used to forecast the number of passengers. |
Table 6 Supported machine learning algorithms - sampling
Algorithm Name | Abbreviation | Application Scenario |
---|---|---|
Sample | - | Sampling. |
Stratified sampling | - | Stratified random sampling, also known as type random sampling, is used to divide the overall units into various types (or layers) according to a certain standard. Then, according to the ratio of the number of units of each type to the total number of units, the number of units extracted from each type is determined. Finally, samples are extracted from each type according to the random principle. |
Balanced sampling | - | Some classification algorithms only perform optimally when the number of samples in each class is roughly the same. Highly skewed datasets are common in many domains (such as fraud detection), so resampling to offset this imbalance can produce better decision boundaries. |
Table 7 Supported machine learning algorithms: statistics
Algorithm Name | Abbreviation | Application Scenario |
---|---|---|
Summary | - | This algorithm generates summary statistics for any data table. |
Correlation and covariance | - | Descriptive statistics, one of which computes the Pearson coefficient and correlation coefficient, and the other outputs covariance. This help us understand the characteristics of the data amount that is statistically reflected so that we can better understand the data to be mined. |
CountMin (Cormode-Muthukrishnan) | - | This algorithm counts the occurrence frequency of an element in a real-time data stream, and is ready to response to the occurrence frequency of an element at any time. No accurate counting is required. |
Flajolet-Martin | FM | This algorithm is used to obtain the number of different values in a specified column, and find the number of unique numbers in the number set. |
Most frequent values | MFV | This algorithm is used to computes frequent values. |
Hypothesis test | - | This algorithm includes F-test and chi2-test. |
Probability functions | - | The probability functions module provides cumulative distribution, density/mass, and quantile functions for various probability distributions. |
Table 8 Supported machine learning algorithms: other algorithms
Algorithm Name | Abbreviation | Application Scenario |
---|---|---|
k-means clustering | - | This algorithm is used in the clustering scenario. |
Latent Dirichlet allocation | LDA | LDA plays an important role in the topic model and is often used for text classification. |
Apriori algorithm | - | Apriori algorithm is used to discover the association between data item sets, such as the typical "beer and diaper" association. |