- About MogDB
- Quick Start
- Characteristic Description
- Overview
- High Performance
- CBO Optimizer
- LLVM
- Vectorized Engine
- Hybrid Row-Column Store
- Adaptive Compression
- Adaptive Two-phase Hash Aggregation
- SQL Bypass
- Kunpeng NUMA Architecture Optimization
- High Concurrency of Thread Pools
- SMP for Parallel Execution
- Xlog no Lock Flush
- Parallel Page-based Redo For Ustore
- Row-Store Execution to Vectorized Execution
- Astore Row Level Compression
- BTree Index Compression
- Tracing SQL Function
- Parallel Index Scan
- Parallel Query Optimization
- Enhancement of Tracing Backend Key Thread
- Ordering Operator Optimization
- OCK-accelerated Data Transmission
- OCK SCRLock Accelerate Distributed Lock
- Enhancement of WAL Redo Performance
- Enhancement of Dirty Pages Flushing Performance
- Sequential Scan Prefetch
- Ustore SMP Parallel Scanning
- Statement Level PLSQL Function Cache Support
- High Availability (HA)
- Primary/Standby
- Logical Replication
- Logical Backup
- Physical Backup
- Automatic Job Retry upon Failure
- Ultimate RTO
- High Availability Based on the Paxos Protocol
- Cascaded Standby Server
- Delayed Replay
- Adding or Deleting a Standby Server
- Delaying Entering the Maximum Availability Mode
- Parallel Logical Decoding
- DCF
- CM(Cluster Manager)
- Global SysCache
- Using a Standby Node to Build a Standby Node
- Two City and Three Center DR
- CM Cluster Management Component Supporting Two Node Deployment
- Query of the Original DDL Statement for a View
- MogDB/CM/PTK Dual Network Segment Support
- Enhanced Efficiency of Logical Backup and Restore
- Maintainability
- Workload Diagnosis Report (WDR)
- Slow SQL Diagnosis
- Session Performance Diagnosis
- System KPI-aided Diagnosis
- Fault Diagnosis
- Extension Splitting
- Built-in Stack Tool
- SQL PATCH
- Lightweight Lock Export and Analysis
- DCF Module Tracing
- Error When Writing Illegal Characters
- Support For Pageinspect & Pagehack
- Autonomous Transaction Management View and Termination
- Corrupt Files Handling
- Compatibility
- Add %rowtype Attribute To The View
- Aggregate Functions Distinct Performance Optimization
- Aggregate Functions Support Keep Clause
- Aggregate Functions Support Scenario Extensions
- Compatible With MySQL Alias Support For Single Quotes
- current_date/current_time Keywords As Field Name
- Custom Type Array
- For Update Support Outer Join
- MogDB Supports Insert All
- Oracle DBLink Syntax Compatibility
- Remove Type Conversion Hint When Creating PACKAGE/FUNCTION/PROCEDURE
- Support Bypass Method When Merge Into Hit Index
- Support For Adding Nocopy Attributes To Procedure And Function Parameters
- Support For Passing The Count Attribute Of An Array As A Parameter Of The Array Extend
- Support Q Quote Escape Character
- Support Subtracting Two Date Types To Return Numeric Type
- Support table()
- Support To Keep The Same Name After The End With Oracle
- Support Where Current Of
- Support For Constants In Package As Default Values
- Support PLPGSQL subtype
- Support Synonym Calls Without Parentheses For Function Without Parameters
- Support For dbms_utility.format_error_backtrace
- Support for PIVOT and UNPIVOT Syntax
- Mod Function Compatibility
- Support for Nesting of Aggregate Functions
- ORDER BY/GROUP BY Scenario Expansion
- Support for Modifying Table Log Properties After Table Creation
- Support for INSERT ON CONFLICT Clause
- Support for AUTHID CURRENT_USER
- Support for Stored Procedure OUT Parameters in PBE Mode
- Database Security
- Access Control Model
- Separation of Control and Access Permissions
- Database Encryption Authentication
- Data Encryption and Storage
- Database Audit
- Network Communication Security
- Resource Label
- Unified Audit
- Dynamic Data Anonymization
- Row-Level Access Control
- Password Strength Verification
- Equality Query in a Fully-encrypted Database
- Ledger Database Mechanism
- Transparent Data Encryption
- Enterprise-Level Features
- Support for Functions and Stored Procedures
- SQL Hints
- Full-Text Indexing
- Copy Interface for Error Tolerance
- Partitioning
- Support for Advanced Analysis Functions
- Materialized View
- HyperLogLog
- Creating an Index Online
- Autonomous Transaction
- Global Temporary Table
- Pseudocolumn ROWNUM
- Stored Procedure Debugging
- JDBC Client Load Balancing and Read/Write Isolation
- In-place Update Storage Engine
- Publication-Subscription
- Foreign Key Lock Enhancement
- Data Compression in OLTP Scenarios
- Transaction Async Submit
- Index Creation Parallel Control
- Dynamic Partition Pruning
- COPY Import Optimization
- SQL Running Status Observation
- BRIN Index
- BLOOM Index
- Event Trigger
- Scrollable Cursor Support for Reverse Retrieval
- Support for Pruning Subquery Projection Columns
- Pruning ORDER BY in Subqueries
- Automatic Creation of Indexes Supporting Fuzzy Matching
- Support for Importing and Exporting Specific Objects
- Application Development Interfaces
- AI Capabilities
- Middleware
- Workload Management
- Installation Guide
- Upgrade Guide
- Administrator Guide
- Localization
- Routine Maintenance
- Starting and Stopping MogDB
- Using the gsql Client for Connection
- Routine Maintenance
- Checking OS Parameters
- Checking MogDB Health Status
- Checking Database Performance
- Checking and Deleting Logs
- Checking Time Consistency
- Checking The Number of Application Connections
- Routinely Maintaining Tables
- Routinely Recreating an Index
- Exporting and Viewing the WDR
- Data Security Maintenance Suggestions
- Slow SQL Diagnosis
- Log Reference
- Primary and Standby Management
- Column-store Tables Management
- Backup and Restoration
- Database Deployment Solutions
- Importing and Exporting Data
- High Available Guide
- AI Features Guide
- AI4DB: Autonomous Database O&M
- DBMind Mode
- Components that Support DBMind
- AI Sub-functions of the DBMind
- ABO Optimizer
- DB4AI: Database-driven AI
- AI4DB: Autonomous Database O&M
- Security Guide
- Developer Guide
- Application Development Guide
- Development Specifications
- Development Based on JDBC
- JDBC Package, Driver Class, and Environment Class
- Development Process
- Loading the Driver
- Connecting to a Database
- Connecting to the Database (Using SSL)
- Connecting to the Database (Using UDS)
- Running SQL Statements
- Processing Data in a Result Set
- Closing a Connection
- Managing Logs
- Example: Common Operations
- Example: Retrying SQL Queries for Applications
- Example: Importing and Exporting Data Through Local Files
- Example 2: Migrating Data from a MY Database to MogDB
- Example: Logic Replication Code
- Example: Parameters for Connecting to the Database in Different Scenarios
- Example: JDBC Primary/Standby Cluster Load Balancing
- JDBC API Reference
- java.sql.Connection
- java.sql.CallableStatement
- java.sql.DatabaseMetaData
- java.sql.Driver
- java.sql.PreparedStatement
- java.sql.ResultSet
- java.sql.ResultSetMetaData
- java.sql.Statement
- javax.sql.ConnectionPoolDataSource
- javax.sql.DataSource
- javax.sql.PooledConnection
- javax.naming.Context
- javax.naming.spi.InitialContextFactory
- CopyManager
- JDBC-based Common Parameter Reference
- JDBC Release Notes
- Development Based on ODBC
- Development Based on libpq
- Psycopg2-Based Development
- Commissioning
- Stored Procedure
- User Defined Functions
- PL/pgSQL-SQL Procedural Language
- Scheduled Jobs
- Autonomous Transaction
- Logical Replication
- Extension
- MySQL Compatibility Description
- Dolphin Extension
- Dolphin Overview
- Dolphin Installation
- Dolphin Restrictions
- Dolphin Syntax
- SQL Reference
- Keywords
- Data Types
- Functions and Operators
- Assignment Operators
- Character Processing Functions and Operators
- Arithmetic Functions and Operators
- Dolphin Lock
- Date and Time Processing Functions and Operators
- Advisory Lock Functions
- Network Address Functions and Operators
- Conditional Expression Functions
- Aggregate Functions
- System Information Functions
- Logical Operators
- Bit String Functions and Operators
- JSON-JSONB Functions and Operators
- Type Conversion Functions
- Compatible Operators and Operations
- Comment Operators
- Expressions
- DDL Syntax
- DML Syntax
- DCL Syntax
- SQL Syntax
- ALTER DATABASE
- ALTER FUNCTION
- ALTER PROCEDURE
- ALTER SERVER
- ALTER TABLE
- ALTER TABLE PARTITION
- ALTER TABLESPACE
- ALTER VIEW
- ANALYZE | ANALYSE
- AST
- CHECKSUM TABLE
- CREATE DATABASE
- CREATE FUNCTION
- CREATE INDEX
- CREATE PROCEDURE
- CREATE SERVER
- CREATE TABLE
- CREATE TABLE AS
- CREATE TABLE PARTITION
- CREATE TABLESPACE
- CREATE TRIGGER
- CREATE VIEW
- DESCRIBE TABLE
- DO
- DROP DATABASE
- DROP INDEX
- DROP TABLESPACE
- EXECUTE
- EXPLAIN
- FLUSH BINARY LOGS
- GRANT
- GRANT/REVOKE PROXY
- INSERT
- KILL
- LOAD DATA
- OPTIMIZE TABLE
- PREPARE
- RENAME TABLE
- RENAME USER
- REVOKE
- SELECT
- SELECT HINT
- SET CHARSET
- SET PASSWORD
- SHOW CHARACTER SET
- SHOW COLLATION
- SHOW COLUMNS
- SHOW CREATE DATABASE
- SHOW CREATE FUNCTION
- SHOW CREATE PROCEDURE
- SHOW CREATE TABLE
- SHOW CREATE TRIGGER
- SHOW CREATE VIEW
- SHOW DATABASES
- SHOW FUNCTION STATUS
- SHOW GRANTS
- SHOW INDEX
- SHOW MASTER STATUS
- SHOW PLUGINS
- SHOW PRIVILEGES
- SHOW PROCEDURE STATUS
- SHOW PROCESSLIST
- SHOW SLAVE HOSTS
- SHOW STATUS
- SHOW TABLES
- SHOW TABLE STATUS
- SHOW TRIGGERS
- SHOW VARIABLES
- SHOW WARNINGS/ERRORS
- UPDATE
- USE db_name
- System Views
- GUC Parameters
- Resetting Parameters
- Stored Procedures
- Identifiers
- SQL Reference
- MySQL Syntax Compatibility Assessment Tool
- Dolphin Extension
- Materialized View
- Partition Management
- Application Development Guide
- Performance Tuning Guide
- Reference Guide
- System Catalogs and System Views
- Overview
- Querying a System Catalog
- System Catalogs
- GS_ASP
- GS_AUDITING_POLICY
- GS_AUDITING_POLICY_ACCESS
- GS_AUDITING_POLICY_FILTERS
- GS_AUDITING_POLICY_PRIVILEGES
- GS_CLIENT_GLOBAL_KEYS
- GS_CLIENT_GLOBAL_KEYS_ARGS
- GS_COLUMN_KEYS
- GS_COLUMN_KEYS_ARGS
- GS_DB_PRIVILEGE
- GS_ENCRYPTED_COLUMNS
- GS_ENCRYPTED_PROC
- GS_GLOBAL_CHAIN
- GS_GLOBAL_CONFIG
- GS_MASKING_POLICY
- GS_MASKING_POLICY_ACTIONS
- GS_MASKING_POLICY_FILTERS
- GS_MATVIEW
- GS_MATVIEW_DEPENDENCY
- GS_MODEL_WAREHOUSE
- GS_OPT_MODEL
- GS_PACKAGE
- GS_POLICY_LABEL
- GS_RECYCLEBIN
- GS_TXN_SNAPSHOT
- GS_UID
- GS_WLM_EC_OPERATOR_INFO
- GS_WLM_INSTANCE_HISTORY
- GS_WLM_OPERATOR_INFO
- GS_WLM_PLAN_ENCODING_TABLE
- GS_WLM_PLAN_OPERATOR_INFO
- GS_WLM_SESSION_QUERY_INFO_ALL
- GS_WLM_USER_RESOURCE_HISTORY
- PG_AGGREGATE
- PG_AM
- PG_AMOP
- PG_AMPROC
- PG_APP_WORKLOADGROUP_MAPPING
- PG_ATTRDEF
- PG_ATTRIBUTE
- PG_AUTH_HISTORY
- PG_AUTH_MEMBERS
- PG_AUTHID
- PG_CAST
- PG_CLASS
- PG_COLLATION
- PG_CONSTRAINT
- PG_CONVERSION
- PG_DATABASE
- PG_DB_ROLE_SETTING
- PG_DEFAULT_ACL
- PG_DEPEND
- PG_DESCRIPTION
- PG_DIRECTORY
- PG_ENUM
- PG_EVENT_TRIGGER
- PG_EXTENSION
- PG_EXTENSION_DATA_SOURCE
- PG_FOREIGN_DATA_WRAPPER
- PG_FOREIGN_SERVER
- PG_FOREIGN_TABLE
- PG_HASHBUCKET
- PG_INDEX
- PG_INHERITS
- PG_JOB
- PG_JOB_PROC
- PG_LANGUAGE
- PG_LARGEOBJECT
- PG_LARGEOBJECT_METADATA
- PG_NAMESPACE
- PG_OBJECT
- PG_OPCLASS
- PG_OPERATOR
- PG_OPFAMILY
- PG_PARTITION
- PG_PLTEMPLATE
- PG_PROC
- PG_PUBLICATION
- PG_PUBLICATION_REL
- PG_RANGE
- PG_REPLICATION_ORIGIN
- PG_RESOURCE_POOL
- PG_REWRITE
- PG_RLSPOLICY
- PG_SECLABEL
- PG_SET
- PG_SHDEPEND
- PG_SHDESCRIPTION
- PG_SHSECLABEL
- PG_STATISTIC
- PG_STATISTIC_EXT
- PG_SUBSCRIPTION
- PG_SUBSCRIPTION_REL
- PG_SYNONYM
- PG_TABLESPACE
- PG_TRIGGER
- PG_TS_CONFIG
- PG_TS_CONFIG_MAP
- PG_TS_DICT
- PG_TS_PARSER
- PG_TS_TEMPLATE
- PG_TYPE
- PG_USER_MAPPING
- PG_USER_STATUS
- PG_WORKLOAD_GROUP
- PGXC_CLASS
- PGXC_GROUP
- PGXC_NODE
- PGXC_SLICE
- PLAN_TABLE_DATA
- STATEMENT_HISTORY
- System Views
- GET_GLOBAL_PREPARED_XACTS(Discarded)
- GS_ASYNC_SUBMIT_SESSIONS_STATUS
- GS_AUDITING
- GS_AUDITING_ACCESS
- GS_AUDITING_PRIVILEGE
- GS_CLUSTER_RESOURCE_INFO
- GS_COMPRESSION
- GS_DB_PRIVILEGES
- GS_FILE_STAT
- GS_GSC_MEMORY_DETAIL
- GS_INSTANCE_TIME
- GS_LABELS
- GS_LSC_MEMORY_DETAIL
- GS_MASKING
- GS_MATVIEWS
- GS_OS_RUN_INFO
- GS_REDO_STAT
- GS_SESSION_CPU_STATISTICS
- GS_SESSION_MEMORY
- GS_SESSION_MEMORY_CONTEXT
- GS_SESSION_MEMORY_DETAIL
- GS_SESSION_MEMORY_STATISTICS
- GS_SESSION_STAT
- GS_SESSION_TIME
- GS_SHARED_MEMORY_DETAIL
- GS_SQL_COUNT
- GS_STAT_SESSION_CU
- GS_THREAD_MEMORY_CONTEXT
- GS_TOTAL_MEMORY_DETAIL
- GS_WLM_CGROUP_INFO
- GS_WLM_EC_OPERATOR_STATISTICS
- GS_WLM_OPERATOR_HISTORY
- GS_WLM_OPERATOR_STATISTICS
- GS_WLM_PLAN_OPERATOR_HISTORY
- GS_WLM_REBUILD_USER_RESOURCE_POOL
- GS_WLM_RESOURCE_POOL
- GS_WLM_SESSION_HISTORY
- GS_WLM_SESSION_INFO
- GS_WLM_SESSION_INFO_ALL
- GS_WLM_SESSION_STATISTICS
- GS_WLM_USER_INFO
- IOS_STATUS
- MPP_TABLES
- PG_AVAILABLE_EXTENSION_VERSIONS
- PG_AVAILABLE_EXTENSIONS
- PG_COMM_DELAY
- PG_COMM_RECV_STREAM
- PG_COMM_SEND_STREAM
- PG_COMM_STATUS
- PG_CONTROL_GROUP_CONFIG
- PG_CURSORS
- PG_EXT_STATS
- PG_GET_INVALID_BACKENDS
- PG_GET_SENDERS_CATCHUP_TIME
- PG_GROUP
- PG_GTT_ATTACHED_PIDS
- PG_GTT_RELSTATS
- PG_GTT_STATS
- PG_INDEXES
- PG_LOCKS
- PG_NODE_ENV
- PG_OS_THREADS
- PG_PREPARED_STATEMENTS
- PG_PREPARED_XACTS
- PG_PUBLICATION_TABLES
- PG_REPLICATION_ORIGIN_STATUS
- PG_REPLICATION_SLOTS
- PG_RLSPOLICIES
- PG_ROLES
- PG_RULES
- PG_RUNNING_XACTS
- PG_SECLABELS
- PG_SESSION_IOSTAT
- PG_SESSION_WLMSTAT
- PG_SETTINGS
- PG_SHADOW
- PG_STAT_ACTIVITY
- PG_STAT_ACTIVITY_NG
- PG_STAT_ALL_INDEXES
- PG_STAT_ALL_TABLES
- PG_STAT_BAD_BLOCK
- PG_STAT_BGWRITER
- PG_STAT_DATABASE
- PG_STAT_DATABASE_CONFLICTS
- PG_STAT_REPLICATION
- PG_STAT_SUBSCRIPTION
- PG_STAT_SYS_INDEXES
- PG_STAT_SYS_TABLES
- PG_STAT_USER_FUNCTIONS
- PG_STAT_USER_INDEXES
- PG_STAT_USER_TABLES
- PG_STAT_XACT_ALL_TABLES
- PG_STAT_XACT_SYS_TABLES
- PG_STAT_XACT_USER_FUNCTIONS
- PG_STAT_XACT_USER_TABLES
- PG_STATIO_ALL_INDEXES
- PG_STATIO_ALL_SEQUENCES
- PG_STATIO_ALL_TABLES
- PG_STATIO_SYS_INDEXES
- PG_STATIO_SYS_SEQUENCES
- PG_STATIO_SYS_TABLES
- PG_STATIO_USER_INDEXES
- PG_STATIO_USER_SEQUENCES
- PG_STATIO_USER_TABLES
- PG_STATS
- PG_TABLES
- PG_TDE_INFO
- PG_THREAD_WAIT_STATUS
- PG_TIMEZONE_ABBREVS
- PG_TIMEZONE_NAMES
- PG_TOTAL_MEMORY_DETAIL
- PG_TOTAL_USER_RESOURCE_INFO
- PG_TOTAL_USER_RESOURCE_INFO_OID
- PG_USER
- PG_USER_MAPPINGS
- PG_VARIABLE_INFO
- PG_VIEWS
- PG_WLM_STATISTICS
- PGXC_PREPARED_XACTS
- PLAN_TABLE
- PATCH_INFORMATION_TABLE
- Functions and Operators
- Logical Operators
- Comparison Operators
- Character Processing Functions and Operators
- Binary String Functions and Operators
- Bit String Functions and Operators
- Mode Matching Operators
- Mathematical Functions and Operators
- Date and Time Processing Functions and Operators
- Type Conversion Functions
- Geometric Functions and Operators
- Network Address Functions and Operators
- Text Search Functions and Operators
- JSON/JSONB Functions and Operators
- HLL Functions and Operators
- SEQUENCE Functions
- Array Functions and Operators
- Range Functions and Operators
- Aggregate Functions
- Window Functions(Analysis Functions)
- Security Functions
- Ledger Database Functions
- Encrypted Equality Functions
- Set Returning Functions
- Conditional Expression Functions
- System Information Functions
- System Administration Functions
- Configuration Settings Functions
- Universal File Access Functions
- Server Signal Functions
- Backup and Restoration Control Functions
- Snapshot Synchronization Functions
- Database Object Functions
- Advisory Lock Functions
- Logical Replication Functions
- Segment-Page Storage Functions
- Other Functions
- Undo System Functions
- Row-store Compression System Functions
- Statistics Information Functions
- Trigger Functions
- Event Trigger Functions
- Hash Function
- Prompt Message Function
- Global Temporary Table Functions
- Fault Injection System Function
- AI Feature Functions
- Dynamic Data Masking Functions
- Other System Functions
- Internal Functions
- Global SysCache Feature Functions
- Data Damage Detection and Repair Functions
- XML Functions
- Obsolete Functions
- Supported Data Types
- SQL Syntax
- ABORT
- ALTER AGGREGATE
- ALTER AUDIT POLICY
- ALTER DATABASE
- ALTER DATA SOURCE
- ALTER DEFAULT PRIVILEGES
- ALTER DIRECTORY
- ALTER EVENT
- ALTER EVENT TRIGGER
- ALTER EXTENSION
- ALTER FOREIGN DATA WRAPPER
- ALTER FOREIGN TABLE
- ALTER FUNCTION
- ALTER GLOBAL CONFIGURATION
- ALTER GROUP
- ALTER INDEX
- ALTER LANGUAGE
- ALTER LARGE OBJECT
- ALTER MASKING POLICY
- ALTER MATERIALIZED VIEW
- ALTER OPERATOR
- ALTER PACKAGE
- ALTER PROCEDURE
- ALTER PUBLICATION
- ALTER RESOURCE LABEL
- ALTER RESOURCE POOL
- ALTER ROLE
- ALTER ROW LEVEL SECURITY POLICY
- ALTER RULE
- ALTER SCHEMA
- ALTER SEQUENCE
- ALTER SERVER
- ALTER SESSION
- ALTER SUBSCRIPTION
- ALTER SYNONYM
- ALTER SYSTEM KILL SESSION
- ALTER SYSTEM SET
- ALTER TABLE
- ALTER TABLE PARTITION
- ALTER TABLE SUBPARTITION
- ALTER TABLESPACE
- ALTER TEXT SEARCH CONFIGURATION
- ALTER TEXT SEARCH DICTIONARY
- ALTER TRIGGER
- ALTER TYPE
- ALTER USER
- ALTER USER MAPPING
- ALTER VIEW
- ANALYZE | ANALYSE
- BEGIN
- CALL
- CHECKPOINT
- CLEAN CONNECTION
- CLOSE
- CLUSTER
- COMMENT
- COMMIT | END
- COMMIT PREPARED
- CONNECT BY
- COPY
- CREATE AGGREGATE
- CREATE AUDIT POLICY
- CREATE CAST
- CREATE CLIENT MASTER KEY
- CREATE COLUMN ENCRYPTION KEY
- CREATE DATABASE
- CREATE DATA SOURCE
- CREATE DIRECTORY
- CREATE EVENT
- CREATE EVENT TRIGGER
- CREATE EXTENSION
- CREATE FOREIGN DATA WRAPPER
- CREATE FOREIGN TABLE
- CREATE FUNCTION
- CREATE GROUP
- CREATE INCREMENTAL MATERIALIZED VIEW
- CREATE INDEX
- CREATE LANGUAGE
- CREATE MASKING POLICY
- CREATE MATERIALIZED VIEW
- CREATE MODEL
- CREATE OPERATOR
- CREATE PACKAGE
- CREATE PROCEDURE
- CREATE PUBLICATION
- CREATE RESOURCE LABEL
- CREATE RESOURCE POOL
- CREATE ROLE
- CREATE ROW LEVEL SECURITY POLICY
- CREATE RULE
- CREATE SCHEMA
- CREATE SEQUENCE
- CREATE SERVER
- CREATE SUBSCRIPTION
- CREATE SYNONYM
- CREATE TABLE
- CREATE TABLE AS
- CREATE TABLE PARTITION
- CREATE TABLESPACE
- CREATE TABLE SUBPARTITION
- CREATE TEXT SEARCH CONFIGURATION
- CREATE TEXT SEARCH DICTIONARY
- CREATE TRIGGER
- CREATE TYPE
- CREATE USER
- CREATE USER MAPPING
- CREATE VIEW
- CREATE WEAK PASSWORD DICTIONARY
- CURSOR
- DEALLOCATE
- DECLARE
- DELETE
- DELIMITER
- DO
- DROP AGGREGATE
- DROP AUDIT POLICY
- DROP CAST
- DROP CLIENT MASTER KEY
- DROP COLUMN ENCRYPTION KEY
- DROP DATABASE
- DROP DATA SOURCE
- DROP DIRECTORY
- DROP EVENT
- DROP EVENT TRIGGER
- DROP EXTENSION
- DROP FOREIGN DATA WRAPPER
- DROP FOREIGN TABLE
- DROP FUNCTION
- DROP GLOBAL CONFIGURATION
- DROP GROUP
- DROP INDEX
- DROP LANGUAGE
- DROP MASKING POLICY
- DROP MATERIALIZED VIEW
- DROP MODEL
- DROP OPERATOR
- DROP OWNED
- DROP PACKAGE
- DROP PROCEDURE
- DROP PUBLICATION
- DROP RESOURCE LABEL
- DROP RESOURCE POOL
- DROP ROLE
- DROP ROW LEVEL SECURITY POLICY
- DROP RULE
- DROP SCHEMA
- DROP SEQUENCE
- DROP SERVER
- DROP SUBSCRIPTION
- DROP SYNONYM
- DROP TABLE
- DROP TABLESPACE
- DROP TEXT SEARCH CONFIGURATION
- DROP TEXT SEARCH DICTIONARY
- DROP TRIGGER
- DROP TYPE
- DROP USER
- DROP USER MAPPING
- DROP VIEW
- DROP WEAK PASSWORD DICTIONARY
- EXECUTE
- EXECUTE DIRECT
- EXPLAIN
- EXPLAIN PLAN
- FETCH
- GRANT
- INSERT
- LOCK
- MERGE INTO
- MOVE
- PREDICT BY
- PREPARE
- PREPARE TRANSACTION
- PURGE
- REASSIGN OWNED
- REFRESH INCREMENTAL MATERIALIZED VIEW
- REFRESH MATERIALIZED VIEW
- REINDEX
- RELEASE SAVEPOINT
- RESET
- REVOKE
- ROLLBACK
- ROLLBACK PREPARED
- ROLLBACK TO SAVEPOINT
- SAVEPOINT
- SELECT
- SELECT INTO
- SET
- SET CONSTRAINTS
- SET ROLE
- SET SESSION AUTHORIZATION
- SET TRANSACTION
- SHOW
- SHOW EVENTS
- SHRINK
- SHUTDOWN
- SNAPSHOT
- START TRANSACTION
- TIMECAPSULE TABLE
- TRUNCATE
- UPDATE
- VACUUM
- VALUES
- SQL Reference
- MogDB SQL
- Keywords
- Constant and Macro
- Expressions
- Type Conversion
- Full Text Search
- System Operation
- DDL Syntax Overview
- DML Syntax Overview
- DCL Syntax Overview
- Subquery
- LLVM
- Alias
- Lock
- Transaction
- Ordinary Table
- Partitioned Table
- Index
- Constraints
- Cursors
- Anonymous Block
- Trigger
- INSERT_RIGHT_REF_DEFAULT_VALUE
- Appendix
- GUC Parameters
- GUC Parameter Usage
- GUC Parameter List
- File Location
- Connection and Authentication
- Resource Consumption
- Write Ahead Log
- HA Replication
- Query Planning
- Error Reporting and Logging
- Alarm Detection
- Statistics During the Database Running
- Load Management
- Automatic Vacuuming
- Default Settings of Client Connection
- Lock Management
- Version and Platform Compatibility
- Faut Tolerance
- Connection Pool Parameters
- MogDB Transaction
- Replication Parameters of Two Database Instances
- Developer Options
- Auditing
- CM Parameters
- Backend Compression
- Upgrade Parameters
- Miscellaneous Parameters
- Wait Events
- Query
- System Performance Snapshot
- Security Configuration
- Global Temporary Table
- HyperLogLog
- Scheduled Task
- Thread Pool
- User-defined Functions
- Backup and Restoration
- DCF Parameters Settings
- Flashback
- Rollback Parameters
- Reserved Parameters
- AI Features
- Global SysCache Parameters
- Multi-Level Cache Management Parameters
- Resource Pooling Parameters
- Parameters Related to Efficient Data Compression Algorithms
- Writer Statement Parameters Supported by Standby Servers
- Data Import and Export
- Delimiter
- Appendix
- Schema
- Information Schema
- DBE_PERF
- OS
- Instance
- Memory
- File
- Object
- STAT_USER_TABLES
- SUMMARY_STAT_USER_TABLES
- GLOBAL_STAT_USER_TABLES
- STAT_USER_INDEXES
- SUMMARY_STAT_USER_INDEXES
- GLOBAL_STAT_USER_INDEXES
- STAT_SYS_TABLES
- SUMMARY_STAT_SYS_TABLES
- GLOBAL_STAT_SYS_TABLES
- STAT_SYS_INDEXES
- SUMMARY_STAT_SYS_INDEXES
- GLOBAL_STAT_SYS_INDEXES
- STAT_ALL_TABLES
- SUMMARY_STAT_ALL_TABLES
- GLOBAL_STAT_ALL_TABLES
- STAT_ALL_INDEXES
- SUMMARY_STAT_ALL_INDEXES
- GLOBAL_STAT_ALL_INDEXES
- STAT_DATABASE
- SUMMARY_STAT_DATABASE
- GLOBAL_STAT_DATABASE
- STAT_DATABASE_CONFLICTS
- SUMMARY_STAT_DATABASE_CONFLICTS
- GLOBAL_STAT_DATABASE_CONFLICTS
- STAT_XACT_ALL_TABLES
- SUMMARY_STAT_XACT_ALL_TABLES
- GLOBAL_STAT_XACT_ALL_TABLES
- STAT_XACT_SYS_TABLES
- SUMMARY_STAT_XACT_SYS_TABLES
- GLOBAL_STAT_XACT_SYS_TABLES
- STAT_XACT_USER_TABLES
- SUMMARY_STAT_XACT_USER_TABLES
- GLOBAL_STAT_XACT_USER_TABLES
- STAT_XACT_USER_FUNCTIONS
- SUMMARY_STAT_XACT_USER_FUNCTIONS
- GLOBAL_STAT_XACT_USER_FUNCTIONS
- STAT_BAD_BLOCK
- SUMMARY_STAT_BAD_BLOCK
- GLOBAL_STAT_BAD_BLOCK
- STAT_USER_FUNCTIONS
- SUMMARY_STAT_USER_FUNCTIONS
- GLOBAL_STAT_USER_FUNCTIONS
- Workload
- Session/Thread
- SESSION_STAT
- GLOBAL_SESSION_STAT
- SESSION_TIME
- GLOBAL_SESSION_TIME
- SESSION_MEMORY
- GLOBAL_SESSION_MEMORY
- SESSION_MEMORY_DETAIL
- GLOBAL_SESSION_MEMORY_DETAIL
- SESSION_STAT_ACTIVITY
- GLOBAL_SESSION_STAT_ACTIVITY
- THREAD_WAIT_STATUS
- GLOBAL_THREAD_WAIT_STATUS
- LOCAL_THREADPOOL_STATUS
- GLOBAL_THREADPOOL_STATUS
- SESSION_CPU_RUNTIME
- SESSION_MEMORY_RUNTIME
- STATEMENT_IOSTAT_COMPLEX_RUNTIME
- LOCAL_ACTIVE_SESSION
- Transaction
- Query
- STATEMENT
- SUMMARY_STATEMENT
- STATEMENT_COUNT
- GLOBAL_STATEMENT_COUNT
- SUMMARY_STATEMENT_COUNT
- GLOBAL_STATEMENT_COMPLEX_HISTORY
- GLOBAL_STATEMENT_COMPLEX_HISTORY_TABLE
- GLOBAL_STATEMENT_COMPLEX_RUNTIME
- STATEMENT_RESPONSETIME_PERCENTILE
- STATEMENT_COMPLEX_RUNTIME
- STATEMENT_COMPLEX_HISTORY_TABLE
- STATEMENT_COMPLEX_HISTORY
- STATEMENT_WLMSTAT_COMPLEX_RUNTIME
- STATEMENT_HISTORY
- Cache/IO
- STATIO_USER_TABLES
- SUMMARY_STATIO_USER_TABLES
- GLOBAL_STATIO_USER_TABLES
- STATIO_USER_INDEXES
- SUMMARY_STATIO_USER_INDEXES
- GLOBAL_STATIO_USER_INDEXES
- STATIO_USER_SEQUENCES
- SUMMARY_STATIO_USER_SEQUENCES
- GLOBAL_STATIO_USER_SEQUENCES
- STATIO_SYS_TABLES
- SUMMARY_STATIO_SYS_TABLES
- GLOBAL_STATIO_SYS_TABLES
- STATIO_SYS_INDEXES
- SUMMARY_STATIO_SYS_INDEXES
- GLOBAL_STATIO_SYS_INDEXES
- STATIO_SYS_SEQUENCES
- SUMMARY_STATIO_SYS_SEQUENCES
- GLOBAL_STATIO_SYS_SEQUENCES
- STATIO_ALL_TABLES
- SUMMARY_STATIO_ALL_TABLES
- GLOBAL_STATIO_ALL_TABLES
- STATIO_ALL_INDEXES
- SUMMARY_STATIO_ALL_INDEXES
- GLOBAL_STATIO_ALL_INDEXES
- STATIO_ALL_SEQUENCES
- SUMMARY_STATIO_ALL_SEQUENCES
- GLOBAL_STATIO_ALL_SEQUENCES
- GLOBAL_STAT_DB_CU
- GLOBAL_STAT_SESSION_CU
- Utility
- REPLICATION_STAT
- GLOBAL_REPLICATION_STAT
- REPLICATION_SLOTS
- GLOBAL_REPLICATION_SLOTS
- BGWRITER_STAT
- GLOBAL_BGWRITER_STAT
- GLOBAL_CKPT_STATUS
- GLOBAL_DOUBLE_WRITE_STATUS
- GLOBAL_PAGEWRITER_STATUS
- GLOBAL_RECORD_RESET_TIME
- GLOBAL_REDO_STATUS
- GLOBAL_RECOVERY_STATUS
- CLASS_VITAL_INFO
- USER_LOGIN
- SUMMARY_USER_LOGIN
- GLOBAL_GET_BGWRITER_STATUS
- GLOBAL_SINGLE_FLUSH_DW_STATUS
- GLOBAL_CANDIDATE_STATUS
- Lock
- Wait Events
- Configuration
- Operator
- Workload Manager
- Global Plancache
- RTO
- DBE_PLDEBUGGER Schema
- DBE_PLDEBUGGER.turn_on
- DBE_PLDEBUGGER.turn_off
- DBE_PLDEBUGGER.local_debug_server_info
- DBE_PLDEBUGGER.attach
- DBE_PLDEBUGGER.info_locals
- DBE_PLDEBUGGER.next
- DBE_PLDEBUGGER.continue
- DBE_PLDEBUGGER.abort
- DBE_PLDEBUGGER.print_var
- DBE_PLDEBUGGER.info_code
- DBE_PLDEBUGGER.step
- DBE_PLDEBUGGER.add_breakpoint
- DBE_PLDEBUGGER.delete_breakpoint
- DBE_PLDEBUGGER.info_breakpoints
- DBE_PLDEBUGGER.backtrace
- DBE_PLDEBUGGER.disable_breakpoint
- DBE_PLDEBUGGER.enable_breakpoint
- DBE_PLDEBUGGER.finish
- DBE_PLDEBUGGER.set_var
- DB4AI Schema
- DBE_PLDEVELOPER
- DBE_SQL_UTIL Schema
- Tool Reference
- Tool Overview
- Client Tool
- Server Tools
- Tools Used in the Internal System
- dsscmd
- dssserver
- mogdb
- gs_backup
- gs_basebackup
- gs_ctl
- gs_initdb
- gs_install
- gs_postuninstall
- gs_preinstall
- gs_sshexkey
- gs_tar
- gs_uninstall
- gs_upgradectl
- gs_expansion
- gs_dropnode
- gs_probackup
- gstrace
- kdb5_util
- kadmin.local
- kinit
- klist
- krb5kdc
- kdestroy
- pg_config
- pg_controldata
- pg_recvlogical
- pg_resetxlog
- pg_archivecleanup
- pssh
- pscp
- transfer.py
- FAQ
- Functions of MogDB Executable Scripts
- System Catalogs and Views Supported by gs_collector
- Error Code Reference
- Description of SQL Error Codes
- Third-Party Library Error Codes
- GAUSS-00001 - GAUSS-00100
- GAUSS-00101 - GAUSS-00200
- GAUSS 00201 - GAUSS 00300
- GAUSS 00301 - GAUSS 00400
- GAUSS 00401 - GAUSS 00500
- GAUSS 00501 - GAUSS 00600
- GAUSS 00601 - GAUSS 00700
- GAUSS 00701 - GAUSS 00800
- GAUSS 00801 - GAUSS 00900
- GAUSS 00901 - GAUSS 01000
- GAUSS 01001 - GAUSS 01100
- GAUSS 01101 - GAUSS 01200
- GAUSS 01201 - GAUSS 01300
- GAUSS 01301 - GAUSS 01400
- GAUSS 01401 - GAUSS 01500
- GAUSS 01501 - GAUSS 01600
- GAUSS 01601 - GAUSS 01700
- GAUSS 01701 - GAUSS 01800
- GAUSS 01801 - GAUSS 01900
- GAUSS 01901 - GAUSS 02000
- GAUSS 02001 - GAUSS 02100
- GAUSS 02101 - GAUSS 02200
- GAUSS 02201 - GAUSS 02300
- GAUSS 02301 - GAUSS 02400
- GAUSS 02401 - GAUSS 02500
- GAUSS 02501 - GAUSS 02600
- GAUSS 02601 - GAUSS 02700
- GAUSS 02701 - GAUSS 02800
- GAUSS 02801 - GAUSS 02900
- GAUSS 02901 - GAUSS 03000
- GAUSS 03001 - GAUSS 03100
- GAUSS 03101 - GAUSS 03200
- GAUSS 03201 - GAUSS 03300
- GAUSS 03301 - GAUSS 03400
- GAUSS 03401 - GAUSS 03500
- GAUSS 03501 - GAUSS 03600
- GAUSS 03601 - GAUSS 03700
- GAUSS 03701 - GAUSS 03800
- GAUSS 03801 - GAUSS 03900
- GAUSS 03901 - GAUSS 04000
- GAUSS 04001 - GAUSS 04100
- GAUSS 04101 - GAUSS 04200
- GAUSS 04201 - GAUSS 04300
- GAUSS 04301 - GAUSS 04400
- GAUSS 04401 - GAUSS 04500
- GAUSS 04501 - GAUSS 04600
- GAUSS 04601 - GAUSS 04700
- GAUSS 04701 - GAUSS 04800
- GAUSS 04801 - GAUSS 04900
- GAUSS 04901 - GAUSS 05000
- GAUSS 05001 - GAUSS 05100
- GAUSS 05101 - GAUSS 05200
- GAUSS 05201 - GAUSS 05300
- GAUSS 05301 - GAUSS 05400
- GAUSS 05401 - GAUSS 05500
- GAUSS 05501 - GAUSS 05600
- GAUSS 05601 - GAUSS 05700
- GAUSS 05701 - GAUSS 05800
- GAUSS 05801 - GAUSS 05900
- GAUSS 05901 - GAUSS 06000
- GAUSS 06001 - GAUSS 06100
- GAUSS 06101 - GAUSS 06200
- GAUSS 06201 - GAUSS 06300
- GAUSS 06301 - GAUSS 06400
- GAUSS 06401 - GAUSS 06500
- GAUSS 06501 - GAUSS 06600
- GAUSS 06601 - GAUSS 06700
- GAUSS 06701 - GAUSS 06800
- GAUSS 06801 - GAUSS 06900
- GAUSS 06901 - GAUSS 07000
- GAUSS 07001 - GAUSS 07100
- GAUSS 07101 - GAUSS 07200
- GAUSS 07201 - GAUSS 07300
- GAUSS 07301 - GAUSS 07400
- GAUSS 07401 - GAUSS 07500
- GAUSS 50000 - GAUSS 50999
- GAUSS 51000 - GAUSS 51999
- GAUSS 52000 - GAUSS 52999
- GAUSS 53000 - GAUSS 53699
- Error Log Reference
- System Catalogs and System Views
- Common Faults and Identification
- Common Fault Locating Methods
- Common Fault Locating Cases
- Core Fault Locating
- Permission/Session/Data Type Fault Location
- Service/High Availability/Concurrency Fault Location
- Standby Node in the Need Repair (WAL) State
- Service Startup Failure
- Primary Node Is Hung in Demoting During a Switchover
- "too many clients already" Is Reported or Threads Failed To Be Created in High Concurrency Scenarios
- Performance Deterioration Caused by Dirty Page Flushing Efficiency During TPCC High Concurrentcy Long Term Stable Running
- A Query Error Is Reported Due to Predicate Pushdown
- Table/Partition Table Fault Location
- File System/Disk/Memory Fault Location
- After You Run the du Command to Query Data File Size In the XFS File System, the Query Result Is Greater than the Actual File Size
- File Is Damaged in the XFS File System
- Insufficient Memory
- "Error:No space left on device" Is Displayed
- When the TPC-C is running and a disk to be injected is full, the TPC-C stops responding
- Disk Space Usage Reaches the Threshold and the Database Becomes Read-only
- Shared Memory Leakage
- SQL Fault Location
- Index Fault Location
- CM Two-Node Fault Location
- Source Code Parsing
- FAQs
- Glossary
- Communication Matrix
- Mogeaver
DB4AI-Query for Model Training and Prediction
The current version of MogDB supports the native DB4AI capability. By introducing native AI operators, MogDB simplifies the operation process and fully utilizes the optimization and execution capabilities of the database optimizer and executor to obtain the high-performance model training capability in the database. With a simpler model training and prediction process and higher performance, developers can focus on model tuning and data analysis in a shorter period of time, avoiding fragmented technology stacks and redundant code implementation.
Keyword Parsing
Table 1 DB4AI syntax and keywords
Name | Description |
---|---|
CREATE MODEL | Creates a model, trains it, and saves the model. |
PREDICT BY | Uses an existing model for prediction. |
TARGET | Target column name of a training or prediction task. |
FEATURES | Data feature column name of a training or prediction task. |
MODEL | Model name of a training task. |
Developer Guide
-
Introduce the algorithms supported in this version.
DB4AI of the current version supports logistic regression (binary classification tasks), linear regression, and vector machine algorithms (classification tasks) based on the SGD operator, as well as the K-Means clustering algorithm based on the K-Means operator.
-
Learn about the model training syntax.
-
CREATE MODEL
You can run the CREATE MODEL statement to create and train a model. Taking dataset kmeans_2d as an example, the data content of the table is as follows:
MogDB=# select * from kmeans_2d; id | position ----+------------------------------------- 1 | {74.5268815685995,88.2141939294524} 2 | {70.9565760521218,98.8114827475511} 3 | {76.2756086327136,23.8387574302033} 4 | {17.8495847294107,81.8449544720352} 5 | {81.2175785354339,57.1677675866522} 6 | {53.97752255667,49.3158342130482} 7 | {93.2475341879763,86.934042100329} 8 | {72.7659293473698,19.7020415100269} 9 | {16.5800288529135,75.7475957670249} 10 | {81.8520747194998,40.3476078575477} 11 | {76.796671198681,86.3827232690528} 12 | {59.9231450678781,90.9907738864422} 13 | {70.161884885747,19.7427458665334} 14 | {11.1269539105706,70.9988166182302} 15 | {80.5005071521737,65.2822235273197} 16 | {54.7030725912191,52.151339428965} 17 | {103.059707058128,80.8419883321039} 18 | {85.3574452036992,14.9910179991275} 19 | {28.6501615960151,76.6922890325077} 20 | {69.7285806713626,49.5416352967732} (20 rows)
The data type of the position field in this table is double precision[].
-
The following uses K-Means as an example to describe how to train a model. Specify position as a feature column in the kmeans_2d training set, and use the K-Means algorithm to create and save the point_kmeans model.
MogDB=# CREATE MODEL point_kmeans USING kmeans FEATURES position FROM kmeans_2d WITH num_centroids=3; NOTICE: Hyperparameter max_iterations takes value DEFAULT (10) NOTICE: Hyperparameter num_centroids takes value 3 NOTICE: Hyperparameter tolerance takes value DEFAULT (0.000010) NOTICE: Hyperparameter batch_size takes value DEFAULT (10) NOTICE: Hyperparameter num_features takes value DEFAULT (2) NOTICE: Hyperparameter distance_function takes value DEFAULT (L2_Squared) NOTICE: Hyperparameter seeding_function takes value DEFAULT (Random++) NOTICE: Hyperparameter verbose takes value DEFAULT (0) NOTICE: Hyperparameter seed takes value DEFAULT (0) MODEL CREATED. PROCESSED 1
In the preceding command:
-
The CREATE MODEL statement is used to train and save a model.
-
USING specifies the algorithm name.
-
FEATURES specifies the features of the training model and needs to be added based on the column name of the training data table.
-
TARGET specifies the training target of the model. It can be the column name of the data table required for training or an expression, for example, price > 10000.
-
WITH specifies the hyperparameters used for model training. When the hyperparameter is not set by the user, the framework uses the default value.
The framework supports various hyperparameter combinations for different operators.
Table 2 Hyperparameters supported by operators
Operator Hyperparameter GD(logistic_regression, linear_regression, and svm_classification) optimizer(char*), verbose(bool), max_iterations(int), max_seconds(double), batch_size(int), learning_rate(double), decay(double), and tolerance(double)SVM limits the hyperparameter lambda(double). K-Means max_iterations(int), num_centroids(int), tolerance(double), batch_size(int), num_features(int), distance_function(char*), seeding_function(char*), verbose(int), and seed(int) The default value and value range of each hyperparameter are as follows:
Table 3 Default values and value ranges of hyperparameters
Operator Default Hyperparameter Value Value Range Hyperparameter Description GD (logistic_regression, linear_regression, and svm_classification) optimizer = gd (gradient descent) gd or ngd (natural gradient descent) Optimizer verbose = false T or F Log display max_iterations = 100 (0,INT_MAX_VALUE] Maximum iterations max_seconds = 0 (The running duration is not limited.) [0,INT_MAX_VALUE] Running duration batch_size = 1000 (0,MAX_MEMORY_LIMIT] Number of data records selected per training learning_rate = 0.8 (0,DOUBLE_MAX_VALUE] Learning rate decay = 0.95 (0,DOUBLE_MAX_VALUE] Weight decay rate tolerance = 0.0005 (0,DOUBLE_MAX_VALUE] Tolerance seed = 0 (random value of seed) [0,INT_MAX_VALUE] Seed just for SVM:lambda = 0.01 (0,DOUBLE_MAX_VALUE) Regularization parameter K-Means max_iterations = 10 [1,INT_MAX_VALUE] Maximum iterations num_centroids = 10 [1,MAX_MEMORY_LIMIT] Number of clusters tolerance = 0.00001 (0,1) Central point error batch_size = 10 [1,MAX_MEMORY_LIMIT] Number of data records selected per training num_features = 2 [1,GS_MAX_COLS] Number of sample features distance_function = "L2_Squared" L1, L2, L2_Squared, or Linf Regularization method seeding_function = "Random++" "Random++" or "KMeans||" Method for initializing seed points verbose = 0U {0,1,2} Verbose mode seed = 0U [0,INT_MAX_VALUE] Seed MAX_MEMORY_LIMIT = Maximum number of tuples loaded in memory GS_MAX_COLS = Maximum number of attributes in a database table
-
-
If the model is saved successfully, the following information is returned:
MODEL CREATED. PROCESSED x
-
-
View the model information.
After the training is complete, the model is stored in the gs_model_warehouse system catalog. You can view information about the model and training process in the gs_model_warehouse system catalog.
You can view a model by viewing the system catalog. For example, run the following SQL statement to view the model named point_kmeans:
MogDB=# select * from gs_model_warehouse where modelname='point_kmeans'; -[ RECORD 1 ]---------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- modelname | point_kmeans modelowner | 10 createtime | 2021-04-30 17:30:39.59044 processedtuples | 20 discardedtuples | 0 pre_process_time | 6.2001e-05 exec_time | .000185272 iterations | 5 outputtype | 23 modeltype | kmeans query | CREATE MODEL point_kmeans USING kmeans FEATURES position FROM kmeans_2d WITH num_centroids=3; modeldata | weight | hyperparametersnames | {max_iterations,num_centroids,tolerance,batch_size,num_features,distance_function,seeding_function,verbose,seed} hyperparametersvalues | {10,3,1e-05,10,2,L2_Squared,Random++,0,0} hyperparametersoids | {23,23,701,23,23,1043,1043,23,23} coefnames | {original_num_centroids,actual_num_centroids,dimension,distance_function_id,seed,coordinates} coefvalues | {3,3,2,2,572368998,"(77.282589,23.724434)(74.421616,73.239455)(18.551682,76.320914)"} coefoids | trainingscoresname | trainingscoresvalue | modeldescribe | {"id:1,objective_function:542.851169,avg_distance_to_centroid:108.570234,min_distance_to_centroid:1.027078,max_distance_to_centroid:297.210108,std_dev_distance_to_centroid:105.053257,cluster_size:5","id:2,objective_function:5825.982139,avg_distance_to_centroid:529.634740,min_distance_to_centroid:100.270449,max_distance_to_centroid:990.300588,std_dev_distance_to_centroid:285.915094,cluster_size:11","id:3,objective_function:220.792591,avg_distance_to_centroid:55.198148,min_distance_to_centroid:4.216111,max_distance_to_centroid:102.117204,std_dev_distance_to_centroid:39.319118,cluster_size:4"}
-
Use an existing model to perform a prediction task.
Use the SELECT and PREDICT BY keywords to complete the prediction task based on the existing model.
Query syntax: SELECT… PREDICT BY… (FEATURES…)… FROM…;
MogDB=# SELECT id, PREDICT BY point_kmeans (FEATURES position) as pos FROM (select * from kmeans_2d limit 10); id | pos ----+----- 1 | 2 2 | 2 3 | 1 4 | 3 5 | 2 6 | 2 7 | 2 8 | 1 9 | 3 10 | 1 (10 rows)
For the same prediction task, the results of the same model are stable. In addition, models trained based on the same hyperparameter and training set are stable. AI model training is random (random gradient descent of data distribution each batch). Therefore, the computing performance and results of different models can vary slightly.
-
View the execution plan.
You can use the EXPLAIN statement to analyze the execution plan in the model training or prediction process of CREATE MODEL and PREDICT BY. The keyword EXPLAIN can be followed by a CREATE MODEL or PREDICT BY clause or an optional parameter. The supported parameters are as follows:
Table 4 Parameters supported by EXPLAIN
Parameter Description ANALYZE Boolean variable, which is used to add description information such as the running time and number of loop times VERBOSE Boolean variable, which determines whether to output the training running information to the client COSTS Boolean variable CPU Boolean variable DETAIL Boolean variable, which is available only in distributed mode NODES Boolean variable, which is available only in distributed mode NUM_NODES Boolean variable, which is available only in distributed mode BUFFERS Boolean variable TIMING Boolean variable PLAN Boolean variable FORMAT Optional format type: TEXT, XML, JSON, and YAML Example:
MogDB=# Explain CREATE MODEL patient_logisitic_regression USING logistic_regression FEATURES second_attack, treatment TARGET trait_anxiety > 50 FROM patients WITH batch_size=10, learning_rate = 0.05; NOTICE: Hyperparameter batch_size takes value 10 NOTICE: Hyperparameter decay takes value DEFAULT (0.950000) NOTICE: Hyperparameter learning_rate takes value 0.050000 NOTICE: Hyperparameter max_iterations takes value DEFAULT (100) NOTICE: Hyperparameter max_seconds takes value DEFAULT (0) NOTICE: Hyperparameter optimizer takes value DEFAULT (gd) NOTICE: Hyperparameter tolerance takes value DEFAULT (0.000500) NOTICE: Hyperparameter seed takes value DEFAULT (0) NOTICE: Hyperparameter verbose takes value DEFAULT (FALSE) NOTICE: GD shuffle cache size 212369 QUERY PLAN ------------------------------------------------------------------- Gradient Descent (cost=0.00..0.00 rows=0 width=0) -> Seq Scan on patients (cost=0.00..32.20 rows=1776 width=12) (2 rows)
-
Perform troubleshooting in case of exceptions.
-
Training phase
-
Scenario 1: When the value of the hyperparameter exceeds the value range, the model training fails and an error message is returned. For example:
MogDB=# CREATE MODEL patient_linear_regression USING linear_regression FEATURES second_attack,treatment TARGET trait_anxiety FROM patients WITH optimizer='aa'; NOTICE: Hyperparameter batch_size takes value DEFAULT (1000) NOTICE: Hyperparameter decay takes value DEFAULT (0.950000) NOTICE: Hyperparameter learning_rate takes value DEFAULT (0.800000) NOTICE: Hyperparameter max_iterations takes value DEFAULT (100) NOTICE: Hyperparameter max_seconds takes value DEFAULT (0) NOTICE: Hyperparameter optimizer takes value aa ERROR: Invalid hyperparameter value for optimizer. Valid values are: gd, ngd. (default is gd)
-
Scenario 2: If the model name already exists, the model fails to be saved, and an error message with the cause is displayed:
MogDB=# CREATE MODEL patient_linear_regression USING linear_regression FEATURES second_attack,treatment TARGET trait_anxiety FROM patients; NOTICE: Hyperparameter batch_size takes value DEFAULT (1000) NOTICE: Hyperparameter decay takes value DEFAULT (0.950000) NOTICE: Hyperparameter learning_rate takes value DEFAULT (0.800000) NOTICE: Hyperparameter max_iterations takes value DEFAULT (100) NOTICE: Hyperparameter max_seconds takes value DEFAULT (0) NOTICE: Hyperparameter optimizer takes value DEFAULT (gd) NOTICE: Hyperparameter tolerance takes value DEFAULT (0.000500) NOTICE: Hyperparameter seed takes value DEFAULT (0) NOTICE: Hyperparameter verbose takes value DEFAULT (FALSE) NOTICE: GD shuffle cache size 5502 ERROR: The model name "patient_linear_regression" already exists in gs_model_warehouse.
-
Scenario 3: If the value in the FEATURE or TARGETS column is *, ERROR is returned with the error cause:
MogDB=# CREATE MODEL patient_linear_regression USING linear_regression FEATURES * TARGET trait_anxiety FROM patients; ERROR: FEATURES clause cannot be * ----------------------------------------------------------------------------------------------------------------------- MogDB=# CREATE MODEL patient_linear_regression USING linear_regression FEATURES second_attack,treatment TARGET * FROM patients; ERROR: TARGET clause cannot be *
-
Scenario 4: If the keyword TARGET is used in the unsupervised learning method or is not applicable to the supervised learning method, ERROR is returned with the error cause:
MogDB=# CREATE MODEL patient_linear_regression USING linear_regression FEATURES second_attack,treatment FROM patients; ERROR: Supervised ML algorithms require TARGET clause ----------------------------------------------------------------------------------------------------------------------------- CREATE MODEL patient_linear_regression USING linear_regression TARGET trait_anxiety FROM patients; ERROR: Supervised ML algorithms require FEATURES clause
-
Scenario 5: If the GUC parameter statement_timeout is set, the statement that is executed due to training timeout will be terminated. In this case, execute the CREATE MODEL statement. Parameters such as the size of the training set, number of training rounds (iteration), early termination conditions (tolerance and max_seconds), and number of parallel threads (nthread) affect the training duration. When the duration exceeds the database limit, the statement execution is terminated and model training fails.
-
-
Prediction phase
-
Scenario 6: If the model name cannot be found in the system catalog, the database reports ERROR:
MogDB=# select id, PREDICT BY patient_logistic_regression (FEATURES second_attack,treatment) FROM patients; ERROR: There is no model called "patient_logistic_regression".
-
Scenario 7: If the data dimension and data type of the FEATURES task are inconsistent with those of the training set, ERROR is reported and the error cause is displayed. For example:
MogDB=# select id, PREDICT BY patient_linear_regression (FEATURES second_attack) FROM patients; ERROR: Invalid number of features for prediction, provided 1, expected 2 CONTEXT: referenced column: patient_linear_regression_pred ------------------------------------------------------------------------------------------------------------------------------------- MogDB=# select id, PREDICT BY patient_linear_regression (FEATURES 1,second_attack,treatment) FROM patients; ERROR: Invalid number of features for prediction, provided 3, expected 2 CONTEXT: referenced column: patient_linear_regression_pre
-
-