- About MogDB
- Quick Start
- MogDB Playground
- Container-based MogDB Installation
- Installation on a Single Node
- MogDB Access
- Use CLI to Access MogDB
- Use GUI to Access MogDB
- Use Middleware to Access MogDB
- Use Programming Language to Access MogDB
- Using Sample Dataset Mogila
- Characteristic Description
- High Performance
- High Availability (HA)
- Maintainability
- Database Security
- Access Control Model
- Separation of Control and Access Permissions
- Database Encryption Authentication
- Data Encryption and Storage
- Database Audit
- Network Communication Security
- Resource Label
- Unified Audit
- Dynamic Data Anonymization
- Row-Level Access Control
- Password Strength Verification
- Equality Query in a Fully-encrypted Database
- Ledger Database Mechanism
- Enterprise-Level Features
- Support for Functions and Stored Procedures
- SQL Hints
- Full-Text Indexing
- Copy Interface for Error Tolerance
- Partitioning
- Support for Advanced Analysis Functions
- Materialized View
- HyperLogLog
- Creating an Index Online
- Autonomous Transaction
- Global Temporary Table
- Pseudocolumn ROWNUM
- Stored Procedure Debugging
- JDBC Client Load Balancing and Read/Write Isolation
- In-place Update Storage Engine
- Application Development Interfaces
- AI Capabilities
- Installation Guide
- Container Installation
- Simplified Installation Process
- Standard Installation
- Manual Installation
- Administrator Guide
- Routine Maintenance
- Starting and Stopping MogDB
- Using the gsql Client for Connection
- Routine Maintenance
- Checking OS Parameters
- Checking MogDB Health Status
- Checking Database Performance
- Checking and Deleting Logs
- Checking Time Consistency
- Checking The Number of Application Connections
- Routinely Maintaining Tables
- Routinely Recreating an Index
- Data Security Maintenance Suggestions
- Log Reference
- Primary and Standby Management
- MOT Engine
- Introducing MOT
- Using MOT
- Concepts of MOT
- Appendix
- Column-store Tables Management
- Backup and Restoration
- Importing and Exporting Data
- Importing Data
- Exporting Data
- Upgrade Guide
- Routine Maintenance
- AI Features Guide
- Overview
- Predictor: AI Query Time Forecasting
- X-Tuner: Parameter Optimization and Diagnosis
- SQLdiag: Slow SQL Discovery
- A-Detection: Status Monitoring
- Index-advisor: Index Recommendation
- DeepSQL
- AI-Native Database (DB4AI)
- Security Guide
- Developer Guide
- Application Development Guide
- Development Specifications
- Development Based on JDBC
- Overview
- JDBC Package, Driver Class, and Environment Class
- Development Process
- Loading the Driver
- Connecting to a Database
- Connecting to the Database (Using SSL)
- Running SQL Statements
- Processing Data in a Result Set
- Closing a Connection
- Managing Logs
- Example: Common Operations
- Example: Retrying SQL Queries for Applications
- Example: Importing and Exporting Data Through Local Files
- Example 2: Migrating Data from a MY Database to MogDB
- Example: Logic Replication Code
- Example: Parameters for Connecting to the Database in Different Scenarios
- JDBC API Reference
- java.sql.Connection
- java.sql.CallableStatement
- java.sql.DatabaseMetaData
- java.sql.Driver
- java.sql.PreparedStatement
- java.sql.ResultSet
- java.sql.ResultSetMetaData
- java.sql.Statement
- javax.sql.ConnectionPoolDataSource
- javax.sql.DataSource
- javax.sql.PooledConnection
- javax.naming.Context
- javax.naming.spi.InitialContextFactory
- CopyManager
- Development Based on ODBC
- Development Based on libpq
- Development Based on libpq
- libpq API Reference
- Database Connection Control Functions
- Database Statement Execution Functions
- Functions for Asynchronous Command Processing
- Functions for Canceling Queries in Progress
- Example
- Connection Characters
- Psycopg-Based Development
- Commissioning
- Appendices
- Stored Procedure
- User Defined Functions
- PL/pgSQL-SQL Procedural Language
- Scheduled Jobs
- Autonomous Transaction
- Logical Replication
- Logical Decoding
- Foreign Data Wrapper
- Materialized View
- Materialized View Overview
- Full Materialized View
- Incremental Materialized View
- Resource Load Management
- Overview
- Resource Management Preparation
- Application Development Guide
- Performance Tuning Guide
- System Optimization
- SQL Optimization
- WDR Snapshot Schema
- TPCC Performance Tuning Guide
- Reference Guide
- System Catalogs and System Views
- Overview of System Catalogs and System Views
- System Catalogs
- GS_AUDITING_POLICY
- GS_AUDITING_POLICY_ACCESS
- GS_AUDITING_POLICY_FILTERS
- GS_AUDITING_POLICY_PRIVILEGES
- GS_CLIENT_GLOBAL_KEYS
- GS_CLIENT_GLOBAL_KEYS_ARGS
- GS_COLUMN_KEYS
- GS_COLUMN_KEYS_ARGS
- GS_ENCRYPTED_COLUMNS
- GS_ENCRYPTED_PROC
- GS_GLOBAL_CHAIN
- GS_MASKING_POLICY
- GS_MASKING_POLICY_ACTIONS
- GS_MASKING_POLICY_FILTERS
- GS_MATVIEW
- GS_MATVIEW_DEPENDENCY
- GS_OPT_MODEL
- GS_POLICY_LABEL
- GS_RECYCLEBIN
- GS_TXN_SNAPSHOT
- GS_WLM_INSTANCE_HISTORY
- GS_WLM_OPERATOR_INFO
- GS_WLM_PLAN_ENCODING_TABLE
- GS_WLM_PLAN_OPERATOR_INFO
- GS_WLM_EC_OPERATOR_INFO
- PG_AGGREGATE
- PG_AM
- PG_AMOP
- PG_AMPROC
- PG_APP_WORKLOADGROUP_MAPPING
- PG_ATTRDEF
- PG_ATTRIBUTE
- PG_AUTHID
- PG_AUTH_HISTORY
- PG_AUTH_MEMBERS
- PG_CAST
- PG_CLASS
- PG_COLLATION
- PG_CONSTRAINT
- PG_CONVERSION
- PG_DATABASE
- PG_DB_ROLE_SETTING
- PG_DEFAULT_ACL
- PG_DEPEND
- PG_DESCRIPTION
- PG_DIRECTORY
- PG_ENUM
- PG_EXTENSION
- PG_EXTENSION_DATA_SOURCE
- PG_FOREIGN_DATA_WRAPPER
- PG_FOREIGN_SERVER
- PG_FOREIGN_TABLE
- PG_INDEX
- PG_INHERITS
- PG_JOB
- PG_JOB_PROC
- PG_LANGUAGE
- PG_LARGEOBJECT
- PG_LARGEOBJECT_METADATA
- PG_NAMESPACE
- PG_OBJECT
- PG_OPCLASS
- PG_OPERATOR
- PG_OPFAMILY
- PG_PARTITION
- PG_PLTEMPLATE
- PG_PROC
- PG_RANGE
- PG_RESOURCE_POOL
- PG_REWRITE
- PG_RLSPOLICY
- PG_SECLABEL
- PG_SHDEPEND
- PG_SHDESCRIPTION
- PG_SHSECLABEL
- PG_STATISTIC
- PG_STATISTIC_EXT
- PG_SYNONYM
- PG_TABLESPACE
- PG_TRIGGER
- PG_TS_CONFIG
- PG_TS_CONFIG_MAP
- PG_TS_DICT
- PG_TS_PARSER
- PG_TS_TEMPLATE
- PG_TYPE
- PG_USER_MAPPING
- PG_USER_STATUS
- PG_WORKLOAD_GROUP
- PLAN_TABLE_DATA
- STATEMENT_HISTORY
- System Views
- GET_GLOBAL_PREPARED_XACTS
- GS_AUDITING
- GS_AUDITING_ACCESS
- GS_AUDITING_PRIVILEGE
- GS_CLUSTER_RESOURCE_INFO
- GS_INSTANCE_TIME
- GS_LABELS
- GS_MASKING
- GS_MATVIEWS
- GS_SESSION_MEMORY
- GS_SESSION_CPU_STATISTICS
- GS_SESSION_MEMORY_CONTEXT
- GS_SESSION_MEMORY_DETAIL
- GS_SESSION_MEMORY_STATISTICS
- GS_SQL_COUNT
- GS_WLM_CGROUP_INFO
- GS_WLM_PLAN_OPERATOR_HISTORY
- GS_WLM_REBUILD_USER_RESOURCE_POOL
- GS_WLM_RESOURCE_POOL
- GS_WLM_USER_INFO
- GS_STAT_SESSION_CU
- GS_TOTAL_MEMORY_DETAIL
- MPP_TABLES
- PG_AVAILABLE_EXTENSION_VERSIONS
- PG_AVAILABLE_EXTENSIONS
- PG_COMM_DELAY
- PG_COMM_RECV_STREAM
- PG_COMM_SEND_STREAM
- PG_COMM_STATUS
- PG_CONTROL_GROUP_CONFIG
- PG_CURSORS
- PG_EXT_STATS
- PG_GET_INVALID_BACKENDS
- PG_GET_SENDERS_CATCHUP_TIME
- PG_GROUP
- PG_GTT_RELSTATS
- PG_GTT_STATS
- PG_GTT_ATTACHED_PIDS
- PG_INDEXES
- PG_LOCKS
- PG_NODE_ENV
- PG_OS_THREADS
- PG_PREPARED_STATEMENTS
- PG_PREPARED_XACTS
- PG_REPLICATION_SLOTS
- PG_RLSPOLICIES
- PG_ROLES
- PG_RULES
- PG_SECLABELS
- PG_SETTINGS
- PG_SHADOW
- PG_STATS
- PG_STAT_ACTIVITY
- PG_STAT_ALL_INDEXES
- PG_STAT_ALL_TABLES
- PG_STAT_BAD_BLOCK
- PG_STAT_BGWRITER
- PG_STAT_DATABASE
- PG_STAT_DATABASE_CONFLICTS
- PG_STAT_USER_FUNCTIONS
- PG_STAT_USER_INDEXES
- PG_STAT_USER_TABLES
- PG_STAT_REPLICATION
- PG_STAT_SYS_INDEXES
- PG_STAT_SYS_TABLES
- PG_STAT_XACT_ALL_TABLES
- PG_STAT_XACT_SYS_TABLES
- PG_STAT_XACT_USER_FUNCTIONS
- PG_STAT_XACT_USER_TABLES
- PG_STATIO_ALL_INDEXES
- PG_STATIO_ALL_SEQUENCES
- PG_STATIO_ALL_TABLES
- PG_STATIO_SYS_INDEXES
- PG_STATIO_SYS_SEQUENCES
- PG_STATIO_SYS_TABLES
- PG_STATIO_USER_INDEXES
- PG_STATIO_USER_SEQUENCES
- PG_STATIO_USER_TABLES
- PG_TABLES
- PG_TDE_INFO
- PG_THREAD_WAIT_STATUS
- PG_TIMEZONE_ABBREVS
- PG_TIMEZONE_NAMES
- PG_TOTAL_MEMORY_DETAIL
- PG_TOTAL_USER_RESOURCE_INFO
- PG_TOTAL_USER_RESOURCE_INFO_OID
- PG_USER
- PG_USER_MAPPINGS
- PG_VARIABLE_INFO
- PG_VIEWS
- PLAN_TABLE
- GS_FILE_STAT
- GS_OS_RUN_INFO
- GS_REDO_STAT
- GS_SESSION_STAT
- GS_SESSION_TIME
- GS_THREAD_MEMORY_CONTEXT
- Functions and Operators
- Logical Operators
- Comparison Operators
- Character Processing Functions and Operators
- Binary String Functions and Operators
- Bit String Functions and Operators
- Mode Matching Operators
- Mathematical Functions and Operators
- Date and Time Processing Functions and Operators
- Type Conversion Functions
- Geometric Functions and Operators
- Network Address Functions and Operators
- Text Search Functions and Operators
- JSON/JSONB Functions and Operators
- HLL Functions and Operators
- SEQUENCE Functions
- Array Functions and Operators
- Range Functions and Operators
- Aggregate Functions
- Window Functions
- Security Functions
- Ledger Database Functions
- Encrypted Equality Functions
- Set Returning Functions
- Conditional Expression Functions
- System Information Functions
- System Administration Functions
- Configuration Settings Functions
- Universal File Access Functions
- Server Signal Functions
- Backup and Restoration Control Functions
- Snapshot Synchronization Functions
- Database Object Functions
- Advisory Lock Functions
- Logical Replication Functions
- Segment-Page Storage Functions
- Other Functions
- Undo System Functions
- Statistics Information Functions
- Trigger Functions
- Hash Function
- Prompt Message Function
- Global Temporary Table Functions
- Fault Injection System Function
- AI Feature Functions
- Dynamic Data Masking Functions
- Other System Functions
- Internal Functions
- Obsolete Functions
- Supported Data Types
- Numeric Types
- Monetary Types
- Boolean Types
- Enumerated Types
- Character Types
- Binary Types
- Date/Time Types
- Geometric
- Network Address Types
- Bit String Types
- Text Search Types
- UUID
- JSON/JSONB Types
- HLL
- Array Types
- Range
- OID Types
- Pseudo-Types
- Data Types Supported by Column-store Tables
- XML Types
- Data Type Used by the Ledger Database
- SQL Syntax
- ABORT
- ALTER AGGREGATE
- ALTER AUDIT POLICY
- ALTER DATABASE
- ALTER DATA SOURCE
- ALTER DEFAULT PRIVILEGES
- ALTER DIRECTORY
- ALTER EXTENSION
- ALTER FOREIGN TABLE
- ALTER FUNCTION
- ALTER GROUP
- ALTER INDEX
- ALTER LANGUAGE
- ALTER LARGE OBJECT
- ALTER MASKING POLICY
- ALTER MATERIALIZED VIEW
- ALTER OPERATOR
- ALTER RESOURCE LABEL
- ALTER RESOURCE POOL
- ALTER ROLE
- ALTER ROW LEVEL SECURITY POLICY
- ALTER RULE
- ALTER SCHEMA
- ALTER SEQUENCE
- ALTER SERVER
- ALTER SESSION
- ALTER SYNONYM
- ALTER SYSTEM KILL SESSION
- ALTER SYSTEM SET
- ALTER TABLE
- ALTER TABLE PARTITION
- ALTER TABLE SUBPARTITION
- ALTER TABLESPACE
- ALTER TEXT SEARCH CONFIGURATION
- ALTER TEXT SEARCH DICTIONARY
- ALTER TRIGGER
- ALTER TYPE
- ALTER USER
- ALTER USER MAPPING
- ALTER VIEW
- ANALYZE | ANALYSE
- BEGIN
- CALL
- CHECKPOINT
- CLEAN CONNECTION
- CLOSE
- CLUSTER
- COMMENT
- COMMIT | END
- COMMIT PREPARED
- CONNECT BY
- COPY
- CREATE AGGREGATE
- CREATE AUDIT POLICY
- CREATE CAST
- CREATE CLIENT MASTER KEY
- CREATE COLUMN ENCRYPTION KEY
- CREATE DATABASE
- CREATE DATA SOURCE
- CREATE DIRECTORY
- CREATE EXTENSION
- CREATE FOREIGN TABLE
- CREATE FUNCTION
- CREATE GROUP
- CREATE INCREMENTAL MATERIALIZED VIEW
- CREATE INDEX
- CREATE LANGUAGE
- CREATE MASKING POLICY
- CREATE MATERIALIZED VIEW
- CREATE MODEL
- CREATE OPERATOR
- CREATE PACKAGE
- CREATE ROW LEVEL SECURITY POLICY
- CREATE PROCEDURE
- CREATE RESOURCE LABEL
- CREATE RESOURCE POOL
- CREATE ROLE
- CREATE RULE
- CREATE SCHEMA
- CREATE SEQUENCE
- CREATE SERVER
- CREATE SYNONYM
- CREATE TABLE
- CREATE TABLE AS
- CREATE TABLE PARTITION
- CREATE TABLE SUBPARTITION
- CREATE TABLESPACE
- CREATE TEXT SEARCH CONFIGURATION
- CREATE TEXT SEARCH DICTIONARY
- CREATE TRIGGER
- CREATE TYPE
- CREATE USER
- CREATE USER MAPPING
- CREATE VIEW
- CREATE WEAK PASSWORD DICTIONARY
- CURSOR
- DEALLOCATE
- DECLARE
- DELETE
- DO
- DROP AGGREGATE
- DROP AUDIT POLICY
- DROP CAST
- DROP CLIENT MASTER KEY
- DROP COLUMN ENCRYPTION KEY
- DROP DATABASE
- DROP DATA SOURCE
- DROP DIRECTORY
- DROP EXTENSION
- DROP FOREIGN TABLE
- DROP FUNCTION
- DROP GROUP
- DROP INDEX
- DROP LANGUAGE
- DROP MASKING POLICY
- DROP MATERIALIZED VIEW
- DROP MODEL
- DROP OPERATOR
- DROP OWNED
- DROP PACKAGE
- DROP PROCEDURE
- DROP RESOURCE LABEL
- DROP RESOURCE POOL
- DROP ROW LEVEL SECURITY POLICY
- DROP ROLE
- DROP RULE
- DROP SCHEMA
- DROP SEQUENCE
- DROP SERVER
- DROP SYNONYM
- DROP TABLE
- DROP TABLESPACE
- DROP TEXT SEARCH CONFIGURATION
- DROP TEXT SEARCH DICTIONARY
- DROP TRIGGER
- DROP TYPE
- DROP USER
- DROP USER MAPPING
- DROP VIEW
- DROP WEAK PASSWORD DICTIONARY
- EXECUTE
- EXECUTE DIRECT
- EXPLAIN
- EXPLAIN PLAN
- FETCH
- GRANT
- INSERT
- LOCK
- MOVE
- MERGE INTO
- PREDICT BY
- PREPARE
- PREPARE TRANSACTION
- PURGE
- REASSIGN OWNED
- REFRESH INCREMENTAL MATERIALIZED VIEW
- REFRESH MATERIALIZED VIEW
- REINDEX
- RELEASE SAVEPOINT
- RESET
- REVOKE
- ROLLBACK
- ROLLBACK PREPARED
- ROLLBACK TO SAVEPOINT
- SAVEPOINT
- SELECT
- SELECT INTO
- SET
- SET CONSTRAINTS
- SET ROLE
- SET SESSION AUTHORIZATION
- SET TRANSACTION
- SHOW
- SHUTDOWN
- SNAPSHOT
- START TRANSACTION
- TIMECAPSULE TABLE
- TRUNCATE
- UPDATE
- VACUUM
- VALUES
- SQL Reference
- MogDB SQL
- Keywords
- Constant and Macro
- Expressions
- Type Conversion
- Full Text Search
- Introduction
- Tables and Indexes
- Controlling Text Search
- Additional Features
- Parser
- Dictionaries
- Configuration Examples
- Testing and Debugging Text Search
- Limitations
- System Operation
- Controlling Transactions
- DDL Syntax Overview
- DML Syntax Overview
- DCL Syntax Overview
- Appendix
- GUC Parameters
- GUC Parameter Usage
- File Location
- Connection and Authentication
- Resource Consumption
- Parallel Import
- Write Ahead Log
- HA Replication
- Memory Table
- Query Planning
- Error Reporting and Logging
- Alarm Detection
- Statistics During the Database Running
- Load Management
- Automatic Vacuuming
- Default Settings of Client Connection
- Lock Management
- Version and Platform Compatibility
- Faut Tolerance
- Connection Pool Parameters
- MogDB Transaction
- Developer Options
- Auditing
- Upgrade Parameters
- Miscellaneous Parameters
- Wait Events
- Query
- System Performance Snapshot
- Security Configuration
- Global Temporary Table
- HyperLogLog
- Scheduled Task
- Thread Pool
- User-defined Functions
- Backup and Restoration
- Undo
- DCF Parameters Settings
- Flashback
- Rollback Parameters
- Reserved Parameters
- AI Features
- Appendix
- Schema
- Information Schema
- DBE_PERF
- Overview
- OS
- Instance
- Memory
- File
- Object
- STAT_USER_TABLES
- SUMMARY_STAT_USER_TABLES
- GLOBAL_STAT_USER_TABLES
- STAT_USER_INDEXES
- SUMMARY_STAT_USER_INDEXES
- GLOBAL_STAT_USER_INDEXES
- STAT_SYS_TABLES
- SUMMARY_STAT_SYS_TABLES
- GLOBAL_STAT_SYS_TABLES
- STAT_SYS_INDEXES
- SUMMARY_STAT_SYS_INDEXES
- GLOBAL_STAT_SYS_INDEXES
- STAT_ALL_TABLES
- SUMMARY_STAT_ALL_TABLES
- GLOBAL_STAT_ALL_TABLES
- STAT_ALL_INDEXES
- SUMMARY_STAT_ALL_INDEXES
- GLOBAL_STAT_ALL_INDEXES
- STAT_DATABASE
- SUMMARY_STAT_DATABASE
- GLOBAL_STAT_DATABASE
- STAT_DATABASE_CONFLICTS
- SUMMARY_STAT_DATABASE_CONFLICTS
- GLOBAL_STAT_DATABASE_CONFLICTS
- STAT_XACT_ALL_TABLES
- SUMMARY_STAT_XACT_ALL_TABLES
- GLOBAL_STAT_XACT_ALL_TABLES
- STAT_XACT_SYS_TABLES
- SUMMARY_STAT_XACT_SYS_TABLES
- GLOBAL_STAT_XACT_SYS_TABLES
- STAT_XACT_USER_TABLES
- SUMMARY_STAT_XACT_USER_TABLES
- GLOBAL_STAT_XACT_USER_TABLES
- STAT_XACT_USER_FUNCTIONS
- SUMMARY_STAT_XACT_USER_FUNCTIONS
- GLOBAL_STAT_XACT_USER_FUNCTIONS
- STAT_BAD_BLOCK
- SUMMARY_STAT_BAD_BLOCK
- GLOBAL_STAT_BAD_BLOCK
- STAT_USER_FUNCTIONS
- SUMMARY_STAT_USER_FUNCTIONS
- GLOBAL_STAT_USER_FUNCTIONS
- Workload
- Session/Thread
- SESSION_STAT
- GLOBAL_SESSION_STAT
- SESSION_TIME
- GLOBAL_SESSION_TIME
- SESSION_MEMORY
- GLOBAL_SESSION_MEMORY
- SESSION_MEMORY_DETAIL
- GLOBAL_SESSION_MEMORY_DETAIL
- SESSION_STAT_ACTIVITY
- GLOBAL_SESSION_STAT_ACTIVITY
- THREAD_WAIT_STATUS
- GLOBAL_THREAD_WAIT_STATUS
- LOCAL_THREADPOOL_STATUS
- GLOBAL_THREADPOOL_STATUS
- SESSION_CPU_RUNTIME
- SESSION_MEMORY_RUNTIME
- STATEMENT_IOSTAT_COMPLEX_RUNTIME
- LOCAL_ACTIVE_SESSION
- Transaction
- Query
- STATEMENT
- SUMMARY_STATEMENT
- STATEMENT_COUNT
- GLOBAL_STATEMENT_COUNT
- SUMMARY_STATEMENT_COUNT
- GLOBAL_STATEMENT_COMPLEX_HISTORY
- GLOBAL_STATEMENT_COMPLEX_HISTORY_TABLE
- GLOBAL_STATEMENT_COMPLEX_RUNTIME
- STATEMENT_RESPONSETIME_PERCENTILE
- STATEMENT_USER_COMPLEX_HISTORY
- STATEMENT_COMPLEX_RUNTIME
- STATEMENT_COMPLEX_HISTORY_TABLE
- STATEMENT_COMPLEX_HISTORY
- STATEMENT_WLMSTAT_COMPLEX_RUNTIME
- STATEMENT_HISTORY
- Cache/IO
- STATIO_USER_TABLES
- SUMMARY_STATIO_USER_TABLES
- GLOBAL_STATIO_USER_TABLES
- STATIO_USER_INDEXES
- SUMMARY_STATIO_USER_INDEXES
- GLOBAL_STATIO_USER_INDEXES
- STATIO_USER_SEQUENCES
- SUMMARY_STATIO_USER_SEQUENCES
- GLOBAL_STATIO_USER_SEQUENCES
- STATIO_SYS_TABLES
- SUMMARY_STATIO_SYS_TABLES
- GLOBAL_STATIO_SYS_TABLES
- STATIO_SYS_INDEXES
- SUMMARY_STATIO_SYS_INDEXES
- GLOBAL_STATIO_SYS_INDEXES
- STATIO_SYS_SEQUENCES
- SUMMARY_STATIO_SYS_SEQUENCES
- GLOBAL_STATIO_SYS_SEQUENCES
- STATIO_ALL_TABLES
- SUMMARY_STATIO_ALL_TABLES
- GLOBAL_STATIO_ALL_TABLES
- STATIO_ALL_INDEXES
- SUMMARY_STATIO_ALL_INDEXES
- GLOBAL_STATIO_ALL_INDEXES
- STATIO_ALL_SEQUENCES
- SUMMARY_STATIO_ALL_SEQUENCES
- GLOBAL_STATIO_ALL_SEQUENCES
- GLOBAL_STAT_DB_CU
- GLOBAL_STAT_SESSION_CU
- Utility
- REPLICATION_STAT
- GLOBAL_REPLICATION_STAT
- REPLICATION_SLOTS
- GLOBAL_REPLICATION_SLOTS
- BGWRITER_STAT
- GLOBAL_BGWRITER_STAT
- GLOBAL_CKPT_STATUS
- GLOBAL_DOUBLE_WRITE_STATUS
- GLOBAL_PAGEWRITER_STATUS
- GLOBAL_RECORD_RESET_TIME
- GLOBAL_REDO_STATUS
- GLOBAL_RECOVERY_STATUS
- CLASS_VITAL_INFO
- USER_LOGIN
- SUMMARY_USER_LOGIN
- GLOBAL_GET_BGWRITER_STATUS
- GLOBAL_SINGLE_FLUSH_DW_STATUS
- GLOBAL_CANDIDATE_STATUS
- Lock
- Wait Events
- Configuration
- Operator
- Workload Manager
- Global Plancache
- RTO
- Appendix
- DBE_PLDEBUGGER Schema
- Overview
- DBE_PLDEBUGGER.turn_on
- DBE_PLDEBUGGER.turn_off
- DBE_PLDEBUGGER.local_debug_server_info
- DBE_PLDEBUGGER.attach
- DBE_PLDEBUGGER.next
- DBE_PLDEBUGGER.continue
- DBE_PLDEBUGGER.abort
- DBE_PLDEBUGGER.print_var
- DBE_PLDEBUGGER.info_code
- DBE_PLDEBUGGER.step
- DBE_PLDEBUGGER.add_breakpoint
- DBE_PLDEBUGGER.delete_breakpoint
- DBE_PLDEBUGGER.info_breakpoints
- DBE_PLDEBUGGER.backtrace
- DBE_PLDEBUGGER.finish
- DBE_PLDEBUGGER.set_var
- DB4AI Schema
- Tool Reference
- Tool Overview
- Client Tool
- Server Tools
- Tools Used in the Internal System
- gaussdb
- gs_backup
- gs_basebackup
- gs_ctl
- gs_initdb
- gs_install
- gs_install_plugin
- gs_install_plugin_local
- gs_postuninstall
- gs_preinstall
- gs_sshexkey
- gs_tar
- gs_uninstall
- gs_upgradectl
- gs_expansion
- gs_dropnode
- gs_probackup
- gstrace
- kdb5_util
- kadmin.local
- kinit
- klist
- krb5kdc
- kdestroy
- pg_config
- pg_controldata
- pg_recvlogical
- pg_resetxlog
- pg_archivecleanup
- pssh
- pscp
- transfer.py
- FAQ
- System Catalogs and Views Supported by gs_collector
- Extension Reference
- Error Code Reference
- Description of SQL Error Codes
- Third-Party Library Error Codes
- GAUSS-00001 - GAUSS-00100
- GAUSS-00101 - GAUSS-00200
- GAUSS 00201 - GAUSS 00300
- GAUSS 00301 - GAUSS 00400
- GAUSS 00401 - GAUSS 00500
- GAUSS 00501 - GAUSS 00600
- GAUSS 00601 - GAUSS 00700
- GAUSS 00701 - GAUSS 00800
- GAUSS 00801 - GAUSS 00900
- GAUSS 00901 - GAUSS 01000
- GAUSS 01001 - GAUSS 01100
- GAUSS 01101 - GAUSS 01200
- GAUSS 01201 - GAUSS 01300
- GAUSS 01301 - GAUSS 01400
- GAUSS 01401 - GAUSS 01500
- GAUSS 01501 - GAUSS 01600
- GAUSS 01601 - GAUSS 01700
- GAUSS 01701 - GAUSS 01800
- GAUSS 01801 - GAUSS 01900
- GAUSS 01901 - GAUSS 02000
- GAUSS 02001 - GAUSS 02100
- GAUSS 02101 - GAUSS 02200
- GAUSS 02201 - GAUSS 02300
- GAUSS 02301 - GAUSS 02400
- GAUSS 02401 - GAUSS 02500
- GAUSS 02501 - GAUSS 02600
- GAUSS 02601 - GAUSS 02700
- GAUSS 02701 - GAUSS 02800
- GAUSS 02801 - GAUSS 02900
- GAUSS 02901 - GAUSS 03000
- GAUSS 03001 - GAUSS 03100
- GAUSS 03101 - GAUSS 03200
- GAUSS 03201 - GAUSS 03300
- GAUSS 03301 - GAUSS 03400
- GAUSS 03401 - GAUSS 03500
- GAUSS 03501 - GAUSS 03600
- GAUSS 03601 - GAUSS 03700
- GAUSS 03701 - GAUSS 03800
- GAUSS 03801 - GAUSS 03900
- GAUSS 03901 - GAUSS 04000
- GAUSS 04001 - GAUSS 04100
- GAUSS 04101 - GAUSS 04200
- GAUSS 04201 - GAUSS 04300
- GAUSS 04301 - GAUSS 04400
- GAUSS 04401 - GAUSS 04500
- GAUSS 04501 - GAUSS 04600
- GAUSS 04601 - GAUSS 04700
- GAUSS 04701 - GAUSS 04800
- GAUSS 04801 - GAUSS 04900
- GAUSS 04901 - GAUSS 05000
- GAUSS 05001 - GAUSS 05100
- GAUSS 05101 - GAUSS 05200
- GAUSS 05201 - GAUSS 05300
- GAUSS 05301 - GAUSS 05400
- GAUSS 05401 - GAUSS 05500
- GAUSS 05501 - GAUSS 05600
- GAUSS 05601 - GAUSS 05700
- GAUSS 05701 - GAUSS 05800
- GAUSS 05801 - GAUSS 05900
- GAUSS 05901 - GAUSS 06000
- GAUSS 06001 - GAUSS 06100
- GAUSS 06101 - GAUSS 06200
- GAUSS 06201 - GAUSS 06300
- GAUSS 06301 - GAUSS 06400
- GAUSS 06401 - GAUSS 06500
- GAUSS 06501 - GAUSS 06600
- GAUSS 06601 - GAUSS 06700
- GAUSS 06701 - GAUSS 06800
- GAUSS 06801 - GAUSS 06900
- GAUSS 06901 - GAUSS 07000
- GAUSS 07001 - GAUSS 07100
- GAUSS 07101 - GAUSS 07200
- GAUSS 07201 - GAUSS 07300
- GAUSS 07301 - GAUSS 07400
- GAUSS 07401 - GAUSS 07480
- GAUSS 50000 - GAUSS 50999
- GAUSS 51000 - GAUSS 51999
- GAUSS 52000 - GAUSS 52999
- GAUSS 53000 - GAUSS 53699
- Error Log Reference
- System Catalogs and System Views
- Common Faults and Identification Guide
- Common Fault Locating Methods
- Common Fault Locating Cases
- Core Fault Locating
- Permission/Session/Data Type Fault Location
- Service/High Availability/Concurrency Fault Location
- Table/Partition Table Fault Location
- File System/Disk/Memory Fault Location
- After You Run the du Command to Query Data File Size In the XFS File System, the Query Result Is Greater than the Actual File Size
- File Is Damaged in the XFS File System
- Insufficient Memory
- "Error:No space left on device" Is Displayed
- When the TPC-C is running and a disk to be injected is full, the TPC-C stops responding
- Disk Space Usage Reaches the Threshold and the Database Becomes Read-only
- SQL Fault Location
- Index Fault Location
- Source Code Parsing
- FAQs
- Glossary
Best Practices
This section describes how to use certain algorithms, including the classification, regression, clustering, gbdt, xgboost, and prohpet algorithms.
Create a database and install an algorithm.
create database test1 dbcompatibility='B';
./madpack -s madlib -p mogdb -c opg@127.0.0.1:7651/test1 install
Classification Algorithm
Take the housing price prediction with the support vector machine (SVM) as an example:
-
Prepare a dataset.
DROP TABLE IF EXISTS houses; CREATE TABLE houses (id INT, tax INT, bedroom INT, bath FLOAT, price INT, size INT, lot INT); INSERT INTO houses VALUES (1 , 590 , 2 , 1 , 50000 , 770 , 22100), (2 , 1050 , 3 , 2 , 85000 , 1410 , 12000), (3 , 20 , 3 , 1 , 22500 , 1060 , 3500), (4 , 870 , 2 , 2 , 90000 , 1300 , 17500), (5 , 1320 , 3 , 2 , 133000 , 1500 , 30000), (6 , 1350 , 2 , 1 , 90500 , 820 , 25700), (7 , 2790 , 3 , 2.5 , 260000 , 2130 , 25000), (8 , 680 , 2 , 1 , 142500 , 1170 , 22000), (9 , 1840 , 3 , 2 , 160000 , 1500 , 19000), (10 , 3680 , 4 , 2 , 240000 , 2790 , 20000), (11 , 1660 , 3 , 1 , 87000 , 1030 , 17500), (12 , 1620 , 3 , 2 , 118600 , 1250 , 20000), (13 , 3100 , 3 , 2 , 140000 , 1760 , 38000), (14 , 2070 , 2 , 3 , 148000 , 1550 , 14000), (15 , 650 , 3 , 1.5 , 65000 , 1450 , 12000);
-
Train the model.
Configure the corresponding schema and compatibility parameters before training.
SET search_path="$user",public,madlib; SET behavior_compat_options = 'bind_procedure_searchpath';
Use default parameters for training. The classification condition is 'price < 100000'. The SQL statement is as follows:
DROP TABLE IF EXISTS houses_svm, houses_svm_summary; SELECT madlib.svm_classification('public.houses','public.houses_svm','price < 100000','ARRAY[1, tax, bath, size]');
-
View the model.
\x on SELECT * FROM houses_svm; \x off
The results are displayed as follows:
-[ RECORD 1 ]------+----------------------------------------------------------------- coef | {.113989576847,-.00226133300602,-.0676303607996,.00179440841072} loss | .614496714256667 norm_of_gradient | 108.171180769224 num_iterations | 100 num_rows_processed | 15 num_rows_skipped | 0 dep_var_mapping | {f,t}
-
Perform prediction.
DROP TABLE IF EXISTS houses_pred; SELECT madlib.svm_predict('public.houses_svm','public.houses','id','public.houses_pred');
-
View the prediction result.
SELECT *, price < 100000 AS actual FROM houses JOIN houses_pred USING (id) ORDER BY id;
The results are displayed as follows:
id | tax | bedroom | bath | price | size | lot | prediction | decision_function | actual ----+------+---------+------+--------+------+-------+------------+-------------------+-------- 1 | 590 | 2 | 1 | 50000 | 770 | 22100 | t | .09386721875 | t 2 | 1050 | 3 | 2 | 85000 | 1410 | 12000 | t | .134445058042 | t 3 | 20 | 3 | 1 | 22500 | 1060 | 3500 | t | 1.9032054712902 | t 4 | 870 | 2 | 2 | 90000 | 1300 | 17500 | t | .3441000739464 | t 5 | 1320 | 3 | 2 | 133000 | 1500 | 30000 | f | -.3146180966186 | f 6 | 1350 | 2 | 1 | 90500 | 820 | 25700 | f | -1.5350254452892 | t 7 | 2790 | 3 | 2.5 | 260000 | 2130 | 25000 | f | -2.5421154971142 | f 8 | 680 | 2 | 1 | 142500 | 1170 | 22000 | t | .6081106124962 | f 9 | 1840 | 3 | 2 | 160000 | 1500 | 19000 | f | -1.490511259749 | f 10 | 3680 | 4 | 2 | 240000 | 2790 | 20000 | f | -3.336577140997 | f 11 | 1660 | 3 | 1 | 87000 | 1030 | 17500 | f | -1.8592129109042 | t 12 | 1620 | 3 | 2 | 118600 | 1250 | 20000 | f | -1.4416201011046 | f 13 | 3100 | 3 | 2 | 140000 | 1760 | 38000 | f | -3.873244660547 | f 14 | 2070 | 2 | 3 | 148000 | 1550 | 14000 | f | -1.9885277913972 | f 15 | 650 | 3 | 1.5 | 65000 | 1450 | 12000 | t | 1.1445697772786 | t (15 rows)
-
View the false positive rate.
SELECT COUNT(*) FROM houses_pred JOIN houses USING (id) WHERE houses_pred.prediction != (houses.price < 100000);
The results are displayed as follows:
count ------- 3 (1 row)
-
-
Use other SVM cores for training.
DROP TABLE IF EXISTS houses_svm_gaussian, houses_svm_gaussian_summary, houses_svm_gaussian_random; SELECT madlib.svm_classification( 'public.houses','public.houses_svm_gaussian','price < 100000','ARRAY[1, tax, bath, size]','gaussian','n_components=10', '', 'init_stepsize=1, max_iter=200' );
Perform prediction and view the training result.
DROP TABLE IF EXISTS houses_pred_gaussian; SELECT madlib.svm_predict('public.houses_svm_gaussian','public.houses','id', 'public.houses_pred_gaussian'); SELECT COUNT(*) FROM houses_pred_gaussian JOIN houses USING (id) WHERE houses_pred_gaussian.prediction != (houses.price < 100000);
The result is as follows:
count -------+ 0 (1 row)
-
Use other parameters.
In addition to specifying different kernel methods, you can specify the number of training steps and initial parameters, such as init_stepsize, max_iter, and class_weight.
NOTICE:
SET search_path="$user",public,madlib; SET behavior_compat_options = 'bind_procedure_searchpath';
Before executing the algorithm, you need to set the schema in search_path and set the bind_procedure_searchpath. Otherwise, the table cannot be found. All machine learning methods are installed in a schema, and user tables are installed in user's schemas. In this example, the algorithm is installed in madlib, and user tables are stored in public. If no schema is set, the table may not be found when the algorithm is executed. When executing the algorithm, you are advised to add the scheme of the input table.
Regression Algorithm
Use the Boston housing price prediction with the linear regression as an example:
-
Prepare a dataset.
The dataset is the same as that of the SVM.
-
Train a model.
SET search_path="$user",public,madlib; SET behavior_compat_options = 'bind_procedure_searchpath'; DROP TABLE IF EXISTS houses_linregr, houses_linregr_summary; SELECT madlib.linregr_train( 'public.houses', 'public.houses_linregr', 'price', 'ARRAY[1, tax, bath, size]');
-
View the model content.
\x ON SELECT * FROM houses_linregr; \x OFF
The returned result is as follows:
-[ RECORD 1 ]------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- coef | {-12849.4168959872,28.9613922651775,10181.6290712649,50.516894915353} r2 | .768577580597462 std_err | {33453.0344331377,15.8992104963991,19437.7710925915,32.9280231740856} t_stats | {-.384103179688204,1.82156166004197,.523806408809163,1.53416118083608} p_values | {.708223134615411,.0958005827189556,.610804093526516,.153235085548177} condition_no | 9002.50457069858 num_rows_processed | 15 num_missing_rows_skipped | 0 variance_covariance | {{1119105512.7847,217782.067878005,-283344228.394538,-616679.693190829},{217782.067878005,252.784894408806,-46373.1796964038,-369.864520095145},{-283344228.394538,-46373.1796964038,377826945.047986,-209088.217319699},{-616679.693190829,-369.864520095145,-209088.217319699,1084.25471015312}}
-
Predict and compare the results.
SELECT houses.*, madlib.linregr_predict( m.coef, ARRAY[1,tax,bath,size]) as predict, price - madlib.linregr_predict( m.coef, ARRAY[1,tax,bath,size]) as residual FROM public.houses, public.houses_linregr AS m ORDER BY id;
The returned result is as follows:
id | tax | bedroom | bath | price | size | lot | predict | residual ----+------+---------+------+--------+------+-------+------------------+------------------- 1 | 590 | 2 | 1 | 50000 | 770 | 22100 | 53317.4426965543 | -3317.44269655428 2 | 1050 | 3 | 2 | 85000 | 1410 | 12000 | 109152.124955627 | -24152.1249556268 3 | 20 | 3 | 1 | 22500 | 1060 | 3500 | 51459.3486308555 | -28959.3486308555 4 | 870 | 2 | 2 | 90000 | 1300 | 17500 | 98382.215907206 | -8382.21590720599 5 | 1320 | 3 | 2 | 133000 | 1500 | 30000 | 121518.221409606 | 11481.7785903935 6 | 1350 | 2 | 1 | 90500 | 820 | 25700 | 77853.9455638568 | 12646.0544361432 7 | 2790 | 3 | 2.5 | 260000 | 2130 | 25000 | 201007.926371722 | 58992.0736282778 8 | 680 | 2 | 1 | 142500 | 1170 | 22000 | 76130.7259665615 | 66369.2740334385 9 | 1840 | 3 | 2 | 160000 | 1500 | 19000 | 136578.145387499 | 23421.8546125013 10 | 3680 | 4 | 2 | 240000 | 2790 | 20000 | 255033.901596231 | -15033.9015962306 11 | 1660 | 3 | 1 | 87000 | 1030 | 17500 | 97440.5250982859 | -10440.5250982859 12 | 1620 | 3 | 2 | 118600 | 1250 | 20000 | 117577.415360321 | 1022.58463967856 13 | 3100 | 3 | 2 | 140000 | 1760 | 38000 | 186203.892319614 | -46203.8923196141 14 | 2070 | 2 | 3 | 148000 | 1550 | 14000 | 155946.739425522 | -7946.73942552213 15 | 650 | 3 | 1.5 | 65000 | 1450 | 12000 | 94497.4293105374 | -29497.4293105374 (15 rows)
Clustering Algorithm
Use kmeans as an example:
-
Prepare the data.
DROP TABLE IF EXISTS km_sample; CREATE TABLE km_sample(pid int, points double precision[]); INSERT INTO km_sample VALUES (1, '{14.23, 1.71, 2.43, 15.6, 127, 2.8, 3.0600, 0.2800, 2.29, 5.64, 1.04, 3.92, 1065}'), (2, '{13.2, 1.78, 2.14, 11.2, 1, 2.65, 2.76, 0.26, 1.28, 4.38, 1.05, 3.49, 1050}'), (3, '{13.16, 2.36, 2.67, 18.6, 101, 2.8, 3.24, 0.3, 2.81, 5.6799, 1.03, 3.17, 1185}'), (4, '{14.37, 1.95, 2.5, 16.8, 113, 3.85, 3.49, 0.24, 2.18, 7.8, 0.86, 3.45, 1480}'), (5, '{13.24, 2.59, 2.87, 21, 118, 2.8, 2.69, 0.39, 1.82, 4.32, 1.04, 2.93, 735}'), (6, '{14.2, 1.76, 2.45, 15.2, 112, 3.27, 3.39, 0.34, 1.97, 6.75, 1.05, 2.85, 1450}'), (7, '{14.39, 1.87, 2.45, 14.6, 96, 2.5, 2.52, 0.3, 1.98, 5.25, 1.02, 3.58, 1290}'), (8, '{14.06, 2.15, 2.61, 17.6, 121, 2.6, 2.51, 0.31, 1.25, 5.05, 1.06, 3.58, 1295}'), (9, '{14.83, 1.64, 2.17, 14, 97, 2.8, 2.98, 0.29, 1.98, 5.2, 1.08, 2.85, 1045}'), (10, '{13.86, 1.35, 2.27, 16, 98, 2.98, 3.15, 0.22, 1.8500, 7.2199, 1.01, 3.55, 1045}');
-
Run the kmeans algorithm.
Use kmeans++ and Euclidean distance function for calculation.
SET search_path="$user",public,madlib; SET behavior_compat_options = 'bind_procedure_searchpath'; DROP TABLE IF EXISTS km_result; CREATE TABLE km_result AS SELECT * FROM madlib.kmeanspp( 'public.km_sample', -- Table of source data 'points', -- Column containing point co-ordinates 2, -- Number of centroids to calculate 'madlib.squared_dist_norm2', -- Distance function 'madlib.avg', -- Aggregate function 20, -- Number of iterations 0.001 -- Fraction of centroids reassigned to keep iterating );
After kmeans is executed, no table is automatically created to save the content. Therefore, you need to create a table.
\x on select * from km_result; \x off
The returned result is as follows:
-[ RECORD 1 ]----+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- centroids | {{14.0333333333333,1.84111111111111,2.41,15.5111111111111,96.2222222222222,2.91666666666667,3.01111111111111,.282222222222222,1.95444444444444,5.88553333333333,1.02222222222222,3.38222222222222,1211.66666666667},{13.24,2.59,2.87,21,118,2.8,2.69,.39,1.82,4.32,1.04,2.93,735}} cluster_variance | {257041.999707571,0} objective_fn | 257041.999707571 frac_reassigned | 0 num_iterations | 2
-
Apply the clustering result.
Execute the following function to calculate the adjacent node and the corresponding distance of each node:
DROP TABLE IF EXISTS km_points_silh; SELECT * FROM madlib.simple_silhouette_points('public.km_sample', -- Input points table 'public.km_points_silh', -- Output table 'pid', -- Point ID column in input table 'points', -- Points column in input table 'public.km_result', -- Centroids table 'centroids', -- Column in centroids table containing centroids 'madlib.squared_dist_norm2' -- Distance function ); SELECT * FROM km_points_silh ORDER BY pid;
The returned result is as follows:
pid | centroid_id | neighbor_centroid_id | silh -----+-------------+----------------------+------------------ 1 | 0 | 1 | .793983543638996 2 | 0 | 1 | .688301735667703 3 | 0 | 1 | .996324103148159 4 | 0 | 1 | .869765755931474 5 | 1 | 0 | 1 6 | 0 | 1 | .888416176253661 7 | 0 | 1 | .980107240092519 8 | 0 | 1 | .975880363039906 9 | 0 | 1 | .712384473959954 10 | 0 | 1 | .712198411442872 (10 rows)
gbdt Algorithm
Although the gdbt base learner is a regression tree, the algorithm itself supports both classification and regression operations. The following describes the implementation of the two types of tasks. It is worth noting that this method does not support NULL values in the label column.
Classification
-
Prepare the data.
DROPTABLEIFEXISTS dt_golf CASCADE; DROPTABLEIFEXISTS train_output,train_output_summary; CREATETABLE dt_golf ( id integerNOTNULL, "OUTLOOK"text, temperature double precision, humidity double precision, "Cont_features"double precision[], cat_features text[], windy boolean, class integer ) ; INSERTINTO dt_golf (id,"OUTLOOK",temperature,humidity,"Cont_features",cat_features, windy,class) VALUES (1, 'sunny', 85, 85,ARRAY[85, 85], ARRAY['a', 'b'], false, 0), (2, 'sunny', 80, 90, ARRAY[80, 90], ARRAY['a', 'b'], true, 0), (3, 'overcast', 83, 78, ARRAY[83, 78], ARRAY['a', 'b'], false, 1), (4, 'rain', 70, NULL, ARRAY[70, 96], ARRAY['a', 'b'], false, 1), (5, 'rain', 68, 80, ARRAY[68, 80], ARRAY['a', 'b'], false, 1), (6, 'rain', NULL, 70, ARRAY[65, 70], ARRAY['a', 'b'], true, 0), (7, 'overcast', 64, 65, ARRAY[64, 65], ARRAY['c', 'b'], NULL , 1), (8, 'sunny', 72, 95, ARRAY[72, 95], ARRAY['a', 'b'], false, 0), (9, 'sunny', 69, 70, ARRAY[69, 70], ARRAY['a', 'b'], false, 1), (10, 'rain', 75, 80, ARRAY[75, 80], ARRAY['a', 'b'], false, 1), (11, 'sunny', 75, 70, ARRAY[75, 70], ARRAY['a', 'd'], true, 1), (12, 'overcast', 72, 90, ARRAY[72, 90], ARRAY['c', 'b'], NULL, 1), (13, 'overcast', 81, 75, ARRAY[81, 75], ARRAY['a', 'b'], false, 1), (15, NULL, 81, 75, ARRAY[81, 75], ARRAY['a', 'b'], false, 1), (16, 'overcast', NULL, 75, ARRAY[81, 75], ARRAY['a', 'd'], false,1), (14, 'rain', 71, 80, ARRAY[71, 80], ARRAY['c', 'b'], true, 0);
-
Train a model.
select madlib.gbdt_train('dt_golf', -- source table 'train_output', -- output model table 'id' , -- id column 'class', -- response '"OUTLOOK", temperature', -- features NULL, -- exclude columns 1, --weight 10, -- num of trees NULL, -- num of random features 10, -- max depth 1, -- min split 1, -- min bucket 8, -- number of bins per continuous variable 'max_surrogates=0', TRUE );
After the model training is complete, two tables are generated. The train_output table stores the regression tree model in gdbt, including the parameter records of each base learner.
Column | Type | Modifiers | Storage | Stats target | Description --------------------+---------------+-----------+----------+--------------+------------- iteration | integer | | plain | | tree | madlib.bytea8 | | external | | cat_levels_in_text | text[] | | extended | | cat_n_levels | integer[] | | extended | | tree_depth | integer | | plain | |
iteration | cat_levels_in_text | cat_n_levels | tree_depth -----------+-----------------------+--------------+------------ 0 | {sunny,rain,overcast} | {3} | 4 1 | {sunny,rain,overcast} | {3} | 5 2 | {sunny,rain,overcast} | {3} | 6 3 | {sunny,rain,overcast} | {3} | 4 4 | {sunny,rain,overcast} | {3} | 5 5 | {sunny,rain,overcast} | {3} | 5 6 | {sunny,rain,overcast} | {3} | 5 7 | {sunny,rain,overcast} | {3} | 5 8 | {sunny,rain,overcast} | {3} | 4 9 | {sunny,rain,overcast} | {3} | 3 (10 rows)
The train_output_summary table describes the gdbt training.
select * from train_output_summary; method | cat_features | con_features | source_table | model_table | null_proxy | learning_rate | is_classification | predict_dt_prob | num_trees --------+--------------+--------------+--------------+--------------+------------+---------------+-------------------+-----------------+----------- GBDT | "OUTLOOK" | temperature | dt_golf | train_output | | .01 | t | response | 10 (1 row)
-
Perform prediction.
select madlib.gbdt_predict('dt_golf2','train_output','test_output','id');
View the prediction result.
select test_output.id, test_prediction,class from test_output join dt_golf using (id); id | test_prediction | class ----+-----------------+------- 1 | 1.0 | 0 2 | 1.0 | 0 3 | 1.0 | 1 4 | 1.0 | 1 5 | 1.0 | 1 6 | 1.0 | 0 7 | 1.0 | 1 8 | 0.0 | 0 9 | 1.0 | 1 10 | 1.0 | 1 11 | 1.0 | 1 12 | 1.0 | 1 13 | 1.0 | 1 15 | 0.0 | 1 16 | 1.0 | 1 14 | 0.0 | 0 (16 rows)
Regression
-
Prepare the data.
DROP TABLE IF EXISTS crime; CREATE TABLE crime ( id SERIAL NOT NULL, CrimeRat DOUBLE PRECISION, MaleTeen INTEGER, South SMALLINT, Educ DOUBLE PRECISION, Police60 INTEGER, Police59 INTEGER, Labor INTEGER, Males INTEGER, Pop INTEGER, NonWhite INTEGER, Unemp1 INTEGER, Unemp2 INTEGER, Median INTEGER, BelowMed INTEGER ); INSERT INTO crime( CrimeRat, MaleTeen, South, Educ, Police60, Police59, Labor, Males, Pop, NonWhite, Unemp1, Unemp2, Median, BelowMed ) VALUES (79.1, 151, 1, 9.1, 58, 56, 510, 950, 33, 301, 108, 41, 394, 261), (163.5, 143, 0, 11.3, 103, 95, 583, 1012, 13, 102, 96, 36, 557, 194), (57.8, 142, 1, 8.9, 45, 44, 533, 969, 18, 219, 94, 33, 318, 250), (196.9, 136, 0, 12.1, 149, 141, 577, 994, 157, 80, 102, 39, 673, 167), (123.4, 141, 0, 12.1, 109, 101, 591, 985, 18, 30, 91, 20, 578, 174), (68.2, 121, 0, 11.0, 118, 115, 547, 964, 25, 44, 84, 29, 689, 126), (96.3, 127, 1, 11.1, 82, 79, 519, 982, 4, 139, 97, 38, 620, 168), (155.5, 131, 1, 10.9, 115, 109, 542, 969, 50, 179, 79, 35, 472, 206), (85.6, 157, 1, 9.0, 65, 62, 553, 955, 39, 286, 81, 28, 421, 239), (70.5, 140, 0, 11.8, 71, 68, 632, 1029, 7, 15, 100, 24, 526, 174), (167.4, 124, 0, 10.5, 121, 116, 580, 966, 101, 106, 77, 35, 657, 170), (84.9, 134, 0, 10.8, 75, 71, 595, 972, 47, 59, 83, 31, 580, 172), (51.1, 128, 0, 11.3, 67, 60, 624, 972, 28, 10, 77, 25, 507, 206), (66.4, 135, 0, 11.7, 62, 61, 595, 986, 22, 46, 77, 27, 529, 190), (79.8, 152, 1, 8.7, 57, 53, 530, 986, 30, 72, 92, 43, 405, 264), (94.6, 142, 1, 8.8, 81, 77, 497, 956, 33, 321, 116, 47, 427, 247), (53.9, 143, 0, 11.0, 66, 63, 537, 977, 10, 6, 114, 35, 487, 166), (92.9, 135, 1, 10.4, 123, 115, 537, 978, 31, 170, 89, 34, 631, 165), (75.0, 130, 0, 11.6, 128, 128, 536, 934, 51, 24, 78, 34, 627, 135), (122.5, 125, 0, 10.8, 113, 105, 567, 985, 78, 94, 130, 58, 626, 166), (74.2, 126, 0, 10.8, 74, 67, 602, 984, 34, 12, 102, 33, 557, 195), (43.9, 157, 1, 8.9, 47, 44, 512, 962, 22, 423, 97, 34, 288, 276), (121.6, 132, 0, 9.6, 87, 83, 564, 953, 43, 92, 83, 32, 513, 227), (96.8, 131, 0, 11.6, 78, 73, 574, 1038, 7, 36, 142, 42, 540, 176), (52.3, 130, 0, 11.6, 63, 57, 641, 984, 14, 26, 70, 21, 486, 196), (199.3, 131, 0, 12.1, 160, 143, 631, 1071, 3, 77, 102, 41, 674, 152), (34.2, 135, 0, 10.9, 69, 71, 540, 965, 6, 4, 80, 22, 564, 139), (121.6, 152, 0, 11.2, 82, 76, 571, 1018, 10, 79, 103, 28, 537, 215), (104.3, 119, 0, 10.7, 166, 157, 521, 938, 168, 89, 92, 36, 637, 154), (69.6, 166, 1, 8.9, 58, 54, 521, 973, 46, 254, 72, 26, 396, 237), (37.3, 140, 0, 9.3, 55, 54, 535, 1045, 6, 20, 135, 40, 453, 200), (75.4, 125, 0, 10.9, 90, 81, 586, 964, 97, 82, 105, 43, 617, 163), (107.2, 147, 1, 10.4, 63, 64, 560, 972, 23, 95, 76, 24, 462, 233), (92.3, 126, 0, 11.8, 97, 97, 542, 990, 18, 21, 102, 35, 589, 166), (65.3, 123, 0, 10.2, 97, 87, 526, 948, 113, 76, 124, 50, 572, 158), (127.2, 150, 0, 10.0, 109, 98, 531, 964, 9, 24, 87, 38, 559, 153), (83.1, 177, 1, 8.7, 58, 56, 638, 974, 24, 349, 76, 28, 382, 254), (56.6, 133, 0, 10.4, 51, 47, 599, 1024, 7, 40, 99, 27, 425, 225), (82.6, 149, 1, 8.8, 61, 54, 515, 953, 36, 165, 86, 35, 395, 251), (115.1, 145, 1, 10.4, 82, 74, 560, 981, 96, 126, 88, 31, 488, 228), (88.0, 148, 0, 12.2, 72, 66, 601, 998, 9, 19, 84, 20, 590, 144), (54.2, 141, 0, 10.9, 56, 54, 523, 968, 4, 2, 107, 37, 489, 170), (82.3, 162, 1, 9.9, 75, 70, 522, 996, 40, 208, 73, 27, 496, 224), (103.0, 136, 0, 12.1, 95, 96, 574, 1012, 29, 36, 111, 37, 622, 162), (45.5, 139, 1, 8.8, 46, 41, 480, 968, 19, 49, 135, 53, 457, 249), (50.8, 126, 0, 10.4, 106, 97, 599, 989, 40, 24, 78, 25, 593, 171), (84.9, 130, 0, 12.1, 90, 91, 623, 1049, 3, 22, 113, 40, 588, 160);
-
Train a model.
select madlib.gbdt_train('crime', -- source table 'train_output', -- output model table 'id' , -- id column 'CrimeRat', -- response '*', -- features NULL, -- exclude columns 1, --weight 20, -- num of trees 4, -- num of random features 10, -- max depth 1, -- min split 1, -- min bucket 8, -- number of bins per continuous variable 'max_surrogates=0', FALSE );
When is_classification is set to FALSE, the model is a regression task. By default, gbdt provides the regression calculation function. The method generates two tables. One table records the collective information of each tree and the binary of the model. The other table records the parameter information of the method.
-
Perform prediction.
select madlib.gbdt_predict('crime','train_output','test_output','id');
View the prediction result.
select test_output.id, test_prediction,CrimeRat from test_output join crime using (id); id | test_prediction | crimerat ----+--------------------+---------- 1 | 79.1 | 79.1 2 | 163.5 | 163.5 3 | 57.8 | 57.8 4 | 196.9 | 196.9 5 | 123.4 | 123.4 6 | 68.2 | 68.2 7 | 96.2999999992251 | 96.3 8 | 155.49842087765936 | 155.5 9 | 84.35 | 85.6 10 | 70.50157912234037 | 70.5 11 | 167.4000000007749 | 167.4 12 | 84.9 | 84.9 13 | 51.1 | 51.1 14 | 66.4 | 66.4 15 | 79.8 | 79.8 16 | 94.6 | 94.6 17 | 53.9 | 53.9 18 | 92.9 | 92.9 19 | 75.0 | 75 20 | 122.5 | 122.5 21 | 74.2 | 74.2 22 | 43.9 | 43.9 23 | 121.6 | 121.6 24 | 96.8 | 96.8 25 | 52.3 | 52.3 26 | 199.3 | 199.3 27 | 34.2 | 34.2 28 | 121.6 | 121.6 29 | 104.3 | 104.3 30 | 69.6 | 69.6 31 | 37.3 | 37.3 32 | 75.4 | 75.4 33 | 107.2 | 107.2 34 | 92.3 | 92.3 35 | 65.2999999992251 | 65.3 36 | 127.19842087765963 | 127.2 37 | 84.35000000002215 | 83.1 38 | 56.60155638297881 | 56.6 39 | 82.45000000075257 | 82.6 40 | 115.10002273936168 | 115.1 41 | 88.0 | 88 42 | 54.19997726063828 | 54.2 43 | 82.44999999999999 | 82.3 44 | 103.00002273936173 | 103 45 | 45.500000000000156 | 45.5 46 | 50.8 | 50.8 47 | 84.9 | 84.9 (47 rows)
xgboost Algorithm
The new xgboost supports classification and regression. The following uses the iris flower classification as an example to describe the xgboost algorithm.
The xgboost supports grid search mode and can train multiple groups of parameters at the same time.
-
Prepare the data.
DROP TABLE IF EXISTS iris; create table iris (id serial, a float, d float, c float, b float, label int); INSERT into iris (a, b, c, d, label) values (5.1, 3.5, 1.4, 0.2, 0),(4.9, 3.0, 1.4, 0.2, 0),(4.7, 3.2, 1.3, 0.2, 0),(4.6, 3.1, 1.5, 0.2, 0), (5.0, 3.6, 1.4, 0.2, 0),(5.4, 3.9, 1.7, 0.4, 0),(4.6, 3.4, 1.4, 0.3, 0),(5.0, 3.4, 1.5, 0.2, 0), (4.4, 2.9, 1.4, 0.2, 0),(4.9, 3.1, 1.5, 0.1, 0),(5.4, 3.7, 1.5, 0.2, 0),(4.8, 3.4, 1.6, 0.2, 0), (4.8, 3.0, 1.4, 0.1, 0),(4.3, 3.0, 1.1, 0.1, 0),(5.8, 4.0, 1.2, 0.2, 0),(5.7, 4.4, 1.5, 0.4, 0), (5.4, 3.9, 1.3, 0.4, 0),(5.1, 3.5, 1.4, 0.3, 0),(5.7, 3.8, 1.7, 0.3, 0),(5.1, 3.8, 1.5, 0.3, 0), (5.4, 3.4, 1.7, 0.2, 0),(5.1, 3.7, 1.5, 0.4, 0),(4.6, 3.6, 1.0, 0.2, 0),(5.1, 3.3, 1.7, 0.5, 0), (4.8, 3.4, 1.9, 0.2, 0),(5.0, 3.0, 1.6, 0.2, 0),(5.0, 3.4, 1.6, 0.4, 0),(5.2, 3.5, 1.5, 0.2, 0), (5.2, 3.4, 1.4, 0.2, 0),(4.7, 3.2, 1.6, 0.2, 0),(4.8, 3.1, 1.6, 0.2, 0),(5.4, 3.4, 1.5, 0.4, 0), (5.2, 4.1, 1.5, 0.1, 0),(5.5, 4.2, 1.4, 0.2, 0),(4.9, 3.1, 1.5, 0.2, 0),(5.0, 3.2, 1.2, 0.2, 0), (5.5, 3.5, 1.3, 0.2, 0),(4.9, 3.6, 1.4, 0.1, 0),(4.4, 3.0, 1.3, 0.2, 0),(5.1, 3.4, 1.5, 0.2, 0), (5.0, 3.5, 1.3, 0.3, 0),(4.5, 2.3, 1.3, 0.3, 0),(4.4, 3.2, 1.3, 0.2, 0),(5.0, 3.5, 1.6, 0.6, 0), (5.1, 3.8, 1.9, 0.4, 0),(4.8, 3.0, 1.4, 0.3, 0),(5.1, 3.8, 1.6, 0.2, 0),(4.6, 3.2, 1.4, 0.2, 0), (5.3, 3.7, 1.5, 0.2, 0),(5.0, 3.3, 1.4, 0.2, 0),(7.0, 3.2, 4.7, 1.4, 1),(6.4, 3.2, 4.5, 1.5, 1), (6.9, 3.1, 4.9, 1.5, 1),(5.5, 2.3, 4.0, 1.3, 1),(6.5, 2.8, 4.6, 1.5, 1),(5.7, 2.8, 4.5, 1.3, 1), (6.3, 3.3, 4.7, 1.6, 1),(4.9, 2.4, 3.3, 1.0, 1),(6.6, 2.9, 4.6, 1.3, 1),(5.2, 2.7, 3.9, 1.4, 1), (5.0, 2.0, 3.5, 1.0, 1),(5.9, 3.0, 4.2, 1.5, 1),(6.0, 2.2, 4.0, 1.0, 1),(6.1, 2.9, 4.7, 1.4, 1), (5.6, 2.9, 3.6, 1.3, 1),(6.7, 3.1, 4.4, 1.4, 1),(5.6, 3.0, 4.5, 1.5, 1),(5.8, 2.7, 4.1, 1.0, 1), (6.2, 2.2, 4.5, 1.5, 1),(5.6, 2.5, 3.9, 1.1, 1),(5.9, 3.2, 4.8, 1.8, 1),(6.1, 2.8, 4.0, 1.3, 1), (6.3, 2.5, 4.9, 1.5, 1),(6.1, 2.8, 4.7, 1.2, 1),(6.4, 2.9, 4.3, 1.3, 1),(6.6, 3.0, 4.4, 1.4, 1), (6.8, 2.8, 4.8, 1.4, 1),(6.7, 3.0, 5.0, 1.7, 1),(6.0, 2.9, 4.5, 1.5, 1),(5.7, 2.6, 3.5, 1.0, 1), (5.5, 2.4, 3.8, 1.1, 1),(5.5, 2.4, 3.7, 1.0, 1),(5.8, 2.7, 3.9, 1.2, 1),(6.0, 2.7, 5.1, 1.6, 1), (5.4, 3.0, 4.5, 1.5, 1),(6.0, 3.4, 4.5, 1.6, 1),(6.7, 3.1, 4.7, 1.5, 1),(6.3, 2.3, 4.4, 1.3, 1), (5.6, 3.0, 4.1, 1.3, 1),(5.5, 2.5, 4.0, 1.3, 1),(5.5, 2.6, 4.4, 1.2, 1),(6.1, 3.0, 4.6, 1.4, 1), (5.8, 2.6, 4.0, 1.2, 1),(5.0, 2.3, 3.3, 1.0, 1),(5.6, 2.7, 4.2, 1.3, 1),(5.7, 3.0, 4.2, 1.2, 1), (5.7, 2.9, 4.2, 1.3, 1),(6.2, 2.9, 4.3, 1.3, 1),(5.1, 2.5, 3.0, 1.1, 1),(5.7, 2.8, 4.1, 1.3, 1), (6.3, 3.3, 6.0, 2.5, 2),(5.8, 2.7, 5.1, 1.9, 2),(7.1, 3.0, 5.9, 2.1, 2),(6.3, 2.9, 5.6, 1.8, 2), (6.5, 3.0, 5.8, 2.2, 2),(7.6, 3.0, 6.6, 2.1, 2),(4.9, 2.5, 4.5, 1.7, 2),(7.3, 2.9, 6.3, 1.8, 2), (6.7, 2.5, 5.8, 1.8, 2),(7.2, 3.6, 6.1, 2.5, 2),(6.5, 3.2, 5.1, 2.0, 2),(6.4, 2.7, 5.3, 1.9, 2), (6.8, 3.0, 5.5, 2.1, 2),(5.7, 2.5, 5.0, 2.0, 2),(5.8, 2.8, 5.1, 2.4, 2),(6.4, 3.2, 5.3, 2.3, 2), (6.5, 3.0, 5.5, 1.8, 2),(7.7, 3.8, 6.7, 2.2, 2),(7.7, 2.6, 6.9, 2.3, 2),(6.0, 2.2, 5.0, 1.5, 2), (6.9, 3.2, 5.7, 2.3, 2),(5.6, 2.8, 4.9, 2.0, 2),(7.7, 2.8, 6.7, 2.0, 2),(6.3, 2.7, 4.9, 1.8, 2), (6.7, 3.3, 5.7, 2.1, 2),(7.2, 3.2, 6.0, 1.8, 2),(6.2, 2.8, 4.8, 1.8, 2),(6.1, 3.0, 4.9, 1.8, 2), (6.4, 2.8, 5.6, 2.1, 2),(7.2, 3.0, 5.8, 1.6, 2),(7.4, 2.8, 6.1, 1.9, 2),(7.9, 3.8, 6.4, 2.0, 2), (6.4, 2.8, 5.6, 2.2, 2),(6.3, 2.8, 5.1, 1.5, 2),(6.1, 2.6, 5.6, 1.4, 2),(7.7, 3.0, 6.1, 2.3, 2), (6.3, 3.4, 5.6, 2.4, 2),(6.4, 3.1, 5.5, 1.8, 2),(6.0, 3.0, 4.8, 1.8, 2),(6.9, 3.1, 5.4, 2.1, 2), (6.7, 3.1, 5.6, 2.4, 2),(6.9, 3.1, 5.1, 2.3, 2),(5.8, 2.7, 5.1, 1.9, 2),(6.8, 3.2, 5.9, 2.3, 2), (6.7, 3.3, 5.7, 2.5, 2),(6.7, 3.0, 5.2, 2.3, 2),(6.3, 2.5, 5.0, 1.9, 2),(6.5, 3.0, 5.2, 2.0, 2), (6.2, 3.4, 5.4, 2.3, 2),(5.9, 3.0, 5.1, 1.8, 2);
-
Perform the classification training.
SET search_path="$user",public,madlib; SET behavior_compat_options = 'bind_procedure_searchpath'; select madlib.xgboost_sk_Classifier('public.iris', 'public.iris_model_xgbc', 'id', 'label', 'a,b,c,d', NULL, $${'booster': ['gbtree'], 'eta': (0.1, 0.9), 'max_depth': (5,1), 'objective': ('multi:softmax',)}$$, -- Training parameter combination. If there are multiple parameters, input them in tuple or list mode. TRUE); -- Whether to evaluate the model. The multi-classification evaluation metrics are the accuracy and kappa value. The binary classification evaluation metrics are precision, recall, fscore, and support. The regression evaluation metrics are mae, mse, R2squared, and rmse.
The xgboost supports parallel training of multiple groups of parameters. For example, the eta values in the case are 0.1 and 0.9, and the maximum depth is 5 or 1.
select id, train_timestamp, source_table, y_type, metrics, features, params from iris_model_xgbc;
The following shows the model result.
id | train_timestamp | source_table | y_type | metrics | features | params ----+-------------------------------+--------------+---------+----------------------------+-----------+--------------------------------------------------------------------------------- 1 | 2020-12-14 20:15:05.904184+08 | public.iris | integer | {'acc': 1.0, 'kappa': 1.0} | {a,b,c,d} | ('objective = multi:softmax', 'eta = 0.1', 'max_depth = 5', 'booster = gbtree') 2 | 2020-12-14 20:15:05.904184+08 | public.iris | integer | {'acc': 1.0, 'kappa': 1.0} | {a,b,c,d} | ('objective = multi:softmax', 'eta = 0.1', 'max_depth = 1', 'booster = gbtree') 3 | 2020-12-14 20:15:05.904184+08 | public.iris | integer | {'acc': 1.0, 'kappa': 1.0} | {a,b,c,d} | ('objective = multi:softmax', 'eta = 0.9', 'max_depth = 5', 'booster = gbtree') 4 | 2020-12-14 20:15:05.904184+08 | public.iris | integer | {'acc': 1.0, 'kappa': 1.0} | {a,b,c,d} | ('objective = multi:softmax', 'eta = 0.9', 'max_depth = 1', 'booster = gbtree') (4 rows)
The result table records the training time, features, result types, and used parameters.
In this example, two types of eta and two types of max_depth are selected. Therefore, there are a total of four parameter combinations displayed on four rows in the result. In the metrics column, the training evaluation results of the four parameter combinations are recorded. You can enter multiple parameter combinations and select a proper model after training.
-
View the prediction result.
select madlib.xgboost_sk_predict('public.iris', 'public.iris_model_xgbc', 'public.iris_xgbc_out', 'id'); select t1.id, prediction, label from iris as t1, iris_xgbc_out as t2 where t1.id = t2.id and prediction <> label;
The comparison between the prediction result and the training result shows that the number of unmatched rows is 0, indicating that the classification accuracy is high.
id | prediction | label ----+------------+------- (0 rows)
prophet Algorithm
The prophet time series forecasting algorithm for Facebook is added. The following uses time series data as an example to describe how to use the prophet algorithm.
-
Prepare the data.
DROP TABLE IF EXISTS ts_data; CREATE TABLE ts_data(date date, value float); INSERT into ts_data (date, value) values ('2016-11-29 21:20:00', 5.6),('2016-11-29 21:30:00', 5.2),('2016-11-29 21:40:00', 5.3),('2016-11-29 21:50:00', 5.3), ('2016-11-29 22:00:00', 5.1),('2016-11-29 22:10:00', 5.8),('2016-11-29 22:20:00', 5.6),('2016-11-29 22:30:00', 5.4), ('2016-11-29 22:40:00', 5.4),('2016-11-29 22:50:00', 5.1),('2016-11-29 23:00:00', 5.2),('2016-11-29 23:10:00', 5.9), ('2016-11-29 23:20:00', 5.9),('2016-11-29 23:30:00', 5.1),('2016-11-29 23:40:00', 5.8),('2016-11-29 23:50:00', 6.0), ('2016-11-30 00:00:00', 5.9),('2016-11-30 00:10:00', 5.3),('2016-11-30 00:20:00', 5.4),('2016-11-30 00:30:00', 5.1), ('2016-11-30 00:40:00', 5.6),('2016-11-30 00:50:00', 5.7),('2016-11-30 01:00:00', 5.8),('2016-11-30 01:10:00', 5.4), ('2016-11-30 01:20:00', 5.8),('2016-11-30 01:30:00', 5.1),('2016-11-30 01:40:00', 5.6),('2016-11-30 01:50:00', 5.6), ('2016-11-30 02:00:00', 5.6),('2016-11-30 02:10:00', 5.9),('2016-11-30 02:20:00', 5.7),('2016-11-30 02:30:00', 5.4), ('2016-11-30 02:40:00', 5.6),('2016-11-30 02:50:00', 5.4),('2016-11-30 03:00:00', 5.1),('2016-11-30 03:10:00', 5.0), ('2016-11-30 03:20:00', 5.9),('2016-11-30 03:30:00', 5.8),('2016-11-30 03:40:00', 5.4),('2016-11-30 03:50:00', 5.7), ('2016-11-30 04:00:00', 5.6),('2016-11-30 04:10:00', 5.9),('2016-11-30 04:20:00', 5.1),('2016-11-30 04:30:00', 5.8), ('2016-11-30 04:40:00', 5.5),('2016-11-30 04:50:00', 5.1),('2016-11-30 05:00:00', 5.8),('2016-11-30 05:10:00', 5.5), ('2016-11-30 05:20:00', 5.7),('2016-11-30 05:30:00', 5.2),('2016-11-30 05:40:00', 5.7),('2016-11-30 05:50:00', 6.0), ('2016-11-30 06:00:00', 5.8),('2016-11-30 06:10:00', 5.6),('2016-11-30 06:20:00', 5.2),('2016-11-30 06:30:00', 5.8), ('2016-11-30 06:40:00', 5.3),('2016-11-30 06:50:00', 5.4),('2016-11-30 07:00:00', 5.8),('2016-11-30 07:10:00', 5.2), ('2016-11-30 07:20:00', 5.3),('2016-11-30 07:30:00', 5.3),('2016-11-30 07:40:00', 5.8),('2016-11-30 07:50:00', 5.9), ('2016-11-30 08:00:00', 5.6),('2016-11-30 08:10:00', 5.2),('2016-11-30 08:20:00', 5.4),('2016-11-30 08:30:00', 5.6), ('2016-11-30 08:40:00', 6.0),('2016-11-30 08:50:00', 5.4),('2016-11-30 09:00:00', 6.0),('2016-11-30 09:10:00', 5.1), ('2016-11-30 09:20:00', 5.1),('2016-11-30 09:30:00', 5.5),('2016-11-30 09:40:00', 5.6),('2016-11-30 09:50:00', 5.0), ('2016-11-30 10:00:00', 5.1),('2016-11-30 10:10:00', 5.7),('2016-11-30 10:20:00', 5.4),('2016-11-30 10:30:00', 5.4), ('2016-11-30 10:40:00', 5.7),('2016-11-30 10:50:00', 5.2),('2016-11-30 11:00:00', 5.4),('2016-11-30 11:10:00', 5.3), ('2016-11-30 11:20:00', 5.6),('2016-11-30 11:30:00', 5.0),('2016-11-30 11:40:00', 5.2),('2016-11-30 11:50:00', 5.2), ('2016-11-30 12:00:00', 5.5),('2016-11-30 12:10:00', 5.1),('2016-11-30 12:20:00', 5.7),('2016-11-30 12:30:00', 5.4), ('2016-11-30 12:40:00', 5.2),('2016-11-30 12:50:00', 5.5),('2016-11-30 13:00:00', 5.0),('2016-11-30 13:10:00', 5.5), ('2016-11-30 13:20:00', 5.6),('2016-11-30 13:30:00', 5.3),('2016-11-30 13:40:00', 5.5),('2016-11-30 13:50:00', 5.9), ('2016-11-30 14:00:00', 10.9),('2016-11-30 14:10:00', 10.6),('2016-11-30 14:20:00', 10.3),('2016-11-30 14:30:00', 11.0), ('2016-11-30 14:40:00', 10.0),('2016-11-30 14:50:00', 10.1),('2016-11-30 15:00:00', 10.2),('2016-11-30 15:10:00', 10.2), ('2016-11-30 15:20:00', 10.3),('2016-11-30 15:30:00', 10.1),('2016-11-30 15:40:00', 10.9),('2016-11-30 15:50:00', 10.1), ('2016-11-30 16:00:00', 11.0),('2016-11-30 16:10:00', 10.2),('2016-11-30 16:20:00', 10.7),('2016-11-30 16:30:00', 10.2), ('2016-11-30 16:40:00', 10.2),('2016-11-30 16:50:00', 10.2),('2016-11-30 17:00:00', 10.8),('2016-11-30 17:10:00', 10.6), ('2016-11-30 17:20:00', 10.5),('2016-11-30 17:30:00', 10.7),('2016-11-30 17:40:00', 10.9),('2016-11-30 17:50:00', 10.9), ('2016-11-30 18:00:00', 10.1),('2016-11-30 18:10:00', 10.3),('2016-11-30 18:20:00', 10.1),('2016-11-30 18:30:00', 10.6), ('2016-11-30 18:40:00', 10.3),('2016-11-30 18:50:00', 10.8),('2016-11-30 19:00:00', 10.9),('2016-11-30 19:10:00', 10.8), ('2016-11-30 19:20:00', 10.6),('2016-11-30 19:30:00', 11.0),('2016-11-30 19:40:00', 10.3),('2016-11-30 19:50:00', 10.9), ('2016-11-30 20:00:00', 10.6),('2016-11-30 20:10:00', 10.6),('2016-11-30 20:20:00', 10.5),('2016-11-30 20:30:00', 10.4), ('2016-11-30 20:40:00', 10.9),('2016-11-30 20:50:00', 10.9),('2016-11-30 21:00:00', 10.7),('2016-11-30 21:10:00', 10.6), ('2016-11-30 21:20:00', 10.5),('2016-11-30 21:30:00', 10.8),('2016-11-30 21:40:00', 10.4),('2016-11-30 21:50:00', 10.0), ('2016-11-30 22:00:00', 10.6),('2016-11-30 22:10:00', 10.6),('2016-11-30 22:20:00', 10.6),('2016-11-30 22:30:00', 10.1), ('2016-11-30 22:40:00', 10.4),('2016-11-30 22:50:00', 10.8),('2016-11-30 23:00:00', 10.4),('2016-11-30 23:10:00', 10.6), ('2016-11-30 23:20:00', 10.1),('2016-11-30 23:30:00', 10.2),('2016-11-30 23:40:00', 10.6),('2016-11-30 23:50:00', 10.8), ('2016-12-01 00:00:00', 10.6),('2016-12-01 00:10:00', 10.2),('2016-12-01 00:20:00', 10.9),('2016-12-01 00:30:00', 10.3), ('2016-12-01 00:40:00', 10.3),('2016-12-01 00:50:00', 10.1),('2016-12-01 01:00:00', 10.7),('2016-12-01 01:10:00', 10.5), ('2016-12-01 01:20:00', 10.4),('2016-12-01 01:30:00', 10.7),('2016-12-01 01:40:00', 10.5),('2016-12-01 01:50:00', 10.7), ('2016-12-01 02:00:00', 10.8),('2016-12-01 02:10:00', 10.9),('2016-12-01 02:20:00', 10.9),('2016-12-01 02:30:00', 10.1), ('2016-12-01 02:40:00', 10.4),('2016-12-01 02:50:00', 10.7),('2016-12-01 03:00:00', 10.7),('2016-12-01 03:10:00', 10.5), ('2016-12-01 03:20:00', 10.2),('2016-12-01 03:30:00', 10.2),('2016-12-01 03:40:00', 10.8),('2016-12-01 03:50:00', 10.2), ('2016-12-01 04:00:00', 10.9),('2016-12-01 04:10:00', 10.4),('2016-12-01 04:20:00', 10.6),('2016-12-01 04:30:00', 11.0), ('2016-12-01 04:40:00', 10.4),('2016-12-01 04:50:00', 10.3),('2016-12-01 05:00:00', 10.7),('2016-12-01 05:10:00', 10.6), ('2016-12-01 05:20:00', 10.9),('2016-12-01 05:30:00', 11.0),('2016-12-01 05:40:00', 10.9),('2016-12-01 05:50:00', 10.0), ('2016-12-01 06:00:00', 10.8),('2016-12-01 06:10:00', 10.0),('2016-12-01 06:20:00', 10.1),('2016-12-01 06:30:00', 10.5), ('2016-12-01 06:40:00', 15.5),('2016-12-01 06:50:00', 15.7),('2016-12-01 07:00:00', 15.1),('2016-12-01 07:10:00', 15.6), ('2016-12-01 07:20:00', 15.5),('2016-12-01 07:30:00', 15.4),('2016-12-01 07:40:00', 15.7),('2016-12-01 07:50:00', 15.6), ('2016-12-01 08:00:00', 15.3),('2016-12-01 08:10:00', 15.6),('2016-12-01 08:20:00', 15.1),('2016-12-01 08:30:00', 15.6), ('2016-12-01 08:40:00', 15.9),('2016-12-01 08:50:00', 16.0),('2016-12-01 09:00:00', 15.4),('2016-12-01 09:10:00', 15.0), ('2016-12-01 09:20:00', 15.0),('2016-12-01 09:30:00', 15.4),('2016-12-01 09:40:00', 15.9),('2016-12-01 09:50:00', 15.6), ('2016-12-01 10:00:00', 15.7),('2016-12-01 10:10:00', 15.4),('2016-12-01 10:20:00', 15.2),('2016-12-01 10:30:00', 15.2), ('2016-12-01 10:40:00', 15.8),('2016-12-01 10:50:00', 15.4),('2016-12-01 11:00:00', 16.0),('2016-12-01 11:10:00', 15.9), ('2016-12-01 11:20:00', 15.1),('2016-12-01 11:30:00', 15.0),('2016-12-01 11:40:00', 15.0),('2016-12-01 11:50:00', 15.4), ('2016-12-01 12:00:00', 15.5),('2016-12-01 12:10:00', 15.3),('2016-12-01 12:20:00', 16.0),('2016-12-01 12:30:00', 15.1), ('2016-12-01 12:40:00', 15.5),('2016-12-01 12:50:00', 16.0),('2016-12-01 13:00:00', 15.7),('2016-12-01 13:10:00', 15.9), ('2016-12-01 13:20:00', 15.4),('2016-12-01 13:30:00', 15.3),('2016-12-01 13:40:00', 15.9),('2016-12-01 13:50:00', 15.8), ('2016-12-01 14:00:00', 15.4),('2016-12-01 14:10:00', 15.9),('2016-12-01 14:20:00', 15.3),('2016-12-01 14:30:00', 16.0), ('2016-12-01 14:40:00', 15.5),('2016-12-01 14:50:00', 15.0),('2016-12-01 15:00:00', 15.1),('2016-12-01 15:10:00', 16.0), ('2016-12-01 15:20:00', 15.8),('2016-12-01 15:30:00', 15.9),('2016-12-01 15:40:00', 15.4),('2016-12-01 15:50:00', 15.1), ('2016-12-01 16:00:00', 15.8),('2016-12-01 16:10:00', 15.2),('2016-12-01 16:20:00', 15.4),('2016-12-01 16:30:00', 15.8), ('2016-12-01 16:40:00', 15.8),('2016-12-01 16:50:00', 15.1),('2016-12-01 17:00:00', 15.3),('2016-12-01 17:10:00', 15.6), ('2016-12-01 17:20:00', 15.3),('2016-12-01 17:30:00', 15.8),('2016-12-01 17:40:00', 15.0),('2016-12-01 17:50:00', 15.3), ('2016-12-01 18:00:00', 15.5),('2016-12-01 18:10:00', 15.4),('2016-12-01 18:20:00', 15.3),('2016-12-01 18:30:00', 15.8), ('2016-12-01 18:40:00', 15.2),('2016-12-01 18:50:00', 15.9),('2016-12-01 19:00:00', 15.4),('2016-12-01 19:10:00', 15.3), ('2016-12-01 19:20:00', 15.1),('2016-12-01 19:30:00', 15.3),('2016-12-01 19:40:00', 15.9),('2016-12-01 19:50:00', 15.3), ('2016-12-01 20:00:00', 15.3),('2016-12-01 20:10:00', 15.2),('2016-12-01 20:20:00', 15.0),('2016-12-01 20:30:00', 15.3), ('2016-12-01 20:40:00', 15.1),('2016-12-01 20:50:00', 15.1),('2016-12-01 21:00:00', 15.6),('2016-12-01 21:10:00', 15.8), ('2016-12-01 21:20:00', 15.4),('2016-12-01 21:30:00', 15.2),('2016-12-01 21:40:00', 16.0),('2016-12-01 21:50:00', 15.5), ('2016-12-01 22:00:00', 15.4),('2016-12-01 22:10:00', 15.7),('2016-12-01 22:20:00', 15.3),('2016-12-01 22:30:00', 15.9), ('2016-12-01 22:40:00', 15.9),('2016-12-01 22:50:00', 15.2),('2016-12-01 23:00:00', 15.8),('2016-12-01 23:10:00', 15.9), ('2016-12-01 23:20:00', 20.9),('2016-12-01 23:30:00', 20.4),('2016-12-01 23:40:00', 20.3),('2016-12-01 23:50:00', 20.1), ('2016-12-02 00:00:00', 20.7),('2016-12-02 00:10:00', 20.7),('2016-12-02 00:20:00', 20.5),('2016-12-02 00:30:00', 20.4), ('2016-12-02 00:40:00', 20.4),('2016-12-02 00:50:00', 20.1),('2016-12-02 01:00:00', 20.2),('2016-12-02 01:10:00', 20.9), ('2016-12-02 01:20:00', 20.6),('2016-12-02 01:30:00', 20.0),('2016-12-02 01:40:00', 20.4),('2016-12-02 01:50:00', 20.2), ('2016-12-02 02:00:00', 20.6),('2016-12-02 02:10:00', 20.4),('2016-12-02 02:20:00', 20.5),('2016-12-02 02:30:00', 20.4), ('2016-12-02 02:40:00', 20.5),('2016-12-02 02:50:00', 20.7),('2016-12-02 03:00:00', 20.2),('2016-12-02 03:10:00', 20.2), ('2016-12-02 03:20:00', 20.1),('2016-12-02 03:30:00', 20.5),('2016-12-02 03:40:00', 20.5),('2016-12-02 03:50:00', 20.0), ('2016-12-02 04:00:00', 20.7),('2016-12-02 04:10:00', 20.8),('2016-12-02 04:20:00', 20.6),('2016-12-02 04:30:00', 20.4), ('2016-12-02 04:40:00', 20.5),('2016-12-02 04:50:00', 20.8),('2016-12-02 05:00:00', 20.1),('2016-12-02 05:10:00', 20.9), ('2016-12-02 05:20:00', 20.5),('2016-12-02 05:30:00', 20.4),('2016-12-02 05:40:00', 20.2),('2016-12-02 05:50:00', 20.4), ('2016-12-02 06:00:00', 20.8),('2016-12-02 06:10:00', 20.7),('2016-12-02 06:20:00', 20.9),('2016-12-02 06:30:00', 20.1), ('2016-12-02 06:40:00', 20.3),('2016-12-02 06:50:00', 20.2),('2016-12-02 07:00:00', 20.4),('2016-12-02 07:10:00', 20.7), ('2016-12-02 07:20:00', 20.4),('2016-12-02 07:30:00', 20.8),('2016-12-02 07:40:00', 20.8),('2016-12-02 07:50:00', 20.1), ('2016-12-02 08:00:00', 20.3),('2016-12-02 08:10:00', 20.7),('2016-12-02 08:20:00', 20.9),('2016-12-02 08:30:00', 21.0), ('2016-12-02 08:40:00', 20.2),('2016-12-02 08:50:00', 20.5),('2016-12-02 09:00:00', 20.2),('2016-12-02 09:10:00', 20.8), ('2016-12-02 09:20:00', 20.9),('2016-12-02 09:30:00', 20.5),('2016-12-02 09:40:00', 20.9),('2016-12-02 09:50:00', 20.7), ('2016-12-02 10:00:00', 20.3),('2016-12-02 10:10:00', 21.0),('2016-12-02 10:20:00', 20.5),('2016-12-02 10:30:00', 20.3), ('2016-12-02 10:40:00', 20.2),('2016-12-02 10:50:00', 20.3),('2016-12-02 11:00:00', 20.4),('2016-12-02 11:10:00', 20.4), ('2016-12-02 11:20:00', 21.0),('2016-12-02 11:30:00', 20.3),('2016-12-02 11:40:00', 20.3),('2016-12-02 11:50:00', 20.9), ('2016-12-02 12:00:00', 20.8),('2016-12-02 12:10:00', 20.9),('2016-12-02 12:20:00', 20.7),('2016-12-02 12:30:00', 20.7);
-
Perform training.
SET search_path="$user",public,madlib; SET behavior_compat_options = 'bind_procedure_searchpath'; select madlib.prophet_fit('public.ts_data', 'public.prophet_model', $${'ds': 'date', 'y': 'value'}$$, -- Column name mapping. The prophet requires that the time column name be 'ds' and the time series value column name be 'y'. $${'growth': 'linear', 'changepoints': ['2016-11-30 05:40:00']}$$ -- Training parameter combination. If there are multiple parameters, input them in tuple mode. );
Query the model table.
select id, y_type, params from public.prophet_model; id | y_type | params ----+------------------+--------------------------------------------------------------- 1 | double precision | {'changepoints': ['2016-11-30 05:40:00'], 'growth': 'linear'}
The model table records the training time, result types, and used parameters.
-
Perform the prediction.
select madlib.prophet_predict('public.prophet_model','public.prophet_output', 10, '10T');
View the prediction result.
select ds, yhat, yhat_lower, yhat_upper from public.prophet_output; ds | yhat | yhat_lower | yhat_upper ------------+---------------+---------------+--------------- 2016-12-02 | 20.6943848045 | 17.7671496048 | 23.4160694837 2016-12-02 | 20.7408355633 | 17.9264413164 | 23.6426403933 2016-12-02 | 20.7872863221 | 17.9298207895 | 23.4548814727 2016-12-02 | 20.833737081 | 18.234443228 | 23.5317342873 2016-12-02 | 20.8801878398 | 18.2471709649 | 23.8345735574 2016-12-02 | 20.9266385986 | 18.1780101465 | 23.696087927 2016-12-02 | 20.9730893575 | 18.4292088648 | 23.7209823631 2016-12-02 | 21.0195401163 | 18.2623494126 | 23.7341427068 2016-12-02 | 21.0659908751 | 18.1173966769 | 23.7919478206 2016-12-02 | 21.112441634 | 18.5018042056 | 23.9508963879 (10 rows)