MogDB
Ecological Tools
Doc Menu

Overview

For database security purposes, MogDB provides two backup types, multiple backup and restoration solutions, and data reliability assurance mechanisms.

Backup and restoration can be logically or physically performed.

  • Logical backup and restoration: backs up data by logically exporting data. This method can dump data that is backed up at a certain time point, and restore data only to this backup point. A logical backup does not back up data processed between failure occurrence and the last backup. It applies to scenarios where data rarely changes. Such data damaged due to misoperation can be quickly restored using a logical backup. To restore all the data in a database through logical backup, rebuild a database and import the backup data. Logical backup is not recommended for databases requiring high data availability because it takes a long time for data restoration. Logical backup is a major approach to migrate and transfer data because it can be performed on any platform.
  • Physical backup and restoration: copies physical files in the unit of disk blocks to back up a database. A database can be restored using backup files, such as data files and archive log files. Physical backup is usually used for full backup, quickly backing up and restoring data with low costs if properly planned.

    The two data backup and restoration solutions supported by MogDB are as follows. Methods for restoring data in case of an exception differ for different backup and restoration solutions.

    Table 1 Comparison between logical and physical backup and restoration

    Backup Type Application Scenario Media Tool Recovery Time Advantage and Disadvantage
    Logical backup and restoration Small volume of data needs to be processed.
    Currently, it is used for the backup and restoration of one or more tables.
    The backup data needs to be restored using the gsql or gs_restore tool.
    For a large volume of data, backup requires a long period of time.
    - Disk
    - SSD
    gs_dump For data in plain text format, backup requires a long period of time. For data in archive format, backup requires a short period of time. A tool used for exporting database-related information. The user can customize the export object, such as a database or an object in a database, including mode, table, view, and so on. The database to be exported can be the default database postgres or a customized database. The export format can be the text or archive format. For data in plain text format, it can be restored using only the gsql tool, and the restoration time is relatively long. For data in archive format, it can be restored using only the gs_restore tool, and the restoration time is relatively short.
    gs_dump all Long data recovery time A tool used for exporting database-related information. All MogDB data can be exported, including default database postgres, customized databases, and global objects in all MogDB databases.
    Only data in plain text format can be exported and restored using only the gsql tool. Additionally, the restoration time is relatively long.
    Physical backup and restoration Huge volume of data needs to be processed. It is mainly used for full backup and restoration as well as the backup and restoration of all WAL archive and run logs in the database. gs_backup Backing up a small amount of data is efficient and flexible. An OM tool used for exporting database-related information, such as database parameter files and binary files. This tool supports MogDB backup, restoration of important data, and display of help information and version information. During the backup, you can choose the backup object type. During the restoration, make sure that the backup directory on each node contain backup files. During cluster restoration, restoration can be realized through the cluster information in the static configuration file. If only parameter files are restored, the restoration time is short.
    gs_basebackup During the restoration, you can directly copy the backup file to replace the original file, or start the standby database. The restoration can be quite efficient. When backing up all database binary files, you can just back up the data at a certain time point. Combined with PITR restoration, you can restore the data to a time point later than the backup time point.
    gs_probackup A tool used for restoring data to a backup time point. You can start the standby database to restore data, which is quite efficient. A tool used for managing backup and restoration of MogDB databases. It can perform periodical backup for MogDB instances and be used for backing up a single database or the database of the primary node in a cluster. This backup is physical backup. It supports backup of external directories, such as script files, configuration files, log files, and dump files. It supports incremental backup, periodical backup, and remote backup. Compared with full backup, the time required for incremental backup is relatively short because only modified files are backed up. Currently, the default backup is to back up the data directory. If the tablespace does not exist in the data directory, you need to specify the directory for backing up the tablespace. Currently, backup can be performed only on hosts.

While backing up and restoring data, take the following aspects into consideration:

  • Whether the impact of data backup on services is acceptable
  • Database restoration efficiency

    To minimize the impact of database faults, try to minimize the restoration duration, achieving the highest restoration efficiency.

  • Data restorability

    Minimize data loss after the database is invalidated.

  • Database restoration cost

    There are many factors that need to be considered while you select a backup policy on the live network, such as backup objects, data volume, and network configuration. Table 2 lists available backup policies and applicable scenarios for each backup policy.

    Table 2 Backup policies and scenarios

    Backup Policy Key Performance Factor Typical Data Volume Performance Specifications
    Cluster backup - Data amount
    - Network configuration
    Data volume: PB level
    Object quantity: about 1 million
    Backup:
    - Data transfer rate on each host: 80 Mbit/s (NBU/EISOO+Disk)
    - Disk I/O rate (SSD/HDD): about 90%
    Table backup - Schema where the table to be backed up resides
    - Network configuration (NBU)
    Data volume: 10 TB level Backup: depends on query performance rate and I/O rate
    NOTE:
    For multi-table backup, the backup time is calculated as follows:
    Total time = Number of tables x Starting time + Total data volume/Data backup speed
    In the preceding information:
    - The starting time of a disk is about 5s. The starting time of an NBU is longer than that of a disk (depending on the NBU deployment).
    - The data backup speed is about 50 MB/s on a single node. (The speed is evaluated based on the backup of a 1 GB table from a physical host to a local disk.)
    The smaller the table is, the lower the backup performance will be.