Overview

For database security purposes, MogDB provides three backup types, multiple backup and restoration solutions, and data reliability assurance mechanisms.

Backup and restoration can be classified into logical backup and restoration, physical backup and restoration, and flashback.

Logical backup and restoration: backs up data by logically exporting data. This method can dump data that is backed up at a certain time point, and restore data only to this backup point. A logical backup does not back up data processed between failure occurrence and the last backup. It applies to scenarios where data rarely changes. Such data damaged due to misoperation can be quickly restored using a logical backup. To restore all the data in a database through logical backup, rebuild a database and import the backup data. Logical backup is not recommended for databases requiring high data availability because it takes a long time for data restoration. Logical backup is a major approach to migrate and transfer data because it can be performed on any platform.
Physical backup and restoration: copies physical files in the unit of disk blocks from the primary node to the standby node to back up a database. A database can be restored using backup files, such as data files and archive log files. Physical backup is usually used for full backup, quickly backing up and restoring data at a low cost if properly planned.

Flashback: This function is used to restore dropped tables from the recycle bin. Like in a Window OS, dropped table information is stored in the recycle bin of databases. The MVCC mechanism is used to restore data to a specified point in time or change sequence number (CSN).

The three data backup and restoration solutions supported by MogDB are as follows. Methods for restoring data in case of an exception differ for different backup and restoration solutions.

Table 1 Comparison of three backup and restoration types

Backup Type	Application Scenario	Media	Tool Name	Recovery Time	Advantage and Disadvantage
Logical backup and restoration	Small volume of data needs to be processed. You can back up a single table, multiple tables, a single database, or all databases. The backup data needs to be restored using gsql or gs_restore. When the data volume is large, the restoration takes a long time.	- Disk - SSD	gs_dump	It takes a long time to restore data in plain-text format. It takes a long time to restore data in archive format.	This tool is used to export database information. Users can export a database or its objects (such as schemas, tables, and views). The database can be the default postgres database or a user-specified database. The exported file can be in plain-text format or archive format. Data in plain-text format can be restored only by using gsql, which takes a long time. Data in archive format can be restored only by using gs_restore. The restoration time is shorter than that of the plain-text format.
Logical backup and restoration			gs_dumpall	Long data recovery time	This tool is used to export all information of the openGauss database, including the data of the default postgres database, data of user-specified databases, and global objects of all openGauss databases. Only data in plain-text format can be exported. The exported data can be restored only by using gsql, which takes a long time.
Physical backup and restoration	Huge volume of data needs to be processed. It is mainly used for full backup and restoration as well as the backup of all WAL archive and run logs in the database.		gs_backup	Small data volume and fast data recovery	The OM tool for exporting database information can be used to export database parameter files and binary files. It helps the openGauss to back up and restore important data, and display help and version information. During the backup, you can select the type of the backup content. During the restoration, ensure that the backup file exists in the backup directory of each node. During cluster restoration, the cluster information in the static configuration file is used for restoration. It takes a short time to restore only parameter files.
			gs_basebackup	During the restoration, you can directly copy and replace the original files, or directly start the database on the backup database. The restoration takes a short time.	This too is used to fully copy the binary files of the server database. Only the database at a certain time point can be backed up. With PITR, you can restore data to a time point after the full backup time point.
			gs_probackup	Data can be directly restored to a backup point and the database can be started on the backup database. The restoration takes a short time.	gs_probackup is a tool used to manage openGauss database backup and restoration. It periodically backs up openGauss instances. It supports the physical backup of a standalone database or a primary database node in a cluster. It supports the backup of contents in external directories, such as script files, configuration files, log files, and dump files. It supports incremental backup, periodic backup, and remote backup. The time required for incremental backup is shorter than that for full backup. You only need to back up the modified files. Currently, the data directory is backed up by default. If the tablespace is not in the data directory, you need to manually specify the tablespace directory to be backed up. Currently, data can be backed up only on the primary node.
Flashback	Applicable to: 1) A table is deleted by mistake. 2) Data in the tables needs to be restored to a specified time point or CSN.		None	You can restore a table to the status at a specified time point or before the table structure is deleted within a short period of time.	Flashback can selectively and efficiently undo the impact of a committed transaction and recover from a human error. Before the flashback technology is used, the committed database modification can be retrieved only by means of restoring backup or PITR. The restoration takes several minutes or even hours. After the flashback technology is used, it takes only seconds to restore the committed data before the database is modified. The restoration time is irrelevant to the database size. Flashback supports two recovery modes: - Multi-version data restoration based on MVCC: applicable to the query and restoration of data that is deleted, updated, or inserted by mistake. You can configure the retention period of the old version and run the corresponding query or restoration command to query or restore data to a specified time point or CSN. - Recovery based on the recycle bin (similar to that on Windows OS): This method is applicable to the recovery of tables that are dropped or truncated by mistake. You can configure the recycle bin switch and run the corresponding restoration command to restore the tables that are dropped or truncated by mistake.

While backing up and restoring data, take the following aspects into consideration:

Whether the impact of data backup on services is acceptable
Database restoration efficiency

To minimize the impact of database faults, try to minimize the restoration duration, achieving the highest restoration efficiency.
Data restorability

Minimize data loss after the database is invalidated.

Database restoration cost

There are many factors that need to be considered while you select a backup policy on the live network, such as backup objects, data volume, and network configuration. Table 2 lists available backup policies and applicable scenarios for each backup policy.

Table 2 Backup policies and scenarios

Backup Policy	Key Performance Factor	Typical Data Volume	Performance Specifications
Database instance backup	- Data amount - Network configuration	Data volume: PB level Object quantity: about 1 million	Backup: - Data transfer rate on each host: 80 Mbit/s (NBU/EISOO+Disk) - Disk I/O rate (SSD/HDD): about 90%
Table backup	- Schema where the table to be backed up resides - Network configuration (NBU)	Data volume: 10 TB level	Backup: depends on query performance rate and I/O rate NOTE: For multi-table backup, the backup time is calculated as follows: `Total time = Number of tables x Starting time + Total data volume/Data backup speed` In the preceding information: - The starting time of a disk is about 5s. The starting time of an NBU is longer than that of a disk (depending on the NBU deployment). - The data backup speed is about 50 MB/s on a single node. (The speed is evaluated based on the backup of a 1 GB table from a physical host to a local disk.) The smaller the table is, the lower the backup performance will be.

Issue