MogDB
Ecological Tools
Doc Menu

Parallel Data Import

MogDB provides a parallel data import function that enables a large amount of data to be imported in a fast and efficient manner. This section describes parameters for importing data in parallel.

raise_errors_if_no_files

Parameter description: Specifies whether to distinguish between the problems "the number of imported file records is empty" and "the imported file does not exist". If this parameter is set to on and the problem "the imported file does not exist" occurs, MogDB will report the error message "file does not exist".

This parameter is a SUSET parameter. Set it based on instructions provided in Table 1 GUC parameters.

Value range: Boolean

  • on indicates that the messages of "the number of imported file records is empty" and "the imported file does not exist" are distinguished when files are imported.
  • off indicates that the messages of "the number of imported file records is empty" and "the imported file does not exist" are the same when files are imported.

Default value:off

partition_mem_batch

Parameter description: In order to optimize the inserting of column-store partitioned tables in batches, the data is buffered during the inserting process and then written in the disk. You can specify the number of caches through partition_mem_batch. If the value is too large, much memory will be consumed. If it is too small, the performance of inserting column-store partitioned tables in batches will deteriorate.

This parameter is a USERSET parameter. Set it based on instructions provided in Table 1 GUC parameters.

Value range: 1 to 65535

Default value:256

partition_max_cache_size

Parameter description: In order to optimize the inserting of column-store partitioned tables in batches, the data is buffered during the inserting process and then written in the disk. You can specify the data buffer cache size through partition_max_cache_size. If the value is too large, much memory will be consumed. If it is too small, the performance of inserting column-store partitioned tables in batches will deteriorate.

This parameter is a USERSET parameter. Set it based on instructions provided in Table 1 GUC parameters.

Value range:

4096 to INT_MAX/2. The unit is KB.

Default value:2GB

enable_delta_store

Parameter description: Specifies whether to enable delta tables for column-store tables. Delta tables will improve the performance of importing a single piece of data to a column-store table and prevent table bloating. If this parameter is set to on, data to be imported to a column-store table will be stored in the delta table when the data volume is less than DELTAROW_THRESHOLD specified in table definition and otherwise will be stored in CUs of the main table. This parameter affects all operations involving data transfer of column-store tables, including INSERT, COPY, VACUUM, VACUUM FULL, VACUUM DELTAMERGE, and data redistribution.

This parameter is a POSTMASTER parameter. Set it based on instructions provided in Table 1 GUC parameters.

Value range:

  • on indicates that delta tables are enabled.
  • off indicates that delta tables are disabled.

Default value:off