HomeMogDBMogDB StackUqbar
v3.0

Documentation:v3.0

Supported Versions:

Other Versions:

COPY Import Optimization

Availability

This feature is available since MogDB 3.0.0.

Introduction

COPY is the most used way to import user table data. This feature improves the performance of COPY in the parsing stage by using the SIMD feature of modern CPUs to improve the performance of COPY and the import speed.

When COPY imports data from a file, it is theoretically a string comparison operation during the parsing phase to find the separator and to determine whether data parsed by CSV/TEXT is legal or not. The SIMD feature supports comparison of multiple strings at one time, thereby reducing the number of branch judgments and then improving performance.

Benefits

The row or column separator is optimized in lookup during COPY parsing using SIMD command. The end users of this feature are general customers, such as database DBAs, software developers, etc. The performance of COPY is increased by 10% to 30%.

Number of Data Records 100000000
Total Data Size 24 GB
Average Performance Improvement 12.29%

The test results are as follows.

Test Sequence Time Spent with SIMD feature Unused (Second) Time Spent with SIMD Feature Used (Second)
1 761.01 671.05
2 747.06 662.60
3 770.22 663.03
4 747.940 674.03
5 787.22 674.13
Average time spent 762.69 668.97

Constraints

  • Only machines with the x86 architecture, only text and csv files are supported. The following are not supported: escape characters, escape and quote, null value substitution and custom column separators.

  • Because this string comparison instruction value is only supported since SSE4.2, only x86 that supports SSE4.2 can use this optimization.

The following commands can be used to determine if the machine supports the SSE4.2 command set (logging in as either root or omm user).

[omm3@hostname ~]$ grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"
SSE 4.2 supported

[xxxx@hostname ~]$ grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"
SSE 4.2 not supported

The enable_sse42 feature can be enabled or disabled using the following command.

Log in to the database.

[omm3@hostname ~]$ gsql -d postgres -p18000
gsql ((MogDB 3.0.0 build 945141ad) compiled at 2022-05-28 16:14:02 commit 0 last mr  )
Non-SSL connection (SSL connection is recommended when requiring high-security)
Type "help" for help.

Enable the enable_sse42 feature.

MogDB=# set enable_sse42 to on;
SET
MogDB=# show enable_sse42;
enable_sse42 
--------------
 on
(1 row)

Disable the enable_sse42 feature.

MogDB=# set enable_sse42 to off;
SET
MogDB=# show enable_sse42;
enable_sse42 
--------------
off
(1 row)

COPY

Copyright © 2011-2024 www.enmotech.com All rights reserved.