HomeMogDBMogDB StackUqbar
v2.1

Documentation:v2.1

Supported Versions:

Other Versions:

FAQs

An Error Is Reported Displaying "Failed to obtain the GPHOME" When a Command Is Executed

Symptom

The following information is displayed if user root runs a command.

Failed to obtain the GPHOME.

Possible Cause

The GPHOME environment variable is not correctly configured. You need to check whether the GPHOME environment variable contains gaussdbToolPath in the XML file of the cluster.

Procedure

Check the $GPHOME path.

echo $GPHOME

Modify the $GPHOME path in the configuration file if it is not the default installation path.

vim /etc/profile

Restoration Method for Incomplete Key Files Caused by Interruption During Standby Instance Rebuilding Using gs_ctl

Symptom

The standby instance fails to be rebuilt because the rebuilding process is interrupted. The following error information is displayed:

CRC checksum does not match value stored in file, maybe the cipher file is corrupt
non obs cipher file or random parameter file is invalid.
read cipher file or random parameter file failed.
2020-06-18 20:58:12.080 5eeb64e3.1 [unknown] 140697304617088 [unknown] 0 dn_6001_6002 F0000 0 [BACKEND] FATAL:  could not load server certificate file "server.crt": no start line
[2020-06-18 20:58:12.086][24066][dn_6001_6002][gs_ctl]:  waitpid 24446 failed, exitstatus is 256, ret is 2

Possible Cause

The certificate file is incomplete when the rebuilding is interrupted. The rebuilding fails again due to the incomplete certificate file.

Procedure

  1. Check the size of the certificate file in the data directory.

    ll
    Check the size of the key file.
    -rw------- 1 omm omm       0 Jun 18 20:58 server.crt
    -rw------- 1 omm omm       0 Jun 18 20:58 server.key
    -rw------- 1 omm omm       0 Jun 18 20:58 server.key.cipher
    -rw------- 1 omm omm       0 Jun 18 20:58 server.key.rand
  2. If the certificate file size is 0, delete the certificate file.

    rm -rf server.crt server.key server.key.cipher server.key.rand
  3. Rebuild the standby instance.

    gs_ctl build -D data_dir

img NOTE: If the database on the standby node is stopped, you need to regenerate a certificate file or copy the certificate file (in $GAUSSHOME**/share**) to the data directory, start the standby node, and rebuild the standby instance. For details about how to generate a certificate file, see the Developer Guide.

No Response Is Returned for a Long Time When gs_om -t status --all Is Used to Query Database Status

Symptom

The system does not respond for a long time after the gs_om -t status --all command is executed.

Cause Analysis

The possible cause is that the GaussDB process is hung. The query operation calls the gsql or gs_ctl tool to query the database status. After the process is hung, no response is returned until the query times out.

Procedure

  1. Check whether gsql can access the database. If the following information is displayed, the GaussDB process is hung and the database is abnormal.

    gsql -d postgres -p 29776        
    gsql: wait (null):29776 timeout expired, errno: Success
  2. Check whether the postgresql-*.log file contains error information. If yes, rectify the fault based on the error information.

    cd $GAUSSLOG/pg_log/dn_6001;grep "ERROR\|FATAL" postgresql-*.log   
  3. If the database has been hung, and the gs_om command does not take effect, search for the process ID on each node and kill the process.

    ps -ef|grep $GAUSSHOME/bin/gaussdb|grep -v grep       
    kill -9 $pid
  4. After the processes on all nodes are killed, run the following command on a node to start the processes: In the test environment, directly restart the database. In the manufacturer environment, contact Enmo technical support.

    gs_om -t start

gs_sshexkey Reports an Error When the Same User Has Different Passwords

Symptom

In the openEuler environment, gs_sshexkey supports mutual trust between different passwords of the same user. However, the authentication fails after the correct password is entered.

Cause Analysis

Open the system log file /var/log/secure and check whether the pam_faillock(sshd:auth): Consecutive login failures for user log exists. If yes, the user account is locked because the number of incorrect password attempts exceeds the upper limit.

Procedure

In the /etc/pam.d directory, modify the system-auth, password-auth, and password-auth-crond configuration files, increase the value of deny=3 in the files, and restore the value after the mutual trust relationships are established.

Copyright © 2011-2024 www.enmotech.com All rights reserved.