Specifying a Position for Logical Decoding
Serial/Parallel logical decoding supports the decoding of online WALs from a specified position. This position can be an LSN. Logical decoding finds a consistency LSN from the specified LSN, decodes data from the consistency LSN, and outputs data.
Specification Restrictions
- A position can be specified for decoding online WALs only. You need to adjust the number of online WAL files that can be retained on GaussDB based on the number of WAL files generated by different services every day, so that a position can be specified for decoding the WALs.
- Logical decoding actually starts from the consistency point. The specific location of the consistency point is related to the actual situation of concurrent transactions, for example, long transactions. Only the data modifications generated by transactions since the consistency point can be decoded. Ensure that the transactions to be decoded start after the consistency point. To prevent missing decoding, you are advised to specify a position earlier than the consistency point for logical decoding.
- When a version that a position can be specified for logical decoding is installed, the dictionary table is automatically baselined during initialization. After the installation is complete, administrator can specify a position with no need to execute baselined functions. For upgrade from a version that no position can be specified for logical decoding to a version that a position can be specified for logical decoding, to specify a position, administrators must execute baselined system functions. In addition, the LSN of the specified position must be greater than that when the baseline is complete.
- If the data dictionary retention period is shorter than the specified period of generating WALs, decoding exceptions may occur due to dictionary data missing. To configure the retention period of logical decoding data dictionaries, set GUC parameter logical_replication_dictionary_retention_time.
System catalog tuples related to data dictionaries can be recycled only when they meet the following conditions:
- The period obtained by subtracting the creation time of system catalog data from the current time is longer than period specified by logical_replication_dictionary_retention_time.
- The value of csnmax in the system catalog is not 0 and less than the minimum CSN after the logical replication slot number is updated. That is, the value of slot_dictionary_type is the minimum value specified by dictionary_csn_min among the tuples of the dictionary table.
gaussdb=# select * from pg_get_replication_slots(); slot_name | plugin | slot_type | datoid | active | xmin | catalog_xmin | restart_lsn | dummy_standby | confirmed_flush | confirmed_csn | dictionary_csn_min | slot_dictionary_type -----------+----------------+-----------+--------+--------+------+--------------+-------------+---------------+-----------------+---------------+--------------------+---------------------- slot_lsn | mppdb_decoding | logical | 41313 | f | | | 0/E34CDC0 | f | 0/E34CE40 | | 3011 | dictionary table (1 row)
API Design
- New control parameters
- A logical decoding control parameter is added to specify the decoding start point. You can use the JDBC API or logical replication functions (such as pg_logical_slot_peek_changes, pg_logical_slot_get_changes, pg_logical_slot_peek_binary_changes, and pg_logical_slot_get_binary_changes) to enable this parameter.
- restart-lsn: If this parameter is not specified, the original consistency point of the logical replication slot is used for starting logical decoding. If this parameter is set to a valid value, the logical decoding starts from a consistency LSN after restart-lsn.
- The following GUC parameters are added:
- logical_replication_dictionary_retention_time: specifies the data retention period when system catalogs related to data dictionaries are recycled. The default value is 365, in days.
- enable_logical_replication_dictionary: The default value is ON. If this parameter is set to ON, logical replication slots of the multi-version dictionary table type are created for logical decoding. If this parameter is set to OFF, logical replication slots of the multi-version dictionary table type cannot be created.
- A logical decoding control parameter is added to specify the decoding start point. You can use the JDBC API or logical replication functions (such as pg_logical_slot_peek_changes, pg_logical_slot_get_changes, pg_logical_slot_peek_binary_changes, and pg_logical_slot_get_binary_changes) to enable this parameter.
- The following system functions are added:
- gs_logical_dictionary_baseline(): baselines the data dictionary data for logical decoding. If the operation is successful, the time required is returned. If the operation fails, the failure cause is returned.
Specifications: The execution duration of a function is positively correlated with the number of service tables on the instance. If an instance has 10,000 service tables, the execution of the baseline function takes about 25 seconds. If an instance has 100,000 service tables, the execution of the baseline function takes about 120 seconds. Operations of other SQL statements are not blocked during function execution.
You can run SELECT status FROM gs_logical_dictionary; to check whether the instance has been baselined. The return values and meanings are as follows:
- 0: The baselining is complete but not loaded.
- 1: The baselining is not complete.
- 2: The baselining is in progress.
- 3: The baselining has been completed.
If the function is successfully executed and the return result is 3 as expected, the baselining is complete. Then, enable synchronous write to system catalogs related to the data dictionary. For example, when pg_class is modified, gs_logical_class is modified accordingly.
- Gs_logical_dictionary_disabled(): disables the logical decoding data dictionary function and stops writing data to system catalogs related to the data dictionary. If the command is executed successfully, "OK" is returned. If the command fails to be executed, the failure cause is returned.
After the data dictionary function is disabled, the existing data dictionary cannot be used by the replication slot for decoding. To use the data dictionary function, you need to call the gs_logical_dictionary_baseline() function to baseline the logical data dictionary again.
- gs_logical_dictionary_baseline(): baselines the data dictionary data for logical decoding. If the operation is successful, the time required is returned. If the operation fails, the failure cause is returned.
Procedure
- For the logical decoding feature, set the wal_level GUC parameter to logical in advance. This parameter takes effect only after restart.
gs_guc set -Z datanode -D $node_dir -c "wal_level = logical"
In the preceding command, $node_dir indicates the database node path. Change it based on the actual situation.
- Log in to the primary node of GaussDB as a user with the REPLICATION permission and run the following command to connect to the database:
gsql -U user1 -W password -d db1 -p 16000 -r
In the preceding command, user1 indicates the username, password indicates the user password, db1 indicates the name of the database to be connected, and 16000 indicates the database port number. You can replace them as required.
- Create a logical replication slot named slot1.
1 2 3 4 5
gaussdb=# SELECT * FROM pg_create_logical_replication_slot('slot1', 'mppdb_decoding'); slotname | xlog_position ----------+--------------- slot1 | 0/3764C788 (1 row)
- Query create_lsn at a specified time point.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
gaussdb=# SELECT * FROM gs_txn_lsn_time ORDER BY create_time LIMIT 10; create_csn | create_lsn | snp_xmin | create_time | snp_snapshot | barrier ------------+------------+----------+-------------------------------+--------------+--------- 15639 | 607906688 | 0 | 2024-05-11 11:13:23.909348+08 | | 15640 | 607908632 | 0 | 2024-05-11 11:14:23.9082+08 | | 15641 | 607912608 | 0 | 2024-05-11 11:15:23.907633+08 | | 15642 | 607914280 | 0 | 2024-05-11 11:16:23.907281+08 | | 15643 | 607917416 | 0 | 2024-05-11 11:17:23.906145+08 | | 15646 | 608049488 | 0 | 2024-05-11 11:18:23.905562+08 | | 15647 | 608062840 | 0 | 2024-05-11 11:19:23.904358+08 | | 15648 | 608077904 | 0 | 2024-05-11 11:20:23.903672+08 | | 15651 | 608106328 | 0 | 2024-05-11 11:21:23.902544+08 | | 15653 | 608173464 | 0 | 2024-05-11 11:22:23.902104+08 | | (10 rows)
- create_lsn is stored as a number in the system catalog. An LSN can be converted to the common LSN format (for reference only), that is, XXXXXXXX/XXXXXXXX, using the SQL syntax. The conversion syntax is as follows:
SELECT CONCAT(to_hex(({lsn_num}::bigint >> 32) & x'ffffffff'::bigint), '/', to_hex(({lsn_num}::bigint << 32 >> 32) & x'ffffffff'::bigint));
When using the preceding SQL statement, convert lsn_num to the required number. The following uses 607906688 as an example:
gaussdb=# SELECT CONCAT(to_hex((607906688::bigint >> 32) & x'ffffffff'::bigint), '/', to_hex((607906688::bigint << 32 >> 32) & x'ffffffff'::bigint)); concat ------------ 0/243beb80 (1 row)
- The M-compatible syntax is different from the GaussDB syntax. Therefore, in an M-compatible database, you need to perform the following conversion:
SELECT CONCAT(hex(({lsn_num} >> 32) & 4294967295), '/', hex((({lsn_num}<< 32) >> 32) & 4294967295));
When using the preceding SQL statement, convert lsn_num to the required number. The following uses 607906688 as an example:
gaussdb_m=# SELECT CONCAT(hex((607906688 >> 32) & 4294967295), '/', hex(((607906688 << 32) >> 32) & 4294967295)); concat ------------ 0/243BEB80 (1 row)
- create_lsn is stored as a number in the system catalog. An LSN can be converted to the common LSN format (for reference only), that is, XXXXXXXX/XXXXXXXX, using the SQL syntax. The conversion syntax is as follows:
- Specify the position restart-lsn for starting logical decoding and read the decoding result of the replication slot slot1. You can use the JDBC API or a logical replication function to specify the position for starting logical decoding. For example, if the value of create_lsn is 607906688, convert the value to 0/243beb80 in hexadecimal format based on the conversion command provided in Step 4. Therefore, you can set the restart-lsn parameter in the JDBC API or logical replication function to 0/243beb80.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
gaussdb=# SELECT * FROM pg_logical_slot_get_changes('slot_1', NULL, NULL, 'restart-lsn', '0/243beb80'); location | xid | data ------------+--------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 0/243CBA90 | 98702 | BEGIN 98702 0/243E1538 | 98702 | COMMIT 98702 (at 2024-05-11 11:18:24.824773+08) CSN 15645 0/243E1950 | 98703 | BEGIN 98703 0/243E1C70 | 98703 | COMMIT 98703 (at 2024-05-11 11:19:21.08274+08) CSN 15646 0/243E4D78 | 98704 | BEGIN 98704 0/243E5098 | 98704 | COMMIT 98704 (at 2024-05-11 11:20:21.083968+08) CSN 15647 0/243E8850 | 98705 | BEGIN 98705 0/243E8B70 | 98705 | COMMIT 98705 (at 2024-05-11 11:21:21.084388+08) CSN 15648 0/243E8B70 | 98706 | BEGIN 98706 0/243E8BB0 | 98706 | {"table_name":"public.tt","op_type":"INSERT","columns_name":["c1"],"columns_type":["integer"],"columns_val":["1"],"old_keys_name":[],"old_keys_type":[],"old_keys_val":[]} 0/243E8CD0 | 98706 | COMMIT 98706 (at 2024-05-11 11:21:23.894197+08) CSN 15649 0/243E8D80 | 98707 | BEGIN 98707 0/243E8DC0 | 98707 | {"table_name":"public.tt","op_type":"INSERT","columns_name":["c1"],"columns_type":["integer"],"columns_val":["2"],"old_keys_name":[],"old_keys_type":[],"old_keys_val":[]} 0/243E8EE0 | 98707 | COMMIT 98707 (at 2024-05-11 11:21:27.243517+08) CSN 15650 0/243EF758 | 98708 | BEGIN 98708 0/243EFA78 | 98708 | COMMIT 98708 (at 2024-05-11 11:22:21.08472+08) CSN 15651 0/243FFD98 | 98710 | BEGIN 98710 0/244000D0 | 98710 | COMMIT 98710 (at 2024-05-11 11:23:21.085542+08) CSN 15653 --More--
- Delete the logical replication slot slot1.
1 2 3 4 5
gaussdb=# SELECT * FROM pg_drop_replication_slot('slot1'); pg_drop_replication_slot -------------------------- (1 row)
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot