Merge Join
Description
A merge join is an efficient join method that depends on sorting operations. During a merge join, GaussDB sorts the join columns of the two tables and scans both tables at the same time to find matching rows. The time complexity of a merge join is O(n + m), where n and m indicate the numbers of rows in the two tables. However, if a sorting operation is required, the time complexity of the sorting operation may reach max(O(log n), O(log m)), which is usually more time-consuming than a direct merge join operation. In GaussDB, the optimizer prefers hash join, even if the two tables to be joined have been sorted.
Typical Scenarios
- The two tables are similar in size.
- The join columns of two tables have been sorted, for example, by index.
- In addition to the equi-join (for example, using the = operator), the join condition also includes the range query (for example, using the <, <=, >, and >= operators).
Examples
Example: Two tables are joined and the sizes of the two tables are close.
-- Prepare data. gaussdb=# CREATE TABLE t1(id int,info text); CREATE TABLE gaussdb=# CREATE TABLE t2(id int,info text); CREATE TABLE gaussdb=# INSERT INTO t2 SELECT generate_series(1,100000),'bill'||generate_series(1,100000); INSERT 0 100000 gaussdb=# INSERT INTO t1 SELECT generate_series(1,100000),'bill'||generate_series(1,100000); INSERT 0 100000 -- Execution result. gaussdb=# EXPLAIN SELECT /*+ MERGEJOIN(t2 t1) */ * FROM t2 JOIN t1 ON (t1.id=t2.id); QUERY PLAN ----------------------------------------------------------------------- Merge Join (cost=19421.64..21421.64 rows=100000 width=26) Merge Cond: (t2.id = t1.id) -> Sort (cost=9710.82..9960.82 rows=100000 width=13) Sort Key: t2.id -> Seq Scan on t2 (cost=0.00..1406.00 rows=100000 width=13) -> Sort (cost=9710.82..9960.82 rows=100000 width=13) Sort Key: t1.id -> Seq Scan on t1 (cost=0.00..1406.00 rows=100000 width=13) (8 rows) -- Drop. gaussdb=# DROP TABLE t1,t2;
In the preceding example, the output of the merge join operator is as follows.
Item |
Description |
---|---|
Merge Join |
Operator name. |
Merge Cond |
Join predicate of the operator merge join. In the example, the condition is that t2.id is equal to t1.id. During query execution, rows that meet these conditions are included in the final result set. |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot