patternMinor
Improve query "JOIN ... WHERE NOT EXISTS (SELECT 1 FROM ... WHERE ... )"
Viewed 0 times
querywherejoinimproveexistsselectfromnot
Problem
I'm trying to optimize a query because the plan generates a full table scan for a subquery.
Data
The (simplified) table structure:
The data (in real life the tables have hundreds of thousands of rows):
Query
The (simplified) original query:
Results:
This has a query plan with a full table scan for the subquery:
```
+------+--------------------+---------+-------+-------------------------+-----------+---------+----------------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+--------------------+---------+-------+-------------------------+-----------+-
Data
The (simplified) table structure:
CREATE TABLE RECORD (
ID INTEGER PRIMARY KEY AUTO_INCREMENT
);
CREATE TABLE POINT (
ID INTEGER PRIMARY KEY AUTO_INCREMENT,
RECORD_ID INTEGER NOT NULL,
CONSTRAINT POINT_FK1 FOREIGN KEY (RECORD_ID) REFERENCES RECORD(ID)
);
CREATE TABLE SEGMENT (
ID INTEGER PRIMARY KEY AUTO_INCREMENT,
POINT_START INTEGER NOT NULL,
POINT_END INTEGER NOT NULL,
RECORD_ID INTEGER NOT NULL,
CONSTRAINT SEGMENT_FK1 FOREIGN KEY (POINT_START) REFERENCES POINT(ID),
CONSTRAINT SEGMENT_FK2 FOREIGN KEY (POINT_END) REFERENCES POINT(ID),
CONSTRAINT SEGMENT_FK3 FOREIGN KEY (RECORD_ID) REFERENCES RECORD(ID)
);The data (in real life the tables have hundreds of thousands of rows):
INSERT INTO RECORD(ID) VALUES (1), (2), (3);
INSERT INTO POINT(ID, RECORD_ID) VALUES (1, 1), (2, 1), (3, 1), (4, 1), (5, 2), (6, 2);
INSERT INTO SEGMENT(ID, POINT_START, POINT_END, RECORD_ID) VALUES (1, 1, 2, 1), (2, 3, 4, 1);Query
The (simplified) original query:
SELECT
RECORD.ID AS RECORD_ID,
POINT.ID AS POINT_ID
FROM
RECORD
LEFT OUTER JOIN POINT
ON RECORD.ID = POINT.RECORD_ID
AND (NOT EXISTS (SELECT 1 FROM SEGMENT WHERE POINT_START = POINT.ID OR POINT_END = POINT.ID))Results:
+-----------+----------+
| RECORD_ID | POINT_ID |
+-----------+----------+
| 1 | NULL |
| 2 | 5 |
| 2 | 6 |
| 3 | NULL |
+-----------+----------+This has a query plan with a full table scan for the subquery:
```
+------+--------------------+---------+-------+-------------------------+-----------+---------+----------------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+--------------------+---------+-------+-------------------------+-----------+-
Solution
I tried the
http://sqlfiddle.com/#!9/f6e9ca/1
The dependent subqueries are now joins instead, using the same indexes but it should have less overhead - dependent subquery means firing a subquery per main (outer) select row which is not as well optimized as joins are.
LEFT JOIN .. IS NULL idea I noted in the comment:SELECT
RECORD.ID AS RECORD_ID,
POINT.ID AS POINT_ID
FROM
RECORD
LEFT OUTER JOIN (
POINT
LEFT JOIN SEGMENT s1 ON POINT.ID = s1.POINT_START
LEFT JOIN SEGMENT s2 ON POINT.ID = s2.POINT_END
) ON RECORD.ID = POINT.RECORD_ID
AND s1.POINT_START IS NULL AND s2.POINT_END IS NULL;http://sqlfiddle.com/#!9/f6e9ca/1
The dependent subqueries are now joins instead, using the same indexes but it should have less overhead - dependent subquery means firing a subquery per main (outer) select row which is not as well optimized as joins are.
Code Snippets
SELECT
RECORD.ID AS RECORD_ID,
POINT.ID AS POINT_ID
FROM
RECORD
LEFT OUTER JOIN (
POINT
LEFT JOIN SEGMENT s1 ON POINT.ID = s1.POINT_START
LEFT JOIN SEGMENT s2 ON POINT.ID = s2.POINT_END
) ON RECORD.ID = POINT.RECORD_ID
AND s1.POINT_START IS NULL AND s2.POINT_END IS NULL;Context
StackExchange Database Administrators Q#137384, answer score: 2
Revisions (0)
No revisions yet.