patternsqlMinor
Multi-expression indexes in PostgreSQL not using being used as expected
Viewed 0 times
expressionpostgresqlmultiusedexpectedindexesbeingusingnot
Problem
I have a PostgreSQL 9.3 table with multiple expression indexes defined. When I use
http://sqlfiddle.com/#!15/580e0/1
The basic multi-column index
However the similar multi-column expression index abplus1 isn't being used:
The strange thing is that the more-specific index does get used if the single-column expression indexes are removed. See this example http://sqlfiddle.com/#!15/f4fda/1:
```
explain select * from Table1 where (a+1) = 2 AND (b+1) = 2
| QUERY PLAN |
|-----------------------------------------------------------------------|
| Index Scan using abplus1 on table1 (cost=0.14..8.15 ro
explain to see which indexes are being used for my query, I am surprised to see that PostgreSQL is not using the index which most closely-matches the conditions in my WHERE clause. It appears to be a problem specific to expression indexes. Here is a sqlfiddle link which demonstrates the problem:http://sqlfiddle.com/#!15/580e0/1
CREATE TABLE Table1
("a" int, "b" int, "c" int, "d" int)
;
CREATE index a ON Table1 (a);
CREATE index ab ON Table1 (a,b);
CREATE index bplus1 ON Table1 ((b+1));
CREATE index abplus1 ON Table1 ((a+1),(b+1));The basic multi-column index
ab is being used properly:explain select * from Table1 where a = 1 AND b = 1
| QUERY PLAN |
|------------------------------------------------------------------|
| Index Scan using ab on table1 (cost=0.13..8.15 rows=1 width=16) |
| Index Cond: ((a = 1) AND (b = 1)) |However the similar multi-column expression index abplus1 isn't being used:
explain select * from Table1 where (a+1) = 2 AND (b+1) = 2
| QUERY PLAN |
|----------------------------------------------------------------------|
| Index Scan using bplus1 on table1 (cost=0.13..8.16 rows=1 width=16) |
| Index Cond: ((b + 1) = 2) |
| Filter: ((a + 1) = 2) |The strange thing is that the more-specific index does get used if the single-column expression indexes are removed. See this example http://sqlfiddle.com/#!15/f4fda/1:
```
explain select * from Table1 where (a+1) = 2 AND (b+1) = 2
| QUERY PLAN |
|-----------------------------------------------------------------------|
| Index Scan using abplus1 on table1 (cost=0.14..8.15 ro
Solution
It looks to me like there is a conflict between the single-expression and the multi-expression indexes.
All applicable indexes are candidates, if that's what you mean by "conflict". But that's the extent of it. Postgres estimates execution time based on table and index statistics in combination with cost settings, and the best estimate wins.
If Postgres expects one of your two conditions to be selective enough, it will likely take a matching single-column index and filter possible (few!) false positives in a
However, that is certainly not the case for the data distribution you mention in the comment:
20,000 rows. b=1 for 10,000, a=1 for 2. One row for a=1,b=1
@ypercube already demonstrated in his sqlfiddle that your test should use the multicolumn index. Even more so with the data distribution you describe.
If you provide the output of
Come to think of it, we (that is: you & me) solved this problem before:
Also related:
That aside, Postgres 9.3 is getting old. There have been 3 major release since. If you run into problems like this I suggest to test a current version and see if the problem is still there.
All applicable indexes are candidates, if that's what you mean by "conflict". But that's the extent of it. Postgres estimates execution time based on table and index statistics in combination with cost settings, and the best estimate wins.
If Postgres expects one of your two conditions to be selective enough, it will likely take a matching single-column index and filter possible (few!) false positives in a
FILTER step with the second condition.However, that is certainly not the case for the data distribution you mention in the comment:
20,000 rows. b=1 for 10,000, a=1 for 2. One row for a=1,b=1
@ypercube already demonstrated in his sqlfiddle that your test should use the multicolumn index. Even more so with the data distribution you describe.
If you provide the output of
EXPLAIN (ANALYZE, BUFFERS) instead of just EXPLAIN, we might be able to see more. I would guess you didn't run ANALYZE after creating the expression index, which is required to update statistics. autovacuum will kick in after some time, but not fast enough if you run your test immediately after creating the index (and never for temporary tables!).ANALYZE Table1;Come to think of it, we (that is: you & me) solved this problem before:
- PostgreSQL partial index unused when created on a table with existing data
Also related:
- Index that is not used, yet influences query
That aside, Postgres 9.3 is getting old. There have been 3 major release since. If you run into problems like this I suggest to test a current version and see if the problem is still there.
Code Snippets
ANALYZE Table1;Context
StackExchange Database Administrators Q#155928, answer score: 4
Revisions (0)
No revisions yet.