patternsqlMinor
Not searching GIN index when querying ANY(column)
Viewed 0 times
searchingcolumnanyginqueryingwhenindexnot
Problem
In a Postgres 9.4.11 database, I have a large table of books, which has a one-to-many relationship with authors. Theoretically to improve performance, I've cached the author names in the ebook record for quick searching. Properties of the column
Sample rows:
I've created a GIN index on this column:
It's not using the index when searching
Result:
10 seconds is not an acceptable amount of time. I'm open to design changes, as I'm not "married" to this column, yet.
authors_cache:Sample rows:
I've created a GIN index on this column:
CREATE INDEX "index_ebooks_on_authors_cache" ON "ebooks" USING gin ("authors_cache")It's not using the index when searching
authors_cache and takes a unacceptable amount of time:EXPLAIN ANALYZE
SELECT * FROM "ebooks"
WHERE ('Charles Bukowski' = ANY (authors_cache))
LIMIT 60 OFFSET 0;Result:
Limit (cost=0.00..172334.67 rows=30 width=124) (actual time=71.962..10067.070 rows=59 loops=1)
-> Seq Scan on ebooks (cost=0.00..172334.67 rows=30 width=124) (actual time=71.960..10067.015 rows=59 loops=1)
Filter: ('Charles Bukowski'::text = ANY ((authors_cache)::text[]))
Rows Removed by Filter: 894504
Planning time: 0.188 ms
Execution time: 10067.112 ms
10 seconds is not an acceptable amount of time. I'm open to design changes, as I'm not "married" to this column, yet.
Solution
Use array operators which are supported by your GIN index. The
Detailed explanation:
Not sure your de-normalization will buy you much. A standard 1:n implementation with simple btree indexes should perform nicely, too.
If you are going with this approach, I would consider storing an array of integer (
Oh, and consider upgrading. Performance with GIN indexes has been improved substantially since Postgres 9.4.
= ANY () construct is not.SELECT *
FROM ebooks
WHERE authors_cache @> '{Charles Bukowski}' -- array literal
LIMIT 60;Detailed explanation:
- Can PostgreSQL index array columns?
Not sure your de-normalization will buy you much. A standard 1:n implementation with simple btree indexes should perform nicely, too.
If you are going with this approach, I would consider storing an array of integer (
int[]) with author_id's. Substantially smaller array column and GIN index. Potential problems with referential integrity are about the same. (Array elements cannot currently have FK constraints. Attempts to implement this in Postgres were unsuccessful so far.)Oh, and consider upgrading. Performance with GIN indexes has been improved substantially since Postgres 9.4.
Code Snippets
SELECT *
FROM ebooks
WHERE authors_cache @> '{Charles Bukowski}' -- array literal
LIMIT 60;Context
StackExchange Database Administrators Q#187959, answer score: 8
Revisions (0)
No revisions yet.