patternsqlMinor
Index on column with data type citext not used
Viewed 0 times
columnwithcitextusedtypeindexdatanot
Problem
In PostgreSQL 9.4, with following schema:
if I search by name, the index is used:
But if I replace
The index is not used anymore:
According to
Otherwise, it behaves almost exactly like text.
How can I tell PostgreSQL to use the index?
CREATE TABLE people (
id INTEGER PRIMARY KEY,
name TEXT,
junk CHAR(1000)
);
INSERT INTO people(id, name)
SELECT generate_series(1,100000), md5(random()::text);
CREATE INDEX ON people (name text_pattern_ops);if I search by name, the index is used:
test=# explain analyze select id, name from people where name like 'a%';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on people (cost=248.59..1160.92 rows=6061 width=37) (actual time=2.412..8.340 rows=6271 loops=1)
Filter: (name ~~ 'a%'::text)
Heap Blocks: exact=834
-> Bitmap Index Scan on people_name_idx (cost=0.00..247.08 rows=6266 width=0) (actual time=2.123..2.123 rows=6271 loops=1)
Index Cond: ((name ~>=~ 'a'::text) AND (name ~<~ 'b'::text))
Planning time: 0.600 ms
Execution time: 8.991 msBut if I replace
TEXT with CITEXT:CREATE EXTENSION CIText;
CREATE TABLE people (
id INTEGER PRIMARY KEY,
name CITEXT,
junk CHAR(1000)
);The index is not used anymore:
test=# explain analyze select id, name from people where name like 'a%';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Seq Scan on people (cost=0.00..2084.00 rows=500 width=36) (actual time=5.700..152.572 rows=6305 loops=1)
Filter: (name ~~ 'a%'::citext)
Rows Removed by Filter: 93695
Planning time: 0.764 ms
Execution time: 153.046 msAccording to
CITEXT PostgreSQL documentation, the behavior should be as with TEXT: Otherwise, it behaves almost exactly like text.
How can I tell PostgreSQL to use the index?
Solution
Index usage with
Like you commented, the actual reason is burried in collation support.
Either way,
And a corresponding query:
Note that
Alternatively, you could use a trigram index, which is more expensive to maintain, but also offers more capabilities:
Aside: your test case is suboptimal because your dummy values are all lower case to begin with and the pattern
text_pattern_ops (as well as with the default operator class when using the C locale) depends on the binary representation of character data. citext stores original values with the case preserved, so there must be a problem with that ...Like you commented, the actual reason is burried in collation support.
Either way,
citext or text, you can make it work with an expression index:CREATE INDEX people_name_idx ON people (lower(name) text_pattern_ops);And a corresponding query:
SELECT id, name FROM people WHERE lower(name) LIKE 'abc%';Note that
lower(name) returns data type text, even when feeding it citext.Alternatively, you could use a trigram index, which is more expensive to maintain, but also offers more capabilities:
- How is LIKE implemented?
- Pattern matching with LIKE, SIMILAR TO or regular expressions
Aside: your test case is suboptimal because your dummy values are all lower case to begin with and the pattern
'a%' is often not selective enough to use an index at all. And char(1000) doesn't make sense (even though irrelevant to the test).Code Snippets
CREATE INDEX people_name_idx ON people (lower(name) text_pattern_ops);SELECT id, name FROM people WHERE lower(name) LIKE 'abc%';Context
StackExchange Database Administrators Q#105244, answer score: 7
Revisions (0)
No revisions yet.