patternsqlMinor
Expression index on a citext column ignored, why?
Viewed 0 times
expressionwhyignoredcolumncitextindex
Problem
Running on RDS with about 32M rows.
Also testing locally on macOS with about 8M rows.
I've got a column named
Background
I've got a field change log table named record_changes_log_detail with 32M rows and growing that includes a citext field named old_value.
The data is very skeweed. Most values are less than a dozen characters, some are more than 5,000.
Postgres chokes on large values with an error about B-tree entries being limited to 2172 characters. So I believe that for a B-tree, I need to substring the source value.
My users primary interest is in an = search, a starts-with search, and, sometimes, a contains-this-substring search. So = string% and %string%
Goals
Create an index that supports those searches that the planner uses.
Tried and failed
A straight B-tree fails to build, in some cases, because of long values.
An expression B-tree like this builds, but is not used
Adding text_pattern_opts does not help.
Tried and works partially
A hash index works, but only for equality. (Like it says on the tin.)
This is the closest I've gotten to success:
This works for quality, but not
PostgreSQL 11.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11), 64-bitAlso testing locally on macOS with about 8M rows.
PostgreSQL 11.5 on x86_64-apple-darwin16.7.0, compiled by Apple LLVM version 8.1.0 (clang-802.0.42), 64-bitI've got a column named
old_value that's of type citext. I asked about this already, but posted way to many of my discovery steps along the way. Here's a boiled down version that I'm hoping gets to the point.Background
I've got a field change log table named record_changes_log_detail with 32M rows and growing that includes a citext field named old_value.
The data is very skeweed. Most values are less than a dozen characters, some are more than 5,000.
Postgres chokes on large values with an error about B-tree entries being limited to 2172 characters. So I believe that for a B-tree, I need to substring the source value.
My users primary interest is in an = search, a starts-with search, and, sometimes, a contains-this-substring search. So = string% and %string%
Goals
Create an index that supports those searches that the planner uses.
Tried and failed
A straight B-tree fails to build, in some cases, because of long values.
An expression B-tree like this builds, but is not used
CREATE INDEX record_changes_log_detail_old_value_ix_btree
ON record_changes_log_detail
USING btree (substring(old_value,1,1024));Adding text_pattern_opts does not help.
CREATE INDEX record_changes_log_detail_old_value_ix_btree
ON record_changes_log_detail
USING btree (substring(old_value,1,1024) text_pattern_opts);Tried and works partially
A hash index works, but only for equality. (Like it says on the tin.)
This is the closest I've gotten to success:
CREATE INDEX record_changes_log_detail_old_value_ix_btree
ON record_changes_log_detail
USING btree (old_value citext_pattern_ops);This works for quality, but not
Solution
Please edit your question, rather than posting answers to it that don't answer it.
If you create an index on the expression
While it is theoretically possible to prove that
If you create an index on the expression
substring(old_value,1,1024), then that index can only get used if you query involves substring(old_value,1,1024).While it is theoretically possible to prove that
old_value='foo' implies that substring(old_value,1,1024)='foo' (and thus the contrapositive to that) if you have enough insight into the internals of substring, PostgreSQL makes no attempt to prove that. You need to write the query in a way that no such proof is needed.Context
StackExchange Database Administrators Q#249022, answer score: 3
Revisions (0)
No revisions yet.