HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Expression index on a citext column ignored, why?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
expressionwhyignoredcolumncitextindex

Problem

Running on RDS with about 32M rows.

PostgreSQL 11.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11), 64-bit


Also testing locally on macOS with about 8M rows.

PostgreSQL 11.5 on x86_64-apple-darwin16.7.0, compiled by Apple LLVM version 8.1.0 (clang-802.0.42), 64-bit


I've got a column named old_value that's of type citext. I asked about this already, but posted way to many of my discovery steps along the way. Here's a boiled down version that I'm hoping gets to the point.

Background

I've got a field change log table named record_changes_log_detail with 32M rows and growing that includes a citext field named old_value.

The data is very skeweed. Most values are less than a dozen characters, some are more than 5,000.

Postgres chokes on large values with an error about B-tree entries being limited to 2172 characters. So I believe that for a B-tree, I need to substring the source value.

My users primary interest is in an = search, a starts-with search, and, sometimes, a contains-this-substring search. So = string% and %string%

Goals

Create an index that supports those searches that the planner uses.

Tried and failed

A straight B-tree fails to build, in some cases, because of long values.

An expression B-tree like this builds, but is not used

CREATE INDEX record_changes_log_detail_old_value_ix_btree
    ON  record_changes_log_detail 
    USING btree (substring(old_value,1,1024));


Adding text_pattern_opts does not help.

CREATE INDEX record_changes_log_detail_old_value_ix_btree
    ON  record_changes_log_detail 
    USING btree (substring(old_value,1,1024) text_pattern_opts);


Tried and works partially

A hash index works, but only for equality. (Like it says on the tin.)

This is the closest I've gotten to success:

CREATE INDEX record_changes_log_detail_old_value_ix_btree
    ON record_changes_log_detail 
    USING btree (old_value citext_pattern_ops);


This works for quality, but not

Solution

Please edit your question, rather than posting answers to it that don't answer it.

If you create an index on the expression substring(old_value,1,1024), then that index can only get used if you query involves substring(old_value,1,1024).

While it is theoretically possible to prove that old_value='foo' implies that substring(old_value,1,1024)='foo' (and thus the contrapositive to that) if you have enough insight into the internals of substring, PostgreSQL makes no attempt to prove that. You need to write the query in a way that no such proof is needed.

Context

StackExchange Database Administrators Q#249022, answer score: 3

Revisions (0)

No revisions yet.