HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Can PostgreSQL use nulls in its indexes?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
postgresqlcanindexesitsusenulls

Problem

I've been reading this book which says that


The database assumes that Indexed_Col IS NOT NULL covers too large a range to be useful, so the database will not drive to an index from this condition.

I recognize that the book is more than 10 years old, but it has already proven quite useful--Using instructions gleaned from its pages, I've sped a query up by a factor of ten.

Further, in running EXPLAIN ANALYZE on a SELECT query, I've found that none of my indexes are being used, even when by all rights, they ought to be.

Thus, my question is:

Supposing there is a table which has a column, whose column definition includes "NOT NULL", and that an index exists which covers this column, would this index be used in a query of that table where the columns is part of the query?

Like:

CREATE TABLE my_table(
a varchar NOT NULL
);

CREATE INDEX ix_my_table ON my_table(a);

SELECT a from my_table;

Solution

PostgreSQL certainly can use an index for IS NOT NULL. I don't see any query planner assumptions about that condition, either.

If the null fraction for the column (pg_statistic.stanullfrac) is low enough to suggest that the index is usefully selective for the query, PostgreSQL will use an index.

I can't figure out what you're trying to say with:


If this is correct, is my understanding that an index on a column defined as "NOT NULL" not be used in a query which uses that column?

Certainly an index won't get used for an IS NOT NULL condition on a NOT NULL column. It'd always match 100% of rows, so a seqscan will almost always be much faster.

PostgreSQL won't use an index if the index doesn't filter out a large proportion of rows for a query. The only likely exception is when you're asking for a set of columns covered by a single index, in an order matching that of the index. PostgreSQL might do an index-only scan then. E.g. if there's an index on t(a, b, c) and you:

select a, b FROM t ORDER BY a, b, c;


PostgreSQL might use your index, even though no rows are filtered out, because it only has to read the index and can skip reading the heap, avoid doing a sort, etc.

Code Snippets

select a, b FROM t ORDER BY a, b, c;

Context

StackExchange Database Administrators Q#110607, answer score: 9

Revisions (0)

No revisions yet.