HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Creating index on TIMESTAMP column to use with range operators

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
columncreatingwithrangetimestampindexuseoperators

Problem

TLDR: Can I create an index that's used by the following WHERE clause:

WHERE foo_date <@ tsrange('2018-01-01', '2018-02-01')


Let's say I have a table like this:

CREATE TABLE foo
(
foo_id INTEGER GENERATED BY DEFAULT AS IDENTITY,
foo_date timestamp without time zone NOT NULL,
CONSTRAINT foo_pkey PRIMARY KEY (foo_id)
);

This table contains 100,000 records with dates from 2009-01-01 to 2018-12-29. I'd like to be able to query for rows in a given date range (e.g. for rows in January 2018).

Option 1

One approach is to use the BETWEEN operator:

SELECT * FROM foo WHERE foo_date BETWEEN '2018-01-01' AND '2018-01-31';

The problem of this approach is that if foo_date occurred on 2018-01-31 after midnight, they wouldn't be included in this query. So I could change the query to BETWEEN '2018-01-01' AND '2018-02-01'. The problem then, however, is records that occur on 2018-02-01 00:00:00. These would be included, which I don't want.

Option 2

Another option, put forth by Aaron Bertrand, is to use this construct:

foo_date >= '2018-01-01' AND foo_date < '2018-02-01'


(Yes, this blog is for SQL Server, but seems to be applicable here).

While this form unequivocally gives me the results I want, it's cumbersome: I have to repeat the column name twice.

Option 3

Since Postgres gives us the range data type, I thought a clearer form might be:

foo_date <@ tsrange('2018-01-01', '2018-02-01')


So my next question is, if I use this form, can I use an index to speed up operation?

With Options 1 and 2 above, a normal b-tree index can be used:

CREATE INDEX idx_foo ON foo(foo_date);


A query using Options 1 or 2 will use the index:

EXPLAIN SELECT * FROM foo 
WHERE 
    foo_date >= '2018-01-01' 
    AND foo_date < '2018-02-01';


gives me this query plan:

```
Bitmap Heap Scan on foo (cost=21.95..592.70 rows=942 width=12)
Recheck Cond: ((foo_date >= '2018-01-01 00:00:00'::timestamp without time zone) AND (f

Solution

The name "foo_date" indicates a date and is a bad choice for a timestamp column. Option 1 would work just fine with actual dates.

A plain btree index in combination with Option 2 is the unequivocally best solution. Look no further. Except maybe for the special case of a BRIN index for large tables with physically sorted data. See:

  • Speed up creation of Postgres partial index



Proof of concept

That said, to make a GiST or SP-GiST index work, you could create an expression index on fake ranges. You do not need the module btree_gist for this. Demonstrating with SP-GiST since that is typically a bit faster here. See:

  • Perform this hours of operation query in PostgreSQL



CREATE INDEX foo_date_spgist_idx ON foo USING spgist(tsrange(foo_date, foo_date, '[]'));

SELECT * FROM foo
WHERE  tsrange(foo_date, foo_date, '[]') <@ tsrange('2018-01-01', '2018-02-01')


Or with range literal:

...
WHERE  tsrange(foo_date, foo_date, '[]') <@ '[2018-01-01,2018-02-01)'


But: bigger, more expensive to maintain, slower than the btree index. Not even less cumbersome to write. Pointless for your case.

Aside: technically, you could:

... WHERE foo_date BETWEEN '2018-01-01' AND '2018-01-31 23:59.999999';


The Postgres timestamp type is (currently) implemented with µs resolution, i.e. max. 6 fractional digits. Hence the expression does exactly what you want. But I strongly advice against building on this implementation detail. Option 2 is the way to go. Related:

  • Interval (days) in PostgreSQL with two parameters



  • Ignoring time zones altogether in Rails and PostgreSQL

Code Snippets

CREATE INDEX foo_date_spgist_idx ON foo USING spgist(tsrange(foo_date, foo_date, '[]'));

SELECT * FROM foo
WHERE  tsrange(foo_date, foo_date, '[]') <@ tsrange('2018-01-01', '2018-02-01')
...
WHERE  tsrange(foo_date, foo_date, '[]') <@ '[2018-01-01,2018-02-01)'
... WHERE foo_date BETWEEN '2018-01-01' AND '2018-01-31 23:59.999999';

Context

StackExchange Database Administrators Q#236970, answer score: 7

Revisions (0)

No revisions yet.