patternsqlMinor
Creating index on TIMESTAMP column to use with range operators
Viewed 0 times
columncreatingwithrangetimestampindexuseoperators
Problem
TLDR: Can I create an index that's used by the following
Let's say I have a table like this:
CREATE TABLE foo
(
foo_id INTEGER GENERATED BY DEFAULT AS IDENTITY,
foo_date timestamp without time zone NOT NULL,
CONSTRAINT foo_pkey PRIMARY KEY (foo_id)
);
This table contains 100,000 records with dates from
Option 1
One approach is to use the
SELECT * FROM foo WHERE foo_date BETWEEN '2018-01-01' AND '2018-01-31';
The problem of this approach is that if
Option 2
Another option, put forth by Aaron Bertrand, is to use this construct:
(Yes, this blog is for SQL Server, but seems to be applicable here).
While this form unequivocally gives me the results I want, it's cumbersome: I have to repeat the column name twice.
Option 3
Since Postgres gives us the range data type, I thought a clearer form might be:
So my next question is, if I use this form, can I use an index to speed up operation?
With Options 1 and 2 above, a normal b-tree index can be used:
A query using Options 1 or 2 will use the index:
gives me this query plan:
```
Bitmap Heap Scan on foo (cost=21.95..592.70 rows=942 width=12)
Recheck Cond: ((foo_date >= '2018-01-01 00:00:00'::timestamp without time zone) AND (f
WHERE clause: WHERE foo_date <@ tsrange('2018-01-01', '2018-02-01')Let's say I have a table like this:
CREATE TABLE foo
(
foo_id INTEGER GENERATED BY DEFAULT AS IDENTITY,
foo_date timestamp without time zone NOT NULL,
CONSTRAINT foo_pkey PRIMARY KEY (foo_id)
);
This table contains 100,000 records with dates from
2009-01-01 to 2018-12-29. I'd like to be able to query for rows in a given date range (e.g. for rows in January 2018).Option 1
One approach is to use the
BETWEEN operator: SELECT * FROM foo WHERE foo_date BETWEEN '2018-01-01' AND '2018-01-31';
The problem of this approach is that if
foo_date occurred on 2018-01-31 after midnight, they wouldn't be included in this query. So I could change the query to BETWEEN '2018-01-01' AND '2018-02-01'. The problem then, however, is records that occur on 2018-02-01 00:00:00. These would be included, which I don't want.Option 2
Another option, put forth by Aaron Bertrand, is to use this construct:
foo_date >= '2018-01-01' AND foo_date < '2018-02-01'(Yes, this blog is for SQL Server, but seems to be applicable here).
While this form unequivocally gives me the results I want, it's cumbersome: I have to repeat the column name twice.
Option 3
Since Postgres gives us the range data type, I thought a clearer form might be:
foo_date <@ tsrange('2018-01-01', '2018-02-01')So my next question is, if I use this form, can I use an index to speed up operation?
With Options 1 and 2 above, a normal b-tree index can be used:
CREATE INDEX idx_foo ON foo(foo_date);A query using Options 1 or 2 will use the index:
EXPLAIN SELECT * FROM foo
WHERE
foo_date >= '2018-01-01'
AND foo_date < '2018-02-01';gives me this query plan:
```
Bitmap Heap Scan on foo (cost=21.95..592.70 rows=942 width=12)
Recheck Cond: ((foo_date >= '2018-01-01 00:00:00'::timestamp without time zone) AND (f
Solution
The name "foo_date" indicates a
A plain btree index in combination with Option 2 is the unequivocally best solution. Look no further. Except maybe for the special case of a BRIN index for large tables with physically sorted data. See:
Proof of concept
That said, to make a GiST or SP-GiST index work, you could create an expression index on fake ranges. You do not need the module
Or with range literal:
But: bigger, more expensive to maintain, slower than the btree index. Not even less cumbersome to write. Pointless for your case.
Aside: technically, you could:
The Postgres timestamp type is (currently) implemented with µs resolution, i.e. max. 6 fractional digits. Hence the expression does exactly what you want. But I strongly advice against building on this implementation detail. Option 2 is the way to go. Related:
date and is a bad choice for a timestamp column. Option 1 would work just fine with actual dates.A plain btree index in combination with Option 2 is the unequivocally best solution. Look no further. Except maybe for the special case of a BRIN index for large tables with physically sorted data. See:
- Speed up creation of Postgres partial index
Proof of concept
That said, to make a GiST or SP-GiST index work, you could create an expression index on fake ranges. You do not need the module
btree_gist for this. Demonstrating with SP-GiST since that is typically a bit faster here. See:- Perform this hours of operation query in PostgreSQL
CREATE INDEX foo_date_spgist_idx ON foo USING spgist(tsrange(foo_date, foo_date, '[]'));
SELECT * FROM foo
WHERE tsrange(foo_date, foo_date, '[]') <@ tsrange('2018-01-01', '2018-02-01')Or with range literal:
...
WHERE tsrange(foo_date, foo_date, '[]') <@ '[2018-01-01,2018-02-01)'But: bigger, more expensive to maintain, slower than the btree index. Not even less cumbersome to write. Pointless for your case.
Aside: technically, you could:
... WHERE foo_date BETWEEN '2018-01-01' AND '2018-01-31 23:59.999999';The Postgres timestamp type is (currently) implemented with µs resolution, i.e. max. 6 fractional digits. Hence the expression does exactly what you want. But I strongly advice against building on this implementation detail. Option 2 is the way to go. Related:
- Interval (days) in PostgreSQL with two parameters
- Ignoring time zones altogether in Rails and PostgreSQL
Code Snippets
CREATE INDEX foo_date_spgist_idx ON foo USING spgist(tsrange(foo_date, foo_date, '[]'));
SELECT * FROM foo
WHERE tsrange(foo_date, foo_date, '[]') <@ tsrange('2018-01-01', '2018-02-01')...
WHERE tsrange(foo_date, foo_date, '[]') <@ '[2018-01-01,2018-02-01)'... WHERE foo_date BETWEEN '2018-01-01' AND '2018-01-31 23:59.999999';Context
StackExchange Database Administrators Q#236970, answer score: 7
Revisions (0)
No revisions yet.