HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMajor

2 B-tree indices OR 1 GiST index on tsrange -- which will perform better?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
indicestsrangegistbetterwillperformwhichindextree

Problem

I have a table which is storing reservation data using the columns starts_at & ends_at Whenever I'm querying the table to find overlapping reservations, I have an option of using one of the following queries:

SELECT * FROM reservations
WHERE starts_at ='2014-01-01 00:00:00';


Or

SELECT * FROM reservations
WHERE tsrange(starts_at, ends_at) && ('2014-01-01 00:00:00', '2014-01-03 00:00:00')


I have regular B-Tree indices on the starts_at and ends_at columns, therefore the first query is always using them. However, unless I define a functional GiST index on the tsrange, the second query does a complete scan.

create index tsrange_idx on reservations using gist(tsrange(starts_at, ends_at));


My question is, as the table grows, which index is going to be faster? Probably the answer is obvious looking at the query execution plan, but I'm not well-versed with reading EXPLAIN ANALYZE output.

Solution

Timestamps with B-tree index

I suggest a third option: as long as your table holds two timestamp columns (which seem to be defined NOT NULL) I would use a single multicolumn B-tree index with opposed sort order (if no other considerations apply):

CREATE INDEX reservations_range_idx ON reservations (starts_at, ends_at DESC);


More in these related answer:

  • Optimizing queries on a range of timestamps (two columns)



  • PostgreSQL index not used for query on range



As for the query, have a look at the SQL-standard operator OVERLAPS:

SELECT * FROM reservations
WHERE (starts_at, ends_at) OVERLAPS ('2014-01-01 00:00:00', '2014-01-03 00:00:00');


More in this related question on SO:

  • Find overlapping date ranges in PostgreSQL



Should be faster than two B-tree indices. Less disk space and cheaper to maintain. The burden on write operations is small.
Range type with GiST index

With big tables, a GiST index on a range type is most probably faster, because it scales better. Storage on disk is considerably bigger and index maintenance a bit more expensive, though.

If you go that route, it would be more efficient to store your timestamps as range (tsrange or tstzrange) to begin with. A plain GiST index without the functional aspect is a bit faster.

CREATE TABLE reservation (
  reservation_id serial PRIMARY KEY
, span tsrange
, ...
);
 
CREATE INDEX reservation_span_gist_idx on reservations USING GiST (span);


With the && "overlap" operator you already displayed in the question:

SELECT *
FROM   reservation 
WHERE  span && ('2014-01-01 00:00:00', '2014-01-03 00:00:00');


You might be interested in an exclusion constraint to rule out overlaps by design, which implements a GiST index like the one above automatically. There is a code example in the manual. This related answer on SO has more details:

  • Preventing adjacent/overlapping entries with EXCLUDE in PostgreSQL

Code Snippets

CREATE INDEX reservations_range_idx ON reservations (starts_at, ends_at DESC);
SELECT * FROM reservations
WHERE (starts_at, ends_at) OVERLAPS ('2014-01-01 00:00:00', '2014-01-03 00:00:00');
CREATE TABLE reservation (
  reservation_id serial PRIMARY KEY
, span tsrange
, ...
);
 
CREATE INDEX reservation_span_gist_idx on reservations USING GiST (span);
SELECT *
FROM   reservation 
WHERE  span && ('2014-01-01 00:00:00', '2014-01-03 00:00:00');

Context

StackExchange Database Administrators Q#59074, answer score: 20

Revisions (0)

No revisions yet.