HiveBrain v1.2.0
Get Started
← Back to all entries
snippetsqlMinor

How to optimise window queries in postgres

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
postgresqueriesoptimisehowwindow

Problem

I have the following table with approximately 175k records:

Column     |            Type             |              Modifiers
----------------+-----------------------------+-------------------------------------
 id             | uuid                        | not null default uuid_generate_v4()
 competition_id | uuid                        | not null
 user_id        | uuid                        | not null
 first_name     | character varying(255)      | not null
 last_name      | character varying(255)      | not null
 image          | character varying(255)      |
 country        | character varying(255)      |
 slug           | character varying(255)      | not null
 total_votes    | integer                     | not null default 0
 created_at     | timestamp without time zone |
 updated_at     | timestamp without time zone |
 featured_until | timestamp without time zone |
 image_src      | character varying(255)      |
 hidden         | boolean                     | not null default false
 photos_count   | integer                     | not null default 0
 photo_id       | uuid                        |
Indexes:
    "entries_pkey" PRIMARY KEY, btree (id)
    "index_entries_on_competition_id" btree (competition_id)
    "index_entries_on_featured_until" btree (featured_until)
    "index_entries_on_hidden" btree (hidden)
    "index_entries_on_photo_id" btree (photo_id)
    "index_entries_on_slug" btree (slug)
    "index_entries_on_total_votes" btree (total_votes)
    "index_entries_on_user_id" btree (user_id)


and I'm executing the following query to get the rank of the entry and the slug of the next and previous entry:

```
WITH entry_with_global_rank AS (
SELECT id
, rank() OVER w AS global_rank
, LAG(slug) OVER w AS previous_slug
, LEAD(slug) OVER w AS next_slug
FROM entries
WHERE competition_id = 'bdd94eee-25a4-481f-b7b5-37aaed953c6b'
WINDOW w AS (PARTITION BY competition_id ORDER BY total_votes DESC)
)
SELECT *
FROM entr

Solution

The CTE is not needed here and poses as optimization barrier. A plain subquery generally performs better:

SELECT * 
FROM  (
   SELECT id
         ,rank()     OVER w AS global_rank
         ,lag(slug)  OVER w AS previous_slug
         ,lead(slug) OVER w AS next_slug 
   FROM   entries 
   WHERE  competition_id = 'bdd94eee-25a4-481f-b7b5-37aaed953c6b' 
   WINDOW w AS (ORDER BY total_votes DESC) 
   ) entry_with_global_rank 
WHERE  id = 'f2df68b7-d720-459d-8c4d-d11e28e0f0c0' 
LIMIT  1;


As @Daniel commented, I removed the PARTITION BY clause from the window definition, since you are limiting to a single competition_id anyway.
Table layout

You could optimize your table layout to slightly reduce on-disk storage size, which makes everything a bit faster, yet:

Column     |            Type             |              Modifiers
----------------+-----------------------------+-------------------------------------
 id             | uuid                        | not null default uuid_generate_v4()
 competition_id | uuid                        | not null
 user_id        | uuid                        | not null
 total_votes    | integer                     | not null default 0
 photos_count   | integer                     | not null default 0
 hidden         | boolean                     | not null default false
 slug           | character varying(255)      | not null
 first_name     | character varying(255)      | not null
 last_name      | character varying(255)      | not null
 image          | character varying(255)      |
 country        | character varying(255)      |
 image_src      | character varying(255)      |
 photo_id       | uuid                        |
 created_at     | timestamp without time zone |
 updated_at     | timestamp without time zone |
 featured_until | timestamp without time zone |


More about that:

  • Configuring PostgreSQL for read performance



Also, do you actually need all those uuid columns? int or bigint won't work for you? Would make table and indexes a bit smaller and everything faster.

And I would just use text for the character data, but that is not going to help performance of the query.

Aside: character varying(255) is almost always pointless in Postgres. Some other RDBMS profit from the restriction of the length, for Postgres it's all the same (unless you actually need to enforce the unlikely max. length of 255 characters).
Special index

Finally, you could build a highly specialized index (only if index maintenance is worth the special casing):

CREATE INDEX entries_special_idx ON entries (competition_id, total_votes DESC, id, slug);


Adding (id, slug) to the index only makes sense if you can get index-only scans out of this. (Disabled autovacuum or lots of concurrent writes would negate that effort.) Else remove the last two columns.

While being at it, audit your indexes. Are they all in use? There might be some dead freight here.

Code Snippets

SELECT * 
FROM  (
   SELECT id
         ,rank()     OVER w AS global_rank
         ,lag(slug)  OVER w AS previous_slug
         ,lead(slug) OVER w AS next_slug 
   FROM   entries 
   WHERE  competition_id = 'bdd94eee-25a4-481f-b7b5-37aaed953c6b' 
   WINDOW w AS (ORDER BY total_votes DESC) 
   ) entry_with_global_rank 
WHERE  id = 'f2df68b7-d720-459d-8c4d-d11e28e0f0c0' 
LIMIT  1;
Column     |            Type             |              Modifiers
----------------+-----------------------------+-------------------------------------
 id             | uuid                        | not null default uuid_generate_v4()
 competition_id | uuid                        | not null
 user_id        | uuid                        | not null
 total_votes    | integer                     | not null default 0
 photos_count   | integer                     | not null default 0
 hidden         | boolean                     | not null default false
 slug           | character varying(255)      | not null
 first_name     | character varying(255)      | not null
 last_name      | character varying(255)      | not null
 image          | character varying(255)      |
 country        | character varying(255)      |
 image_src      | character varying(255)      |
 photo_id       | uuid                        |
 created_at     | timestamp without time zone |
 updated_at     | timestamp without time zone |
 featured_until | timestamp without time zone |
CREATE INDEX entries_special_idx ON entries (competition_id, total_votes DESC, id, slug);

Context

StackExchange Database Administrators Q#74530, answer score: 7

Revisions (0)

No revisions yet.