patternsqlMinor
Select N rows before/after specified row
Viewed 0 times
afterrowsbeforeselectrowspecified
Problem
Let's say I have a simple table in Postgres (11.3):
If a user requests id=5869, I need to be able to return the N rows before and the N rows after that row in a query that is ordered by the
However, I am unable to assume that the higher id is most recently created and I'm wondering what the best way to retrieve that data is in that case. This method works, but is remarkably slow on a 100k row dataset:
Running through that was taking upwards of 800 milliseconds which is far too slow on a dataset so small.
I have also tried a variation of the above using
Is there a better way to do this query? Is there perhaps a window function I am missing in Postgres that would handle it?
create table posts
(
id serial not null,
created_at timestamp(0)
constraint posts_pkey
primary key (id)
);If a user requests id=5869, I need to be able to return the N rows before and the N rows after that row in a query that is ordered by the
created_at column. If we're able to assume that the larger the id, the larger the created_at, we can do something relatively simple like this:(select * from posts where id = 5869 order by id limit 11);However, I am unable to assume that the higher id is most recently created and I'm wondering what the best way to retrieve that data is in that case. This method works, but is remarkably slow on a 100k row dataset:
WITH
boundaries AS (
SELECT *,
row_number() OVER (ORDER BY created_at DESC) AS rownum
FROM posts
),
target_boundary AS (
SELECT *
FROM boundaries
WHERE boundaries.id = 5869
)
SELECT posts.*, boundaries.rownum
FROM posts
LEFT JOIN boundaries ON posts.id = boundaries.id
JOIN target_boundary ON boundaries.rownum BETWEEN target_boundary.rownum - 10 AND target_boundary.rownum + 10Running through that was taking upwards of 800 milliseconds which is far too slow on a dataset so small.
I have also tried a variation of the above using
lead() and lag(), but that was even less efficient.Is there a better way to do this query? Is there perhaps a window function I am missing in Postgres that would handle it?
Solution
Use the power of
This query assumes that there are no duplicates in
For good performance, you need indexes on
If you need the result sorted, use my query as a subselect and add an
UNION ALL:WITH init AS (
SELECT created_at
FROM posts
WHERE id = 5869
)
(
(SELECT posts.*
FROM posts
CROSS JOIN init
WHERE posts.created_at init.created_at
ORDER BY posts.created_at
LIMIT 10)
);This query assumes that there are no duplicates in
created_at.For good performance, you need indexes on
id (you have that with the primary key) and created_at.If you need the result sorted, use my query as a subselect and add an
ORDER BY.Code Snippets
WITH init AS (
SELECT created_at
FROM posts
WHERE id = 5869
)
(
(SELECT posts.*
FROM posts
CROSS JOIN init
WHERE posts.created_at <= init.created_at
ORDER BY posts.created_at DESC
LIMIT 11)
UNION ALL
(SELECT posts.*
FROM posts
CROSS JOIN init
WHERE posts.created_at > init.created_at
ORDER BY posts.created_at
LIMIT 10)
);Context
StackExchange Database Administrators Q#252256, answer score: 3
Revisions (0)
No revisions yet.