HiveBrain v1.2.0
Get Started
← Back to all entries
snippetsqlMinor

How to use ORDER BY and LIMIT inside JSON_AGG

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
json_aggorderlimithowanduseinside

Problem

I have a query which returns a desired output.

SELECT 
    shop,
    JSON_AGG(item_history.* ORDER BY created_date DESC) as data
FROM item_history
GROUP BY
    shop;


Results:

[
  {
    "shop": "shop1",
    "data": [
      {
        "id": 226,
        "price": "0",
        "shop": "shop1.com",
        "country": "UK",
        "item": "item1",
        "created_date": "2021-06-07T08:48:42.338201",
      },
      {
        "id": 224,
        "price": "0",
        "shop": "shop1.com",
        "country": "UK",
        "item": "item 1",
        "created_date": "2021-06-07T07:53:25.030621",
      },
      ...
  },
  {
    "shop": "shop2",
    "data": [
      {
        "id": 225,
        "price": "0",
        "shop": "shop2.com",
        "country": "DE",
        "item": "Item 2",
        "created_date": "2021-06-07T08:48:36.443849",
      },
      ...
]


Which is exactly the output I want but the issue is that it fetches all items under data array and it would be nice to limit that array. I tried adding LIMIT

SELECT 
    shop,
    JSON_AGG(item_history.* ORDER BY created_date DESC LIMIT 5) as data
FROM item_history
GROUP BY
    shop;


But I get syntax error at or near "limit". Is it possible to limit the output from JSON_AGG?

Solution

The syntax of the ORDER BY inside JSON_AGG does not accept a LIMIT option. I am not aware of a way to limit the scope of any aggregate function in this way. As an alternative, you can try the following ROW_NUMBER + FILTER approach:

  • assign row numbers within shop partitions



  • use the row numbers in conditional aggregation.



In SQL it would look like this:

SELECT 
    shop,
    JSON_AGG(derived.* ORDER BY created_date DESC) FILTER (WHERE rn <= 5) AS data
FROM
    (
        SELECT
            *,
            ROW_NUMBER() OVER (PARTITION BY shop ORDER BY created_date DESC) AS rn
        FROM
            item_history
    ) AS derived
GROUP BY
    shop;


Or you could use a CTE instead of a derived table:

WITH
    cte AS
    (
        SELECT
            *,
            ROW_NUMBER() OVER (PARTITION BY shop ORDER BY created_date DESC) AS rn
        FROM
            item_history
    )
SELECT 
    shop,
    JSON_AGG(cte.* ORDER BY created_date DESC) FILTER (WHERE rn <= 5) AS data
FROM
    cte
GROUP BY
    shop;


The results would be identical. The difference, if any, would only be in efficiency, because CTEs in PostgreSQL may be materialised while derived tables are typically not. You should test for yourself to see which option works better for you.

Note: In both cases the aggregate function still has the same ORDER BY, but you can now change it however you like, or remove it altogether: the FILTER clause makes sure you only have the last five entries per shop anyway.

Code Snippets

SELECT 
    shop,
    JSON_AGG(derived.* ORDER BY created_date DESC) FILTER (WHERE rn <= 5) AS data
FROM
    (
        SELECT
            *,
            ROW_NUMBER() OVER (PARTITION BY shop ORDER BY created_date DESC) AS rn
        FROM
            item_history
    ) AS derived
GROUP BY
    shop;
WITH
    cte AS
    (
        SELECT
            *,
            ROW_NUMBER() OVER (PARTITION BY shop ORDER BY created_date DESC) AS rn
        FROM
            item_history
    )
SELECT 
    shop,
    JSON_AGG(cte.* ORDER BY created_date DESC) FILTER (WHERE rn <= 5) AS data
FROM
    cte
GROUP BY
    shop;

Context

StackExchange Database Administrators Q#293868, answer score: 6

Revisions (0)

No revisions yet.