patternsqlMinor
Managing the order of aggregation in Postgres
Viewed 0 times
themanagingorderpostgresaggregation
Problem
Suppose I have the following merging function that merges two jsonb values overwriting duplicate keys with the values from the second
Since I want to use that function in aggregates I need to define an aggregate function like so
And suppose I have the following table
id (bigint)
username (text)
event (jsonb)
1
foo
{ "it": 1, "key": "bla" }
2
foo
{ "it": 2, "key" : "dah" }
3
bar
{}
4
zar
{}
When I want to aggregate on username column and merge the event column I would use the following query
And I expect to get the following results, where records with greater id overwrite keys value pairs of earlier records in the aggregate
username
event
foo
{ "it": 2, "key": "dah"}
bar
{}
zar
{}
The problem is that I am sometimes seeing that the foo aggregate contains combination
create or replace function jsonb_concat(a jsonb, b jsonb) returns jsonb
as
'select $1 || $2'
language sql
immutable
parallel safe
;
Since I want to use that function in aggregates I need to define an aggregate function like so
create or replace aggregate jsonb_merge_agg(jsonb)
(
sfunc = jsonb_concat,
stype = jsonb,
initcond = '{}'
);
And suppose I have the following table
id (bigint)
username (text)
event (jsonb)
1
foo
{ "it": 1, "key": "bla" }
2
foo
{ "it": 2, "key" : "dah" }
3
bar
{}
4
zar
{}
When I want to aggregate on username column and merge the event column I would use the following query
select username, jsonb_merge_agg(event) from table group by username
And I expect to get the following results, where records with greater id overwrite keys value pairs of earlier records in the aggregate
username
event
foo
{ "it": 2, "key": "dah"}
bar
{}
zar
{}
The problem is that I am sometimes seeing that the foo aggregate contains combination
{ "it": 1, "key": "bla" } instead. I am aware that Postgres does not have natural row order in the table, so it might be different between transactions. How do I manage the order of the aggregation merge?Solution
In the Aggregate functions page (chapter 9, section 20 in postgresql 12 documentation "Aggregate Functions" (1)) there's a paragraph specifically addressing this
The aggregate functions array_agg, json_agg, jsonb_agg, json_object_agg, jsonb_object_agg, string_agg, and xmlagg, as well as similar user-defined aggregate functions, produce meaningfully different result values depending on the order of the input values. This ordering is unspecified by default, but can be controlled by writing an ORDER BY clause within the aggregate call, as shown in Section 4.2.7.
And in the section 4.2.7 (2) Value expressions/Aggregate expressions the syntax is explained as following
as a result, to get consistent merge order the query must be modified to the following
1: https://www.postgresql.org/docs/12/functions-aggregate.html
2: https://www.postgresql.org/docs/12/sql-expressions.html#SYNTAX-AGGREGATES
The aggregate functions array_agg, json_agg, jsonb_agg, json_object_agg, jsonb_object_agg, string_agg, and xmlagg, as well as similar user-defined aggregate functions, produce meaningfully different result values depending on the order of the input values. This ordering is unspecified by default, but can be controlled by writing an ORDER BY clause within the aggregate call, as shown in Section 4.2.7.
And in the section 4.2.7 (2) Value expressions/Aggregate expressions the syntax is explained as following
aggregate_name (expression [ , ... ] [ order_by_clause ] ) [ FILTER ( WHERE filter_clause ) ]as a result, to get consistent merge order the query must be modified to the following
select username, jsonb_merge_agg(event order by id)1: https://www.postgresql.org/docs/12/functions-aggregate.html
2: https://www.postgresql.org/docs/12/sql-expressions.html#SYNTAX-AGGREGATES
Code Snippets
aggregate_name (expression [ , ... ] [ order_by_clause ] ) [ FILTER ( WHERE filter_clause ) ]select username, jsonb_merge_agg(event order by id)Context
StackExchange Database Administrators Q#334945, answer score: 2
Revisions (0)
No revisions yet.