snippetsqlModerate
How to remove duplicates from any array and preserve ordering in PostgreSQL?
Viewed 0 times
postgresqlarraypreserveanyorderingremovehowandfromduplicates
Problem
I'm looking for a way to eliminate duplicates from a PostgreSQL array
while preserving the ordering of its elements. What I currently
have are the following functions:
In the above,
with all duplicates removed; their relative ordering is arbitrary.
relative to each other; this is sometimes useful as all arrays with the
same set of of distinct elements will compare equal after being
normalized by this function.
(which in this exampe must be a
left to right; whenever it encounters a previously unseen element, it adds
that one to the result. Thus, only the first occurrence of each value
is kept.
However, the implementation has some drawbacks: First, there seems to be no
way to write it so it accepts the pseudo-type
Second, while
the first two functions are written in SQL, they may presumably be inlined,
Third, it bugs me that I couldn't come up with a solution in pure SQL...
while preserving the ordering of its elements. What I currently
have are the following functions:
create function array_unique( anyarray )
returns anyarray immutable strict language sql as $
select array( select distinct unnest( $1 ) ); $;
create function array_unique_sorted( anyarray )
returns anyarray immutable strict language sql as $
select array( select distinct unnest( $1 ) order by 1 ); $;
/* ### TAINT there ought to be a simpler, declarative solution */
create function array_unique_stable( text[] )
returns text[] immutable strict parallel safe language plpgsql as $
declare
R text[] = '{}';
¶element text;
begin
foreach ¶element in array $1 loop
if not array[ ¶element ] && R then
R := R || array[ ¶element ];
end if;
end loop;
return R; end; $;In the above,
array_unique takes an array of any type and returns a copywith all duplicates removed; their relative ordering is arbitrary.
array_unique_sorted is like array_unique, but the elements are sortedrelative to each other; this is sometimes useful as all arrays with the
same set of of distinct elements will compare equal after being
normalized by this function.
array_unique_stable already does what I'm looking for: given an array (which in this exampe must be a
text[] array), it scans elements fromleft to right; whenever it encounters a previously unseen element, it adds
that one to the result. Thus, only the first occurrence of each value
is kept.
However, the implementation has some drawbacks: First, there seems to be no
way to write it so it accepts the pseudo-type
anyarray. Second, while
the first two functions are written in SQL, they may presumably be inlined,
array_unique_stable is written in PL/pgSQL, and so it cannot be inlined.Third, it bugs me that I couldn't come up with a solution in pure SQL...
Solution
This can indeed be done using pure SQL:
The
returns
and
create function array_unique_stable(p_input anyarray)
returns anyarray immutable strict parallel safe
language sql
as
$
select array_agg(t order by x)
from (
select distinct on (t) t,x
from unnest(p_input) with ordinality as p(t,x)
order by t,x
) t2;
$The
unnest(p_input) with ordinality will return the original index of the element in the array which is then used to aggregate them back in the outer query.select array_unique_stable(array['a','x','x','b']) as text_array,
array_unique_stable(array[10,1,1,5,8,8]) as int_arrayreturns
text_array | int_array
-----------+-----------
{a,x,b} | {10,1,5,8}and
Code Snippets
create function array_unique_stable(p_input anyarray)
returns anyarray immutable strict parallel safe
language sql
as
$$
select array_agg(t order by x)
from (
select distinct on (t) t,x
from unnest(p_input) with ordinality as p(t,x)
order by t,x
) t2;
$$select array_unique_stable(array['a','x','x','b']) as text_array,
array_unique_stable(array[10,1,1,5,8,8]) as int_arraytext_array | int_array
-----------+-----------
{a,x,b} | {10,1,5,8}Context
StackExchange Database Administrators Q#211501, answer score: 12
Revisions (0)
No revisions yet.