snippetCritical
How to select distinct for one column and any in another column?
Viewed 0 times
distinctcolumnanyoneforanotherhowandselect
Problem
I need to query an SQL database to find all distinct values of one column and I need an arbitrary value from another column. For example, consider the following table with two columns, key and value:
I wish to get back one sample row, chosen arbitrarily, from each distinct key, perhaps getting these three rows:
How can I formulate such a query in SQL?
key value
=== =====
one test
one another
one value
two goes
two here
two also
three exampleI wish to get back one sample row, chosen arbitrarily, from each distinct key, perhaps getting these three rows:
key value
=== =====
one test
two goes
three exampleHow can I formulate such a query in SQL?
Solution
The easiest query to write is for MySQL (with not strict ANSI settings). It uses the non-standard construction:
In recent version (5.7 and 8.0+) where the strict settings and
For other DBMSs, that have window functions (like Postgres, SQL-Server, Oracle, DB2), you can use them like this. The advantage is that you can select other columns in the result as well (besides the
For older versions of the above and for any other DBMS, a general way that works almost everywhere. One disadvantage is that you cannot select other columns with this approach. Another is that aggregate functions like
PostgreSQL has a special non-standard
SELECT key, value
FROM tableX
GROUP BY key ;In recent version (5.7 and 8.0+) where the strict settings and
ONLY_FULL_GROUP_BY are the default, you can use the ANY_VALUE() function, added in 5.7:SELECT key, ANY_VALUE(value) AS value
FROM tableX
GROUP BY key ;For other DBMSs, that have window functions (like Postgres, SQL-Server, Oracle, DB2), you can use them like this. The advantage is that you can select other columns in the result as well (besides the
key and value) :SELECT key, value
FROM tableX
( SELECT key, value,
ROW_NUMBER() OVER (PARTITION BY key
ORDER BY whatever) --- ORDER BY NULL
AS rn --- for example
FROM tableX
) tmp
WHERE rn = 1 ;For older versions of the above and for any other DBMS, a general way that works almost everywhere. One disadvantage is that you cannot select other columns with this approach. Another is that aggregate functions like
MIN() and MAX() do not work with some datatypes in some DBMSs (like bit, text, blobs):SELECT key, MIN(value) AS value
FROM tableX
GROUP BY key ;PostgreSQL has a special non-standard
DISTINCT ON operator that can also be used. The optional ORDER BY is for selecting which row from every group should be selected:SELECT DISTINCT ON (key) key, value
FROM tableX
-- ORDER BY key, ;Code Snippets
SELECT key, value
FROM tableX
GROUP BY key ;SELECT key, ANY_VALUE(value) AS value
FROM tableX
GROUP BY key ;SELECT key, value
FROM tableX
( SELECT key, value,
ROW_NUMBER() OVER (PARTITION BY key
ORDER BY whatever) --- ORDER BY NULL
AS rn --- for example
FROM tableX
) tmp
WHERE rn = 1 ;SELECT key, MIN(value) AS value
FROM tableX
GROUP BY key ;SELECT DISTINCT ON (key) key, value
FROM tableX
-- ORDER BY key, <some_other_expressions> ;Context
StackExchange Database Administrators Q#24327, answer score: 52
Revisions (0)
No revisions yet.