HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlModerate

SELECT LIMIT 1 per column value?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
percolumnlimitvalueselect

Problem

Lets say I have the following table

user_id
comment

2
thats cool

2
awesome

3
i hate this

3
okay

6
this is weird

6
hello?

6
what is it

9
how are you

16
too slow

16
yes

17
alrighty

How can you select one row per user_id? So my results would be:

user_id
comment

2
thats cool

3
i hate this

6
this is weird

9
how are you

16
too slow

17
alrighty

Is this possible with a single efficient query? Or are sub-selected necessary? Is it possible to somehow use DISTINCT on a single column?

Solution

That's what GROUP BY is used for. Get one row (per group). In this case, it will show all distinct user_id values and for the rest of the columns, you can (have to) use aggregate functions like MIN(), MAX(), AVG(), SUM() as you will have more than one values per group and only one can be shown.

SELECT
    user_id
  , MIN(comment) AS comment  -- it will show the first in alphabetical order  
                             -- you could also use MAX()
FROM
    tableX
GROUP BY
    user_id ;


MySQL allows also the following unorthodox solution, that will return one (more or less random) comment per user:

SELECT
    user_id
  , comment
FROM
    tableX
GROUP BY
    user_id ;


This last query will not work but raise an error if the (stricter) ONLY_FULL_GROUP_BY mode is enabled. In the recently released 5.7 version, this mode is the default and a new function, ANY_VALUE(), is provided. For more details, see the MySQL Handling of GROUP BY page. The query can be written now:

SELECT
    user_id
  , ANY_VALUE(comment) AS comment
FROM
    tableX
GROUP BY
    user_id ;


Note that with either the "unorthodox" version or using the recent ANY_VALUE() function, if we add more columns in the SELECT list, their values is not guaranteed to be from the same row, just from a row in the same group. The way they are selected is not exactly random, depends on the execution plan and the indexes used.

Code Snippets

SELECT
    user_id
  , MIN(comment) AS comment  -- it will show the first in alphabetical order  
                             -- you could also use MAX()
FROM
    tableX
GROUP BY
    user_id ;
SELECT
    user_id
  , comment
FROM
    tableX
GROUP BY
    user_id ;
SELECT
    user_id
  , ANY_VALUE(comment) AS comment
FROM
    tableX
GROUP BY
    user_id ;

Context

StackExchange Database Administrators Q#24908, answer score: 11

Revisions (0)

No revisions yet.