HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

group rows when row count exceeds certain count

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
rowsgroupcertainexceedswhencountrow

Problem

Query:

select city.name as name, sum(count) as views from profpostview
inner join professional as prof on prof.pfid = profpostview.viewerpfid
left join cityinfo as city on city.cityid = prof.cityid
where pfpid = _pfpid group by name order by views desc;


output:

**name , views**

'Ahmedabad', '14'
'Gandhinagar', '7'
'Mumbai', '5'
'Aalborg', '5'
'Rajkot', '3'
'Bhavnagar', '2'
'Baroda', '1'


required:

**name, views**

'Ahmedabad', '14'
'Gandhinagar', '7'
'Aalborg', '5'
'others', '11


explanation: I want to combine rows by adding view count when row count exceeds 3. Profpostview table is related to profpostinfo by pfpid and profpostinfo table is in turn related to cityinfo by cityid.

The final query should never return more than 4 rows. All the (summed) rows after the 3rd should be grouped into a 4th row.

Solution

This would be fairly straight forward if MySQL had implemented window functions or CTEs (they are working on it but it hasn't been announced yet in which version it will appear.) With a CTE:

with grp as
  ( select city.name, sum(count) as views 
    from profpostview as ppview
      inner join professional as prof 
        on prof.pfid = ppview.viewerpfid
      left join cityinfo as city 
        on city.cityid = prof.cityid
    where pfpid = _pfpid 
    group by city.name
  ),
the_first_three as
  ( select name, views
    from grp
    order by views desc
    limit 3
   ),
the_rest as
  ( select name, views
    from grp
    order by views desc
    limit 900000000000000 offset 3
   )
select name, views
from the_first_three

union all 

select 'other', sum(views)
from the_rest 

order by (name = 'other'), views desc ;


With CTEs and window functions:

with grp as
  ( select city.name, sum(count) as views,
           row_number() over (order by sum(count) desc) as rn 
    from profpostview as ppview
      inner join professional as prof 
        on prof.pfid = ppview.viewerpfid
      left join cityinfo as city 
        on city.cityid = prof.cityid
    where pfpid = _pfpid 
    group by city.name
  )
select name, views, rn
from grp
where rn  3 

order by rn ;


And while we are waiting for CTEs and window functions to be implemented, in current MySQL versiosn, we'll have to either duplicate code or use some trickery to avoid duplication. In any case, I don't think we can avoid doing the group by and summation twice. Here is one way to get the wanted output. It's essentially a rewrite of the above code 1, using derived tables instead of CTEs:

( select city.name, sum(count) as views 
    from profpostview as ppview
      inner join professional as prof 
        on prof.pfid = ppview.viewerpfid
      left join cityinfo as city 
        on city.cityid = prof.cityid
    where pfpid = _pfpid 
    group by city.name
    order by views desc
    limit 3
  )
union all   
  ( select 'other', sum(views)
    from 
      ( select city.name, sum(count) as views 
        from profpostview as ppview
          inner join professional as prof 
            on prof.pfid = ppview.viewerpfid
          left join cityinfo as city 
            on city.cityid = prof.cityid
        where pfpid = _pfpid 
        group by city.name
        order by views desc
        limit 900000000000000 offset 3
      ) grp
  )  
order by (name = 'other'), views desc ;


If you want to resolve ties in some way, change the order by accordingly, eg. to order by views desc, name asc

Code Snippets

with grp as
  ( select city.name, sum(count) as views 
    from profpostview as ppview
      inner join professional as prof 
        on prof.pfid = ppview.viewerpfid
      left join cityinfo as city 
        on city.cityid = prof.cityid
    where pfpid = _pfpid 
    group by city.name
  ),
the_first_three as
  ( select name, views
    from grp
    order by views desc
    limit 3
   ),
the_rest as
  ( select name, views
    from grp
    order by views desc
    limit 900000000000000 offset 3
   )
select name, views
from the_first_three

union all 

select 'other', sum(views)
from the_rest 

order by (name = 'other'), views desc ;
with grp as
  ( select city.name, sum(count) as views,
           row_number() over (order by sum(count) desc) as rn 
    from profpostview as ppview
      inner join professional as prof 
        on prof.pfid = ppview.viewerpfid
      left join cityinfo as city 
        on city.cityid = prof.cityid
    where pfpid = _pfpid 
    group by city.name
  )
select name, views, rn
from grp
where rn <= 3

union all 

select 'other', sum(views), 4
from grp
where rn > 3 

order by rn ;
( select city.name, sum(count) as views 
    from profpostview as ppview
      inner join professional as prof 
        on prof.pfid = ppview.viewerpfid
      left join cityinfo as city 
        on city.cityid = prof.cityid
    where pfpid = _pfpid 
    group by city.name
    order by views desc
    limit 3
  )
union all   
  ( select 'other', sum(views)
    from 
      ( select city.name, sum(count) as views 
        from profpostview as ppview
          inner join professional as prof 
            on prof.pfid = ppview.viewerpfid
          left join cityinfo as city 
            on city.cityid = prof.cityid
        where pfpid = _pfpid 
        group by city.name
        order by views desc
        limit 900000000000000 offset 3
      ) grp
  )  
order by (name = 'other'), views desc ;

Context

StackExchange Database Administrators Q#150731, answer score: 5

Revisions (0)

No revisions yet.