HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Segmenting data by percentage of a column total

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
totalcolumnsegmentingdatapercentage

Problem

I need to separate a list of records by the percentage of a column when they are summed to all records larger than the current record until a MAX percentage.

Records are sorted DESC in column A, and I need all rows in A summed up until (from large to small) they equal 60% of the sum(A) (col B). Then, I need to perform the same for the rest of the row to 30% of the sum(A), and then the last 10% or the rest of the records.

The top 60% records need to be marked as A, middle 30% records marked B and last ~10% marked C.
I have tried NTILE but can't get the segments to match up correctly.

`╔══════════════════════╦══════╦═══╦══════╦═════╗
║ A ║ B ║ C ║ ║ ║
╠══════════════════════╬══════╬═══╬══════╬═════╣
║ 197 ║ 197 ║ A ║ ║ ║
║ 192 ║ 389 ║ A ║ ║ ║
║ 190 ║ 579 ║ A ║ ║ ║
║ 185 ║ 764 ║ A ║ ║ ║
║ 184 ║ 948 ║ A ║ ║ ║
║ 184 ║ 1132 ║ A ║ ║ ║
║ 181 ║ 1313 ║ A ║ ║ ║
║ 180 ║ 1493 ║ A ║ ║ ║
║ 175 ║ 1668 ║ A ║ ║ ║
║ 172 ║ 1840 ║ A ║ ║ ║
║ 171 ║ 2011 ║ A ║ ║ ║
║ 170 ║ 2181 ║ A ║ ║ ║
║ 169 ║ 2350 ║ A ║ ║ ║
║ 169 ║ 2519 ║ A ║ 2602 ║ 60% ║
║ ║ ║ ║ ║ ║
║ 167 ║ 167 ║ B ║ ║ ║
║ 167 ║ 334 ║ B ║ ║ ║
║ 167 ║ 501 ║ B ║ ║ ║
║ 166 ║ 667 ║ B ║ ║ ║
║ 166 ║ 833 ║ B ║ ║ ║
║ 166 ║ 999 ║ B ║ ║ ║
║ 165 ║ 1164 ║ B ║ 1301 ║ 30% ║
║ ║ ║ ║ ║ ║
║ 164 ║ 164 ║ C ║ ║ ║
║ 164 ║ 328 ║ C ║ ║ ║
║ 163 ║ 49

Solution

You could calculate the percentage of the sum of a you've reached up till the current row, then segment that based on your thresholds.

with pct as (
  select a
  , (sum(a) over (order by a desc))/sum(a) over() pct
  from foo
)
select a
, pct
, sum(a) over (partition by
    case
      when pct <= .6 then 'A'
      when pct <= .9 then 'B'
      else                'C'
    end
    order by a desc) partial_sum
from pct
order by a desc;


Expanded version that actually shows the category (second subquery isn't necessary, just avoids repeating the partial sum expression):

with pct as (
  select a
  , (sum(a) over (order by a desc))/sum(a) over () pct
  from foo
), category as (
  select a
  , pct
  , case when pct <= .6 then 'A'
         when pct <= .9 then 'B'
         else                'C'
    end cat
  from pct
)
select a
, cat
, to_char(pct*100, '999.99')||'%' pct_of_total_a
, sum(a) over (partition by cat order by a desc) sum_by_category
from category
order by a desc;

Code Snippets

with pct as (
  select a
  , (sum(a) over (order by a desc))/sum(a) over() pct
  from foo
)
select a
, pct
, sum(a) over (partition by
    case
      when pct <= .6 then 'A'
      when pct <= .9 then 'B'
      else                'C'
    end
    order by a desc) partial_sum
from pct
order by a desc;
with pct as (
  select a
  , (sum(a) over (order by a desc))/sum(a) over () pct
  from foo
), category as (
  select a
  , pct
  , case when pct <= .6 then 'A'
         when pct <= .9 then 'B'
         else                'C'
    end cat
  from pct
)
select a
, cat
, to_char(pct*100, '999.99')||'%' pct_of_total_a
, sum(a) over (partition by cat order by a desc) sum_by_category
from category
order by a desc;

Context

StackExchange Database Administrators Q#110726, answer score: 2

Revisions (0)

No revisions yet.