patternMinor
Segmenting data by percentage of a column total
Viewed 0 times
totalcolumnsegmentingdatapercentage
Problem
I need to separate a list of records by the percentage of a column when they are summed to all records larger than the current record until a
Records are sorted
The top 60% records need to be marked as A, middle 30% records marked B and last ~10% marked C.
I have tried
`╔══════════════════════╦══════╦═══╦══════╦═════╗
║ A ║ B ║ C ║ ║ ║
╠══════════════════════╬══════╬═══╬══════╬═════╣
║ 197 ║ 197 ║ A ║ ║ ║
║ 192 ║ 389 ║ A ║ ║ ║
║ 190 ║ 579 ║ A ║ ║ ║
║ 185 ║ 764 ║ A ║ ║ ║
║ 184 ║ 948 ║ A ║ ║ ║
║ 184 ║ 1132 ║ A ║ ║ ║
║ 181 ║ 1313 ║ A ║ ║ ║
║ 180 ║ 1493 ║ A ║ ║ ║
║ 175 ║ 1668 ║ A ║ ║ ║
║ 172 ║ 1840 ║ A ║ ║ ║
║ 171 ║ 2011 ║ A ║ ║ ║
║ 170 ║ 2181 ║ A ║ ║ ║
║ 169 ║ 2350 ║ A ║ ║ ║
║ 169 ║ 2519 ║ A ║ 2602 ║ 60% ║
║ ║ ║ ║ ║ ║
║ 167 ║ 167 ║ B ║ ║ ║
║ 167 ║ 334 ║ B ║ ║ ║
║ 167 ║ 501 ║ B ║ ║ ║
║ 166 ║ 667 ║ B ║ ║ ║
║ 166 ║ 833 ║ B ║ ║ ║
║ 166 ║ 999 ║ B ║ ║ ║
║ 165 ║ 1164 ║ B ║ 1301 ║ 30% ║
║ ║ ║ ║ ║ ║
║ 164 ║ 164 ║ C ║ ║ ║
║ 164 ║ 328 ║ C ║ ║ ║
║ 163 ║ 49
MAX percentage.Records are sorted
DESC in column A, and I need all rows in A summed up until (from large to small) they equal 60% of the sum(A) (col B). Then, I need to perform the same for the rest of the row to 30% of the sum(A), and then the last 10% or the rest of the records.The top 60% records need to be marked as A, middle 30% records marked B and last ~10% marked C.
I have tried
NTILE but can't get the segments to match up correctly.`╔══════════════════════╦══════╦═══╦══════╦═════╗
║ A ║ B ║ C ║ ║ ║
╠══════════════════════╬══════╬═══╬══════╬═════╣
║ 197 ║ 197 ║ A ║ ║ ║
║ 192 ║ 389 ║ A ║ ║ ║
║ 190 ║ 579 ║ A ║ ║ ║
║ 185 ║ 764 ║ A ║ ║ ║
║ 184 ║ 948 ║ A ║ ║ ║
║ 184 ║ 1132 ║ A ║ ║ ║
║ 181 ║ 1313 ║ A ║ ║ ║
║ 180 ║ 1493 ║ A ║ ║ ║
║ 175 ║ 1668 ║ A ║ ║ ║
║ 172 ║ 1840 ║ A ║ ║ ║
║ 171 ║ 2011 ║ A ║ ║ ║
║ 170 ║ 2181 ║ A ║ ║ ║
║ 169 ║ 2350 ║ A ║ ║ ║
║ 169 ║ 2519 ║ A ║ 2602 ║ 60% ║
║ ║ ║ ║ ║ ║
║ 167 ║ 167 ║ B ║ ║ ║
║ 167 ║ 334 ║ B ║ ║ ║
║ 167 ║ 501 ║ B ║ ║ ║
║ 166 ║ 667 ║ B ║ ║ ║
║ 166 ║ 833 ║ B ║ ║ ║
║ 166 ║ 999 ║ B ║ ║ ║
║ 165 ║ 1164 ║ B ║ 1301 ║ 30% ║
║ ║ ║ ║ ║ ║
║ 164 ║ 164 ║ C ║ ║ ║
║ 164 ║ 328 ║ C ║ ║ ║
║ 163 ║ 49
Solution
You could calculate the percentage of the sum of
Expanded version that actually shows the category (second subquery isn't necessary, just avoids repeating the partial sum expression):
a you've reached up till the current row, then segment that based on your thresholds.with pct as (
select a
, (sum(a) over (order by a desc))/sum(a) over() pct
from foo
)
select a
, pct
, sum(a) over (partition by
case
when pct <= .6 then 'A'
when pct <= .9 then 'B'
else 'C'
end
order by a desc) partial_sum
from pct
order by a desc;Expanded version that actually shows the category (second subquery isn't necessary, just avoids repeating the partial sum expression):
with pct as (
select a
, (sum(a) over (order by a desc))/sum(a) over () pct
from foo
), category as (
select a
, pct
, case when pct <= .6 then 'A'
when pct <= .9 then 'B'
else 'C'
end cat
from pct
)
select a
, cat
, to_char(pct*100, '999.99')||'%' pct_of_total_a
, sum(a) over (partition by cat order by a desc) sum_by_category
from category
order by a desc;Code Snippets
with pct as (
select a
, (sum(a) over (order by a desc))/sum(a) over() pct
from foo
)
select a
, pct
, sum(a) over (partition by
case
when pct <= .6 then 'A'
when pct <= .9 then 'B'
else 'C'
end
order by a desc) partial_sum
from pct
order by a desc;with pct as (
select a
, (sum(a) over (order by a desc))/sum(a) over () pct
from foo
), category as (
select a
, pct
, case when pct <= .6 then 'A'
when pct <= .9 then 'B'
else 'C'
end cat
from pct
)
select a
, cat
, to_char(pct*100, '999.99')||'%' pct_of_total_a
, sum(a) over (partition by cat order by a desc) sum_by_category
from category
order by a desc;Context
StackExchange Database Administrators Q#110726, answer score: 2
Revisions (0)
No revisions yet.