patternsqlMinor
Grouping values into buckets
Viewed 0 times
valuesintobucketsgrouping
Problem
I'm working a query and I wasn't sure if my current approach is the "best" approach.
I got a table filled with values and a table with buckets that I use to group those values.
There are 2 types of buckets:
I set up a simplified version that can be used to test.
the
The table definitions
Some test data
Right now I have this query which works and gives the results I want but I feel this could be done better and more efficiently.
dbfiddle.uk
I got a table filled with values and a table with buckets that I use to group those values.
There are 2 types of buckets:
- A priority bucket, if items apply to the conditions they should fall in this bucket and be excluded for any following bucket.
- A simple range bucket, the values should fall within a range.
I set up a simplified version that can be used to test.
the
[Filter] column in #tempBUcket here is used for the priority bucketsThe table definitions
create table #TempData([Id] int, [value] int);
create table #TempBucket([Id] int, [FromVal] int,[ToVal] int,[Filter] int);Some test data
INSERT INTO #TempData (Id,value)
VALUES (1,1),(2,2),(3,3),(4,4),(5,5),
(6,6),(7,7),(8,8),(9,9),(10,1),
(11,2),(12,3),(13,4),(14,5),(15,6),
(16,7),(17,8),(18,9),(19,10),(20,11);
INSERT INTO #TempBucket(Id,FromVal,ToVal,Filter)
VALUES (1,0,3,0),(2,3,6,0),
(3,6,9,0),(4,9,12,0),
(5,NULL,NULL,1),(6,NULL,NULL,2);Right now I have this query which works and gives the results I want but I feel this could be done better and more efficiently.
WITH filteredData as
(SELECT tv.*,COALESCE(tb1.Id,tb2.Id) as BucketId FROM #TempData tv
LEFT JOIN #TempBucket tb1 --prio bucket join
ON tb1.Filter IS NOT NULL AND tb1.Filter = tv.Value
LEFT JOIN #TempBucket tb2 --range bucket join
ON tb2.Fromval < tv.value AND tv.value <= tb2.ToVal
)
SELECT Count(fd.value) as count ,BucketId FROM filteredData fd GROUP BY BucketId;dbfiddle.uk
Solution
My solution does only 1
To give a priority to the row in
scan of #TempBucket but it adds an additional sort, I'm not sure it's more efficient, you should test it on volumes of data:;with cte as
(
select t.*, tb.id as BucketId,
row_number() over(partition by t.id order by tb.FromVal) as rn
from #TempData t
join #TempBucket tb
on (t.value = tb.filter)
or (tb.Fromval < t.value AND t.value <= tb.ToVal)
)
select count(id) as cnt, BucketId
from cte
where rn = 1
group by BucketId;To give a priority to the row in
priority bucket it sorts by tb.FromVal which is null in case of priority bucket, alternatively you can sort on tb.Filter desc, I just don't know the exact criteria of priority bucket whether the FromVal,ToVal are null or Filter = 0Code Snippets
;with cte as
(
select t.*, tb.id as BucketId,
row_number() over(partition by t.id order by tb.FromVal) as rn
from #TempData t
join #TempBucket tb
on (t.value = tb.filter)
or (tb.Fromval < t.value AND t.value <= tb.ToVal)
)
select count(id) as cnt, BucketId
from cte
where rn = 1
group by BucketId;Context
StackExchange Database Administrators Q#253392, answer score: 2
Revisions (0)
No revisions yet.