patternMinor
Efficiently finding distinct values
Viewed 0 times
valuesdistinctefficientlyfinding
Problem
I have a number of tables with the primary key (month, year, number) and differing cardinalities differ somewhat. For the tuple (month, year) the history doesn't go back very far, this will probably not grow beyond 50 in the very long term. For every (month, year) tuple there are not more than 2 million unique numbers. I want to know which combinations of month and years are available. I do this using this query:
This returns the correct result but does not seem to be very efficient. What is an efficient way to obtain this result (utilizing the unique index)?
The tuning advisor suggests to add an index on month-year for this query but this seems wasteful because a larger index is already available.
select month, year from table group by month, yearThis returns the correct result but does not seem to be very efficient. What is an efficient way to obtain this result (utilizing the unique index)?
The tuning advisor suggests to add an index on month-year for this query but this seems wasteful because a larger index is already available.
Solution
You may be able to use a variation of the following technique - which forces repeated 'MIN/MAX' range scans:
Assumptions
testbed:
--gets=11656
from (select add_months(sysdate,1-level) as d from dual connect by level
Some explanation in response to comments:
In each case (the testbed and the min/max query), the subquery factoring clause just generated a list of (year, month) tuples:
from foo
where month=m.month and year=m.year;
`
This is very quick because it makes use of the ordered nature of the PK - however it needs to be executed once for each month - if there are millions of rows for each month that makes sense, but not if there are few enough to fit in a small number of block.
Assumptions
- You can produce a list of all possible year/month combinations
numberis not null (which it can't be as it is in the PK, but I mention it as there is a way of working around if nulls are permitted)
testbed:
create table foo(month, year, num, primary key(month, year, num)) as
with m as ( select extract(month from d) as month, extract(year from d) as year
from (select add_months(sysdate,1-level) as d from dual connect by level
normal query:
select distinct month, year from foo;--gets=11656
min/max technique:
with m as ( select extract(month from d) as month, extract(year from d) as yearfrom (select add_months(sysdate,1-level) as d from dual connect by level
Some explanation in response to comments:
In each case (the testbed and the min/max query), the subquery factoring clause just generated a list of (year, month) tuples:
with m as ( select extract(month from d) as month, extract(year from d) as year
from (select add_months(sysdate,1-level) as d from dual connect by level
Then the technique uses a subquery in the select clause to check if any rows are present for the (month, year) - this subquery necessarily must only produce at most 1 row:
select min(num)from foo
where month=m.month and year=m.year;
`
This is very quick because it makes use of the ordered nature of the PK - however it needs to be executed once for each month - if there are millions of rows for each month that makes sense, but not if there are few enough to fit in a small number of block.
Context
StackExchange Database Administrators Q#11366, answer score: 5
Revisions (0)
No revisions yet.