patternMinor

Efficiently finding distinct values

Submitted by: @import:stackexchange-dba·Mar 10, 2026·

Viewed 0 times

valuesdistinctefficientlyfinding

Problem

I have a number of tables with the primary key (month, year, number) and differing cardinalities differ somewhat. For the tuple (month, year) the history doesn't go back very far, this will probably not grow beyond 50 in the very long term. For every (month, year) tuple there are not more than 2 million unique numbers. I want to know which combinations of month and years are available. I do this using this query:

select month, year from table group by month, year

This returns the correct result but does not seem to be very efficient. What is an efficient way to obtain this result (utilizing the unique index)?

The tuning advisor suggests to add an index on month-year for this query but this seems wasteful because a larger index is already available.

Solution

You may be able to use a variation of the following technique - which forces repeated 'MIN/MAX' range scans:

Assumptions

You can produce a list of all possible year/month combinations

number is not null (which it can't be as it is in the PK, but I mention it as there is a way of working around if nulls are permitted)

testbed:

create table foo(month, year, num, primary key(month, year, num)) as
with m as ( select extract(month from d) as month, extract(year from d) as year
            from (select add_months(sysdate,1-level) as d from dual connect by level

normal query:

select distinct month, year from foo;
--gets=11656



min/max technique:

with m as ( select extract(month from d) as month, extract(year from d) as year
from (select add_months(sysdate,1-level) as d from dual connect by level

Some explanation in response to comments:

In each case (the testbed and the min/max query), the subquery factoring clause just generated a list of (year, month) tuples:

with m as ( select extract(month from d) as month, extract(year from d) as year
            from (select add_months(sysdate,1-level) as d from dual connect by level

Then the technique uses a subquery in the

select

 clause to check if any rows are present for the (month, year) - this subquery necessarily must only produce at most 1 row:

select min(num)
from foo
where month=m.month and year=m.year;
`

This is very quick because it makes use of the ordered nature of the PK - however it needs to be executed once for each month - if there are millions of rows for each month that makes sense, but not if there are few enough to fit in a small number of block.

Context

StackExchange Database Administrators Q#11366, answer score: 5

Revisions (0)

No revisions yet.