principleMinor
Rank vs. Distinct
Viewed 0 times
distinctrankstackoverflow
Problem
Just for curiosity, if I have to use
vs.
When ordered by
rank to tag distinct values for some columns only, and ignore whatever different values occurred on other columns, why not use distinct and just remove the other columns? Any real life examples?SELECT * FROM (
SELECT
col1,
col2,
col3,
col4,
row_number () over (partition by col1, col2 ORDER BY col3) RN
FROM table) tmp_table
WHERE rn = 1vs.
SELECT DISTINCT
col1,
col2
FROM tableWhen ordered by
col3, there is a chance you can get different values of col3 compared to rank 2, right? Then if that col3 is not important because it is not consistent, why not use distinct instead for only col1 and col2?Solution
A few things:
- You are using ROW_NUMBER(), not RANK()
- For the example you cite, it does not really make much difference
- For other situations, ROW_NUMBER() provides additional flexibility. For example, suppose you are doing a data cleanse, and you want to read you input table and write all the duplicated rows into another table, so that you can process them further. Then you can do something like:
insert /*+ APPEND */ into table2
select * from (
select col1
, col2
, col3
, col4
row_number() over (partition by order by ) rn
from table )
where rn > 1Code Snippets
insert /*+ APPEND */ into table2
select * from (
select col1
, col2
, col3
, col4
row_number() over (partition by <key columns> order by <xxx> ) rn
from table )
where rn > 1Context
StackExchange Database Administrators Q#164364, answer score: 2
Revisions (0)
No revisions yet.