HiveBrain v1.2.0
Get Started
← Back to all entries
snippetsqlMinor

How to use MAX() to select a single record but prevent GROUP BY from selecting all distinct values and multiple records

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
fromdistinctpreventgroupallbutandrecordsrecordmax

Problem

I have some Line items that have version numbers:

[Doc No]    [Version No]    [Line No]    [Data-a]    [Data-b]
1           1               1000         abc         abc
1           2               1000         xyz         abc
1           1               2000         null        lmnop
1           2               2000         ggg         lmnop


I want to only select the highest version number per [Line No]:

[Doc No]    [Version No]    [Line No]    [Data-a]    [Data-b]
1           2               1000         xyz         abc
1           2               2000         ggg        lmnop


I am doing this using this tsql:

SELECT [Doc No]
      ,max([Version No])
      ,[Line No]
      ,[Data-a]
      ,[Data-b]
FROM [table]
GROUP BY [Doc No], [Line No], [Data-a], [Data-b]


The problem obviously is ... if you notice in the first table the [Data-a] column has changing values (hence the need for version numbers. So I want to see the latest version and the value contained in [Data-a] for that record. But since I am using max() to get the latest version, I have to put [Data-a] in the GROUP BY clause - which causes each two distinct values to be pulled and thus two records showing both version numbers. How can I solve this problem?

Solution

Just incase you need another solution. Below uses row_number() over (partition by .. order by)

CREATE TABLE dbo.docStore
    ([Doc No] int, 
     [Version No] int,
     [Line No] int ,
     [Data-a] varchar(10),
     [Data-b] varchar(10))
;

INSERT INTO dbo.docStore
    ([Doc No],[Version No],[Line No],[Data-a],[Data-b])
VALUES
    (1,1,1000,'abc','abc'    ),
    (1,2,1000,'xyz','abc'    ),
    (1,1,2000,null,'lmnop'   ),
    (1,2,2000,'ggg','lmnop'  )
;

with cte as 
(
select [Doc No], [Version No], [Line No], [Data-a], [Data-b], 
        row_number() over (partition by [Doc No], [Line No] 
                           order by [Version No] desc) 
        as rn 
from dbo.docStore
)
select [Doc No], [Version No], [Line No], [Data-a], [Data-b] 
from cte
where rn = 1
;

-- clean up 
drop table dbo.docStore 
;

Code Snippets

CREATE TABLE dbo.docStore
    ([Doc No] int, 
     [Version No] int,
     [Line No] int ,
     [Data-a] varchar(10),
     [Data-b] varchar(10))
;

INSERT INTO dbo.docStore
    ([Doc No],[Version No],[Line No],[Data-a],[Data-b])
VALUES
    (1,1,1000,'abc','abc'    ),
    (1,2,1000,'xyz','abc'    ),
    (1,1,2000,null,'lmnop'   ),
    (1,2,2000,'ggg','lmnop'  )
;

with cte as 
(
select [Doc No], [Version No], [Line No], [Data-a], [Data-b], 
        row_number() over (partition by [Doc No], [Line No] 
                           order by [Version No] desc) 
        as rn 
from dbo.docStore
)
select [Doc No], [Version No], [Line No], [Data-a], [Data-b] 
from cte
where rn = 1
;

-- clean up 
drop table dbo.docStore 
;

Context

StackExchange Database Administrators Q#148601, answer score: 4

Revisions (0)

No revisions yet.