patternsqlModerate
Selecting only one duplicate
Viewed 0 times
duplicateoneonlyselecting
Problem
I have a database of purchase orders and stockroom checkouts, and it seems that the stockroom checkouts tends to get duplicates inserted (we source this data from an older system nightly). What would be the select to get all rows from this table, but only select 1 of the duplicates at a time? I have around 100k duplicates, and my DB guy is still on christmas vacation :|
the tables do have unique keys, but other than the keys the data is duplicated a few thousand times. I'd like to avoid deleting the data (just in case). Server is running MSSQL2012, and i'll edit this post with a table structure when I get back to my desk.
the tables do have unique keys, but other than the keys the data is duplicated a few thousand times. I'd like to avoid deleting the data (just in case). Server is running MSSQL2012, and i'll edit this post with a table structure when I get back to my desk.
Solution
You can use a CTE for this, if you want the row that's returned to be a complete, intact row rather than aggregates of any of the other columns. You can change the
ORDER BY to prefer rows by any of the columns (the grouping is by the ones you think should be unique).;WITH x AS
(
SELECT col1, col2, col3,
rn = ROW_NUMBER() OVER
(
PARTITION BY unique_columns
ORDER BY unique_columns, tie_breaker_if_you_care
)
FROM dbo.source_table
)
SELECT col1, col2, col3 FROM x WHERE rn = 1;Code Snippets
;WITH x AS
(
SELECT col1, col2, col3,
rn = ROW_NUMBER() OVER
(
PARTITION BY unique_columns
ORDER BY unique_columns, tie_breaker_if_you_care
)
FROM dbo.source_table
)
SELECT col1, col2, col3 FROM x WHERE rn = 1;Context
StackExchange Database Administrators Q#125133, answer score: 10
Revisions (0)
No revisions yet.