snippetsqlMinor
PostgreSQL - Create table as select with distinct on specific columns
Viewed 0 times
postgresqldistinctcreatecolumnswithspecificselecttable
Problem
I need a script to create a table . The problem is my select for creating the table contains equal values in some rows. I need to use distinct for these columns. The other columns' values can come from any matching row.
My current result table has data like this:
But I need data like this:
For columns city, street and street_num I need to apply distinct. val_x and val_y should be used anyone, for example first of that group with same city, street and street_num.
Can you give me advice how to edit this script?
My current result table has data like this:
| CITY | STREET | STREET_NUM | VAL_X | VAL_Y |
------------------------------------------------------------------
| CityA | Street abc | 5 | 11.5 | 0.5 |
| CityA | Street abc | 5 | 15.4 | 1.8 |
| CityA | Street abc | 5 | 12.4 | 2.8 |
| CityB | Street xyz | 18 | 5.4 | 1.9 |
| CityB | Street xyz | 18 | 8.4 | 1.1 |
| CityC | Street klm | 55 | 9.6 | 0.8 |But I need data like this:
| CITY | STREET | STREET_NUM | VAL_X | VAL_Y |
------------------------------------------------------------------
| CityA | Street abc | 5 | 11.5 | 0.5 |
| CityB | Street xyz | 18 | 5.4 | 1.9 |
| CityC | Street klm | 55 | 9.6 | 0.8 |For columns city, street and street_num I need to apply distinct. val_x and val_y should be used anyone, for example first of that group with same city, street and street_num.
Can you give me advice how to edit this script?
Solution
"VAL_X" and "VAL_Y" chosen through some aggregate function
You should consider using
If you need to put together values from several tables,
Using always "VAL_X" and "VAL_Y" from same row, using a WINDOW
If you need to make sure your values are always from the same row, the best way is to use a
You can do this with two steps:
1) Add the
2) At this point, choose only the rows
The result is:
You can see the example at SQLFiddle
You should consider using
GROUP BY for the columns whose values you consider that should be "distinct" (as a group), and, for the rest of columns, choose an appropriate aggregate function (for instance, MIN):CREATE TABLE my_result AS
SELECT
city, street, streetnum, min(val_x) AS val_x, min(val_y) AS val_y
FROM
tableA
WHERE
true /* your condition goes here */
GROUP BY
city, street, streetnumIf you need to put together values from several tables,
UNION ALL of them before you GROUP BY:CREATE TABLE my_result AS
SELECT
city, street, streetnum, min(val_x) AS val_x, min(val_y) AS val_y
FROM
(
SELECT city, street, streetnum, val_x, val_y FROM tableA
UNION ALL
SELECT city, street, streetnum, val_x, val_y FROM tableB
UNION ALL
SELECT city, street, streetnum, val_x, val_y FROM tableC
) AS s0
WHERE
true /* your condition goes here */
GROUP BY
city, street, streetnum ;Using always "VAL_X" and "VAL_Y" from same row, using a WINDOW
If you need to make sure your values are always from the same row, the best way is to use a
WINDOW in your query: PARTITION BY "CITY", "STREET", "STREET_NUM" and ORDER BY "VAL_X", "VAL_Y", and choose the first row of every partition.You can do this with two steps:
1) Add the
row_num() to every partition:SELECT
*,
(row_number() OVER (PARTITION BY "CITY", "STREET", "STREET_NUM" ORDER BY "VAL_X", "VAL_Y")) AS rn
FROM
table_a
| CITY | STREET | STREET_NUM | VAL_X | VAL_Y | rn |
|-------|------------|------------|-------|-------|----|
| CityA | Street abc | 5 | 11.5 | 0.5 | 1 |
| CityA | Street abc | 5 | 12.4 | 2.8 | 2 |
| CityA | Street abc | 5 | 15.4 | 1.8 | 3 |
| CityB | Street xyz | 18 | 5.4 | 1.9 | 1 |
| CityB | Street xyz | 18 | 8.4 | 1.1 | 2 |
| CityC | Street klm | 55 | 9.6 | 0.8 | 1 |2) At this point, choose only the rows
WHERE rn=1 (and ORDER them, if necessary):SELECT
"CITY", "STREET", "STREET_NUM", "VAL_X", "VAL_Y"
FROM
(
SELECT
*,
(row_number() OVER (PARTITION BY "CITY", "STREET", "STREET_NUM" ORDER BY "VAL_X", "VAL_Y")) AS rn
FROM
table_a
) AS table_a_grouped
WHERE
rn = 1
ORDER BY
"CITY", "STREET", "STREET_NUM"The result is:
| CITY | STREET | STREET_NUM | VAL_X | VAL_Y |
|-------|------------|------------|-------|-------|
| CityA | Street abc | 5 | 11.5 | 0.5 |
| CityB | Street xyz | 18 | 5.4 | 1.9 |
| CityC | Street klm | 55 | 9.6 | 0.8 |You can see the example at SQLFiddle
Code Snippets
CREATE TABLE my_result AS
SELECT
city, street, streetnum, min(val_x) AS val_x, min(val_y) AS val_y
FROM
tableA
WHERE
true /* your condition goes here */
GROUP BY
city, street, streetnumCREATE TABLE my_result AS
SELECT
city, street, streetnum, min(val_x) AS val_x, min(val_y) AS val_y
FROM
(
SELECT city, street, streetnum, val_x, val_y FROM tableA
UNION ALL
SELECT city, street, streetnum, val_x, val_y FROM tableB
UNION ALL
SELECT city, street, streetnum, val_x, val_y FROM tableC
) AS s0
WHERE
true /* your condition goes here */
GROUP BY
city, street, streetnum ;SELECT
*,
(row_number() OVER (PARTITION BY "CITY", "STREET", "STREET_NUM" ORDER BY "VAL_X", "VAL_Y")) AS rn
FROM
table_a
| CITY | STREET | STREET_NUM | VAL_X | VAL_Y | rn |
|-------|------------|------------|-------|-------|----|
| CityA | Street abc | 5 | 11.5 | 0.5 | 1 |
| CityA | Street abc | 5 | 12.4 | 2.8 | 2 |
| CityA | Street abc | 5 | 15.4 | 1.8 | 3 |
| CityB | Street xyz | 18 | 5.4 | 1.9 | 1 |
| CityB | Street xyz | 18 | 8.4 | 1.1 | 2 |
| CityC | Street klm | 55 | 9.6 | 0.8 | 1 |SELECT
"CITY", "STREET", "STREET_NUM", "VAL_X", "VAL_Y"
FROM
(
SELECT
*,
(row_number() OVER (PARTITION BY "CITY", "STREET", "STREET_NUM" ORDER BY "VAL_X", "VAL_Y")) AS rn
FROM
table_a
) AS table_a_grouped
WHERE
rn = 1
ORDER BY
"CITY", "STREET", "STREET_NUM"| CITY | STREET | STREET_NUM | VAL_X | VAL_Y |
|-------|------------|------------|-------|-------|
| CityA | Street abc | 5 | 11.5 | 0.5 |
| CityB | Street xyz | 18 | 5.4 | 1.9 |
| CityC | Street klm | 55 | 9.6 | 0.8 |Context
StackExchange Database Administrators Q#166955, answer score: 3
Revisions (0)
No revisions yet.