snippetsqlMinor
How to set a value if certain columns are the same across multiple rows
Viewed 0 times
acrosssametherowscolumnsarevaluemultiplehowcertain
Problem
I am doing a data migration and I end up with a temporary table as follows:
I need to split the data into two tables.
Based on that same data, it should look like this:
That's one row per distinct
The actual
curid cuid rtid cd dd rm
10 4 4 2016-01-02 2016-07-02
16 4 4 2016-06-12 2016-12-12 Remarks Jun 12
18 5 3 2016-07-18 2017-07-31
8 5 3 2015-06-21 2016-06-30 Add some test
11 6 4 2017-01-01 2017-07-01
9 7 3 2017-01-01 2018-01-31I need to split the data into two tables.
Based on that same data, it should look like this:
Table A
id curid cuid rtid
1 10 4 4
2 18 5 3
3 11 6 4
4 9 7 3That's one row per distinct
(cuid, rtid) plus a curid value picked from each set of duplicates. id is just a sequential number.Table B
id curid cd dd rm
1 10 2016-01-02 2016-07-02
2 10 2016-06-12 2016-12-12 Remarks Jun 12
3 18 2016-07-18 2017-07-31
4 18 2015-06-21 2016-06-30 Add some test
5 11 2017-01-01 2017-07-01
6 9 2017-01-01 2018-01-31The actual
curid is irrelevant as long as the records in Table B match the associated record in Table A (so we could even use a temp sequence or something to set the curid).Solution
Your test setup
(Best provided this way in your question - hint!)
Solution
Create target tables if they don't exist:
Use
id | curid | cuid | rtid
-: | ----: | ---: | ---:
1 | 10 | 4 | 4
2 | 8 | 5 | 3
3 | 11 | 6 | 4
4 | 9 | 7 | 3
Detailed explanation here:
Use a simple window function for table B:
id | curid | cd | dd | rm
-: | ----: | :--------- | :--------- | :-------------
1 | 10 | 2016-01-02 | 2016-07-02 | null
2 | 10 | 2016-06-12 | 2016-12-12 | Remarks Jun 12
3 | 8 | 2016-07-18 | 2017-07-31 | null
4 | 8 | 2015-06-21 | 2016-06-30 | Add some test
5 | 11 | 2017-01-01 | 2017-07-01 | null
6 | 9 | 2017-01-01 | 2018-01-31 | null
dbfiddle here
(Best provided this way in your question - hint!)
CREATE TEMP TABLE tmp (
curid int
, cuid int
, rtid int
, cd date
, dd date
, rm text);
INSERT INTO tmp VALUES
(10, 4, 4, '2016-01-02', '2016-07-02', NULL)
,(16, 4, 4, '2016-06-12', '2016-12-12', 'Remarks Jun 12')
,(18, 5, 3, '2016-07-18', '2017-07-31', NULL)
,(8 , 5, 3, '2015-06-21', '2016-06-30', 'Add some test')
,(11, 6, 4, '2017-01-01', '2017-07-01', NULL)
,(9 , 7, 3, '2017-01-01', '2018-01-31', NULL);Solution
Create target tables if they don't exist:
CREATE TEMP TABLE a (
id serial
, curid int -- UNIQUE?
, cuid int
, rtid int
);
CREATE TEMP TABLE b (
id serial
, curid int
, cd date
, dd date
, rm text
);Use
DISTINCT ON for table A:INSERT INTO a (curid, cuid, rtid)
SELECT DISTINCT ON (cuid, rtid)
curid, cuid, rtid
FROM tmp
ORDER BY cuid, rtid, curid -- pick smallest curid per group
RETURNING *;id | curid | cuid | rtid
-: | ----: | ---: | ---:
1 | 10 | 4 | 4
2 | 8 | 5 | 3
3 | 11 | 6 | 4
4 | 9 | 7 | 3
Detailed explanation here:
- Select first row in each GROUP BY group?
Use a simple window function for table B:
INSERT INTO b (curid, cd, dd, rm)
SELECT min(curid) OVER (PARTITION BY cuid, rtid), cd, dd, rm
FROM tmp
ORDER BY cuid, rtid -- optional
RETURNING *;id | curid | cd | dd | rm
-: | ----: | :--------- | :--------- | :-------------
1 | 10 | 2016-01-02 | 2016-07-02 | null
2 | 10 | 2016-06-12 | 2016-12-12 | Remarks Jun 12
3 | 8 | 2016-07-18 | 2017-07-31 | null
4 | 8 | 2015-06-21 | 2016-06-30 | Add some test
5 | 11 | 2017-01-01 | 2017-07-01 | null
6 | 9 | 2017-01-01 | 2018-01-31 | null
curid is guaranteed to match since we picked the smallest per group in both queries.dbfiddle here
Code Snippets
CREATE TEMP TABLE tmp (
curid int
, cuid int
, rtid int
, cd date
, dd date
, rm text);
INSERT INTO tmp VALUES
(10, 4, 4, '2016-01-02', '2016-07-02', NULL)
,(16, 4, 4, '2016-06-12', '2016-12-12', 'Remarks Jun 12')
,(18, 5, 3, '2016-07-18', '2017-07-31', NULL)
,(8 , 5, 3, '2015-06-21', '2016-06-30', 'Add some test')
,(11, 6, 4, '2017-01-01', '2017-07-01', NULL)
,(9 , 7, 3, '2017-01-01', '2018-01-31', NULL);CREATE TEMP TABLE a (
id serial
, curid int -- UNIQUE?
, cuid int
, rtid int
);
CREATE TEMP TABLE b (
id serial
, curid int
, cd date
, dd date
, rm text
);INSERT INTO a (curid, cuid, rtid)
SELECT DISTINCT ON (cuid, rtid)
curid, cuid, rtid
FROM tmp
ORDER BY cuid, rtid, curid -- pick smallest curid per group
RETURNING *;INSERT INTO b (curid, cd, dd, rm)
SELECT min(curid) OVER (PARTITION BY cuid, rtid), cd, dd, rm
FROM tmp
ORDER BY cuid, rtid -- optional
RETURNING *;Context
StackExchange Database Administrators Q#177957, answer score: 3
Revisions (0)
No revisions yet.