patternsqlMinor
Splitting data into two tables, in one go
Viewed 0 times
tablesintosplittingtwoonedata
Problem
I am using SQL Server 2008 and have a problem which I don't know how to solve without using many temporary tables and an unreliable join.
Table 1 contains 6 columns of data, which is then split into two tables. Col1 to Col3 goes into Table 2, and Col4 to Col6 goes into Table 3. Getting the data into Table 2 and Table 3 is the easy part. However T2ID in Table 3 is a foreign key to the ID in Table 2.
Performance is key so I don't want to use variables and/or iterate through the data row by row, ideally I just want one insert which does the lot.
I've tried using a Link Table but the data in Table 2 and Table 3 is not unique which makes joining unreliable.
Any suggestions?
I can't change T2 or T3. T1 is not a staging Table, and I can't alter that either. There are other packages also writing to T1, T2, and T3. Although it is possible to schedule them at different times should I need to.
Table 1 contains 6 columns of data, which is then split into two tables. Col1 to Col3 goes into Table 2, and Col4 to Col6 goes into Table 3. Getting the data into Table 2 and Table 3 is the easy part. However T2ID in Table 3 is a foreign key to the ID in Table 2.
Performance is key so I don't want to use variables and/or iterate through the data row by row, ideally I just want one insert which does the lot.
I've tried using a Link Table but the data in Table 2 and Table 3 is not unique which makes joining unreliable.
Any suggestions?
Create Table T1 (
ID INT IDENTITY(1,1),
Col1 VARCHAR(10),
Col2 VARCHAR(10),
Col3 VARCHAR(10),
Col4 VARCHAR(10),
Col5 VARCHAR(10),
Col6 VARCHAR(10)
)
Create Table T2 (
ID INT IDENTITY(1,1),
Col1 VARCHAR(10),
Col2 VARCHAR(10),
Col3 VARCHAR(10)
)
Create Table T3 (
ID INT IDENTITY(1,1),
T2ID INT,
Col4 VARCHAR(10),
Col5 VARCHAR(10),
Col6 VARCHAR(10)
)I can't change T2 or T3. T1 is not a staging Table, and I can't alter that either. There are other packages also writing to T1, T2, and T3. Although it is possible to schedule them at different times should I need to.
Solution
If you use MERGE to insert the data into
Unlike INSERT, which would let you reference the
Once you have the mapping table, you can just use it in a join when inserting into
Wrap both statements in one transaction to make your split operation atomic.
Discussion about the MERGE mapping method can be found in this thread:
T2, you can generate a mapping table between T1.ID and T2.ID:DECLARE @Mapping TABLE
(
T1ID int,
T2ID int
);
MERGE INTO
dbo.T2 AS tgt
USING
dbo.T1 AS src
ON
1 = 0
WHEN NOT MATCHED THEN
INSERT ( Col1, Col2, Col3)
VALUES (src.Col1, src.Col2, src.Col3)
OUTPUT
src.ID, inserted.ID INTO @Mapping (T1ID, T2ID)
;Unlike INSERT, which would let you reference the
inserted table's columns only in the OUTPUT clause, the MERGE statement allows you to reference the source table's columns as well. That is key to this solution, because that is how you associate the source IDs with the target IDs.Once you have the mapping table, you can just use it in a join when inserting into
T3. A plain INSERT ... SELECT will do this time:INSERT INTO
dbo.T3 (T2ID, Col4, Col5, Col6)
SELECT
m.T2ID,
t.Col4,
t.Col5,
t.Col6
FROM
dbo.T1 AS t
INNER JOIN @Mapping AS m ON t.ID = m.T1ID
;Wrap both statements in one transaction to make your split operation atomic.
Discussion about the MERGE mapping method can be found in this thread:
- Using merge..output to get mapping between source.id and target.id
Code Snippets
DECLARE @Mapping TABLE
(
T1ID int,
T2ID int
);
MERGE INTO
dbo.T2 AS tgt
USING
dbo.T1 AS src
ON
1 = 0
WHEN NOT MATCHED THEN
INSERT ( Col1, Col2, Col3)
VALUES (src.Col1, src.Col2, src.Col3)
OUTPUT
src.ID, inserted.ID INTO @Mapping (T1ID, T2ID)
;INSERT INTO
dbo.T3 (T2ID, Col4, Col5, Col6)
SELECT
m.T2ID,
t.Col4,
t.Col5,
t.Col6
FROM
dbo.T1 AS t
INNER JOIN @Mapping AS m ON t.ID = m.T1ID
;Context
StackExchange Database Administrators Q#160210, answer score: 7
Revisions (0)
No revisions yet.