HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

What's the best way to concatenate a multi-column foreign key?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
themultiwhatcolumnwayforeignconcatenatekeybest

Problem

We are expanding our data warehouse to include a new source system. In the new system, we want a single column for a primary key, but in the old system it's two columns.

For exmaple:

create table #new_case (case_id varchar(255))

create table #old_case (patient_id int, care_event_counter int)

insert into #old_case (patient_id, Care_event_counter)
values 
(55, 1),
(55, 2),
(55, 3),
(56, 1),
(57, 1),
(57, 2)

insert into #new_case (case_id)
select cast(patient_id as varchar(55)) + '_' + cast(care_event_counter as varchar(55))
from #old_case

select * from #new_case


problem: This has got to be super slow on a large scale, and I'd prefer to use int across the board.

Question: How do you merge two integer columns intelligently? is there a math trick you could use to derive the values that would be faster?

Solution

It's quite common in Data Warehouse schema designs to assign a new, integer, single-column key to every table, and retain the source-system key (sometimes called the "business key") as well.

Kimball says:


Natural keys created by operational source systems are subject to
business rules outside the control of the DW/BI system. For instance,
an employee number (natural key) may be changed if the employee
resigns and then is rehired. When the data warehouse wants to have a
single key for that employee, a new durable key must be created that
is persistent and does not change in this situation. This key is
sometimes referred to as a durable supernatural key. The best durable
keys have a format that is independent of the original business
process and thus should be simple integers assigned in sequence
beginning with 1. While multiple surrogate keys may be associated with
an employee over time as their profile changes, the durable key never
changes.

Natural, Durable, and Supernatural Keys

So something like:

create table new_case 
(
  case_id int identity primary key,
  patient_id int not null,
  care_event_counter int not null,
  constraint uk_case unique(patient_id,care_event_counter),
  ... 
)


You need to retain the other key and have it indexed because with new data comes in with the business key, you need to associate it with the new warehouse ID.

Code Snippets

create table new_case 
(
  case_id int identity primary key,
  patient_id int not null,
  care_event_counter int not null,
  constraint uk_case unique(patient_id,care_event_counter),
  ... 
)

Context

StackExchange Database Administrators Q#255494, answer score: 5

Revisions (0)

No revisions yet.