HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Should I use a compound or surrogate PK in a many-to-many table?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
compoundshouldsurrogatemanyusetable

Problem

I have a database:

DROP TABLE IF EXISTS `books`;

CREATE TABLE `books` (
  `isbn` VARCHAR(255) NOT NULL,
  `title` VARCHAR(255) NULL DEFAULT NULL,
  PRIMARY KEY (`isbn`)
) COMMENT 'Books used at this school';

DROP TABLE IF EXISTS `classes`;

CREATE TABLE `classes` (
  `class_id` INT(10) NOT NULL AUTO_INCREMENT,
  `teacher_id` SMALLINT(5) NULL DEFAULT NULL,
  PRIMARY KEY (`class_id`)
) COMMENT 'Classes at the school';

DROP TABLE IF EXISTS `b_c`;

CREATE TABLE `b_c` (
  `isbn` VARCHAR(255) NOT NULL,
  `class_id` INT(10) NOT NULL,
  PRIMARY KEY (`isbn`)
) COMMENT 'Books to classes';

ALTER TABLE `b_c` ADD FOREIGN KEY (isbn) REFERENCES `books` (`isbn`) 
    ON UPDATE CASCADE;
ALTER TABLE `b_c` ADD FOREIGN KEY (class_id) REFERENCES `classes` (`class_id`) 
    ON UPDATE CASCADE;


The issue I'm having is that I would like to normalize data as much as possible (I don't want multiple entries for the same relationship to be entered into the table b_c), but I would like to only store what data is absolutely pertinent.

My first idea to deal with this is to just create a compound primary key for the b_c table consisting of the fields isbn and class_id which would solve the issue of having duplicate relationships in the table, however, I have heard strong opinions on having a unique identifier for every row in a table like this. The justification for having a unique identifier for every row seems to be that it's useful to be able to specify a specific row, though I don't see a situation in which this would become useful. Can someone offer an example?

Another criticism I've heard is that using compound PKs in this way can make JOINs extremely taxing. Can someone comment on the performance of these two different methods?

The question boils down to "Is it worth it to add an id field to the b_c table or is the use of compound PKs enough to properly represent the relationship between the books and classes tables?

If you have any other comments about

Solution

FWIW ISBN is terrible as a primary key. For one, what happens if you get a book you want to put up for pre-order, but the ISBN hasn't been assigned yet? What happens when the ISBN changes (yes, this happens!)? What happens when they change the ISBN format yet again? I would say make that a candidate key but use a surrogate for the PK.

Adding to that the other issues with using strings as keys, and then on top of that a compound key that you require, I just don't understand why you want to fight surrogates when you're getting the suggestion from so many angles. You know that many of the people who are pushing you that way have a lot of experience, right?

Context

StackExchange Database Administrators Q#38708, answer score: 6

Revisions (0)

No revisions yet.