principleModerate
Best data modelling approach to handle redundant foreign keys in a database about Surveys, Questions and Responses
Viewed 0 times
handlequestionssurveysforeignresponsesredundantkeysdatabaseaboutand
Problem
I am looking for advice on the best relational modelling approach to store surveys, questions, and responses.
I am looking for which of the two approaches below looks best, or an alternative approach to either.
I have at least these entities:
And at least these relationships:
Here is where I run into trouble: how to model the responses to survey questions made by a person.
Here are two approaches I've considered, neither of which seems very good to me. The diagrams here are greatly simplified to illustrate the issue.
Approach 1:
What I don't like about this approach:
Approach 2:
Try to avoid two FKs from approach 1 that should refer to the same value...
What I don't like about this approach:
Any advice on:
Would be greatly appreciated!
I am looking for which of the two approaches below looks best, or an alternative approach to either.
I have at least these entities:
- question
- survey
- person
And at least these relationships:
- Each survey has 1 or more questions.
- Each question may be used in 0 or more surveys.
- Each person may take 0 or more surveys.
Here is where I run into trouble: how to model the responses to survey questions made by a person.
Here are two approaches I've considered, neither of which seems very good to me. The diagrams here are greatly simplified to illustrate the issue.
Approach 1:
What I don't like about this approach:
- The
survey_person_question_responsetable has two different columns that refer to a survey:survey_question_survey_idandsurvey_person_survey_id
- It would be an error to have different
survey_id's referenced in one row for these two columns. The survey_question must be from the same survey as the person took in survey_person. I can't see a good way to enforce this.
- It seems like what I am doing here is making a relationship between two relationships. That feels wrong to me for some reason.
Approach 2:
Try to avoid two FKs from approach 1 that should refer to the same value...
What I don't like about this approach:
- There is no enforcement that the
question_idandsurvey_idFKs are from a validsurvey_questionpair
- There is no enforcement that the
survey_idandperson_idFKs are from a validsurvey_personpair
Any advice on:
- Whether one of these approaches is a typical approach
- The pros and cons of one of these approaches over the other
- A better way to arrange this data entirely
Would be greatly appreciated!
Solution
As per my understanding of your specifications, your business environment involves a conceptual-level ternary relationship. In this regard, you need to define:
So, I consider that you are on the right track with your Approach 1, although it requires some small (yet important) refinements in order to make it more accurate. I will detail such refinements and other relevant considerations in the following sections.
Business rules
Let us expand the applicable business rules a bit and reformulate them in the following way:
Expository IDEF1X diagram
Then, I have created the IDEF1Xa diagram that is presented in Figure 1, which synthesizes the business rules formulated above:
a Integration Definition for Information Modeling (IDEF1X) is a highly recommendable modeling technique that was established as a standard in December 1993 by the United States National Institute of Standards and Technology (NIST). It is solidly based on theoretical work authored by the sole founder of the relational model, i.e., Dr. E. F. Codd and also on the entity-relationship view developed by Dr. P. P. Chen.
The PersonSurvey relationship
As I see it, the PersonSurvey relationship is required to provide a means of authorization so that a Person can take part in a given Survey. In this way, once a certain Person has been registered in a specific Survey, he or she is authorized to provide Responses to the Questions that integrate the respective Survey.
The SurveyQuestion relationship
I assume that the property (or attribute) called suvery_question.question_number in your diagram is used to represent the Order of presentation of a given Question instance with respect to a particular Survey. As you can see, I have denoted such property as SurveyQuestion.PresentationOrder, and I think that you should prevent that (i) two or more Question.QuestionNumber values share (ii) the same PresentationOrder value in (iii) the same SurveyQuestion occurrence.
To portray that need, I have included a composite ALTERNATE KEY (AK) in the box representing this entity type, which is comprised of the combination of properties (SurveyNumber, QuestionNumber, PresentationOrder). As you are well aware, a composite AK can be declared in a logical DDL design with the aid of a multi-column UNIQUE constraint (as I exemplified in the
The Response entity type
Yes, with the Response entity type I am depicting a relationship between two other relationships; it may seem awkward at first glance but there is nothing wrong with this approach, as long as it (a) represents the features of the business context of interest accurately and (b) is represented properly in a logical-level layout.
Yes, you are totally correct, it would be an error to portray that part of the scenario at the logical level of abstraction by means of two
Derived logical SQL-DDL layout
```
-- You should determine which are the most fitting
-- data types and sizes for all your table columns
-- depending on your business context characteristics.
-- As one would expect, you are free to make use of
-- your preferred (or required) naming conventions.
CREATE TABLE Person (
PersonId INT NOT NULL,
FirstName CHAR(30) NOT NULL,
LastName CHAR(30) NOT NULL,
GenderCode CHAR(3) NOT NULL,
BirthDate DATE NOT NULL,
CreatedDateTime DATETIME NOT NULL,
--
CONSTRAINT Person_PK PRIMARY KEY (PersonId),
CONSTRAINT Person_AK UNIQUE (
FirstName,
LastName,
GenderCode,
BirthDate
)
);
CREATE TABLE Survey (
SurveyNumber INT NOT NULL,
Description CHAR(255) NOT NULL,
CreatedDateTime DATETIME NOT NULL,
--
CONSTRAINT Survey_PK PRIMARY KEY (SurveyNumber),
CONSTRAINT Survey_AK UNIQUE (Description)
);
CREATE TABLE PersonSurvey (
PersonId INT NOT NULL,
SurveyNumber INT NOT NULL,
RegisteredDateTime DATETIME NOT NULL,
--
CONSTRAINT PersonSurvey_PK PRIMARY KEY (PersonId, Su
- the relationship (or association) type between the entity types Person and Survey;
- the relationship type between Survey and Question;
- the relationship type that establishes the connection between the two aforesaid relationship types and, as a consequence, between Person, Survey and Question, i.e., Response (a shorter name that simplifies interpretation, from my point of view).
So, I consider that you are on the right track with your Approach 1, although it requires some small (yet important) refinements in order to make it more accurate. I will detail such refinements and other relevant considerations in the following sections.
Business rules
Let us expand the applicable business rules a bit and reformulate them in the following way:
- A Person registers in zero-one-or-many Surveys
- A Survey gets the registration of zero-one-or-many Persons
- A Survey is integrated by one-to-many Questions
- A Question integrates zero-one-or-many Surveys
- A Question receives zero-one-or-many Responses
- A Response is provided by exactly-one Person in the context of exactly-one Survey
Expository IDEF1X diagram
Then, I have created the IDEF1Xa diagram that is presented in Figure 1, which synthesizes the business rules formulated above:
a Integration Definition for Information Modeling (IDEF1X) is a highly recommendable modeling technique that was established as a standard in December 1993 by the United States National Institute of Standards and Technology (NIST). It is solidly based on theoretical work authored by the sole founder of the relational model, i.e., Dr. E. F. Codd and also on the entity-relationship view developed by Dr. P. P. Chen.
The PersonSurvey relationship
As I see it, the PersonSurvey relationship is required to provide a means of authorization so that a Person can take part in a given Survey. In this way, once a certain Person has been registered in a specific Survey, he or she is authorized to provide Responses to the Questions that integrate the respective Survey.
The SurveyQuestion relationship
I assume that the property (or attribute) called suvery_question.question_number in your diagram is used to represent the Order of presentation of a given Question instance with respect to a particular Survey. As you can see, I have denoted such property as SurveyQuestion.PresentationOrder, and I think that you should prevent that (i) two or more Question.QuestionNumber values share (ii) the same PresentationOrder value in (iii) the same SurveyQuestion occurrence.
To portray that need, I have included a composite ALTERNATE KEY (AK) in the box representing this entity type, which is comprised of the combination of properties (SurveyNumber, QuestionNumber, PresentationOrder). As you are well aware, a composite AK can be declared in a logical DDL design with the aid of a multi-column UNIQUE constraint (as I exemplified in the
SurveyQuestion table that is part of the expository DDL layout expounded a few sections below).The Response entity type
Yes, with the Response entity type I am depicting a relationship between two other relationships; it may seem awkward at first glance but there is nothing wrong with this approach, as long as it (a) represents the features of the business context of interest accurately and (b) is represented properly in a logical-level layout.
Yes, you are totally correct, it would be an error to portray that part of the scenario at the logical level of abstraction by means of two
Response.SurveyNumber (or, say, Response.SurveyId) values referenced from two different columns in the same Response row.Derived logical SQL-DDL layout
```
-- You should determine which are the most fitting
-- data types and sizes for all your table columns
-- depending on your business context characteristics.
-- As one would expect, you are free to make use of
-- your preferred (or required) naming conventions.
CREATE TABLE Person (
PersonId INT NOT NULL,
FirstName CHAR(30) NOT NULL,
LastName CHAR(30) NOT NULL,
GenderCode CHAR(3) NOT NULL,
BirthDate DATE NOT NULL,
CreatedDateTime DATETIME NOT NULL,
--
CONSTRAINT Person_PK PRIMARY KEY (PersonId),
CONSTRAINT Person_AK UNIQUE (
FirstName,
LastName,
GenderCode,
BirthDate
)
);
CREATE TABLE Survey (
SurveyNumber INT NOT NULL,
Description CHAR(255) NOT NULL,
CreatedDateTime DATETIME NOT NULL,
--
CONSTRAINT Survey_PK PRIMARY KEY (SurveyNumber),
CONSTRAINT Survey_AK UNIQUE (Description)
);
CREATE TABLE PersonSurvey (
PersonId INT NOT NULL,
SurveyNumber INT NOT NULL,
RegisteredDateTime DATETIME NOT NULL,
--
CONSTRAINT PersonSurvey_PK PRIMARY KEY (PersonId, Su
Code Snippets
-- You should determine which are the most fitting
-- data types and sizes for all your table columns
-- depending on your business context characteristics.
-- As one would expect, you are free to make use of
-- your preferred (or required) naming conventions.
CREATE TABLE Person (
PersonId INT NOT NULL,
FirstName CHAR(30) NOT NULL,
LastName CHAR(30) NOT NULL,
GenderCode CHAR(3) NOT NULL,
BirthDate DATE NOT NULL,
CreatedDateTime DATETIME NOT NULL,
--
CONSTRAINT Person_PK PRIMARY KEY (PersonId),
CONSTRAINT Person_AK UNIQUE (
FirstName,
LastName,
GenderCode,
BirthDate
)
);
CREATE TABLE Survey (
SurveyNumber INT NOT NULL,
Description CHAR(255) NOT NULL,
CreatedDateTime DATETIME NOT NULL,
--
CONSTRAINT Survey_PK PRIMARY KEY (SurveyNumber),
CONSTRAINT Survey_AK UNIQUE (Description)
);
CREATE TABLE PersonSurvey (
PersonId INT NOT NULL,
SurveyNumber INT NOT NULL,
RegisteredDateTime DATETIME NOT NULL,
--
CONSTRAINT PersonSurvey_PK PRIMARY KEY (PersonId, SurveyNumber),
CONSTRAINT PersonSurveyToPerson_FK FOREIGN KEY (PersonId)
REFERENCES Person (PersonId),
CONSTRAINT PersonSurveyToSurvey_FK FOREIGN KEY (SurveyNumber)
REFERENCES Survey (SurveyNumber)
);
CREATE TABLE Question (
QuestionNumber INT NOT NULL,
Wording CHAR(255) NOT NULL,
CreatedDateTime DATETIME NOT NULL,
--
CONSTRAINT Question_PK PRIMARY KEY (QuestionNumber),
CONSTRAINT Question_AK UNIQUE (Wording)
);
CREATE TABLE SurveyQuestion (
SurveyNumber INT NOT NULL,
QuestionNumber INT NOT NULL,
PresentationOrder TINYINT NOT NULL,
IsMandatory BIT NOT NULL,
IntegratedDateTime DATETIME NOT NULL,
--
CONSTRAINT SurveyQuestion_PK PRIMARY KEY (SurveyNumber, QuestionNumber),
CONSTRAINT SurveyQuestion_AK UNIQUE (
QuestionNumber,
SurveyNumber,
PresentationOrder
),
CONSTRAINT SurveyQuestionToSurvey_FK FOREIGN KEY (SurveyNumber)
REFERENCES Survey (SurveyNumber),
CONSTRAINT SurveyQuestionToQuestion_FK FOREIGN KEY (QuestionNumber)
REFERENCES Question (QuestionNumber)
);
CREATE TABLE Response (
SurveyNumber INT NOT NULL,
QuestionNumber INT NOT NULL,
PersonId INT NOT NULL,
Content TEXT NOT NULL,
ProvidedDateTime DATETIME NOT NULL,
--
CONSTRAINT Response_PK PRIMARY KEY (SurveyNumber, QuestionNumber, PersonId),
CONSTRAINT ResponseToPersonSurvey_FK FOREIGN KEY (PersonId, SurveyNumber)
REFERENCES PersonSurvey (PersonId, SurveyNumber),
CONSTRAINT ResponseToSurveyQuestion_FK FOREIGN KEY (SurveyNumber, QuestionNumber)
REFERENCES SurveyQuestion (SurveyNumber, QuestionNumber)
);Context
StackExchange Database Administrators Q#120246, answer score: 13
Revisions (0)
No revisions yet.