snippetsqlModerate
Should I create multiple tables for different entity states, statuses or stages?
Viewed 0 times
tablescreatestatesstatusesstagesdifferentformultipleshouldentity
Problem
I have a
Despite having “open” and “begun” in the same table, identified using flags, I decided for some reason early on to have “in review” and “completed” tasks in their own separate tables. While this seemed like a good idea to start off with, making it easy to query for specific things, when I thought about implementing a comment feature for tasks I realised that the
Have I gone about this all wrong?
Current considerations
Some solutions I thought up would be to redo the organisation and put everything in one table, using flags to differentiate purely between the states (this would lead to a fair amount of reworking); to create some kind of
What's the done thing in this situation? I'm not the most versed in database design and I'm kind of making this up as I go along.
Any help would be greatly appreciated.
Responses to further information requests made by @MDCCL via comments
So, the state (or status) of a task can change over time, right? How many times can the same concrete task present the same state, let us say, “open”? Will it vary depending on the kind of state presented by the relevant task?
Yes, a task.state could theoretically change up to four times. I won't go into specifics of the logic but the longest chain would be
The exception is whether it's “begun” or not, a user may abandon and then re-undertake the task infinitely - in theory at l
tasks table in my database and, in the business domain of interest, tasks can have multiple states: “open”, “begun”, “in review” and “completed”.Despite having “open” and “begun” in the same table, identified using flags, I decided for some reason early on to have “in review” and “completed” tasks in their own separate tables. While this seemed like a good idea to start off with, making it easy to query for specific things, when I thought about implementing a comment feature for tasks I realised that the
task_id value would be changing across the three tables.Have I gone about this all wrong?
Current considerations
Some solutions I thought up would be to redo the organisation and put everything in one table, using flags to differentiate purely between the states (this would lead to a fair amount of reworking); to create some kind of
UUID that is transferable across the tables - though I think there might issues with performance if this is done; and finally, to set the IDs across the tables to be no-longer auto incremented and to simply “inherit” the original value from the first task table (which auto increments).What's the done thing in this situation? I'm not the most versed in database design and I'm kind of making this up as I go along.
Any help would be greatly appreciated.
Responses to further information requests made by @MDCCL via comments
So, the state (or status) of a task can change over time, right? How many times can the same concrete task present the same state, let us say, “open”? Will it vary depending on the kind of state presented by the relevant task?
Yes, a task.state could theoretically change up to four times. I won't go into specifics of the logic but the longest chain would be
“begun” → “completed” → “review” → “completed”. And each time, currently, they're being put into another table.The exception is whether it's “begun” or not, a user may abandon and then re-undertake the task infinitely - in theory at l
Solution
The way I see it, the fact that (a) a concrete Task can present distinct States at different points in time entails (b) the creation of a table that is going to contain a time series, as I will detail below.
Business rules
In order to define the structure and constraints of a relational database, it is paramount to first identify and formulate the corresponding business domain rules (i.e., delineate the conceptual schema) with precision, taking into account the relevant entity types, properties and the corresponding interconnections. In this way, some of the formulations that are especially significant in the scenario under consideration are:
StateSpecification is entered
Consequently, based on the aforementioned formulations, I created the IDEF1X† diagram shown in Figure 1:
As you can see,
The entity type denominated
Expository logical design and sample data
I shaped an expository logical design, based on the IDEF1X diagram above, by means of the DDL structure that follows:
The State table
The
+-—————————-+-—————————-+-———————————————-+
| StateCode | Name | Description |
+-—————————-+-—————————-+-———————————————-+
| O | Open | Indicates that… |
+-----------+-----------+-----------------+
| B | Begun | Indicates that… |
+-----------+-----------+-----------------+
| R | In review | Indicates that… |
+-----------+-----------+-----------------+
| C | Completed | Indicates that… |
+-----------+-----------+-----------------+
Note the definition of a PRIMARY KEY (PK) constraint on a column retaining values that are, at the same time, meaninfgul (regarding interpretation by end users and technicians) and small (in terms of bytes at the physical implementation level), which makes it both readable and fast regarding, e.g., data retrieval.
If new business domain States arise, you can of course INSERT the corresponding rows INTO the
The State Specification table
Now, suposse that Task no. 1750 presents the following State History in the form of (unordered) rows held in the
+-——————————-+-———————————————————————-+-—————————-+
| TaskNumber | SpecifiedDateTime | StateCode |
+-——————————-+-————
Business rules
In order to define the structure and constraints of a relational database, it is paramount to first identify and formulate the corresponding business domain rules (i.e., delineate the conceptual schema) with precision, taking into account the relevant entity types, properties and the corresponding interconnections. In this way, some of the formulations that are especially significant in the scenario under consideration are:
- A Task presents one-to-many StateSpecifications
- A Task cannot present more than one StateSpecification at the same Instant
- A StateSpecification is considered Current
- from the particular Instant in which it is entered
- until the exact Instant when the successive
StateSpecification is entered
- A State defines zero-one-or-many StateSpecifications
- A Task receives zero-one-or-many Comments
Consequently, based on the aforementioned formulations, I created the IDEF1X† diagram shown in Figure 1:
As you can see,
Task and State are depicted as individual entity types, each with its own group of properties (or attributes) and its own (direct) associations (or relationships), which are expressed via the corresponding verb phrases, lines, cardinalities and FOREIGN KEY (FK) marks.The entity type denominated
StateSpecification, associated with the two brought up before, is a central aspect in the solution I will explain below.Expository logical design and sample data
I shaped an expository logical design, based on the IDEF1X diagram above, by means of the DDL structure that follows:
-- You should determine which are the most fitting
-- data types and sizes for all your table columns
-- depending on your business context characteristics.
-- Also, you should make accurate tests to define the most
-- convenient physical implementation settings; e.g.,
-- a good indexing strategy based on query tendencies.
-- As one would expect, you are free to use your
-- preferred or required naming conventions.
CREATE TABLE Task (
TaskNumber INT NOT NULL,
Description CHAR(90) NOT NULL,
Etcetera CHAR(30) NOT NULL,
CreatedDateTime DATETIME NOT NULL,
--
CONSTRAINT Task_PK PRIMARY KEY (TaskNumber)
);
CREATE TABLE State (
StateCode CHAR(1) NOT NULL,
Name CHAR(30) NOT NULL,
Description CHAR(60) NOT NULL,
--
CONSTRAINT State_PK PRIMARY KEY (StateCode),
CONSTRAINT State_AK1 UNIQUE (Name), -- ALTERNATE KEY.
CONSTRAINT State_AK2 UNIQUE (Description) -- ALTERNATE KEY.
);
CREATE TABLE StateSpecification (
TaskNumber INT NOT NULL,
SpecifiedDateTime DATETIME NOT NULL,
StateCode CHAR(1) NOT NULL,
--
CONSTRAINT StateSpecification_PK PRIMARY KEY (TaskNumber, SpecifiedDateTime), -- Composite PRIMARY KEY.
CONSTRAINT StateSpecification_to_Task_FK FOREIGN KEY (TaskNumber)
REFERENCES Task (TaskNumber),
CONSTRAINT StateSpecification_to_State FOREIGN KEY (StateCode)
REFERENCES State (StateCode)
);
CREATE TABLE TaskComment (
TaskNumber INT NOT NULL,
CreatedDateTime DATETIME NOT NULL,
Body TEXT NOT NULL,
--
CONSTRAINT TaskComment_PK PRIMARY KEY (TaskNumber, CreatedDateTime),
CONSTRAINT TaskComment_to_Task FOREIGN KEY (TaskNumber)
REFERENCES Task (TaskNumber)
);
--
--The State table
The
State table previously illustrated fulfills a look-up role, and is meant to contain five rows (as per the five State samples you provided):+-—————————-+-—————————-+-———————————————-+
| StateCode | Name | Description |
+-—————————-+-—————————-+-———————————————-+
| O | Open | Indicates that… |
+-----------+-----------+-----------------+
| B | Begun | Indicates that… |
+-----------+-----------+-----------------+
| R | In review | Indicates that… |
+-----------+-----------+-----------------+
| C | Completed | Indicates that… |
+-----------+-----------+-----------------+
Note the definition of a PRIMARY KEY (PK) constraint on a column retaining values that are, at the same time, meaninfgul (regarding interpretation by end users and technicians) and small (in terms of bytes at the physical implementation level), which makes it both readable and fast regarding, e.g., data retrieval.
If new business domain States arise, you can of course INSERT the corresponding rows INTO the
State table.The State Specification table
Now, suposse that Task no. 1750 presents the following State History in the form of (unordered) rows held in the
StateSpecification table:+-——————————-+-———————————————————————-+-—————————-+
| TaskNumber | SpecifiedDateTime | StateCode |
+-——————————-+-————
Code Snippets
-- You should determine which are the most fitting
-- data types and sizes for all your table columns
-- depending on your business context characteristics.
-- Also, you should make accurate tests to define the most
-- convenient physical implementation settings; e.g.,
-- a good indexing strategy based on query tendencies.
-- As one would expect, you are free to use your
-- preferred or required naming conventions.
CREATE TABLE Task (
TaskNumber INT NOT NULL,
Description CHAR(90) NOT NULL,
Etcetera CHAR(30) NOT NULL,
CreatedDateTime DATETIME NOT NULL,
--
CONSTRAINT Task_PK PRIMARY KEY (TaskNumber)
);
CREATE TABLE State (
StateCode CHAR(1) NOT NULL,
Name CHAR(30) NOT NULL,
Description CHAR(60) NOT NULL,
--
CONSTRAINT State_PK PRIMARY KEY (StateCode),
CONSTRAINT State_AK1 UNIQUE (Name), -- ALTERNATE KEY.
CONSTRAINT State_AK2 UNIQUE (Description) -- ALTERNATE KEY.
);
CREATE TABLE StateSpecification (
TaskNumber INT NOT NULL,
SpecifiedDateTime DATETIME NOT NULL,
StateCode CHAR(1) NOT NULL,
--
CONSTRAINT StateSpecification_PK PRIMARY KEY (TaskNumber, SpecifiedDateTime), -- Composite PRIMARY KEY.
CONSTRAINT StateSpecification_to_Task_FK FOREIGN KEY (TaskNumber)
REFERENCES Task (TaskNumber),
CONSTRAINT StateSpecification_to_State FOREIGN KEY (StateCode)
REFERENCES State (StateCode)
);
CREATE TABLE TaskComment (
TaskNumber INT NOT NULL,
CreatedDateTime DATETIME NOT NULL,
Body TEXT NOT NULL,
--
CONSTRAINT TaskComment_PK PRIMARY KEY (TaskNumber, CreatedDateTime),
CONSTRAINT TaskComment_to_Task FOREIGN KEY (TaskNumber)
REFERENCES Task (TaskNumber)
);
--
--SELECT T.TaskNumber,
SS.StateCode AS CurrentStateCode,
SS.SpecifiedDateTime
FROM Task T
JOIN StateSpecification SS
ON T.TaskNumber = SS.TaskNumber
WHERE T.TaskNumber = 1750 -- You can provide a parameter instead of a fixed value.
AND SS.SpecifiedDateTime = (
SELECT MAX(SpecifiedDateTime)
FROM StateSpecification InnerSS
WHERE T.TaskNumber = InnerSS.TaskNumber
);SELECT T.TaskNumber,
SS.StateCode AS CurrentStateCode,
SS.SpecifiedDateTime
FROM Task T
JOIN StateSpecification SS
ON T.TaskNumber = SS.TaskNumber
WHERE SS.SpecifiedDateTime = (
SELECT MAX(SpecifiedDateTime)
FROM StateSpecification InnerSS
WHERE T.TaskNumber = InnerSS.TaskNumber
);SELECT T.TaskNumber,
T.Description,
SS.StateCode,
SS.SpecifiedDateTime AS StartDateTime,
(
SELECT MIN(SpecifiedDateTime)
FROM StateSpecification InnerSS
WHERE T.TaskNumber = InnerSS.TaskNumber
AND InnerSS.SpecifiedDateTime > SS.SpecifiedDateTime
) AS EndDateTime
FROM Task T
JOIN StateSpecification SS
ON T.TaskNumber = SS.TaskNumber
WHERE T.TaskNumber = 1750 -- You can provide a parameter instead of a fixed value.
ORDER BY StartDateTime DESC;SELECT T.TaskNumber,
T.Description,
SS.StateCode,
SS.SpecifiedDateTime AS StartDateTime,
(
SELECT MIN(SpecifiedDateTime)
FROM StateSpecification InnerSS
WHERE T.TaskNumber = InnerSS.TaskNumber
AND InnerSS.SpecifiedDateTime > SS.SpecifiedDateTime
) AS EndDateTime
FROM Task T
JOIN StateSpecification SS
ON T.TaskNumber = SS.TaskNumber
ORDER BY StartDateTime DESC;Context
StackExchange Database Administrators Q#158949, answer score: 11
Revisions (0)
No revisions yet.