HiveBrain v1.2.0
Get Started
← Back to all entries
snippetsqlModerate

Should I create multiple tables for different entity states, statuses or stages?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
tablescreatestatesstatusesstagesdifferentformultipleshouldentity

Problem

I have a tasks table in my database and, in the business domain of interest, tasks can have multiple states: “open”, “begun”, “in review” and “completed”.

Despite having “open” and “begun” in the same table, identified using flags, I decided for some reason early on to have “in review” and “completed” tasks in their own separate tables. While this seemed like a good idea to start off with, making it easy to query for specific things, when I thought about implementing a comment feature for tasks I realised that the task_id value would be changing across the three tables.

Have I gone about this all wrong?

Current considerations

Some solutions I thought up would be to redo the organisation and put everything in one table, using flags to differentiate purely between the states (this would lead to a fair amount of reworking); to create some kind of UUID that is transferable across the tables - though I think there might issues with performance if this is done; and finally, to set the IDs across the tables to be no-longer auto incremented and to simply “inherit” the original value from the first task table (which auto increments).

What's the done thing in this situation? I'm not the most versed in database design and I'm kind of making this up as I go along.

Any help would be greatly appreciated.

Responses to further information requests made by @MDCCL via comments

So, the state (or status) of a task can change over time, right? How many times can the same concrete task present the same state, let us say, “open”? Will it vary depending on the kind of state presented by the relevant task?

Yes, a task.state could theoretically change up to four times. I won't go into specifics of the logic but the longest chain would be “begun” → “completed” → “review” → “completed”. And each time, currently, they're being put into another table.

The exception is whether it's “begun” or not, a user may abandon and then re-undertake the task infinitely - in theory at l

Solution

The way I see it, the fact that (a) a concrete Task can present distinct States at different points in time entails (b) the creation of a table that is going to contain a time series, as I will detail below.
Business rules

In order to define the structure and constraints of a relational database, it is paramount to first identify and formulate the corresponding business domain rules (i.e., delineate the conceptual schema) with precision, taking into account the relevant entity types, properties and the corresponding interconnections. In this way, some of the formulations that are especially significant in the scenario under consideration are:

  • A Task presents one-to-many StateSpecifications



  • A Task cannot present more than one StateSpecification at the same Instant



  • A StateSpecification is considered Current



  • from the particular Instant in which it is entered



  • until the exact Instant when the successive


StateSpecification is entered

  • A State defines zero-one-or-many StateSpecifications



  • A Task receives zero-one-or-many Comments



Consequently, based on the aforementioned formulations, I created the IDEF1X† diagram shown in Figure 1:

As you can see, Task and State are depicted as individual entity types, each with its own group of properties (or attributes) and its own (direct) associations (or relationships), which are expressed via the corresponding verb phrases, lines, cardinalities and FOREIGN KEY (FK) marks.

The entity type denominated StateSpecification, associated with the two brought up before, is a central aspect in the solution I will explain below.
Expository logical design and sample data

I shaped an expository logical design, based on the IDEF1X diagram above, by means of the DDL structure that follows:

-- You should determine which are the most fitting 
-- data types and sizes for all your table columns 
-- depending on your business context characteristics.

-- Also, you should make accurate tests to define the most
-- convenient physical implementation settings; e.g.,
-- a good indexing strategy based on query tendencies.

-- As one would expect, you are free to use your 
-- preferred or required naming conventions. 

CREATE TABLE Task (
    TaskNumber      INT      NOT NULL,
    Description     CHAR(90) NOT NULL,
    Etcetera        CHAR(30) NOT NULL,
    CreatedDateTime DATETIME NOT NULL,
    --
    CONSTRAINT Task_PK  PRIMARY KEY (TaskNumber)
);    

CREATE TABLE State (
    StateCode   CHAR(1)  NOT NULL,
    Name        CHAR(30) NOT NULL,
    Description CHAR(60) NOT NULL,
    --
    CONSTRAINT State_PK  PRIMARY KEY (StateCode),
    CONSTRAINT State_AK1 UNIQUE      (Name),       -- ALTERNATE KEY.
    CONSTRAINT State_AK2 UNIQUE      (Description) -- ALTERNATE KEY.
); 

CREATE TABLE StateSpecification (
    TaskNumber        INT      NOT NULL,
    SpecifiedDateTime DATETIME NOT NULL,
    StateCode         CHAR(1)  NOT NULL,
    --
    CONSTRAINT StateSpecification_PK         PRIMARY KEY (TaskNumber, SpecifiedDateTime), -- Composite PRIMARY KEY.
    CONSTRAINT StateSpecification_to_Task_FK FOREIGN KEY (TaskNumber)
        REFERENCES Task  (TaskNumber),
    CONSTRAINT StateSpecification_to_State   FOREIGN KEY (StateCode)
        REFERENCES State (StateCode)    
);

CREATE TABLE TaskComment (
    TaskNumber      INT      NOT NULL,
    CreatedDateTime DATETIME NOT NULL,
    Body            TEXT     NOT NULL,
    --
    CONSTRAINT TaskComment_PK      PRIMARY KEY (TaskNumber, CreatedDateTime),
    CONSTRAINT TaskComment_to_Task FOREIGN KEY (TaskNumber)
        REFERENCES Task (TaskNumber)  
);

--
--


The State table

The State table previously illustrated fulfills a look-up role, and is meant to contain five rows (as per the five State samples you provided):

+-—————————-+-—————————-+-———————————————-+
| StateCode | Name | Description |
+-—————————-+-—————————-+-———————————————-+
| O | Open | Indicates that… |
+-----------+-----------+-----------------+
| B | Begun | Indicates that… |
+-----------+-----------+-----------------+
| R | In review | Indicates that… |
+-----------+-----------+-----------------+
| C | Completed | Indicates that… |
+-----------+-----------+-----------------+

Note the definition of a PRIMARY KEY (PK) constraint on a column retaining values that are, at the same time, meaninfgul (regarding interpretation by end users and technicians) and small (in terms of bytes at the physical implementation level), which makes it both readable and fast regarding, e.g., data retrieval.

If new business domain States arise, you can of course INSERT the corresponding rows INTO the State table.

The State Specification table

Now, suposse that Task no. 1750 presents the following State History in the form of (unordered) rows held in the StateSpecification table:

+-——————————-+-———————————————————————-+-—————————-+
| TaskNumber | SpecifiedDateTime | StateCode |
+-——————————-+-————

Code Snippets

-- You should determine which are the most fitting 
-- data types and sizes for all your table columns 
-- depending on your business context characteristics.

-- Also, you should make accurate tests to define the most
-- convenient physical implementation settings; e.g.,
-- a good indexing strategy based on query tendencies.

-- As one would expect, you are free to use your 
-- preferred or required naming conventions. 

CREATE TABLE Task (
    TaskNumber      INT      NOT NULL,
    Description     CHAR(90) NOT NULL,
    Etcetera        CHAR(30) NOT NULL,
    CreatedDateTime DATETIME NOT NULL,
    --
    CONSTRAINT Task_PK  PRIMARY KEY (TaskNumber)
);    

CREATE TABLE State (
    StateCode   CHAR(1)  NOT NULL,
    Name        CHAR(30) NOT NULL,
    Description CHAR(60) NOT NULL,
    --
    CONSTRAINT State_PK  PRIMARY KEY (StateCode),
    CONSTRAINT State_AK1 UNIQUE      (Name),       -- ALTERNATE KEY.
    CONSTRAINT State_AK2 UNIQUE      (Description) -- ALTERNATE KEY.
); 

CREATE TABLE StateSpecification (
    TaskNumber        INT      NOT NULL,
    SpecifiedDateTime DATETIME NOT NULL,
    StateCode         CHAR(1)  NOT NULL,
    --
    CONSTRAINT StateSpecification_PK         PRIMARY KEY (TaskNumber, SpecifiedDateTime), -- Composite PRIMARY KEY.
    CONSTRAINT StateSpecification_to_Task_FK FOREIGN KEY (TaskNumber)
        REFERENCES Task  (TaskNumber),
    CONSTRAINT StateSpecification_to_State   FOREIGN KEY (StateCode)
        REFERENCES State (StateCode)    
);

CREATE TABLE TaskComment (
    TaskNumber      INT      NOT NULL,
    CreatedDateTime DATETIME NOT NULL,
    Body            TEXT     NOT NULL,
    --
    CONSTRAINT TaskComment_PK      PRIMARY KEY (TaskNumber, CreatedDateTime),
    CONSTRAINT TaskComment_to_Task FOREIGN KEY (TaskNumber)
        REFERENCES Task (TaskNumber)  
);

--
--
SELECT T.TaskNumber,
       SS.StateCode AS CurrentStateCode,
       SS.SpecifiedDateTime
     FROM Task T
     JOIN StateSpecification SS
       ON T.TaskNumber = SS.TaskNumber 
    WHERE T.TaskNumber = 1750 -- You can provide a parameter instead of a fixed value.
      AND SS.SpecifiedDateTime = (
                                    SELECT MAX(SpecifiedDateTime)
                                          FROM StateSpecification InnerSS
                                         WHERE T.TaskNumber = InnerSS.TaskNumber
                                 );
SELECT T.TaskNumber,
       SS.StateCode AS CurrentStateCode,
       SS.SpecifiedDateTime
     FROM Task T
     JOIN StateSpecification SS
       ON T.TaskNumber = SS.TaskNumber
    WHERE SS.SpecifiedDateTime = (
                                    SELECT MAX(SpecifiedDateTime)
                                          FROM StateSpecification InnerSS
                                         WHERE T.TaskNumber = InnerSS.TaskNumber
                                 );
SELECT T.TaskNumber,
       T.Description,
       SS.StateCode,
       SS.SpecifiedDateTime AS StartDateTime,
       (
          SELECT MIN(SpecifiedDateTime)
                FROM StateSpecification InnerSS
               WHERE T.TaskNumber = InnerSS.TaskNumber
                 AND InnerSS.SpecifiedDateTime > SS.SpecifiedDateTime
       ) AS EndDateTime
    FROM Task T
    JOIN StateSpecification SS
      ON T.TaskNumber = SS.TaskNumber
   WHERE T.TaskNumber = 1750 -- You can provide a parameter instead of a fixed value.
ORDER BY StartDateTime DESC;
SELECT T.TaskNumber,
       T.Description,
       SS.StateCode,
       SS.SpecifiedDateTime AS StartDateTime,
       (
          SELECT MIN(SpecifiedDateTime)
                FROM StateSpecification InnerSS
               WHERE T.TaskNumber = InnerSS.TaskNumber
                 AND InnerSS.SpecifiedDateTime > SS.SpecifiedDateTime
       ) AS EndDateTime
    FROM Task T
    JOIN StateSpecification SS
      ON T.TaskNumber = SS.TaskNumber
ORDER BY StartDateTime DESC;

Context

StackExchange Database Administrators Q#158949, answer score: 11

Revisions (0)

No revisions yet.