HiveBrain v1.2.0
Get Started
← Back to all entries
patternMajor

Designing a friendships database structure: Should I use a multivalued column?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
friendshipsmultivaluedcolumndesigningdatabasestructureshoulduse

Problem

Say I have a table called User_FriendList, which has the following characteristics:

CREATE TABLE User_FriendList (
    ID ...,
    User_ID...,
    FriendList_IDs...,
    CONSTRAINT User_Friendlist_PK PRIMARY KEY (ID)
);


And let us suppose that said table holds the following data:

+----+---------+---------------------------+
| ID | User_ID | Friendlist_IDs |
+----+---------+---------------------------+
| 1 | 102 | 2:15:66:35:26:17: |
+----+---------+---------------------------+
| 2 | 114 | 1:12:63:33:24:16:102 |
+----+---------+---------------------------+
| 3 | 117 | 6:24:52:61:23:90:97:118 |
+----+---------+---------------------------+

Note: The “:” (colon) is the delimiter when exploding in PHP into an array.

Questions

So:

-
Is this a convenient way to “store” the IDs of a FriendList?

-
Or, instead, should I have individual rows with only one single FriendId value in each of them and, when I need to retrieve all the rows of a given list, simply perform a query like SELECT * FROM UserFriendList WHERE UserId = 1?

Solution

Managing an individual piece of information

Assuming that, in your business domain,

  • a User can have zero-one-or-many Friends;



  • a Friend must first be registered as a User; and



  • you will search for, and/or add, and/or remove, and/or modify, single values of a Friend List;



then each specific datum gathered in the Friendlist_IDs multivalued column represents a separate piece of information that carries a very exact meaning. Therefore, said column

  • entails a proper group of explicit constraints, and



  • its values have the potential of being manipulated individually by means of several relational operations (or combinations thereof).



Short answer

Consequently, you should retain each of the Friendlist_IDs values in (a) a column that accepts exclusively one sole value per row in (b) a table that represents the conceptual-level association type that can take place between Users, i.e., a Friendship —as I will exemplify in the following sections—.

In this way, you will be able to handle (i) said table as a mathematical relation and (ii) said column as a mathematical relation attribute —as much as MySQL and its SQL dialect permit, of course—.

Why?

Because the relational model of data, created by Dr. E. F. Codd, demands having tables that are composed of columns that hold exactly one value of the applicable domain or type per row; hence, declaring a table with a column that can contain more than one value of the domain or type in question (1) does not represent a mathematical relation and (2) would not permit obtaining the advantages proposed in the aforementioned theoretical framework.
Modeling Friendships between Users: Defining the business environment rules first

I highly recommend starting to shape a database delimiting —before anything else— the corresponding conceptual schema by virtue of the definition of the relevant business rules that, among other factors, must describe the types of interrelationships that exist between the distinct aspects of interest, i.e., the applicable entity types and their properties; e.g.:

  • A User is primarily identified by his or her UserId



  • A User is alternately identified by the combination of his or her FirstName, LastName, Gender, and Birthdate



  • A User is alternately identified by his or her Username



  • A User is the Requester of zero-one-or-many Friendships



  • A User is the Addressee of zero-one-or-many Friendships



  • A Friendship is primarily identified by the combination of its RequesterId and its AddresseeId



Expository IDEF1X diagram

In this manner, I was able to derive the IDEF1X1 diagram shown in Figure 1, which integrates most of the rules previously formulated:

As depicted, Requester and Addressee are denotations that express the Roles carried out by the specific Users that take part in a given Friendship.

That being so, the Friendship entity type portrays an association type of many-to-many (M:N) cardinality ratio that can involve different ocurrences of the same entity type, i.e., User. As such, it is an example of the classic construct known as “Bill of Materials” or “Parts Explosion”.

1 Integration Definition for Information Modeling (IDEF1X) is a highly recommendable technique that was established as a standard in December 1993 by the U.S. National Institute of Standards and Technology (NIST). It is solidly based on (a) the early theoretical material authored by the sole originator of the relational model, i.e., Dr. E. F. Codd; on (b) the entity-relationship view of data, developed by Dr. P. P. Chen; and also on (c) the Logical Database Design Technique, created by Robert G. Brown.

Illustrative SQL-DDL logical design

Then, from the IDEF1X diagram presented above, declaring a DDL arrangement like the one that follows is much more “natural”:

```
-- You should determine which are the most fitting
-- data types and sizes for all the table columns
-- depending on your business context characteristics.

-- At the physical level, you should make accurate tests
-- to define the mostconvenient INDEX strategies based on
-- the pertinent query tendencies.

-- As one would expect, you are free to make use of
-- your preferred (or required) naming conventions.

CREATE TABLE UserProfile ( -- Represents an independent entity type.
UserId INT NOT NULL,
FirstName CHAR(30) NOT NULL,
LastName CHAR(30) NOT NULL,
BirthDate DATE NOT NULL,
GenderCode CHAR(3) NOT NULL,
Username CHAR(20) NOT NULL,
CreatedDateTime DATETIME NOT NULL,
--
CONSTRAINT UserProfile_PK PRIMARY KEY (UserId),
CONSTRAINT UserProfile_AK1 UNIQUE ( -- Composite ALTERNATE KEY.
FirstName,
LastName,
GenderCode,
BirthDate
),
CONSTRAINT UserProfile_AK2 UNIQUE (Username) -- Single-column ALTERNATE KEY.
);

CREATE TABLE Friendship ( -- Stands for an associative entity type.
RequesterId INT NOT NULL,
AddresseeId INT NOT NULL, -- Fixed

Code Snippets

-- You should determine which are the most fitting 
-- data types and sizes for all the table columns 
-- depending on your business context characteristics.

-- At the physical level, you should make accurate tests 
-- to define the mostconvenient INDEX strategies based on 
-- the pertinent query tendencies.

-- As one would expect, you are free to make use of 
-- your preferred (or required) naming conventions. 

CREATE TABLE UserProfile ( -- Represents an independent entity type.
    UserId          INT      NOT NULL,
    FirstName       CHAR(30) NOT NULL,
    LastName        CHAR(30) NOT NULL,
    BirthDate       DATE     NOT NULL,
    GenderCode      CHAR(3)  NOT NULL,
    Username        CHAR(20) NOT NULL,
    CreatedDateTime DATETIME NOT NULL,
    --
    CONSTRAINT UserProfile_PK  PRIMARY KEY (UserId),
    CONSTRAINT UserProfile_AK1 UNIQUE ( -- Composite ALTERNATE KEY.
        FirstName,
        LastName,
        GenderCode,
        BirthDate
    ),
    CONSTRAINT UserProfile_AK2 UNIQUE (Username) -- Single-column ALTERNATE KEY.
);

CREATE TABLE Friendship ( -- Stands for an associative entity type.
    RequesterId     INT      NOT NULL,
    AddresseeId     INT      NOT NULL, -- Fixed with a well-delimited data type.
    CreatedDateTime DATETIME NOT NULL,
    --
    CONSTRAINT Friendship_PK            PRIMARY KEY (RequesterId, AddresseeId), -- Composite PRIMARY KEY.
    CONSTRAINT FriendshipToRequester_FK FOREIGN KEY (RequesterId)
        REFERENCES UserProfile (UserId),
    CONSTRAINT FriendshipToAddressee_FK FOREIGN KEY (AddresseeId)
        REFERENCES UserProfile (UserId),
    CONSTRAINT FriendsAreDistinct_CK    CHECK       (RequesterId <> AddresseeId)
);
--
CREATE TABLE MyStatus ( -- Denotes an independent entity type.
    StatusCode CHAR(1)  NOT NULL,
    Name       CHAR(30) NOT NULL,
    --
    CONSTRAINT MyStatus_PK PRIMARY KEY (StatusCode),
    CONSTRAINT MyStatus_AK UNIQUE      (Name) -- ALTERNATE KEY.
); 

CREATE TABLE FriendshipStatus ( -- Represents an associative entity type.
    RequesterId       INT      NOT NULL,
    AddresseeId       INT      NOT NULL,
    SpecifiedDateTime DATETIME NOT NULL,
    StatusCode        CHAR(1)  NOT NULL,
    SpecifierId       INT      NOT NULL,
    --
    CONSTRAINT FriendshipStatus_PK             PRIMARY KEY (RequesterId, AddresseeId, SpecifiedDateTime), -- Composite PRIMARY KEY.
    CONSTRAINT FriendshipStatusToFriendship_FK FOREIGN KEY (RequesterId, AddresseeId)
        REFERENCES Friendship  (RequesterId, AddresseeId), -- Composite FOREIGN KEY.
    CONSTRAINT FriendshipStatusToMyStatus_FK   FOREIGN KEY (StatusCode)
        REFERENCES MyStatus    (StatusCode),
    CONSTRAINT FriendshipStatusToSpecifier_FK  FOREIGN KEY (SpecifierId)
        REFERENCES UserProfile (UserId),
    CONSTRAINT FriendsAreDifferent_CK          CHECK       (RequesterId <> AddresseeId)      
);
SELECT FS.StatusCode AS CurrentStatusCode
  FROM FriendshipStatus FS
 WHERE FS.RequesterId = 1750 --(a)
   AND FS.AddresseeId = 1748 --(b)
   AND FS.SpecifiedDateTime = (
                                  SELECT MAX(NestedFS.SpecifiedDateTime)
                                        FROM FriendshipStatus NestedFS
                                       WHERE NestedFS.RequesterId = FS.RequesterId
                                         AND NestedFS.AddresseeId = FS.AddresseeId
                              );
-- (a), (b) Those “fixed” values can of course be replaced by parameters.
SELECT FS.RequesterId,
       FS.AddresseeId,
       FS.SpecifiedDateTime,       
       FS.StatusCode AS CurrentStatusCode,
       FS.SpecifierId
  FROM FriendshipStatus FS
 WHERE FS.RequesterId = 1750
   AND FS.AddresseeId = 1748
   AND FS.SpecifiedDateTime = (
                                  SELECT MAX(NestedFS.SpecifiedDateTime)
                                        FROM FriendshipStatus NestedFS
                                       WHERE NestedFS.RequesterId = FS.RequesterId
                                         AND NestedFS.AddresseeId = FS.AddresseeId
                              );
CONSTRAINT FriendsAreDistinct_CK CHECK (RequesterId <> AddresseeId);

Context

StackExchange Database Administrators Q#135941, answer score: 32

Revisions (0)

No revisions yet.