HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Schema design for user profile management

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
userdesignprofileformanagementschema

Problem

Is this a good design? I would like to create a view on user table to get a list of users based on filtered conditions. Please suggest ways of improving this schema design.

A view example that I would like to add on this schema:

  • Select all users who belong to group and all groups created after that group.



  • Select all users with last message, status, location and media URLs included.



```
CREATE TABLE IF NOT EXISTS users
(
userid TEXT NOT NULL,
name TEXT NULL,
lmessage INTEGER NULL,
statusid INTEGER NULL, / statusid should refer to last status of the user in status table/
locationid INTEGER NULL, / locationid should refer to last status of the user in locations table /
registered INTEGER NOT NULL,
tinypic INTEGER NULL / this refers to media id in media table /,
largepic INTEGER NULL / this also refers to media id in media table /,
groupid INTEGER NULL / this refers to id in groups table / ,
PRIMARY KEY (userid)
);

CREATE TABLE IF NOT EXISTS locations
(
serial INTEGER,
locationid TEXT NOT NULL,
userid TEXT NOT NULL,
time INTEGER NULL,
PRIMARY KEY (serial)
);

CREATE TABLE IF NOT EXISTS status
(
serial INTEGER,
userid TEXT NULL,
message TEXT NOT NULL,
time INTEGER NULL,
PRIMARY KEY (serial)
);

CREATE TABLE IF NOT EXISTS messages
(
sno INTEGER,
messageid INTEGER NOT NULL,
sender TEXT NOT NULL,
receiver TEXT NOT NULL,
time INTEGER NULL,
message TEXT NULL,
image INTEGER NULL,
video INTEGER NULL,
audio INTEGER NULL,
PRIMARY KEY (sno)
);

CREATE TABLE IF NOT EXISTS media
(
mediaid TEXT NOT NULL UNIQUE,
url TEXT NULL,
downloaded INTEGER NULL,
thumbnail TEXT NULL,
PRIMARY KEY (mediaid)
);

CREATE TABLE IF NOT EXISTS groups
(
serial INTEGER,
na

Solution

Is this a good design?

No, not really.
Some of the tables are not well designed.
Take a look at the responsibilities of the users table and the media table.

-
The media table has one clear responsibility:
store the attributes of media items.
Every record is self-contained, complete.
You will probably never update the records in this table:
you will only insert or delete. That's nice and simple

-
The users table has two responsibilities:

  • Store the attributes of users



  • Store the relationship of users and other objects



Every record is only half-complete (user attributes),
to get the full picture you need to follow the references to other tables.
You will probably never update the records very frequently,
for setting the last status and location ids

It would be better to make the users table in charge of user attributes only.
You could add additional tables to track the relations with other objects.

The circular dependencies are especially ugly:

  • users depends on status for the last status, and status depends on users to associate all statuses with a user



  • Same for locations



I recommend this kind of structure:

  • user - in charge of user attributes (name, registered date)



  • status - in charge of status attributes (message, date)



  • user_status - in charge of linking user and status



  • user_last_status - in charge of linking user and last status, with unique key on user



And similarly for locations.
It's more work, but the end result will be more robust and flexible.

A reasonable compromise can be to not create user_status,
but include user_id in the status table (with a similar treatment for locations too).


Select all users who belong to group and all groups created after that group.

To find users by a given group id fast,
you need an index on the group id column in the users table.
Below I explain how to create foreign keys.
When you make user.group_id a foreign key referencing group.group_id,
creating such index is one of the steps,
and will ensure fast lookups.

In addition,
you might also find a group fast by name.
In that case you might want to create an index on group.name column.
However, if you won't have many groups than that would be overkill,
and not recommended by the documentation.


Select all users with last message, status, location and media URLs included.

Generally speaking, For good performance of queries like this,
you want to make sure that all fields used in the WHERE and JOIN clauses are indexed.
If you follow my suggestions,
this will be automatically the case for most of these elements you mention,
and for the rest you can probably figure out yourself.

Another issue with the design is that some of the fields are not clear enough, for example:

  • users



  • userid: it would be good to know if this will be a technical id or a something like a UNIX username. If the latter, I'd rename to username



  • name: NULL is allowed, but why?



  • lmessage: an integer field, but not clear which other table it will reference



  • registered: I suppose the date when the user was created. I recommend using the type datetime, even if internally it might be just an integer



  • locations



  • serial: technical id I suppose?



  • locationid: text? That's a bit surprising. How is it different from serial?



  • time: perhaps there can be a better name for this. And again, I recommend using the type datetime



  • status: similar objections as with locations



  • messages



  • sno: technical id I suppose? I recommend to standardize the name of technical ids in all tables



  • sender, receiver: I'm wondering if these have any relation to other tables



  • media: this is good and clear. The UNIQUE keyword on mediaid is pointless when you add a PRIMARY KEY constraint on the same column



The answers to the questions I raised would be good in comments.
Some example values for the columns would be useful too,
especially for the ones where the meaning and motivation is not obvious.

Declare PRIMARY KEY on the field's line

Instead of declaring PRIMARY KEY at the end,
it's more compact to declare it on the field itself, like this:

CREATE TABLE IF NOT EXISTS locations 
  ( 
     serial INTEGER NOT NULL PRIMARY KEY,
     -- ... 
  );


I also added NOT NULL for the field, though it's not strictly necessary.
When a field is primary key in sqlite,
it's treated as auto-incrementing,
so even if you try to explicitly insert NULL into it,
it will actually use max(col) + 1 instead.
But I like to write it this way anyway to make it explicit:
looking at this schema there will be no doubt that the field will never contain NULL.

Pointless unique indexes

Many of the indexes you're adding seem completely pointless:

```
CREATE UNIQUE INDEX IF NOT EXISTS id_unique ON users (userid ASC);

CREATE UNIQUE INDEX IF NOT EXISTS serial_unique ON status (serial ASC);

CREATE UNIQUE INDEX IF NOT EXISTS id_unique ON messages (sn

Code Snippets

CREATE TABLE IF NOT EXISTS locations 
  ( 
     serial INTEGER NOT NULL PRIMARY KEY,
     -- ... 
  );
CREATE UNIQUE INDEX IF NOT EXISTS id_unique ON users (userid ASC); 

CREATE UNIQUE INDEX IF NOT EXISTS serial_unique ON status (serial ASC); 

CREATE UNIQUE INDEX IF NOT EXISTS id_unique ON messages (sno ASC); 

CREATE UNIQUE INDEX IF NOT EXISTS mediaid_unique ON media (mediaid ASC);
CREATE TABLE t1(a, b UNIQUE);

CREATE TABLE t1(a, b PRIMARY KEY);

CREATE TABLE t1(a, b);
CREATE UNIQUE INDEX t1b ON t1(b);
CREATE UNIQUE INDEX IF NOT EXISTS serial_unique ON patterns (serial DESC);
CREATE TABLE IF NOT EXISTS status 
  (
     serial INTEGER NOT NULL PRIMARY KEY,
     userid TEXT NULL UNIQUE,
     -- ...
  );

Context

StackExchange Code Review Q#68700, answer score: 9

Revisions (0)

No revisions yet.