patternsqlMajor
Likes or votes for posts
Viewed 0 times
votespostsforlikes
Problem
I am making a small program where users makes posts or write blogs. On those posts, other users can like or dislike the post as in facebook or upvote or downvote the post as in stackoverflow. I would like to know a good database structure which is commonly used & the program works efficiently with that structure. I have two options
First
Post:
In the above way,
Second
Post:
Likes:
Dislikes:
In this way, I have to create two separate tables for likes & dislikes to get post's likes. In this way, the tables i.e.
So, I would like to know which is the better & standard way to achieve this task?
First
Post:
id head message datepost likes dislikes
1 ab anchdg DATE 1,2,3 7,55,44,3In the above way,
id is the postid. In the likes column, 1,2,3 is the user's id who liked or upvoted the post or blog. 7,55,44,3 is the id of the users who disliked or downvoted the post or blog.Second
Post:
id head message datepost
1 ab anchdg DATELikes:
id postid userid
1 1 1
2 2 2Dislikes:
id postid userid
1 1 7
2 1 55In this way, I have to create two separate tables for likes & dislikes to get post's likes. In this way, the tables i.e.
Likes & Dislikes will get heavily filled. This might make table heavy & processing slow.So, I would like to know which is the better & standard way to achieve this task?
Solution
The problem you face is known as "Normal forms" of databases, especially the first normal form. https://en.wikipedia.org/wiki/First_normal_form.
Your databse with the concatenated user IDs (first version) is not in first normal form.
See https://en.wikipedia.org/wiki/Database_normalization for why and how normalisation is generally considered good.
In your first example, the query for "user 4 does not like the post anymore" becomes complicated. It will have to do string operations, which will have to consider side effects and corner cases (user is the only "liking" user, user is the last liking user, user is in the middle of the liking user string). I would find this bad. Don't do it. Use a normalized design.
re: database gets heavy
If you have a post that has 4 million likes, in database design 1 you would have one row with a "likes" column that is at least 4 million characters wide (because you'll need the comma as seperator chars). You will then have to perform string operations on four million digit wide strings. This is very unperformant and slow.
On the other hand, databases are designed to handle millions of rows. We have databases with several hundred million rows, and count()-operations are fast. Extremely fast. So no, this will not be a performance bottleneck.
The next issue would be readability and maintainability.
For example, tell me what these 2 statements do:
Your databse with the concatenated user IDs (first version) is not in first normal form.
See https://en.wikipedia.org/wiki/Database_normalization for why and how normalisation is generally considered good.
In your first example, the query for "user 4 does not like the post anymore" becomes complicated. It will have to do string operations, which will have to consider side effects and corner cases (user is the only "liking" user, user is the last liking user, user is in the middle of the liking user string). I would find this bad. Don't do it. Use a normalized design.
re: database gets heavy
If you have a post that has 4 million likes, in database design 1 you would have one row with a "likes" column that is at least 4 million characters wide (because you'll need the comma as seperator chars). You will then have to perform string operations on four million digit wide strings. This is very unperformant and slow.
On the other hand, databases are designed to handle millions of rows. We have databases with several hundred million rows, and count()-operations are fast. Extremely fast. So no, this will not be a performance bottleneck.
The next issue would be readability and maintainability.
For example, tell me what these 2 statements do:
select count(*)
from posts
inner join likes on posts.postid = likes.postid
where postid = 7
select len(likes) - len(replace(likes, ',', ''))
from posts
where postid = 7Code Snippets
select count(*)
from posts
inner join likes on posts.postid = likes.postid
where postid = 7
select len(likes) - len(replace(likes, ',', ''))
from posts
where postid = 7Context
StackExchange Database Administrators Q#129785, answer score: 22
Revisions (0)
No revisions yet.