patternsqlMinor
ORDER BY optimization for a large table
Viewed 0 times
orderoptimizationlargefortable
Problem
I am currently aggregating posts from different social networks. Currently I have 50-100 million posts from Facebook, Twitter, Youtube, Instagram , Pinterest.
Consider a table
Using
Indexes :
Space usage
Using Amazon RDS with 7.5 GB RAM.
The queries I am executing are as follows
For users with > 500k posts sometimes my queries take a really long time.
I have run explain and you can see the results below
Can you please suggest me any optimizations?
Consider a table
postsposts
{
id int(11),
user_id int(11),
url varchar(256),
image varchar(256),
source int(11), // Social Network Source
created bigint(20), // Publish time of the post
visible int(1) // Public or Private Posts
}Using
InnoDB Indexes :
primary key on id
user_id,source
user_id,createdSpace usage
Type Usage
Data 45,876.0 MiB
Index 4,959.0 MiB
Total 50,835.0 MiBUsing Amazon RDS with 7.5 GB RAM.
The queries I am executing are as follows
select id
from posts
where user_id={user_id}
and visible=1
order by created desc
LIMIT 20 ;
select id
from posts
where user_id={user_id}
and source={your_network}
and visible=1
order by created desc
LIMIT 20 ;For users with > 500k posts sometimes my queries take a really long time.
I have run explain and you can see the results below
EXPLAIN SELECT id
FROM `posts`
WHERE user_id =123529745 and source=1 and visible=1
ORDER BY created DESC
LIMIT 20
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE posts ref user_id_2,user_id_4,user_id_3,created created 4 const 954174 Using where; Using index
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE posts ref user_id_2,user_id_4,user_id_3,created created 4 const 742308 Using where;Can you please suggest me any optimizations?
Solution
This is your query:
You are doing only equality comparisons in the
A single index on:
SELECT id
FROM `posts`
WHERE user_id =123529745 and source=1 and visible=1
ORDER BY created DESCYou are doing only equality comparisons in the
where and then sorting by created.A single index on:
posts(user_id, source, visible, created, id) should allow the entire query to be satisfied only by the index -- first finding the right rows, then sorting by them, and then fetching the id.Code Snippets
SELECT id
FROM `posts`
WHERE user_id =123529745 and source=1 and visible=1
ORDER BY created DESCContext
StackExchange Database Administrators Q#44920, answer score: 5
Revisions (0)
No revisions yet.