HiveBrain v1.2.0
Get Started
← Back to all entries
snippetsqlMinor

MySql - How can I speed up this query

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
thiscanquerymysqlhowspeed

Problem

I have the following tables:

CREATE TABLE `users` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `first_name` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `last_name` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `account_data` text COLLATE utf8_unicode_ci,
  `created_at` datetime DEFAULT NULL,
  `updated_at` datetime DEFAULT NULL,
  `twitter_username` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `email` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `crypted_password` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `password_salt` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `persistence_token` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `single_access_token` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `perishable_token` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `login_count` int(11) NOT NULL DEFAULT '0',
  `failed_login_count` int(11) NOT NULL DEFAULT '0',
  `last_request_at` datetime DEFAULT NULL,
  `current_login_at` datetime DEFAULT NULL,
  `last_login_at` datetime DEFAULT NULL,
  `current_login_ip` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `last_login_ip` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `is_admin` tinyint(1) DEFAULT '0',
  `referrer_id` int(11) DEFAULT NULL,
  `partner` tinyint(1) DEFAULT '0',
  `subscription_type` varchar(255) COLLATE utf8_unicode_ci DEFAULT 'free',
  `workflow_state` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `persona_id` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `persona_index` (`persona_id`)
) ENGINE=InnoDB


and the table:

``
CREATE TABLE
user_actions (
id int(11) NOT NULL AUTO_INCREMENT,
user_id int(11) DEFAULT NULL,
action_type varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
module varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
data text COLLATE utf8_unicode_ci,
timestamp datetime DEFAULT NULL,
created_at datetime DEFAULT NULL,
updated_at datetime DEFAULT NULL,
PRIMARY KEY (
id),
KEY
user

Solution

Here is your original query:

SELECT
    user_actions.*,
    users.twitter_username,
    users.email
FROM
    `user_actions`  
    INNER JOIN users
    ON (user_actions.user_id=users.id)
    ORDER BY timestamp
    DESC LIMIT 0, 30
;


First thing I notice is that you are joining two whole tables. Since you only need twitter_username and email from the users table, you should only join from users using three columns : id, twitter_username and email.

Second thing is the LIMIT clause. It is being executed after the join. You should execute it before the join. In your case, you are requesting for the 30 most recent user actions. If you can guarantee that only 30 rows are retreived from user_actions, the join should operate much faster.

If you read the answer from @DTest, his first two bulletpoints already tell you what's wrong the query because of the actions mysql will take in gathering data from each table. The key is to understand what the temp tables will look like while the query is being processed and where the data will reside (memory or disk).

What you need to do is refactor the query to fool the MySQL Query Optimizer. Force the query to produce smaller temp tables. In most cases, config changes in my.cnf should make a dramamtic difference. In other cases, such as this one, refactoring the query may be sufficient.

Here is my proposed change to your query that should work faster:

SELECT
    ua.*,
    u.twitter_username,
    u.email
FROM
    (SELECT * FROM `user_actions`
    ORDER BY timestamp DESC LIMIT 30) ua
    LEFT JOIN
    (SELECT id,twitter_username,email FROM `users`) u
    ON (ua.user_id=u.id)
;


Here are reasons for refactoring the query:
REASON #1

If you look at inline table ua, I retrieve only 30 rows using LIMIT. This will happen no matter how big the user_actions table gets. It is already ordered because the ORDER BY timestamp DESC happens before the LIMIT.
REASON #2

If you look inline table u, it has id,twitter_username,email. The id is needed to implement the join.
REASON #3

I use LEFT JOIN instead of INNER JOIN for two(2) reasons:

  • Preserve the order of the query based on ua



  • Display all user actions in case the user_id in the ua no longer exists in the users tables.



Doing these things will force the temp tables to be smaller. Nevertheless, you will still need implement bulletpoint #3 from @DTest's answer to preempt having temp tables land on disk.

Code Snippets

SELECT
    user_actions.*,
    users.twitter_username,
    users.email
FROM
    `user_actions`  
    INNER JOIN users
    ON (user_actions.user_id=users.id)
    ORDER BY timestamp
    DESC LIMIT 0, 30
;
SELECT
    ua.*,
    u.twitter_username,
    u.email
FROM
    (SELECT * FROM `user_actions`
    ORDER BY timestamp DESC LIMIT 30) ua
    LEFT JOIN
    (SELECT id,twitter_username,email FROM `users`) u
    ON (ua.user_id=u.id)
;

Context

StackExchange Database Administrators Q#10257, answer score: 5

Revisions (0)

No revisions yet.