patternsqlModerate
Can someone explain why left joining two views in mysql is so slow?
Viewed 0 times
leftwhycansomeoneslowviewstwomysqlexplainjoining
Problem
Here is a question I asked yesterday - https://stackoverflow.com/questions/22180727/left-joining-two-views-is-slow.
I got a good answer that helped me but I don't understand why the LEFT JOIN is so much slower than the lookup. The LEFT JOIN was 16 seconds - and I am pretty sure my tables are at least 90% optimized - and when doing the lookup it is just .14 seconds. When I LEFT JOIN tables it is not this slow so why views?
I got a good answer that helped me but I don't understand why the LEFT JOIN is so much slower than the lookup. The LEFT JOIN was 16 seconds - and I am pretty sure my tables are at least 90% optimized - and when doing the lookup it is just .14 seconds. When I LEFT JOIN tables it is not this slow so why views?
Solution
According to the MySQL Documentation on Views
Views (including updatable views) are available in MySQL Server 5.6. Views are stored queries that when invoked produce a result set. A view acts as a virtual table.
The first thing that must be realized about a view is that it produces a result set. The result set emerging from query invoked from the view is a virtual table because it is created on-demand. There is no DDL you can summon afterwards to immediately index the result set. For all intents and purposes, the result set is a table without any indexes. In effect, the LEFT JOIN you were executing is basically a Cartesian product with some filtering.
To give you a more granular look at the JOIN of two views, I will refer to a post I made last year explaining the internal mechanisms MySQL uses to evaluate JOINs and WHEREs (Is there an execution difference between a JOIN condition and a WHERE condition?). I will show you the mechanism as published in Understanding MySQL Internals (Page 172):
OK, it seems like indexes should be used. However, look closer. If you substitute word
MECHANISM MODIFIED
Every table (view) has no index. Thus, working with virtual tables, temp tables, or tables with no indexes really becomes indistinct when doing a JOIN. The keys used are just for JOIN operations, not so much for looking things up faster.
Think of your query as picking up two phone books, the 2014 Yellow Pages and the 2013 Yellow Pages. Each Yellow Pages book contains the White Pages for Residential Phone Numbers.
Obviously, there are differences between the two Phone Books. Doing a JOIN of database tables to figure out the differences between 2013, and 2014 should pose no problem.
Imagine merging the two phone books by hand to locate differences. Sounds insane, doesn't it? Notwithstanding, that is exactly what you are asking mysqld to do when you join two views. Remember, you are not joining real tables and there are no indexes to piggyback from.
Now, let's look back at the actual query.
You are using a virtual table (table with no indexes), viewA, joining it to another virtual table, viewB. The temp table being generated intermittently would be as large as viewA. Then, you running an internal sort on the large temp table to making it distinct.
EPILOGUE
Given the internal mechanisms of evaluating JOINs, along the transient and indexless nature of the result set of a view, your original query
Views (including updatable views) are available in MySQL Server 5.6. Views are stored queries that when invoked produce a result set. A view acts as a virtual table.
The first thing that must be realized about a view is that it produces a result set. The result set emerging from query invoked from the view is a virtual table because it is created on-demand. There is no DDL you can summon afterwards to immediately index the result set. For all intents and purposes, the result set is a table without any indexes. In effect, the LEFT JOIN you were executing is basically a Cartesian product with some filtering.
To give you a more granular look at the JOIN of two views, I will refer to a post I made last year explaining the internal mechanisms MySQL uses to evaluate JOINs and WHEREs (Is there an execution difference between a JOIN condition and a WHERE condition?). I will show you the mechanism as published in Understanding MySQL Internals (Page 172):
- Determine which keys can be used to retrieve the records from tables, and choose the best one for each table.
- For each table, decide whether a table scan is better that reading on a key. If there are a lot of records that match the key value, the advantages of the key are reduced and the table scan becomes faster.
- Determine the order in which tables should be joined when more than one table is present in the query.
- Rewrite the WHERE clauses to eliminate dead code, reducing the unnecessary computations and changing the constraints wherever possible to the open the way for using keys.
- Eliminate unused tables from the join.
- Determine whether keys can be used for
ORDER BYandGROUP BY.
- Attempt to simplify subqueries, as well as determine to what extent their results can be cached.
- Merge views (expand the view reference as a macro)
OK, it seems like indexes should be used. However, look closer. If you substitute word
View for Table, look what happens to the mechanism's execution:MECHANISM MODIFIED
- Determine which keys can be used to retrieve the records from
views, and choose the best one for eachview.
- For each
view, decide whether aviewscan is better that reading on a key. If there are a lot of records that match the key value, the advantages of the key are reduced and theviewscan becomes faster.
- Determine the order in which
viewsshould be joined when more than oneviewsis present in the query.
- Rewrite the WHERE clauses to eliminate dead code, reducing the unnecessary computations and changing the constraints wherever possible to the open the way for using keys.
- Eliminate unused
viewsfrom the join.
- Determine whether keys can be used for
ORDER BYandGROUP BY.
- Attempt to simplify subqueries, as well as determine to what extent their results can be cached.
- Merge views (expand the view reference as a macro)
Every table (view) has no index. Thus, working with virtual tables, temp tables, or tables with no indexes really becomes indistinct when doing a JOIN. The keys used are just for JOIN operations, not so much for looking things up faster.
Think of your query as picking up two phone books, the 2014 Yellow Pages and the 2013 Yellow Pages. Each Yellow Pages book contains the White Pages for Residential Phone Numbers.
- In late 2012, a database table was used to generate the 2013 Yellow Pages.
- During 2013
- People changed phone numbers
- People received new phone numbers
- People dropped phone numbers, switching to cell phone
- In late 2013, a database table was used to generate the 2014 Yellow Pages.
Obviously, there are differences between the two Phone Books. Doing a JOIN of database tables to figure out the differences between 2013, and 2014 should pose no problem.
Imagine merging the two phone books by hand to locate differences. Sounds insane, doesn't it? Notwithstanding, that is exactly what you are asking mysqld to do when you join two views. Remember, you are not joining real tables and there are no indexes to piggyback from.
Now, let's look back at the actual query.
SELECT DISTINCT
viewA.TRID,
viewA.hits,
viewA.department,
viewA.admin,
viewA.publisher,
viewA.employee,
viewA.logincount,
viewA.registrationdate,
viewA.firstlogin,
viewA.lastlogin,
viewA.`month`,
viewA.`year`,
viewA.businesscategory,
viewA.mail,
viewA.givenname,
viewA.sn,
viewA.departmentnumber,
viewA.sa_title,
viewA.title,
viewA.supemail,
viewA.regionname
FROM
viewA
LEFT JOIN viewB ON viewA.TRID = viewB.TRID
WHERE viewB.TRID IS NULLYou are using a virtual table (table with no indexes), viewA, joining it to another virtual table, viewB. The temp table being generated intermittently would be as large as viewA. Then, you running an internal sort on the large temp table to making it distinct.
EPILOGUE
Given the internal mechanisms of evaluating JOINs, along the transient and indexless nature of the result set of a view, your original query
Code Snippets
SELECT DISTINCT
viewA.TRID,
viewA.hits,
viewA.department,
viewA.admin,
viewA.publisher,
viewA.employee,
viewA.logincount,
viewA.registrationdate,
viewA.firstlogin,
viewA.lastlogin,
viewA.`month`,
viewA.`year`,
viewA.businesscategory,
viewA.mail,
viewA.givenname,
viewA.sn,
viewA.departmentnumber,
viewA.sa_title,
viewA.title,
viewA.supemail,
viewA.regionname
FROM
viewA
LEFT JOIN viewB ON viewA.TRID = viewB.TRID
WHERE viewB.TRID IS NULLContext
StackExchange Database Administrators Q#60203, answer score: 12
Revisions (0)
No revisions yet.