HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMajor

Why is this MySQL query taking so long?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
thiswhyquerylongmysqltaking

Problem

I'm looking to do a relatively simply filter from one table called prices.

prices is a very large table (~2G records). Sample data, query and indexes included below. I realize the table being queried is very large, but for a query this simple I would have expected better performance (currently ~5 mins and running). I did notice InnoDB Buffer Usage appears to be at 100% per MySQL Workbench, with ~6K InnoDB reads per second. Guidance on adjustments to make to correct this.

prices

```
dataDate ticker optionSymbol expDate type price strike last bid ask volume OI
2002-02-08 AAPL AAQ020216C00005000 2002-02-16 call 24.03 5 0 18.8 19.1 0 0
2002-02-08 AAPL AAQ020216P00005000 2002-02-16 put 24.03 5 0 0 0.05 0 0
2002-02-08 AAPL AAQ020216C00007500 2002-02-16 call 24.03 7.5 0 16.3 16.6 0 0
2002-02-08 AAPL AAQ020216P00007500 2002-02-16 put 24.03 7.5 0 0 0.05 0 0
2002-02-08 AAPL AAQ020216C00010000 2002-02-16 call 24.03 10 12.2 13.9 14.2 0 1
2002-02-08 AAPL AAQ020216P00010000 2002-02-16 put 24.03 10 0 0 0.05 0 0
2002-02-08 AAPL AAQ020216C00012500 2002-02-16 call 24.03 12.5 13.5 11.4 11.7 0 8
2002-02-08 AAPL AAQ020216P00012500 2002-02-16 put 24.03 12.5 0.05 0 0.05 0 50
2002-02-08 AAPL AAQ020216C00015000 2002-02-16 call 24.03 15 7.1 8.9 9.1 0 10
2002-02-08 AAPL AAQ020216P00015000 2002-02-16 put 24.03 15 0.1 0 0.05 0 30
2002-02-08 AAPL AAQ020216C00017500 2002-02-16 call 24.03 17.5 5.5 6.4 6.7 0 371
2002-02-08 AAPL AAQ020216P00017500 2002-02-16 put 24.03 17.5 0.05 0 0.05 0 147
2002-02-08 AAPL AAQ020216C00020000 2002-02-16 call 24.03 20 3.9 3.9 4.1 7 1064
2002-02-08 AAPL AAQ020216P00020000 2002-02-16 put 24.03 20 0.1 0 0.1 5 1448
2002-02-08 AAPL AAQ020216C000225

Solution

Instead of

'2020-04-30' = date_add(op.dataDate, INTERVAL 14 DAY);


Use

op.dataDate = date_sub('2020-04-30', INTERVAL 14 DAY);


Your first statement will be interpreted as "add 14 days to all dataDate and return when that is 2020-04-30." This will require a full scan of the table.

The second statement will evaluate to: "return records where the dataDate is 2020-04-16." This allows the engine to perform a seek on your index that begins with dataDate.

Do whatever weird stuff you want to do to expDate since that won't factor much into how the query engine will optimize.

Code Snippets

'2020-04-30' = date_add(op.dataDate, INTERVAL 14 DAY);
op.dataDate = date_sub('2020-04-30', INTERVAL 14 DAY);

Context

StackExchange Database Administrators Q#271614, answer score: 33

Revisions (0)

No revisions yet.