HiveBrain v1.2.0
Get Started
← Back to all entries
snippetsqlModerate

How to optimize very slow SELECT with LEFT JOINs over big tables

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
lefttableshowwithslowbigoptimizeveryselectover

Problem

I was googling, self-educating & looking for solution for hours but with no luck. I found a few similar questions here but not this case.

My tables:

  • persons (~10M rows)



  • attributes (location, age,...)



  • links (M:M) between persons and attributes (~40M rows)



Full dump ~280MB

Situation:
I try to select all person ids (person_id) from some locations (location.attribute_value BETWEEN 3000 AND 7000), being some gender (gender.attribute_value = 1), born in some years (bornyear.attribute_value BETWEEN 1980 AND 2000) and having some eyes' color (eyecolor.attribute_value IN (2,3)).

This is my query witch tooks 3~4 min. and I'd like to optimize:

SELECT person_id
FROM person
    LEFT JOIN attribute location ON location.attribute_type_id = 1 AND location.person_id = person.person_id
    LEFT JOIN attribute gender ON gender.attribute_type_id = 2 AND gender.person_id = person.person_id
    LEFT JOIN attribute bornyear ON bornyear.attribute_type_id = 3 AND bornyear.person_id = person.person_id
    LEFT JOIN attribute eyecolor ON eyecolor.attribute_type_id = 4 AND eyecolor.person_id = person.person_id
WHERE 1
    AND location.attribute_value BETWEEN 3000 AND 7000
    AND gender.attribute_value = 1
    AND bornyear.attribute_value BETWEEN 1980 AND 2000
    AND eyecolor.attribute_value IN (2,3)
LIMIT 100000;


Result:

+-----------+
| person_id |
+-----------+
|       233 |
|       605 |
|       ... |
|   8702599 |
|   8703617 |
+-----------+
100000 rows in set (3 min 42.77 sec)


Explain extended:

```
+----+-------------+----------+--------+---------------------------------------------+-----------------+---------+--------------------------+---------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+--------+----------------------------------

Solution

Pick a few attributes to include in person. Index them in a few combinations -- use composite indexes, not single-column indexes.

That is essentially the only way out of EAV-sucks-at-performance, which is where you are.

Here is more discussion: http://mysql.rjweb.org/doc.php/eav including a suggestion of using JSON instead of the key-value table.

Context

StackExchange Database Administrators Q#118291, answer score: 13

Revisions (0)

No revisions yet.