patternsqlMinor
Improving `GROUP BY` query performance in sqlite3
Viewed 0 times
groupquerysqlite3performanceimproving
Problem
I have a little web-application that is using sqlite3 as it's DB (the db is fairly small).
Right now, I am generating some content to display using the following query:
Where
Anyways, right now, this one query is completely killing my performance. It takes approximately 800 milliseconds to execute on a table with ~67K rows.
I have indexes on both
However,
The index on
If I drop the
Basically, my naive assumption would be that the best way to perform this query would be
Right now, I am generating some content to display using the following query:
SELECT dbId,
dlState,
retreivalTime,
seriesName,
FROM DataItems
GROUP BY seriesName
ORDER BY retreivalTime DESC
LIMIT ?
OFFSET ?;Where
limit is typically ~200, and offset is 0 (they drive a pagination mechanism).Anyways, right now, this one query is completely killing my performance. It takes approximately 800 milliseconds to execute on a table with ~67K rows.
I have indexes on both
seriesName and retreivalTime.sqlite> SELECT name FROM sqlite_master WHERE type='index' ORDER BY name;
DataItems_seriesName_index
DataItems_time_index // This is the index on retreivalTime. Yeah, it's poorly namedHowever,
EXPLAIN QUERY PLAN seems to indicate they're not being used:sqlite> EXPLAIN QUERY PLAN SELECT dbId,
dlState,
retreivalTime,
seriesName
FROM
DataItems
GROUP BY
seriesName
ORDER BY
retreivalTime
DESC LIMIT 200 OFFSET 0;
0|0|0|SCAN TABLE DataItems
0|0|0|USE TEMP B-TREE FOR GROUP BY
0|0|0|USE TEMP B-TREE FOR ORDER BYThe index on
seriesName is COLLATE NOCASE, if that's relevant.If I drop the
GROUP BY, it behaves as expected:sqlite> EXPLAIN QUERY PLAN SELECT dbId, dlState, retreivalTime, seriesName FROM DataItems ORDER BY retreivalTime DESC LIMIT 200 OFFSET 0;
0|0|0|SCAN TABLE DataItems USING INDEX DataItems_time_indexBasically, my naive assumption would be that the best way to perform this query would be
Solution
An index can be used to optimize the GROUP BY, but if the ORDER BY uses different columns, the sorting cannot use an index (because an index would help only if the database would be able to read the rows from the table in the sort order).
A COLLATE NOCASE index does not help if you use a different collation in the query.
Add a 'normal' index, or use
Using the OFFSET clause for pagination is not very efficient, because the database still has to group and sort all rows before it can begin stepping over them.
Better use a scrolling cursor.
Note: there is no guarantee that the
A COLLATE NOCASE index does not help if you use a different collation in the query.
Add a 'normal' index, or use
GROUP BY seriesName COLLATE NOCASE, if that is allowed.Using the OFFSET clause for pagination is not very efficient, because the database still has to group and sort all rows before it can begin stepping over them.
Better use a scrolling cursor.
Note: there is no guarantee that the
dbId and dlState values come from any specific row; SQLite allows non-aggregated columns in an aggregated query only for bug compatibility with MySQL.Context
StackExchange Database Administrators Q#72358, answer score: 5
Revisions (0)
No revisions yet.