patternsqlMinor
Why is Query Store missing details?
Viewed 0 times
whyquerystoredetailsmissing
Problem
This is slightly complex to explain, but very easy to reproduce.
Short version:
For about 10%-30% queries (by count) significant amounts of Query Store data is not available. I have looked at SQL 2016 and SQL 2017 where Query store had been running for weeks+, where there was activity on the database(s).
The query below will return two sets of data, the top set does not have a value for
Problem Solving:
-
I Initially discovered this using exported Query Store data My data is in an Excel sheet, I sorted and compared and did not find any common factors separating the 'haves' from the 'have nots'
-
To rule out data export issues, I used the code above to get data directly from the system dataview. Similar results to findings in exported data.
-
To rule out the system dataview, I used "Tracked Queries" to report from the root data. (Query Store > Tracked queries > Configure)
3.A. For the 'haves', the query plan(s) and such are displayed, just as you would expect
3.B. For the 'have nots', there is no query plan and no metrics
-
The only scope limiting factor I have found is that on very active databases, the 'have nots' are limited to the last few hours. But on slow databases, the 'have nots' can have initial_compile_start_time dating back several weeks. (I suspect the 'have nots' are being purged then recreated the next time the now "Never seen before, new" query is run)
-
The 'have nots' can have any range of compile counts, number of plans, etc.
Short version:
For about 10%-30% queries (by count) significant amounts of Query Store data is not available. I have looked at SQL 2016 and SQL 2017 where Query store had been running for weeks+, where there was activity on the database(s).
The query below will return two sets of data, the top set does not have a value for
query_store_query.last_execution_time, while the bottom set does have data in the field. The 'have nots' are also missing most the run time stats data. EXEC sp_query_store_flush_db; --Same results without this, but just to rule it out in the examples.
go
Select * from sys.query_store_query
where query_store_query.last_execution_time is null -- there are bunch of these, also missing other data, why?
Select * from sys.query_store_query
where query_store_query.last_execution_time is not nullProblem Solving:
-
I Initially discovered this using exported Query Store data My data is in an Excel sheet, I sorted and compared and did not find any common factors separating the 'haves' from the 'have nots'
-
To rule out data export issues, I used the code above to get data directly from the system dataview. Similar results to findings in exported data.
-
To rule out the system dataview, I used "Tracked Queries" to report from the root data. (Query Store > Tracked queries > Configure)
3.A. For the 'haves', the query plan(s) and such are displayed, just as you would expect
3.B. For the 'have nots', there is no query plan and no metrics
-
The only scope limiting factor I have found is that on very active databases, the 'have nots' are limited to the last few hours. But on slow databases, the 'have nots' can have initial_compile_start_time dating back several weeks. (I suspect the 'have nots' are being purged then recreated the next time the now "Never seen before, new" query is run)
-
The 'have nots' can have any range of compile counts, number of plans, etc.
Solution
I see this on my SQL Server 2016 instance as well (same settings as yours: 15 min data flush, and 1 hour stats collection). I noticed that the plans with missing information correlated with AG failovers and maintenance-related reboots (which I found by looking in the SQL Server error log).
If you are following the best practices related to "mission critical servers" like I am, you may have taken these steps:
Use trace flags on mission critical servers
The global trace flags 7745 and 7752 can be used to improve availability of databases using Query Store. For more information, refer to Trace Flags.
-
Trace flag 7745 will prevent the default behavior where Query Store writes data to disk before SQL Server can be shut down. This means that Query Store data that has been collected but not been yet persisted to disk will be lost.
-
Trace flag 7752 enables asynchronous load of Query Store. This allows a database to become online and queries to be executed before the Query Store has been fully recovered. The default behavior is to do synchoronous load of Query Store. The default behavior prevents queries from executing before the Query Store has been recovered but also prevents any queries from being missed in the data collection.
This explains all of the missing runtime stats in my setup. While it's not explicitly called out in this article, it appears that the queries and plans might get into the query store without aggregated runtime stats. This might vary based on your "Data Flush Interval" and "Statistics Collection Interval."
If you are following the best practices related to "mission critical servers" like I am, you may have taken these steps:
Use trace flags on mission critical servers
The global trace flags 7745 and 7752 can be used to improve availability of databases using Query Store. For more information, refer to Trace Flags.
-
Trace flag 7745 will prevent the default behavior where Query Store writes data to disk before SQL Server can be shut down. This means that Query Store data that has been collected but not been yet persisted to disk will be lost.
-
Trace flag 7752 enables asynchronous load of Query Store. This allows a database to become online and queries to be executed before the Query Store has been fully recovered. The default behavior is to do synchoronous load of Query Store. The default behavior prevents queries from executing before the Query Store has been recovered but also prevents any queries from being missed in the data collection.
This explains all of the missing runtime stats in my setup. While it's not explicitly called out in this article, it appears that the queries and plans might get into the query store without aggregated runtime stats. This might vary based on your "Data Flush Interval" and "Statistics Collection Interval."
Context
StackExchange Database Administrators Q#234483, answer score: 4
Revisions (0)
No revisions yet.