patternsqlMinor
CPU cores and sys.dm_os_workers
Viewed 0 times
coressysdm_os_workersandcpu
Problem
We recently failed over our AOAG cluster to a secondary region. In the first region, our boxes had 32 cores while in the secondary region, we have 64 cores. Traffic is similar, however on the larger boxes we're running much higher numbers of workers in sys.dm_os_workers (and also threads in sys.dm_os_threads). Is this expected behavior when raising CPU cores or should we be concerned about all these idle workers?
We're running SQL Server 2017 CU 24.
In the current server, we're seeing the following counts:
session_count
request_count
worker_count
thread_count
2366
389
1172
1265
We're running SQL Server 2017 CU 24.
max worker threads is configured to 0 (the default value).max degree of parallelism is configured to 2 in both regions.In the current server, we're seeing the following counts:
session_count
request_count
worker_count
thread_count
2366
389
1172
1265
Solution
What I'm really trying to answer is should the number of workers increase dramatically when we increase the number of CPUs?
Traffic is similar, however on the larger boxes we're running much higher numbers of workers in sys.dm_os_workers (and also threads in sys.dm_os_threads).
Without going into the minutia, the number of cores has little bearing on the number of workers or threads in the process for a server under normal load.
The use of threads falls into three different buckets, the first being threads that are needed to run long lived permanent tasks in a preemptive mode such as the lease in an AG setup, CLR hosting, etc., or 3rd party module items such as driver providers. The second bucket are long lived permanent tasks that are SQL related, such as scheduler manager, lazywriter, and other SQL-bootstrapped items which tend to be partitioned by numa node and not by number of cores. The third bucket are SQL query execution related items such as satisfying queries or logins.
Workers fall into the same buckets except for the first as there aren't worker constructs associated with those types of threads (Examples below).
While you'll have more threads and workers due to the first two buckets assuming there are more numa nodes, it shouldn't be dramatic unless the number of threads and workers was already low. The main difference will be in where the work is assigned. Since each scheduler will have certain local queues and work being spread across more schedulers, there will be more workers and threads as even having one extra scheduler will require one more thread and worker.
Normal server activity will spin up and down workers and threads as needed. Periodic spikes in workload or scheduling changes can make this effect seem larger than otherwise expected but is normal.
Overall, though, this is nothing to worry about unless you're hitting max worker threads - in which case changing the value to something else would not be the appropriate course of action as a solution.
Example - Bucket 1: Long lived Non-SQL permanent
Example - Bucket 2: Long lived SQL Bootstrapped
Example - Bucket 3: Regular SQL User Workerload
Traffic is similar, however on the larger boxes we're running much higher numbers of workers in sys.dm_os_workers (and also threads in sys.dm_os_threads).
Without going into the minutia, the number of cores has little bearing on the number of workers or threads in the process for a server under normal load.
The use of threads falls into three different buckets, the first being threads that are needed to run long lived permanent tasks in a preemptive mode such as the lease in an AG setup, CLR hosting, etc., or 3rd party module items such as driver providers. The second bucket are long lived permanent tasks that are SQL related, such as scheduler manager, lazywriter, and other SQL-bootstrapped items which tend to be partitioned by numa node and not by number of cores. The third bucket are SQL query execution related items such as satisfying queries or logins.
Workers fall into the same buckets except for the first as there aren't worker constructs associated with those types of threads (Examples below).
While you'll have more threads and workers due to the first two buckets assuming there are more numa nodes, it shouldn't be dramatic unless the number of threads and workers was already low. The main difference will be in where the work is assigned. Since each scheduler will have certain local queues and work being spread across more schedulers, there will be more workers and threads as even having one extra scheduler will require one more thread and worker.
Normal server activity will spin up and down workers and threads as needed. Periodic spikes in workload or scheduling changes can make this effect seem larger than otherwise expected but is normal.
Overall, though, this is nothing to worry about unless you're hitting max worker threads - in which case changing the value to something else would not be the appropriate course of action as a solution.
Example - Bucket 1: Long lived Non-SQL permanent
sqlservr!DCOMInitThread
kernel32!BaseThreadInitThunk
ntdll!RtlUserThreadStartExample - Bucket 2: Long lived SQL Bootstrapped
sqlmin!lazywriter
sqldk!SOS_Task::Param::Execute
sqldk!SOS_Scheduler::RunTask
sqldk!SOS_Scheduler::ProcessTasks
sqldk!SchedulerManager::WorkerEntryPoint
sqldk!SystemThread::RunWorker
sqldk!SystemThreadDispatcher::ProcessWorker
sqldk!SchedulerManager::ThreadEntryPoint
kernel32!BaseThreadInitThunk
ntdll!RtlUserThreadStartExample - Bucket 3: Regular SQL User Workerload
sqllang!CXStmtSelect::XretExecute
sqllang!CExecStmtLoopVars::ExecuteXStmtAndSetXretReturn
sqllang!CMsqlExecContext::ExecuteStmts
sqllang!CMsqlExecContext::FExecute
sqllang!CSQLSource::Execute
sqllang!process_request
sqllang!process_commands_internal
sqllang!process_messages
sqldk!SOS_Task::Param::Execute
sqldk!SOS_Scheduler::RunTask
sqldk!SOS_Scheduler::ProcessTasks
sqldk!SchedulerManager::WorkerEntryPoint
sqldk!SystemThread::RunWorker
sqldk!SystemThreadDispatcher::ProcessWorker
sqldk!SchedulerManager::ThreadEntryPoint
kernel32!BaseThreadInitThunk
ntdll!RtlUserThreadStartCode Snippets
sqlservr!DCOMInitThread
kernel32!BaseThreadInitThunk
ntdll!RtlUserThreadStartsqlmin!lazywriter
sqldk!SOS_Task::Param::Execute
sqldk!SOS_Scheduler::RunTask
sqldk!SOS_Scheduler::ProcessTasks
sqldk!SchedulerManager::WorkerEntryPoint
sqldk!SystemThread::RunWorker
sqldk!SystemThreadDispatcher::ProcessWorker
sqldk!SchedulerManager::ThreadEntryPoint
kernel32!BaseThreadInitThunk
ntdll!RtlUserThreadStartsqllang!CXStmtSelect::XretExecute
sqllang!CExecStmtLoopVars::ExecuteXStmtAndSetXretReturn
sqllang!CMsqlExecContext::ExecuteStmts<1,0>
sqllang!CMsqlExecContext::FExecute
sqllang!CSQLSource::Execute
sqllang!process_request
sqllang!process_commands_internal
sqllang!process_messages
sqldk!SOS_Task::Param::Execute
sqldk!SOS_Scheduler::RunTask
sqldk!SOS_Scheduler::ProcessTasks
sqldk!SchedulerManager::WorkerEntryPoint
sqldk!SystemThread::RunWorker
sqldk!SystemThreadDispatcher::ProcessWorker
sqldk!SchedulerManager::ThreadEntryPoint
kernel32!BaseThreadInitThunk
ntdll!RtlUserThreadStartContext
StackExchange Database Administrators Q#305059, answer score: 7
Revisions (0)
No revisions yet.