patternsqlMinor
SQL Parallel Stored Procedure execution using CLR - Performance
Viewed 0 times
storedsqlclrprocedureparallelusingperformanceexecution
Problem
We have a main analytics SP that calls 100 other sub SPs, all of which act upon the same set of data (Claims), check it for some business rules, and output discrepancies.
The SPs need not all be run sequentially, they can be broken out into sections, which are dependent on previous sections (sequential), but independent within the section (parallel).
After looking at many options like Service Broker, Agent Jobs, Batch Files, SSIS etc., I used this CLR code to parallelize the sections, and it gave great performance improvement.
However, when I run multiple (5, 10, 15) main SPs concurrently (each of which analyzes different claims) , performance starts to taper as concurrency increases. I guess this is because of the overhead of creating multiple parallel threads through the CLR. I also see lot of XTP_THREAD_POOL sessions idle in sp_who2.
Has anyone used CLR for parallelizing Stored Procedures in Critical OLTP Production workloads ?
Are there any best practices for performance tuning SQL CLR ?
Is there a threshold for number of parallel threads that can be opened before the overhead makes things worse ?
If my system has 20 cores, does it mean creating > 20 parallel threads does not help ?
The SPs need not all be run sequentially, they can be broken out into sections, which are dependent on previous sections (sequential), but independent within the section (parallel).
After looking at many options like Service Broker, Agent Jobs, Batch Files, SSIS etc., I used this CLR code to parallelize the sections, and it gave great performance improvement.
However, when I run multiple (5, 10, 15) main SPs concurrently (each of which analyzes different claims) , performance starts to taper as concurrency increases. I guess this is because of the overhead of creating multiple parallel threads through the CLR. I also see lot of XTP_THREAD_POOL sessions idle in sp_who2.
Has anyone used CLR for parallelizing Stored Procedures in Critical OLTP Production workloads ?
Are there any best practices for performance tuning SQL CLR ?
Is there a threshold for number of parallel threads that can be opened before the overhead makes things worse ?
If my system has 20 cores, does it mean creating > 20 parallel threads does not help ?
Solution
Has anyone used CLR for parallelizing Stored Procedures in Critical OLTP Production workloads ?
Parallelizing? Yes. For "critical OLTP Production" systems, not really. This is not to say that you can't do it, but you are entering rather dangerous territory here given that this goes beyond the typical caution of accessing external resources (usually web services) that might not return quickly and hence might "hijack" a thread until SQL Server forces that thread to yield.
Is there a threshold for number of parallel threads that can be opened before the overhead makes things worse ?
I do not believe so. Also, I am not sure that the thread count by itself is the overriding issue here. Have you also checked for blocking as you increase the execution count? More thoughts below..
However, when I run multiple (5, 10, 15) main SPs concurrently (each of which analyzes different claims) , performance starts to taper as concurrency increases.
So, here are some points to consider:
-
SQLCLR aside, isn't what you are seeing, to a degree, merely the nature of resource management / concurrency? As you start to increase concurrency you see a gain, then at some point the amount of each additional gain decreases until it becomes 0, and then increasing concurrency further actually has the opposite effect? (i.e. economies of scale / diseconomies of scale)
Yes, the additional operations are being done on different sets of rows, but each thread still requires its own connection / session. Have you checked the connection count in
Execute the following query to get the details, and review the documentation for the DMV, linked above, to better understand the values being returned by the various fields:
As you exec that query while your process is running, how do the values change for the various "task" and "worker" count fields, especially as you increase the number of concurrent executions of the SQLCLR stored procedure?
-
Since you cannot use the Context Connection with multithreading, you are making regular / external connections. That would imply that the SQLCLR stored procedure does not need to be local (well, depending on how it is being used in this process). It might be worth exploring moving this SQLCLR stored procedure to a different server where it wouldn't be competing for threads. Adding to the point made above about only having so many threads to use, one needs to consider that each thread you request is really taking 2 from the CPU, right? One thread in .NET and one thread in SQL Server. Of course, looking at that Code Project page, I don't see any way to pass in a Connection String, so maybe it is just connecting to the
-
One thing I noticed while looking at the usage description on that Code Project page was that it seems to store properties in static variables, given that separate execution of stored procedures are able to maintain state between those executions. While it is possible that the
While it might be coded properly to handle this, you might want to consider creating separate databases just to hold copies of this Assembly to run each separate set of executions. Who knows, you might also get better thread management out of it as well. This is probably necessary whether you run it locally or on a remote instance.
-
Finally, given the limited description of the operation, have you considered implementing this setup in your own .NET app? You could write a console application that handles the multi-threading and handles the workflow as you need it to be: multiple, synchronous sections of asynchronous calls. This would remove the various risks you currently face trying to do this within SQLCLR, while ensuring that you can run it remotely. Then, you can run multiple instances of the console app, each to work on different sets of claims, without worrying about shared memory (as each instance of the console app would have its own App Domain) and without taking any threads away from SQL Server to manag
Parallelizing? Yes. For "critical OLTP Production" systems, not really. This is not to say that you can't do it, but you are entering rather dangerous territory here given that this goes beyond the typical caution of accessing external resources (usually web services) that might not return quickly and hence might "hijack" a thread until SQL Server forces that thread to yield.
Is there a threshold for number of parallel threads that can be opened before the overhead makes things worse ?
I do not believe so. Also, I am not sure that the thread count by itself is the overriding issue here. Have you also checked for blocking as you increase the execution count? More thoughts below..
However, when I run multiple (5, 10, 15) main SPs concurrently (each of which analyzes different claims) , performance starts to taper as concurrency increases.
So, here are some points to consider:
-
SQLCLR aside, isn't what you are seeing, to a degree, merely the nature of resource management / concurrency? As you start to increase concurrency you see a gain, then at some point the amount of each additional gain decreases until it becomes 0, and then increasing concurrency further actually has the opposite effect? (i.e. economies of scale / diseconomies of scale)
Yes, the additional operations are being done on different sets of rows, but each thread still requires its own connection / session. Have you checked the connection count in
sys.dm_exec_connections? I'm not sure how many cores you have, but there is a DMV, sys.dm_os_schedulers, that shows the visible scheduler count and that is how many active query threads you can have at any given point in time. If, let's say, you have 32 schedulers, then it won't process 100 threads faster than 50.Execute the following query to get the details, and review the documentation for the DMV, linked above, to better understand the values being returned by the various fields:
SELECT * FROM sys.dm_os_schedulers WHERE [status] = N'VISIBLE ONLINE';As you exec that query while your process is running, how do the values change for the various "task" and "worker" count fields, especially as you increase the number of concurrent executions of the SQLCLR stored procedure?
-
Since you cannot use the Context Connection with multithreading, you are making regular / external connections. That would imply that the SQLCLR stored procedure does not need to be local (well, depending on how it is being used in this process). It might be worth exploring moving this SQLCLR stored procedure to a different server where it wouldn't be competing for threads. Adding to the point made above about only having so many threads to use, one needs to consider that each thread you request is really taking 2 from the CPU, right? One thread in .NET and one thread in SQL Server. Of course, looking at that Code Project page, I don't see any way to pass in a Connection String, so maybe it is just connecting to the
(local) instance? (I don't have time at the moment to sign up just to look at the source code).-
One thing I noticed while looking at the usage description on that Code Project page was that it seems to store properties in static variables, given that separate execution of stored procedures are able to maintain state between those executions. While it is possible that the
Parallel_Declare proc could be used to manage the separation callers, SQLCLR uses a shared App Domain model: App Domains are created per each DB / Assembly owner combination. If you are using the same stored proc for all of this, then that is running in the same App Domain and ALL sessions are sharing the same memory and hence the same static variables. Are you sure that multiple executions of the same stored procedure with different SQL isn't overwriting concurrent executions that have already been started?While it might be coded properly to handle this, you might want to consider creating separate databases just to hold copies of this Assembly to run each separate set of executions. Who knows, you might also get better thread management out of it as well. This is probably necessary whether you run it locally or on a remote instance.
-
Finally, given the limited description of the operation, have you considered implementing this setup in your own .NET app? You could write a console application that handles the multi-threading and handles the workflow as you need it to be: multiple, synchronous sections of asynchronous calls. This would remove the various risks you currently face trying to do this within SQLCLR, while ensuring that you can run it remotely. Then, you can run multiple instances of the console app, each to work on different sets of claims, without worrying about shared memory (as each instance of the console app would have its own App Domain) and without taking any threads away from SQL Server to manag
Code Snippets
SELECT * FROM sys.dm_os_schedulers WHERE [status] = N'VISIBLE ONLINE';Context
StackExchange Database Administrators Q#192129, answer score: 9
Revisions (0)
No revisions yet.