HiveBrain v1.2.0
Get Started
← Back to all entries
snippetsqlMinor

Will creating partitions reduce locking and how do we implement this in sql-server?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
thisimplementservercreatingsqlreducepartitionswilllockinghow

Problem

A client wants our application to process more data faster so arranged a meeting with their dba to discuss options.

This application generates quite a lot of data that is used for reporting. Before each run the old data for that item is deleted, the calculations are performed and then the new data inserted. In busy periods the users queue up hundreds of these generation tasks and we run upto 30 of them concurrently. Each run might create 60K rows.

The dba has suggested we could change the application to use 30 partitions (eg. one per thread) to reduce locking between threads during insert and delete. They suggested that in standard sql we could do something like

INSERT INTO schema.table.partition (...) VALUES (...)


I do not see this syntax in the msdn docs and this will mean changing this application which is a pain but is it even possible to do this? As I understand we would instead partition based on columns of the tables using partition functions?

I've read the create partition function docs but am not completely sure how to create a function to meet our needs. To make matters worse I don't yet have enterprise edition to try this out on so my apologies for incorrect syntax.

I am thinking that for example if we have an items table and an itemdata table with data for that item we might partition itemdata table by splitting the data based on a function like itemid mod 30. This would put item 1 in partition 1, item 2 in partition 2, etc. I'm not sure if we could do this in the partition function, in the scheme, table declaration or would we need to create a calculated column and use a values clause? Also not sure if we are going to see any performance improvement?

This is how I think we could implement this:

```
CREATE PARTITION FUNCTION SplittingItemIds_PFunc(decimal(18,0)) AS
RANGE LEFT FOR VALUES
(0,1,2,3, ... ,29)

CREATE PARTITION SCHEME SplittingItemIds_Scheme
AS PARTITION SplittingItemIds_PFunc
ALL TO ([PRIMARY]);

CREATE TABLE ItemData
(

Solution

Software can't fix this if the underlying IO/disks are opaque to you.

  • If you add partitions, you won't spawn 30 threads



  • If all your partitions are on the same volume you'll decrease throughput



I've worked on similar systems where we had

  • a staging DB



  • SIMPLE recovery on staging



  • deferred deletes (eg a new or updated run is an INSERT) that run out of hours



  • each run has a header row to track this status



  • flush to real DB when client is ready



We've also

  • removed FKs and other tweaks to the main table



  • put the staging DB onto separate volumes.

Context

StackExchange Database Administrators Q#5295, answer score: 8

Revisions (0)

No revisions yet.