HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Huge data and performance in SQL Server

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
serversqlhugeperformanceanddata

Problem

I've written an application with a SQL Server backend that collects and stores and extremely large amount of records. I've calculated that, at the peak, the average amount of records is somewhere in the avenue of 3-4 billion per day (20 hours of operation).

My original solution (before I'd done the actual calculation of the data) was to have my application inserting records into the same table that is queried by my clients. That crashed and burned fairly quickly, obviously, because it's impossible to query a table that's having that many records inserted.

My second solution was to use 2 databases, one for data received by the application and one for client-ready data.

My application would receive data, chunk it into batches of ~100k records and bulk-insert into the staging table. After ~100k records the application would, on the fly, create another staging table with the same schema as before, and begin inserting into that table. It would create a record in a jobs table with the name of the table that has 100k records and a stored procedure on the SQL Server side would move the data from the staging table(s) to client-ready production table, and then drop the table temporary table created by my application.

Both databases have the same set of 5 tables with the same schema, except the staging database which has the jobs table. The staging database has no integrity constraints, key, indexes etc... on the table where the bulk of records will reside. Shown below, the table name is SignalValues_staging. The goal was to have my application slam the data into SQL Server as quickly as possible. The workflow of creating tables on the fly so they can easily be migrated works pretty well.

The following is the 5 relevant tables from my staging database, plus my jobs table:

The stored procedure I have written handles the moving of the data from all of the staging tables and inserting it into production. Below is the part of my stored procedure that inserts into produc

Solution

I've calculated that, at the peak, the average amount of records is somewhere in the avenue of 3-4 billion per day (20 hours of operation).

From your screenshot, you ONLY have 8GB memory total RAM and 6 GB allocated to SQL Server. This is way tooo low for what you are trying to achieve.

I suggest you to upgrade the memory to a higher value - 256GB and bump up your VM CPUs as well.

You need to invest in hardware at this point for your workload.

Also refer to data loading performance guide - it describes smart ways of efficiently loading the data.


My tempdb is 8mb.

Based on your edit .. you should have a sensible tempdb - preferably multiple tempdb data files equally sized along with TF 1117 and 1118 enabled instance wide.

I would suggest you to get a professional health check and start from there.

Highly recommend

-
Bump up your server spec.

-
Get a professional* person do a health check of your database server instance and follow the recommendations.

-
Once a. and b. are done, then immerse yourself in query tuning and other optimizations like looking at wait stats, query plans, etc.


Note: I am a professional sql server expert at hackhands.com - a pluralsight company, but in no means suggesting you to hire me for help. I am merely suggesting you to take professional help based on your edits only.

HTH.

Context

StackExchange Database Administrators Q#123414, answer score: 8

Revisions (0)

No revisions yet.