patternsqlMinor
Storing terabytes of data in SQL Server
Viewed 0 times
sqldataserverstoringterabytes
Problem
I need to develop a production system capable of storing a 100 kb textfile and a serial number for every piece produced. I know the approximate output over 2 years, the time the data has to be stored, and have calculated a need of around 25-30 TB of storage.
I have some experience working with databases, but handling that amount of data is new to me.
My first guess would be to simply store the files as a blob inside a table in the database. What are the problems of storing that amount of data in a single table? What are possible backup scenarios?
Another approach would be to store the data in several tables (e.g. a table per month).
A third approach would be to store references to the data in SQL Server, zip the files, and store them somewhere else.
The purpose of this storage system is to query the text file by using the serial number.
What would you do?
I have some experience working with databases, but handling that amount of data is new to me.
My first guess would be to simply store the files as a blob inside a table in the database. What are the problems of storing that amount of data in a single table? What are possible backup scenarios?
Another approach would be to store the data in several tables (e.g. a table per month).
A third approach would be to store references to the data in SQL Server, zip the files, and store them somewhere else.
The purpose of this storage system is to query the text file by using the serial number.
What would you do?
Solution
Historically, I'd shy away from storing files in the database and instead include pointers like in your alternative 3. With SQL Server 2012, however, you might look into the FileTable feature. The files are still stored within SQL Server (meaning that you'd have to be aware of the implications on your backup strategy), but also are easily accessible through the standard Windows API so you can manipulate them in Windows Explorer or PowerShell. If you actually need to search within the text files as well, FileTable can do that.
On the other hand, storing 25-30 "extra" TB of data within a SQL Server database will be a challenge to maintain. If this is just for compliance/auditing purposes (i.e., the probability of somebody actually looking at a particular file is low) and files aren't inserted very often, I'd probably stick with the pointers-to-files approach. But if you need access frequently or there are a lot of changes, the advantage goes to FileTable because it takes care of the pointer-handling part of the process.
On the other hand, storing 25-30 "extra" TB of data within a SQL Server database will be a challenge to maintain. If this is just for compliance/auditing purposes (i.e., the probability of somebody actually looking at a particular file is low) and files aren't inserted very often, I'd probably stick with the pointers-to-files approach. But if you need access frequently or there are a lot of changes, the advantage goes to FileTable because it takes care of the pointer-handling part of the process.
Context
StackExchange Database Administrators Q#27964, answer score: 4
Revisions (0)
No revisions yet.