patternsqlMinor
Copy database and clear down data (including linked tables) older than N days
Viewed 0 times
tablesthanlinkeddatabasedownincludingclearanddatadays
Problem
I have a sizable production database (140GB) on MS SQL Server 2012, which I'm trying to automatically download to our development environments on a periodical basis (every week or on demand). I'm not a DBA, but I have some understanding of the basic SQL commands I could use. However, I don't really know the best, most efficient way to implement this and I'd love some feedback and advice.
I came up with the following plan:
There are a couple of things I noted:
My (single) script looks like this:
```
/*
This script is designed to delete all orders older than 6 months old.
It uses a view, which limits the number of claims it brings back
every time it runs. Running the script in this way should reduce the load
on the transaction log, which otherwise would get out of hand. We also
run this in simple recovery mode for a similar reason. For more information
on how this script is structured, please see the following resources:
- http://web.archive.org/web/20100212155407/http://blogs.msdn.com/sqlcat/archive/2009/05/21/fast-ordered-delete.aspx
- http://stackoverflow.com/a/18292136/119624
- http://dba.stackexchange.com/a/63773/45757
- http://www.sqlservercentral.com/blogs/sqldbauk/2011/07/11/can-you-delete-from-a-view_3F00_/
- http://www.sqlbadpractices.com/should-you-shrink-your-database-in-your-maintenance
I came up with the following plan:
- backup production
- restore a copy of production to a copy
- strip down (old data) from copy
- reduce space usage of the copy
- backup copy so it can be downloaded
There are a couple of things I noted:
- I really cannot risk the clean down script running against the production database.
- The production database is in use 24/7/365
- I need to delete a lot of data (out of 10 years worth of data, keep the last 6 months)
- I'm cleaning out millions of rows in multiple tables.
- To prevent transaction log growth getting out of hand, I should use an explicit transaction to avoid log flushes
- Similarly, since I don't require recovery of the database copy, I can switch to simple recovery mode.
My (single) script looks like this:
```
/*
This script is designed to delete all orders older than 6 months old.
It uses a view, which limits the number of claims it brings back
every time it runs. Running the script in this way should reduce the load
on the transaction log, which otherwise would get out of hand. We also
run this in simple recovery mode for a similar reason. For more information
on how this script is structured, please see the following resources:
- http://web.archive.org/web/20100212155407/http://blogs.msdn.com/sqlcat/archive/2009/05/21/fast-ordered-delete.aspx
- http://stackoverflow.com/a/18292136/119624
- http://dba.stackexchange.com/a/63773/45757
- http://www.sqlservercentral.com/blogs/sqldbauk/2011/07/11/can-you-delete-from-a-view_3F00_/
- http://www.sqlbadpractices.com/should-you-shrink-your-database-in-your-maintenance
Solution
One to do the backup and restore (running as an sa) and then creating a separate user that only has access to the PRODCOPY database, that then runs the clean up section of this script. Any thoughts on this?
This is a good idea. You can have 2 jobs
Some things to consider in your script :
-
If the users are connected to the
-
Backing up PROD database, if you are on sql server 2008R2 and up, then standard edition supports BACKUP
-
Make sure you have Instant File Initialization to cut down the restore time.
-
do not drop the database, instead restore
-
Your subsequent script will fail as you are restoring the database, but still putting it in NORECOVERY. You should change it to
-
A good coding practice, avoid using INFORMATION_SCHEMA. Instead use
Also, always use schema.tableName instead of just tableName.
The references are to Aaron Bertrand's blog that explains in more detail
-
Change the sequence of shrink and rebuild indexes. The Shrink should be first followed by a rebuild or reorg and update stats.
-
Use the new syntax of
This is a good idea. You can have 2 jobs
- Perform backup and restore of PROD database to dev environments
- Once above is completed, kick off the scrubbing script to purge old data.
Some things to consider in your script :
-
If the users are connected to the
PRODCOPY, make sure to disconnect them using alter database db_name
set single_user with rollback immediate
waitfor delay '00:00:05'
alter database db_name
set multi_user-
Backing up PROD database, if you are on sql server 2008R2 and up, then standard edition supports BACKUP
compression. You should use that as well.-- we need to delete the existing PRODCOPY-
Make sure you have Instant File Initialization to cut down the restore time.
-
do not drop the database, instead restore
with replace. With Replace will NOT require pages with identical data to be rewritten there by cutting down the restore time. -
Your subsequent script will fail as you are restoring the database, but still putting it in NORECOVERY. You should change it to
RECOVERY.-- We just want to remove everything from this table-
A good coding practice, avoid using INFORMATION_SCHEMA. Instead use
sys.tables.Also, always use schema.tableName instead of just tableName.
The references are to Aaron Bertrand's blog that explains in more detail
-
Change the sequence of shrink and rebuild indexes. The Shrink should be first followed by a rebuild or reorg and update stats.
-
Use the new syntax of
alter index rebuild | reorganize. To automate the above point, you can use Ola Hallengren's - SQL Server Index and Statistics MaintenanceCode Snippets
alter database db_name
set single_user with rollback immediate
waitfor delay '00:00:05'
alter database db_name
set multi_user-- we need to delete the existing PRODCOPY-- We just want to remove everything from this tableContext
StackExchange Database Administrators Q#74318, answer score: 2
Revisions (0)
No revisions yet.