HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Copy large table to archive table SQL Server 2012

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
2012sqlcopylargeservertablearchive

Problem

I have a table with logs of my application that have 161.626.718 rows (and growing) and it's very slow to make a insert on it.
My idea is to keep only 6 months of log in the table and all older records I want to copy on a archive table that is not accessible by the application and if anyone need some info, just do a select direct on the table.
So, the idea is to create a daily job like this:

SELECT * INTO audlog_backup FROM audlog WHERE XDATE <= DATEADD(day,-185,GETDATE())

DELETE FROM audlog WHERE XDATE <= DATEADD(day,-185,GETDATE())


But because of the size of the table, this will take ages and will slow down or even lock the database for some time on the first run, after all most of the rows will be older then 6 months.

Whats the recommendation for doing it safely and faster as possible for the first time?

More information:

SQL Server 2012 Standard

Script of the create table:

CREATE TABLE [dbo].[audlog_backup](
    [PVKEY] [float] NULL,
    [DKEY] [varchar](14) NULL,
    [XDATE] [datetime] NULL,
    [XTIME] [varchar](8) NULL,
    [XFROM] [varchar](50) NULL,
    [XTO] [varchar](50) NULL,
    [DETAILS] [text] NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
SET ANSI_PADDING ON
ALTER TABLE [dbo].[audlog_backup] ADD [XUSER] [varchar](50) NULL
ALTER TABLE [dbo].[audlog_backup] ADD [PEOPLEKEY] [int] NULL
ALTER TABLE [dbo].[audlog_backup] ADD [ADDRESSKEY] [int] NULL
ALTER TABLE [dbo].[audlog_backup] ADD [XFILEKEY] [varchar](50) NULL
ALTER TABLE [dbo].[audlog_backup] ADD [XLOGNO] [int] IDENTITY(1,1) NOT NULL
ALTER TABLE [dbo].[audlog_backup] ADD [XTABLE] [varchar](50) NULL
ALTER TABLE [dbo].[audlog_backup] ADD [XFIELD] [varchar](50) NULL
 CONSTRAINT [PK_AUDLOG_backup] PRIMARY KEY NONCLUSTERED 
(
    [XLOGNO] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

Solution

You can use the following Stored Procedure, scheduled via a SQL Server Agent job, to archive records on a continual basis. You don't need to worry about the initial move of "everything older than 6 months" as it will be taken care of naturally by this process since it will be able to archive records faster than you are inserting them. Of course, if you want to get a large chunk of those rows archived immediately, you can run this Stored Procedure manually (while the job is not running), and just pass in parameter values that are slightly higher for @BatchSize and @SecondsToRun.

Important Notes:

  • Scheduled this to run via SQL Server Agent, probably once per hour



  • Set defaults for @BatchSize and @SecondsToRun to match how fast data is coming into the table



  • Create a NonClustered Index on dbo.AudLog ([XDATE] ASC)



  • Assuming that XLOGNO is in the main AudLog table, remove the IDENTITY specification from the XLOGNO column here (in AudLog_Backup) as that value comes from the main AudLog table



  • Change the datatype of the DETAILS column to be VARCHAR(MAX): text has been deprecated since the release of SQL Server 2005 and is more difficult to work with.



Minor notes:

  • Running this Stored Procedure, at least via the SQL Server Agent job if not also manually at first to get a large chuck of rows moved over, should be less work than swapping the table out, and there is no period of time when the AudLog table is empty.



  • The Stored Procedure will simply exit if there is nothing to archive, so it doesn't hurt to run once per hour, or maybe every 2 - 3 hours (depending on how fast new rows come in).



  • The Stored Procedure is throttled to exit after a certain number of seconds so that if it is causing contention on the main AudLog table, it will only be for short periods of time.



  • The OUTPUT clause binds the DELETE from the main AudLog table and the INSERT into the AudLog_Backup table in a Transaction, so you don't need to worry about the process failing, or system shutting down / crashing, between those two operations.



`CREATE PROCEDURE dbo.AudLog_Archive
(
@BatchSize INT = 2000, -- don't go above 3000 to avoid lock escalation
@SecondsToRun INT = 180, -- run for 3 minutes
@DaysToKeep INT = 185 -- archive rows older than 6 months
)
AS
SET NOCOUNT ON;

DECLARE @EndTime DATETIME = DATEADD(SECOND, @SecondsToRun, GETDATE()),
@ArchiveDate DATETIME = DATEADD(DAY, @DaysToKeep * -1, GETDATE()),
@RowsArchived INT = @BatchSize; -- initialize to be able to enter the loop

WHILE (@EndTime > GETDATE() AND @RowsArchived = @BatchSize)
BEGIN

IF (EXISTS(
SELECT 1
FROM dbo.AudLog al
WHERE al.[XDATE]

Context

StackExchange Database Administrators Q#139003, answer score: 9

Revisions (0)

No revisions yet.