snippetsqlModerate

sql server: updating fields on huge table in small chunks: how to get progress/status?

Submitted by: @import:stackexchange-dba·Mar 10, 2026·

Viewed 0 times

sqlfieldsupdatingchunksstatushugeprogressgetsmallhow

Problem

We have a very large (100million row) table, and we need to update a couple of fields on it.

For log shipping, etc, we also, obviously, want to keep it to bite-size transactions.

Will the below do the trick?

And how can we get it to print some output, so we can see progress? (we tried adding a PRINT statement in there, but nothing was output during while loop)

The code is:

DECLARE @CHUNK_SIZE int
SET @CHUNK_SIZE = 10000

UPDATE TOP(@CHUNK_SIZE) [huge-table] set deleted = 0, deletedDate = '2000-01-01'
where deleted is null or deletedDate is null

WHILE @@ROWCOUNT > 0
BEGIN
    UPDATE TOP(@CHUNK_SIZE) [huge-table] set deleted = 0, deletedDate = '2000-01-01'
    where deleted is null or deletedDate is null
END

Solution

I was not aware of this question when I answered the related question ( Are explicit transactions needed in this while loop? ), but for the sake of completeness, I will address this issue here as it was not part of my suggestion in that linked answer.

Since I am suggesting to schedule this via a SQL Agent job (it is 100 million rows, after all), I don't think that any form of sending status messages to the client (i.e. SSMS) will be ideal (though if that is ever a need for other projects, then I agree with Vladimir that using RAISERROR('', 10, 1) WITH NOWAIT; is the way to go).

In this particular case, I would create a status table that can be updated per each loop with the number of rows updated thus far. And it doesn't hurt to throw in the current time to have a heart-beat on the process.

Given that you want to be able to cancel and restart the process, I am weary of wrapping the the UPDATE of the main table with the UPDATE of the status table in an explicit transaction. However, if you feel that the status table is ever out of sync due to cancelling, it is easy to refresh with the current value by simply updating it manually with the COUNT(*) FROM [huge-table] WHERE deleted IS NOT NULL AND deletedDate IS NOT NULL. and there are two tables to UPDATE (i.e. the main table and the status table), we should use an explicit transaction to keep those two tables in sync, yet we do not want to risk having an orphaned transaction if you cancel the process at a point after it has started the transaction but has not committed it. This should be safe to do as long as you don't stop the SQL Agent job.

How can you stop the process without, um, well, stopping it? By asking it to stop :-). Yep. By sending the process a "signal" (similar to kill -3 in Unix), you can request that it stop at the next convenient moment (i.e. when there is no active transaction!) and have it clean itself up all nice and tidy-like.

How can you communicate with the running process in another session? By using the same mechanism that we created for it to communicate its current status back to you: the status table. We just need to add a column that the process will check at the beginning of each loop so that it knows whether to proceed or abort. And since the intent is to schedule this as a SQL Agent job (run every 10 or 20 minutes), we should also check at the very beginning since there is no point in filling a temp table with 1 million rows if the process is just going to exit a moment later and not use any of that data.

DECLARE @BatchRows INT = 1000000,
        @UpdateRows INT = 4995;

IF (OBJECT_ID(N'dbo.HugeTable_TempStatus') IS NULL)
BEGIN
  CREATE TABLE dbo.HugeTable_TempStatus
  (
    RowsUpdated INT NOT NULL, -- updated by the process
    LastUpdatedOn DATETIME NOT NULL, -- updated by the process
    PauseProcess BIT NOT NULL -- read by the process
  );

  INSERT INTO dbo.HugeTable_TempStatus (RowsUpdated, LastUpdatedOn, PauseProcess)
  VALUES (0, GETDATE(), 0);
END;

-- First check to see if we should run. If no, don't waste time filling temp table
IF (EXISTS(SELECT * FROM dbo.HugeTable_TempStatus WHERE PauseProcess = 1))
BEGIN
  PRINT 'Process is paused. No need to start.';
  RETURN;
END;

CREATE TABLE #FullSet (KeyField1 DataType1, KeyField2 DataType2);
CREATE TABLE #CurrentSet (KeyField1 DataType1, KeyField2 DataType2);

INSERT INTO #FullSet (KeyField1, KeyField2)
  SELECT TOP (@BatchRows) ht.KeyField1, ht.KeyField2
  FROM   dbo.HugeTable ht
  WHERE  ht.deleted IS NULL
  OR     ht.deletedDate IS NULL

WHILE (1 = 1)
BEGIN
  -- Check if process is paused. If yes, just exit cleanly.
  IF (EXISTS(SELECT * FROM dbo.HugeTable_TempStatus WHERE PauseProcess = 1))
  BEGIN
    PRINT 'Process is paused. Exiting.';
    BREAK;
  END;

  -- grab a set of rows to update
  DELETE TOP (@UpdateRows)
  FROM   #FullSet
  OUTPUT Deleted.KeyField1, Deleted.KeyField2
  INTO   #CurrentSet (KeyField1, KeyField2);

  IF (@@ROWCOUNT = 0)
  BEGIN
    RAISERROR(N'All rows have been updated!!', 16, 1);
    BREAK;
  END;

  BEGIN TRY
    BEGIN TRAN;

    -- do the update of the main table
    UPDATE ht
    SET    ht.deleted = 0,
           ht.deletedDate = '2000-01-01'
    FROM   dbo.HugeTable ht
    INNER JOIN #CurrentSet cs
            ON cs.KeyField1 = ht.KeyField1
           AND cs.KeyField2 = ht.KeyField2;

    -- update the current status
    UPDATE ts
    SET    ts.RowsUpdated += @@ROWCOUNT,
           ts.LastUpdatedOn = GETDATE()
    FROM   dbo.HugeTable_TempStatus ts;

    COMMIT TRAN;
  END TRY
  BEGIN CATCH
    IF (@@TRANCOUNT > 0)
    BEGIN
      ROLLBACK TRAN;
    END;

    THROW; -- raise the error and terminate the process
  END CATCH;

  -- clear out rows to update for next iteration
  TRUNCATE TABLE #CurrentSet;

  WAITFOR DELAY '00:00:01'; -- 1 second delay for some breathing room
END;

-- clean up temp tables when testing
-- DROP TABLE #FullSet; 
-- DROP TABLE #CurrentSet;

You can then check the status at any time using the foll

Context

StackExchange Database Administrators Q#115175, answer score: 12

Revisions (0)

No revisions yet.