HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Reducing Table Size after Clearing TEXT Column Precautions

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
aftercolumnsizetextreducingclearingprecautionstable

Problem

I have 3rd part app with a table that's 450GB which contains file attachments in a TEXT column (yeah, I know). With a document retention angle, I've exported about 35% of the older attachments to disk using a portable drive. The TEXT column now has a note that attachment was archived (beneficial to the app's UI) - from multiple MB/GB down to less than 1KB.

I know the argument against shrinking the database. Options I've gathered from this forum include (1) copy table, drop original table, rename the copy, (2) export table using xp_cmdshell 'bcp...', truncate table, import data with BULK INSERT from 'c:...' WITH ...; and (3) run ALTER INDEX ALL ON dbo.Table REORGANIZE WITH (LOB_COMPACTION = ON)

My biggest concerns are the resources (disk space, CPU, memory...) SQL Server will use for any option, especially given the size of the table. As for time, I can block off weekend time if needed but what about log files, tempdb files, etc. What settings should I set/change before embarking down any one of these options?

The server has 160GB of ram with 128GB for SQL Server (2016), 2 nodes w/ 20 processors each rated at 2.2 GHz on MS Windows Server 2012 R2 Standard.

Solution

The downsides of your options seem to be:

  • Copy and Renaming tables can mess up stored procedures and views, requiring recompilation. It also requires double the disk space.



  • Exporting using bcp and reimporting it is messy, and you still need all that space, albeit not in SQL Server itself. If the tables are small there is no benefit over option 1 anyway.



  • REORGANIZE WITH (LOB_COMPACTION = ON) can be pretty slow, and may also require extra space. It will also bloat the transaction log due to heavy page and row movement.



A better option than any of the ones you have presented is a table switch.

Similar to the drop/rename idea, you create a new temporary table that is exactly the same as the original. Insert the data to keep into the new table. Then truncate the old one and switch the new one in. Switching is a fast metadata operation, that requires just swapping the table headers.

So let's say you have an existing table
CREATE TABLE MainTable (
id int PRIMARY KEY IDENTITY,
document text
);


You need to copy the schema exactly, this includes primary/unique/foreign/check constraints and indexes. IDENTITY does not need to be there, and is probably easier if you leave it out. You can create a copy of this schema quickly by right-clicking the table in SSMS and selecting Script Table As -> CREATE.
CREATE TABLE CopyTable (
id int PRIMARY KEY,
document text
);


Then you can copy all the data in
BEGIN TRAN;

INSERT CopyTable WITH (TABLOCKX)
(id, document)
SELECT id, document
FROM MainTable WITH (TABLOCK);


And truncate the old table and switch
TRUNCATE TABLE MainTable;

ALTER TABLE CopyTable
SWITCH TO MainTable;

-- if you have an IDENTITY column then:
DBCC CHECKIDENT ('MainTable');

COMMIT;


Finally drop the copy table, which is now empty:
DROP TABLE CopyTable;


You can move around or remove the BEGIN TRAN and COMMIT depending on what else is running concurrently.

I recommend while you are at it, to change the column data type from text (which is deprecated) to varchar(max). As far as I am aware they implicitly convert to each other, and therefore this should not affect the application at all. varchar(max) has the benefit of storing <8kb values within the row, rather than on LOB pages.

I suggest you do this before copying the table.

Context

StackExchange Database Administrators Q#314886, answer score: 5

Revisions (0)

No revisions yet.