patternsqlMinor
SQL Server: Identifying object closest to the end of a datafile
Viewed 0 times
thedatafilesqlclosestserverendobjectidentifying
Problem
TL\DR
I'm looking for a way to efficiently identify the object located closest to the end of a SQL Server data file. This approach needs to remain performant against large data files.
What I have so far
The following query utilizes an undocumented Dynamic Management Function that shipped with SQL 2012:
The following query identifies the last object in a given data file (Warning: Don't run this against a database larger than 25 GB unless you want to cancel it at some point):
What's wrong with this approach
As the Warning above implies, this query will run slower as the size of the database increases because the function is really designed for a pointed approach to look at a specific object and not look at a specific data file in question. When passing in the
What I'm asking for
I'm hoping there's a way to iterate through the GAM, SGAM, and/or IAM chains to quickly identify the object at the end of a given data file. I'm assuming I have to push this approach outside of TSQL to something like PowerShell and go back to using DBCC PAGE calls, or something of tha
I'm looking for a way to efficiently identify the object located closest to the end of a SQL Server data file. This approach needs to remain performant against large data files.
What I have so far
The following query utilizes an undocumented Dynamic Management Function that shipped with SQL 2012:
sys.dm_db_database_page_allocations; this DMF provides a rough equivalent to the DBCC IND command.The following query identifies the last object in a given data file (Warning: Don't run this against a database larger than 25 GB unless you want to cancel it at some point):
-- Return object with highest Extent Page ID
SELECT files.name as logical_file_name
, files.physical_name as physical_file_name
, OBJECT_SCHEMA_NAME(object_id) + N'.' + OBJECT_NAME(object_id) AS object_name
, alloc.*
FROM sys.dm_db_database_page_allocations(DB_ID(), NULL, NULL, NULL, NULL) alloc
INNER JOIN sys.database_files files
ON alloc.extent_file_id = files.file_id
WHERE is_allocated = 1
AND files.name = 'Logical_FileName'
ORDER BY files.name , files.physical_name, extent_page_id DESCWhat's wrong with this approach
As the Warning above implies, this query will run slower as the size of the database increases because the function is really designed for a pointed approach to look at a specific object and not look at a specific data file in question. When passing in the
NULL parameters as I did, this function likely iterates through all objects within the database behind the scenes and spits out the combined output. This accomplishes what I need, but it does so in a very brute-force way that doesn't lend itself to optimizations.What I'm asking for
I'm hoping there's a way to iterate through the GAM, SGAM, and/or IAM chains to quickly identify the object at the end of a given data file. I'm assuming I have to push this approach outside of TSQL to something like PowerShell and go back to using DBCC PAGE calls, or something of tha
Solution
I think this will do it. This code basically does the following:
This is nested in a cursor that iterates through Page IDs of the data files within the database and executes pretty quickly based on my localized testing. I've also added functionality to identify if the end of a data file is occupied by pages that are not reserved by tables or indexes, such as an IAM or PFS page.
```
SET NOCOUNT ON;
-- Create Temp Table to push DBCC PAGE results into
CREATE TABLE #dbccPage_output(
ID INT IDENTITY(1,1)
, [ParentObject] VARCHAR(255)
, [Object] VARCHAR(255)
, [Field] VARCHAR(255)
, [Value] VARCHAR(255)
)
GO
-- Variables to hold pointer information for traversing GAM and SGAM pages
DECLARE @GAM_maxPageID INT, @SGAM_maxPageID INT, @maxPageID INT,
@GAM_page INT, @SGAM_page INT
DECLARE @stmt VARCHAR(2000)
-- Final Output Table
DECLARE @myOutputTable TABLE
(
LogicalFileName VARCHAR(255)
, ObjectID BIGINT
, IndexID BIGINT
, PartitionID BIGINT
, MaxPageID BIGINT
)
-- Cursor to iterate through each file
DECLARE cursorFileIds CURSOR
FOR
SELECT file_id, size
FROM sys.database_files
WHERE type = 0
-- Variable to hold fileID
DECLARE @fileID INT, @size INT, @interval INT
-- Inject the data into the cursor
OPEN cursorFileIds
FETCH NEXT FROM cursorFileIds
INTO @fileID, @size
-- Enter the While Loop. This loop will end when the
-- end of the data injected into the cursor is reached.
WHILE @@FETCH_STATUS = 0
BEGIN
-- Truncate table (mainly used for 2nd pass and forward)
TRUNCATE TABLE #dbccPage_output
-- Referenced if we need to step back a GAM/SGAM interval
STEPBACK:
-- # of pages in a GAM interval
SET @interval = @size / 511232
-- Set GAM Page to read
SET @GAM_page = CASE @interval WHEN 0 THEN 2 ELSE @interval * 511232 END
-- Set SGAM page to read (always the next page after the GAM)
SET @SGAM_page = CASE @interval WHEN 0 THEN 3 ELSE (@interval * 511232) + 1 END
-- Search Last GAM Interval page
SET @stmt = 'DBCC PAGE(0, ' + CAST(@fileID AS VARCHAR(10)) + ', ' + CAST(@GAM_page AS VARCHAR(20)) + ', 3) WITH TABLERESULTS, NO_INFOMSGS' -- GAM on Primary Datafile
PRINT @stmt
INSERT INTO #dbccPage_output ([ParentObject], [Object], [Field], [Value])
EXEC (@stmt)
-- Get Last Allocated Page Number
SELECT TOP 1
@GAM_maxPageID = REVERSE(SUBSTRING(REVERSE(Field), CHARINDEX(')', REVERSE(Field)) + 1, CHARINDEX(':', REVERSE(Field)) - CHARINDEX(')', REVERSE(Field)) - 1))
FROM #dbccPage_output
WHERE [Value] = ' ALLOCATED'
ORDER BY ID DESC
-- Truncate Table
TRUNCATE TABLE #dbccPage_output
-- Search Last SGAM Interval page
SET @stmt = 'DBCC PAGE(0, ' + CAST(@fileID AS VARCHAR(10)) + ', ' + CAST(@SGAM_page AS VARCHAR(20)) + ', 3) WITH TABLERESULTS, NO_INFOMSGS' -- SGAM on Primary Datafile
PRINT @stmt
INSERT INTO #dbccPage_output ([ParentObject], [Object], [Field], [Value])
EXEC (@stmt)
-- Get Last Allocated Page Number
SELECT TOP 1
@SGAM_maxPageID = REVERSE(SUBSTRING(REVERSE(Field), CHARINDEX(')', REVERSE(Field)) + 1, CHARINDEX(':', REVERSE(Field)) - CHARINDEX(')', REVERSE(Field)) - 1))
FROM #dbccPage_output
WHERE [Value] = ' ALLOCATED'
ORDER BY ID DESC
-- Get highest page value between SGAM and GAM
SELECT @maxPageID = MAX(t.value)
FROM (VALUES (@GAM_maxPageID), (@SGAM_maxPageID)) t(value)
TRUNCATE TABLE #dbccPage_output
-- Check if GAM or SGAM is last allocated page in the chain, if so, step back one interval
IF(@maxPageID IN (@GAM_page, @SGAM_page))
BEGIN
SET @size = ABS(@size - 511232)
GOTO STEPBACK
END
-- Search Highest Page Number of Data File
SET @stmt = 'DBCC PAGE(0, ' + CAST(@fileID AS VARCHAR(10)) + ', ' + CAST(CASE WHEN @maxPageID = @SGAM_maxPageID THEN @maxPageID + 7 ELSE @maxPageID END AS VARCHAR(50)) + ', 1) WITH TABLERESULTS, NO_INFOMSGS' -- Page ID of Last Allocated Object
PRINT @stmt
INSERT INTO #dbccPage_output ([ParentObject], [Object], [Field], [Value])
EXEC (@stmt)
-- Captu
- Retrieve highest allocated page ID from the last GAM interval in the file
- Retrieve highest allocated page ID from the last SGAM interval in the file
- Compare two values to find highest page
- Identify the last ObjectId (table) from the Last allocated page
- Identify the Index defined on the Object as well as it's partition
- Provide a
DBCC SHRINKFILEcommand that will release only the remaining white-space at the end of the file back to the OS (which should be immediate) and is effectively equivalent toDBCC SHRINKFILEusingTRUNCATEONLY
This is nested in a cursor that iterates through Page IDs of the data files within the database and executes pretty quickly based on my localized testing. I've also added functionality to identify if the end of a data file is occupied by pages that are not reserved by tables or indexes, such as an IAM or PFS page.
```
SET NOCOUNT ON;
-- Create Temp Table to push DBCC PAGE results into
CREATE TABLE #dbccPage_output(
ID INT IDENTITY(1,1)
, [ParentObject] VARCHAR(255)
, [Object] VARCHAR(255)
, [Field] VARCHAR(255)
, [Value] VARCHAR(255)
)
GO
-- Variables to hold pointer information for traversing GAM and SGAM pages
DECLARE @GAM_maxPageID INT, @SGAM_maxPageID INT, @maxPageID INT,
@GAM_page INT, @SGAM_page INT
DECLARE @stmt VARCHAR(2000)
-- Final Output Table
DECLARE @myOutputTable TABLE
(
LogicalFileName VARCHAR(255)
, ObjectID BIGINT
, IndexID BIGINT
, PartitionID BIGINT
, MaxPageID BIGINT
)
-- Cursor to iterate through each file
DECLARE cursorFileIds CURSOR
FOR
SELECT file_id, size
FROM sys.database_files
WHERE type = 0
-- Variable to hold fileID
DECLARE @fileID INT, @size INT, @interval INT
-- Inject the data into the cursor
OPEN cursorFileIds
FETCH NEXT FROM cursorFileIds
INTO @fileID, @size
-- Enter the While Loop. This loop will end when the
-- end of the data injected into the cursor is reached.
WHILE @@FETCH_STATUS = 0
BEGIN
-- Truncate table (mainly used for 2nd pass and forward)
TRUNCATE TABLE #dbccPage_output
-- Referenced if we need to step back a GAM/SGAM interval
STEPBACK:
-- # of pages in a GAM interval
SET @interval = @size / 511232
-- Set GAM Page to read
SET @GAM_page = CASE @interval WHEN 0 THEN 2 ELSE @interval * 511232 END
-- Set SGAM page to read (always the next page after the GAM)
SET @SGAM_page = CASE @interval WHEN 0 THEN 3 ELSE (@interval * 511232) + 1 END
-- Search Last GAM Interval page
SET @stmt = 'DBCC PAGE(0, ' + CAST(@fileID AS VARCHAR(10)) + ', ' + CAST(@GAM_page AS VARCHAR(20)) + ', 3) WITH TABLERESULTS, NO_INFOMSGS' -- GAM on Primary Datafile
PRINT @stmt
INSERT INTO #dbccPage_output ([ParentObject], [Object], [Field], [Value])
EXEC (@stmt)
-- Get Last Allocated Page Number
SELECT TOP 1
@GAM_maxPageID = REVERSE(SUBSTRING(REVERSE(Field), CHARINDEX(')', REVERSE(Field)) + 1, CHARINDEX(':', REVERSE(Field)) - CHARINDEX(')', REVERSE(Field)) - 1))
FROM #dbccPage_output
WHERE [Value] = ' ALLOCATED'
ORDER BY ID DESC
-- Truncate Table
TRUNCATE TABLE #dbccPage_output
-- Search Last SGAM Interval page
SET @stmt = 'DBCC PAGE(0, ' + CAST(@fileID AS VARCHAR(10)) + ', ' + CAST(@SGAM_page AS VARCHAR(20)) + ', 3) WITH TABLERESULTS, NO_INFOMSGS' -- SGAM on Primary Datafile
PRINT @stmt
INSERT INTO #dbccPage_output ([ParentObject], [Object], [Field], [Value])
EXEC (@stmt)
-- Get Last Allocated Page Number
SELECT TOP 1
@SGAM_maxPageID = REVERSE(SUBSTRING(REVERSE(Field), CHARINDEX(')', REVERSE(Field)) + 1, CHARINDEX(':', REVERSE(Field)) - CHARINDEX(')', REVERSE(Field)) - 1))
FROM #dbccPage_output
WHERE [Value] = ' ALLOCATED'
ORDER BY ID DESC
-- Get highest page value between SGAM and GAM
SELECT @maxPageID = MAX(t.value)
FROM (VALUES (@GAM_maxPageID), (@SGAM_maxPageID)) t(value)
TRUNCATE TABLE #dbccPage_output
-- Check if GAM or SGAM is last allocated page in the chain, if so, step back one interval
IF(@maxPageID IN (@GAM_page, @SGAM_page))
BEGIN
SET @size = ABS(@size - 511232)
GOTO STEPBACK
END
-- Search Highest Page Number of Data File
SET @stmt = 'DBCC PAGE(0, ' + CAST(@fileID AS VARCHAR(10)) + ', ' + CAST(CASE WHEN @maxPageID = @SGAM_maxPageID THEN @maxPageID + 7 ELSE @maxPageID END AS VARCHAR(50)) + ', 1) WITH TABLERESULTS, NO_INFOMSGS' -- Page ID of Last Allocated Object
PRINT @stmt
INSERT INTO #dbccPage_output ([ParentObject], [Object], [Field], [Value])
EXEC (@stmt)
-- Captu
Code Snippets
SET NOCOUNT ON;
-- Create Temp Table to push DBCC PAGE results into
CREATE TABLE #dbccPage_output(
ID INT IDENTITY(1,1)
, [ParentObject] VARCHAR(255)
, [Object] VARCHAR(255)
, [Field] VARCHAR(255)
, [Value] VARCHAR(255)
)
GO
-- Variables to hold pointer information for traversing GAM and SGAM pages
DECLARE @GAM_maxPageID INT, @SGAM_maxPageID INT, @maxPageID INT,
@GAM_page INT, @SGAM_page INT
DECLARE @stmt VARCHAR(2000)
-- Final Output Table
DECLARE @myOutputTable TABLE
(
LogicalFileName VARCHAR(255)
, ObjectID BIGINT
, IndexID BIGINT
, PartitionID BIGINT
, MaxPageID BIGINT
)
-- Cursor to iterate through each file
DECLARE cursorFileIds CURSOR
FOR
SELECT file_id, size
FROM sys.database_files
WHERE type = 0
-- Variable to hold fileID
DECLARE @fileID INT, @size INT, @interval INT
-- Inject the data into the cursor
OPEN cursorFileIds
FETCH NEXT FROM cursorFileIds
INTO @fileID, @size
-- Enter the While Loop. This loop will end when the
-- end of the data injected into the cursor is reached.
WHILE @@FETCH_STATUS = 0
BEGIN
-- Truncate table (mainly used for 2nd pass and forward)
TRUNCATE TABLE #dbccPage_output
-- Referenced if we need to step back a GAM/SGAM interval
STEPBACK:
-- # of pages in a GAM interval
SET @interval = @size / 511232
-- Set GAM Page to read
SET @GAM_page = CASE @interval WHEN 0 THEN 2 ELSE @interval * 511232 END
-- Set SGAM page to read (always the next page after the GAM)
SET @SGAM_page = CASE @interval WHEN 0 THEN 3 ELSE (@interval * 511232) + 1 END
-- Search Last GAM Interval page
SET @stmt = 'DBCC PAGE(0, ' + CAST(@fileID AS VARCHAR(10)) + ', ' + CAST(@GAM_page AS VARCHAR(20)) + ', 3) WITH TABLERESULTS, NO_INFOMSGS' -- GAM on Primary Datafile
PRINT @stmt
INSERT INTO #dbccPage_output ([ParentObject], [Object], [Field], [Value])
EXEC (@stmt)
-- Get Last Allocated Page Number
SELECT TOP 1
@GAM_maxPageID = REVERSE(SUBSTRING(REVERSE(Field), CHARINDEX(')', REVERSE(Field)) + 1, CHARINDEX(':', REVERSE(Field)) - CHARINDEX(')', REVERSE(Field)) - 1))
FROM #dbccPage_output
WHERE [Value] = ' ALLOCATED'
ORDER BY ID DESC
-- Truncate Table
TRUNCATE TABLE #dbccPage_output
-- Search Last SGAM Interval page
SET @stmt = 'DBCC PAGE(0, ' + CAST(@fileID AS VARCHAR(10)) + ', ' + CAST(@SGAM_page AS VARCHAR(20)) + ', 3) WITH TABLERESULTS, NO_INFOMSGS' -- SGAM on Primary Datafile
PRINT @stmt
INSERT INTO #dbccPage_output ([ParentObject], [Object], [Field], [Value])
EXEC (@stmt)
-- Get Last Allocated Page Number
SELECT TOP 1
@SGAM_maxPageID = REVERSE(SUBSTRING(REVERSE(Field), CHARINDEX(')', REVERSE(Field)) + 1, CHARINDEX('Context
StackExchange Database Administrators Q#187446, answer score: 4
Revisions (0)
No revisions yet.