HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

SQL Server: Identifying object closest to the end of a datafile

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
thedatafilesqlclosestserverendobjectidentifying

Problem

TL\DR

I'm looking for a way to efficiently identify the object located closest to the end of a SQL Server data file. This approach needs to remain performant against large data files.

What I have so far

The following query utilizes an undocumented Dynamic Management Function that shipped with SQL 2012: sys.dm_db_database_page_allocations; this DMF provides a rough equivalent to the DBCC IND command.

The following query identifies the last object in a given data file (Warning: Don't run this against a database larger than 25 GB unless you want to cancel it at some point):

-- Return object with highest Extent Page ID
SELECT   files.name as logical_file_name
        , files.physical_name as physical_file_name
        , OBJECT_SCHEMA_NAME(object_id) + N'.' + OBJECT_NAME(object_id) AS object_name
        , alloc.*
FROM sys.dm_db_database_page_allocations(DB_ID(), NULL, NULL, NULL, NULL) alloc
    INNER JOIN sys.database_files files
        ON alloc.extent_file_id = files.file_id
WHERE is_allocated = 1
    AND files.name = 'Logical_FileName'
ORDER BY files.name , files.physical_name, extent_page_id DESC


What's wrong with this approach

As the Warning above implies, this query will run slower as the size of the database increases because the function is really designed for a pointed approach to look at a specific object and not look at a specific data file in question. When passing in the NULL parameters as I did, this function likely iterates through all objects within the database behind the scenes and spits out the combined output. This accomplishes what I need, but it does so in a very brute-force way that doesn't lend itself to optimizations.

What I'm asking for

I'm hoping there's a way to iterate through the GAM, SGAM, and/or IAM chains to quickly identify the object at the end of a given data file. I'm assuming I have to push this approach outside of TSQL to something like PowerShell and go back to using DBCC PAGE calls, or something of tha

Solution

I think this will do it. This code basically does the following:

  • Retrieve highest allocated page ID from the last GAM interval in the file



  • Retrieve highest allocated page ID from the last SGAM interval in the file



  • Compare two values to find highest page



  • Identify the last ObjectId (table) from the Last allocated page



  • Identify the Index defined on the Object as well as it's partition



  • Provide a DBCC SHRINKFILE command that will release only the remaining white-space at the end of the file back to the OS (which should be immediate) and is effectively equivalent to DBCC SHRINKFILE using TRUNCATEONLY



This is nested in a cursor that iterates through Page IDs of the data files within the database and executes pretty quickly based on my localized testing. I've also added functionality to identify if the end of a data file is occupied by pages that are not reserved by tables or indexes, such as an IAM or PFS page.

```
SET NOCOUNT ON;

-- Create Temp Table to push DBCC PAGE results into
CREATE TABLE #dbccPage_output(
ID INT IDENTITY(1,1)
, [ParentObject] VARCHAR(255)
, [Object] VARCHAR(255)
, [Field] VARCHAR(255)
, [Value] VARCHAR(255)
)
GO

-- Variables to hold pointer information for traversing GAM and SGAM pages
DECLARE @GAM_maxPageID INT, @SGAM_maxPageID INT, @maxPageID INT,
@GAM_page INT, @SGAM_page INT
DECLARE @stmt VARCHAR(2000)

-- Final Output Table
DECLARE @myOutputTable TABLE
(
LogicalFileName VARCHAR(255)
, ObjectID BIGINT
, IndexID BIGINT
, PartitionID BIGINT
, MaxPageID BIGINT
)

-- Cursor to iterate through each file
DECLARE cursorFileIds CURSOR
FOR
SELECT file_id, size
FROM sys.database_files
WHERE type = 0

-- Variable to hold fileID
DECLARE @fileID INT, @size INT, @interval INT

-- Inject the data into the cursor
OPEN cursorFileIds
FETCH NEXT FROM cursorFileIds
INTO @fileID, @size

-- Enter the While Loop. This loop will end when the
-- end of the data injected into the cursor is reached.
WHILE @@FETCH_STATUS = 0
BEGIN
-- Truncate table (mainly used for 2nd pass and forward)
TRUNCATE TABLE #dbccPage_output

-- Referenced if we need to step back a GAM/SGAM interval
STEPBACK:

-- # of pages in a GAM interval
SET @interval = @size / 511232
-- Set GAM Page to read
SET @GAM_page = CASE @interval WHEN 0 THEN 2 ELSE @interval * 511232 END
-- Set SGAM page to read (always the next page after the GAM)
SET @SGAM_page = CASE @interval WHEN 0 THEN 3 ELSE (@interval * 511232) + 1 END

-- Search Last GAM Interval page
SET @stmt = 'DBCC PAGE(0, ' + CAST(@fileID AS VARCHAR(10)) + ', ' + CAST(@GAM_page AS VARCHAR(20)) + ', 3) WITH TABLERESULTS, NO_INFOMSGS' -- GAM on Primary Datafile
PRINT @stmt

INSERT INTO #dbccPage_output ([ParentObject], [Object], [Field], [Value])
EXEC (@stmt)

-- Get Last Allocated Page Number
SELECT TOP 1
@GAM_maxPageID = REVERSE(SUBSTRING(REVERSE(Field), CHARINDEX(')', REVERSE(Field)) + 1, CHARINDEX(':', REVERSE(Field)) - CHARINDEX(')', REVERSE(Field)) - 1))
FROM #dbccPage_output
WHERE [Value] = ' ALLOCATED'
ORDER BY ID DESC

-- Truncate Table
TRUNCATE TABLE #dbccPage_output

-- Search Last SGAM Interval page
SET @stmt = 'DBCC PAGE(0, ' + CAST(@fileID AS VARCHAR(10)) + ', ' + CAST(@SGAM_page AS VARCHAR(20)) + ', 3) WITH TABLERESULTS, NO_INFOMSGS' -- SGAM on Primary Datafile
PRINT @stmt

INSERT INTO #dbccPage_output ([ParentObject], [Object], [Field], [Value])
EXEC (@stmt)

-- Get Last Allocated Page Number
SELECT TOP 1
@SGAM_maxPageID = REVERSE(SUBSTRING(REVERSE(Field), CHARINDEX(')', REVERSE(Field)) + 1, CHARINDEX(':', REVERSE(Field)) - CHARINDEX(')', REVERSE(Field)) - 1))
FROM #dbccPage_output
WHERE [Value] = ' ALLOCATED'
ORDER BY ID DESC

-- Get highest page value between SGAM and GAM
SELECT @maxPageID = MAX(t.value)
FROM (VALUES (@GAM_maxPageID), (@SGAM_maxPageID)) t(value)

TRUNCATE TABLE #dbccPage_output

-- Check if GAM or SGAM is last allocated page in the chain, if so, step back one interval
IF(@maxPageID IN (@GAM_page, @SGAM_page))
BEGIN
SET @size = ABS(@size - 511232)
GOTO STEPBACK
END

-- Search Highest Page Number of Data File
SET @stmt = 'DBCC PAGE(0, ' + CAST(@fileID AS VARCHAR(10)) + ', ' + CAST(CASE WHEN @maxPageID = @SGAM_maxPageID THEN @maxPageID + 7 ELSE @maxPageID END AS VARCHAR(50)) + ', 1) WITH TABLERESULTS, NO_INFOMSGS' -- Page ID of Last Allocated Object
PRINT @stmt

INSERT INTO #dbccPage_output ([ParentObject], [Object], [Field], [Value])
EXEC (@stmt)

-- Captu

Code Snippets

SET NOCOUNT ON;

-- Create Temp Table to push DBCC PAGE results into
CREATE TABLE #dbccPage_output(
      ID                INT IDENTITY(1,1)
    , [ParentObject]    VARCHAR(255)
    , [Object]          VARCHAR(255)
    , [Field]           VARCHAR(255)
    , [Value]           VARCHAR(255)
)
GO

-- Variables to hold pointer information for traversing GAM and SGAM pages
DECLARE @GAM_maxPageID INT, @SGAM_maxPageID INT, @maxPageID INT,
        @GAM_page INT, @SGAM_page INT
DECLARE @stmt VARCHAR(2000)

-- Final Output Table
DECLARE @myOutputTable TABLE
(
      LogicalFileName   VARCHAR(255)
    , ObjectID          BIGINT
    , IndexID           BIGINT
    , PartitionID       BIGINT
    , MaxPageID         BIGINT
)

-- Cursor to iterate through each file
DECLARE cursorFileIds CURSOR
FOR
        SELECT file_id, size
        FROM sys.database_files
        WHERE type = 0

-- Variable to hold fileID
DECLARE @fileID INT, @size INT, @interval INT

-- Inject the data into the cursor
OPEN cursorFileIds
FETCH NEXT FROM cursorFileIds
INTO @fileID, @size

-- Enter the While Loop.  This loop will end when the
--  end of the data injected into the cursor is reached.
WHILE @@FETCH_STATUS = 0
BEGIN
        -- Truncate table (mainly used for 2nd pass and forward)
        TRUNCATE TABLE #dbccPage_output

        -- Referenced if we need to step back a GAM/SGAM interval
        STEPBACK:

        -- # of pages in a GAM interval
        SET @interval = @size / 511232
        -- Set GAM Page to read
        SET @GAM_page = CASE @interval WHEN 0 THEN 2 ELSE @interval * 511232 END
        -- Set SGAM page to read (always the next page after the GAM)
        SET @SGAM_page = CASE @interval WHEN 0 THEN 3 ELSE (@interval * 511232) + 1 END

        -- Search Last GAM Interval page
        SET @stmt = 'DBCC PAGE(0, ' + CAST(@fileID AS VARCHAR(10)) + ', ' + CAST(@GAM_page AS VARCHAR(20)) + ', 3) WITH TABLERESULTS, NO_INFOMSGS' -- GAM on Primary Datafile
        PRINT @stmt

        INSERT INTO #dbccPage_output ([ParentObject], [Object], [Field], [Value])
        EXEC (@stmt)

        -- Get Last Allocated Page Number
        SELECT TOP 1
                @GAM_maxPageID = REVERSE(SUBSTRING(REVERSE(Field), CHARINDEX(')', REVERSE(Field)) + 1, CHARINDEX(':', REVERSE(Field)) - CHARINDEX(')', REVERSE(Field)) - 1))
        FROM #dbccPage_output
        WHERE [Value] = '    ALLOCATED'
        ORDER BY ID DESC

        -- Truncate Table
        TRUNCATE TABLE #dbccPage_output

        -- Search Last SGAM Interval page
        SET @stmt = 'DBCC PAGE(0, ' + CAST(@fileID AS VARCHAR(10)) + ', ' + CAST(@SGAM_page AS VARCHAR(20)) + ', 3) WITH TABLERESULTS, NO_INFOMSGS' -- SGAM on Primary Datafile
        PRINT @stmt

        INSERT INTO #dbccPage_output ([ParentObject], [Object], [Field], [Value])
        EXEC (@stmt)

        -- Get Last Allocated Page Number
        SELECT TOP 1
                @SGAM_maxPageID = REVERSE(SUBSTRING(REVERSE(Field), CHARINDEX(')', REVERSE(Field)) + 1, CHARINDEX('

Context

StackExchange Database Administrators Q#187446, answer score: 4

Revisions (0)

No revisions yet.