patternsqlMinor
Progress of SELECT INTO statement
Viewed 0 times
progressintoselectstatement
Problem
Our ETL flow has a long-running SELECT INTO-statement, that's creating a table on the fly, and populating it with several hundred million records.
The statement looks something like
For monitoring purposes, we would like to get a rough idea of the progress of this statement, while it is executing (approx. rowcount, written number of bytes, or similar).
We tried the following to no avail:
Furthermore, we can see the transaction in
I understand that on SQL Server the SELECT INTO-statement is both a DDL and a DML statement in one, and as such, the implicit table creation will be a locking operation. I still think there must be some clever way to obtain some kind of progress information while the statement is running.
The statement looks something like
SELECT ... INTO DestTable FROM SrcTableFor monitoring purposes, we would like to get a rough idea of the progress of this statement, while it is executing (approx. rowcount, written number of bytes, or similar).
We tried the following to no avail:
-- Is blocked by the SELECT INTO statement:
select count(*) from DestTable with (nolock)
-- Returns 0, 0:
select rows, rowmodctr
from sysindexes with (nolock)
where id = object_id('DestTable')
-- Returns 0:
select rows
from sys.partitions
where object_id = object_id('DestTable')Furthermore, we can see the transaction in
sys.dm_tran_active_transactions, but I was not able to find a way to get the count of affected rows on a given transaction_id (something similar to @@ROWCOUNT perhaps, but with the transaction_id as argument).I understand that on SQL Server the SELECT INTO-statement is both a DDL and a DML statement in one, and as such, the implicit table creation will be a locking operation. I still think there must be some clever way to obtain some kind of progress information while the statement is running.
Solution
I suspect that
If you want to see the details, uncomment the first row of items in the
I tested by running the following in one Session and then repeatedly running the query above in another.
rows in sys.partitions is 0 due to not being committed yet. But this does not mean that SQL Server is unaware of what will go there if the Transaction does commit. The key is in remembering that all operations go through the Buffer Pool (i.e. memory) first, regardless of COMMIT or ROLLBACK of the operation. Hence, we can look in sys.dm_os_buffer_descriptors for that info:SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
SELECT --OBJECT_NAME(sp.[object_id]) AS [TableName], sdobd., '---', sp., '---', sau.*
SUM(sdobd.[row_count]) AS [BufferPoolRows],
SUM(sp.[rows]) AS [AllocatedRows],
COUNT(*) AS [DataPages]
FROM sys.dm_os_buffer_descriptors sdobd
INNER JOIN sys.allocation_units sau
ON sau.[allocation_unit_id] = sdobd.[allocation_unit_id]
INNER JOIN sys.partitions sp
ON ( sau.[type] = 1
AND sau.[container_id] = sp.[partition_id]) -- IN_ROW_DATA
OR ( sau.[type] = 2
AND sau.[container_id] = sp.[hobt_id]) -- LOB_DATA
OR ( sau.[type] = 3
AND sau.[container_id] = sp.[partition_id]) -- ROW_OVERFLOW_DATA
WHERE sdobd.[database_id] = DB_ID()
AND sdobd.[page_type] = N'DATA_PAGE'
AND sp.[object_id] = (SELECT so.[object_id]
FROM sys.objects so
WHERE so.[name] = 'TestDump')
If you want to see the details, uncomment the first row of items in the
SELECT list, comment out the remaining 3 lines.I tested by running the following in one Session and then repeatedly running the query above in another.
SELECT so1.*
INTO dbo.TestDump
FROM sys.objects so1
CROSS JOIN sys.objects so2
CROSS JOIN sys.objects so3;
Context
StackExchange Database Administrators Q#129090, answer score: 8
Revisions (0)
No revisions yet.