snippetsqlMinor
sys.partitions row count is badly wrong - how to correct this?
Viewed 0 times
thisbadlypartitionssyscorrectwronghowcountrow
Problem
Querying sys.partitions can return an approximate row count for a table.
I've noticed that this is returning the same row count for all partitions, regardless of the actual content (even for empty partitions).
The table has a clustered columnstore index and statistics have been created on almost all of the columns. Statistics are updated daily after each data load. The table is partitioned by date.
sys.partitions query:
table query:
Sample results from both queries:
I've noticed that this is returning the same row count for all partitions, regardless of the actual content (even for empty partitions).
The table has a clustered columnstore index and statistics have been created on almost all of the columns. Statistics are updated daily after each data load. The table is partitioned by date.
sys.partitions query:
SELECT convert(date, convert(varchar,rv.[value])) as partitionDate, p.rows as syspartitions_RowCount
FROM sys.tables t
join sys.schemas sc on sc.schema_id = t.schema_id
JOIN sys.partitions p ON p.[object_id] = t.[object_id]
JOIN sys.indexes i ON i.[object_id] = p.[object_id]
AND i.[index_id] = p.[index_id]
JOIN sys.data_spaces ds ON ds.[data_space_id] = i.[data_space_id]
LEFT JOIN sys.partition_schemes ps ON ps.[data_space_id] = ds.[data_space_id]
LEFT JOIN sys.partition_functions pf ON pf.[function_id] = ps.[function_id]
LEFT JOIN sys.partition_range_values rv ON rv.[function_id] = pf.[function_id]
AND rv.[boundary_id]+1 = p.[partition_number]
WHERE p.[index_id] '2016-05-31'
order by convert(date, convert(varchar,rv.[value])),
t.[name]table query:
select date, count_big(*) as real_count
from temp.tbl
where date > '2016-05-31'
group by date
order by dateSample results from both queries:
Solution
Try using
For Azure SQL Data Warehouse, you'll need to use
Note, I removed the
In the on-prem version of SQL Server,
The on-prem version of
While both
sys.dm_db_partition_stats instead of sys.partitions, as in:SELECT ObjectName = QUOTENAME(sc.name) + '.' + QUOTENAME(t.name)
, RangeValue = rv.value
, sys_partitions_RowCount = p.rows
, sys_dm_db_partition_stats_row_count = ddps.row_count
FROM sys.tables t
INNER JOIN sys.schemas sc ON t.schema_id = sc.schema_id
INNER JOIN sys.partitions p ON t.object_id = p.object_id
INNER JOIN sys.indexes i ON t.object_id = i.object_id
AND p.index_id = i.index_id
INNER JOIN sys.data_spaces ds ON ds.data_space_id = i.data_space_id
INNER JOIN sys.partition_schemes ps ON ps.data_space_id = ds.data_space_id
INNER JOIN sys.partition_functions pf ON pf.function_id = ps.function_id
INNER JOIN sys.partition_range_values rv ON rv.function_id = pf.function_id
AND (rv.boundary_id + 1) = p.partition_number
INNER JOIN sys.dm_db_partition_stats ddps ON t.object_id = ddps.object_id
AND p.partition_id = ddps.partition_id
WHERE p.index_id <= 1
and t.name ='tbl'
and sc.name = 'temp'
ORDER BY sc.name
, t.name
, rv.value;For Azure SQL Data Warehouse, you'll need to use
sys.dm_pdw_nodes_db_partition_stats instead of sys.dm_db_partition_stats, even though they contain the same details.Note, I removed the
CONVERT(date,...) functionality so this code is compatible with all partition schemes, not just those with date range values.In the on-prem version of SQL Server,
sys.partitions gets its row counts from the internal table ALUCOUNT or sys.sysrowsets, if ALUCOUNT.rows is NULL. The definition of sys.partitions is:CREATE VIEW sys.partitions AS
SELECT rs.rowsetid AS partition_id
, rs.idmajor AS object_id
, rs.idminor AS index_id
, rs.numpart AS partition_number
, rs.rowsetid AS hobt_id
, isnull(ct.rows, rs.rcrows) AS rows
, rs.fgidfs AS filestream_filegroup_id
, cmprlevel AS data_compression
, cl.name AS data_compression_desc
FROM sys.sysrowsets rs OUTER APPLY OpenRowset(TABLE ALUCOUNT, rs.rowsetid, 0, 0) ct
LEFT JOIN sys.syspalvalues cl ON cl.class = 'CMPL' AND cl.value = cmprlevelThe on-prem version of
sys.dm_db_partition_stats gets its row counts differently, from the internal table, PARTITIONCOUNTS:CREATE VIEW sys.dm_db_partition_stats AS
SELECT c.partition_id
, i.object_id
, i.index_id
, c.partition_number
, c.in_row_data_page_count
, c.in_row_used_page_count
, c.in_row_reserved_page_count
, c.lob_used_page_count
, c.lob_reserved_page_count
, c.row_overflow_used_page_count
, c.row_overflow_reserved_page_count
, c.used_page_count
, c.reserved_page_count
, c.row_count
FROM sys.indexes$ i
CROSS APPLY OpenRowSet(TABLE PARTITIONCOUNTS, i.object_id, i.index_id, i.rowset) cWhile both
sys.partitions and sys.dm_db_partition_stats should both have correct row counts, I'd put more trust in the PARTITIONCOUNTS internal table.Code Snippets
SELECT ObjectName = QUOTENAME(sc.name) + '.' + QUOTENAME(t.name)
, RangeValue = rv.value
, sys_partitions_RowCount = p.rows
, sys_dm_db_partition_stats_row_count = ddps.row_count
FROM sys.tables t
INNER JOIN sys.schemas sc ON t.schema_id = sc.schema_id
INNER JOIN sys.partitions p ON t.object_id = p.object_id
INNER JOIN sys.indexes i ON t.object_id = i.object_id
AND p.index_id = i.index_id
INNER JOIN sys.data_spaces ds ON ds.data_space_id = i.data_space_id
INNER JOIN sys.partition_schemes ps ON ps.data_space_id = ds.data_space_id
INNER JOIN sys.partition_functions pf ON pf.function_id = ps.function_id
INNER JOIN sys.partition_range_values rv ON rv.function_id = pf.function_id
AND (rv.boundary_id + 1) = p.partition_number
INNER JOIN sys.dm_db_partition_stats ddps ON t.object_id = ddps.object_id
AND p.partition_id = ddps.partition_id
WHERE p.index_id <= 1
and t.name ='tbl'
and sc.name = 'temp'
ORDER BY sc.name
, t.name
, rv.value;CREATE VIEW sys.partitions AS
SELECT rs.rowsetid AS partition_id
, rs.idmajor AS object_id
, rs.idminor AS index_id
, rs.numpart AS partition_number
, rs.rowsetid AS hobt_id
, isnull(ct.rows, rs.rcrows) AS rows
, rs.fgidfs AS filestream_filegroup_id
, cmprlevel AS data_compression
, cl.name AS data_compression_desc
FROM sys.sysrowsets rs OUTER APPLY OpenRowset(TABLE ALUCOUNT, rs.rowsetid, 0, 0) ct
LEFT JOIN sys.syspalvalues cl ON cl.class = 'CMPL' AND cl.value = cmprlevelCREATE VIEW sys.dm_db_partition_stats AS
SELECT c.partition_id
, i.object_id
, i.index_id
, c.partition_number
, c.in_row_data_page_count
, c.in_row_used_page_count
, c.in_row_reserved_page_count
, c.lob_used_page_count
, c.lob_reserved_page_count
, c.row_overflow_used_page_count
, c.row_overflow_reserved_page_count
, c.used_page_count
, c.reserved_page_count
, c.row_count
FROM sys.indexes$ i
CROSS APPLY OpenRowSet(TABLE PARTITIONCOUNTS, i.object_id, i.index_id, i.rowset) cContext
StackExchange Database Administrators Q#181740, answer score: 6
Revisions (0)
No revisions yet.