patternsqlModerate
IO_STALL question and understanding
Viewed 0 times
andunderstandingquestionio_stall
Problem
I am collecting IO_STALLS from sys.dm_io_virtual_file_stats every 5minutes and then doing a delta to see which files are being affected most by IO.
In one 5min period I get a delta of 5826331 ms which is 97minutes.
I am a little confused by this, is this saying that an operation started 97mins ago only just finished at that point and hence recorded that wait time?
Thanks
Added code as requested:
```
/*
USE [SysDBA]
GO
*/
/ Object: Table [dbo].[DISKIOPS] Script Date: 04/07/2013 11:40:15 /
/*
DROP TABLE [dbo].[DISKIOPS]
GO
*/
--Create the table
/ Object: Table [dbo].[DISKIOPS] Script Date: 04/07/2013 11:40:15 /
/*
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].DISKIOPS NULL,
[NAME] varchar NULL,
[FILE_ID] [int] NULL,
[DB_FILE_TYPE] varchar NULL,
[DISK] varchar NULL,
[FILE_LOCATION] varchar NULL,
[TIMESTAMP] [datetime] NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
*/
--Capture IO information from DMV and query to find deltas over time.
/*
USE [SysDBA]
GO
INSERT INTO [dbo].[DISKIOPS]
([IO_STALL]
,[IO_STALL_READ_MS]
,[IO_STALL_WRITE_MS]
,[NUM_OF_READS]
,[NUM_OF_WRITES]
,[SIZE_ON_DISK_MB]
,[DBNAME]
,[NAME]
,[FILE_ID]
,[DB_FILE_TYPE]
,[DISK]
,[FILE_LOCATION]
,[TIMESTAMP])
SELECT a.io_stall, a.io_stall_read_ms, a.io_stall_write_ms, a.num_of_reads,
a.num_of_writes,
--a.sample_ms, a.num_of_bytes_read, a.num_of_bytes_written,
( ( a.size_on_disk_bytes / 1024 ) / 1024.0 ) AS size_on_disk_mb,
db_name(a.database_id) AS dbname,
b.nam
In one 5min period I get a delta of 5826331 ms which is 97minutes.
I am a little confused by this, is this saying that an operation started 97mins ago only just finished at that point and hence recorded that wait time?
Thanks
Added code as requested:
```
/*
USE [SysDBA]
GO
*/
/ Object: Table [dbo].[DISKIOPS] Script Date: 04/07/2013 11:40:15 /
/*
DROP TABLE [dbo].[DISKIOPS]
GO
*/
--Create the table
/ Object: Table [dbo].[DISKIOPS] Script Date: 04/07/2013 11:40:15 /
/*
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].DISKIOPS NULL,
[NAME] varchar NULL,
[FILE_ID] [int] NULL,
[DB_FILE_TYPE] varchar NULL,
[DISK] varchar NULL,
[FILE_LOCATION] varchar NULL,
[TIMESTAMP] [datetime] NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
*/
--Capture IO information from DMV and query to find deltas over time.
/*
USE [SysDBA]
GO
INSERT INTO [dbo].[DISKIOPS]
([IO_STALL]
,[IO_STALL_READ_MS]
,[IO_STALL_WRITE_MS]
,[NUM_OF_READS]
,[NUM_OF_WRITES]
,[SIZE_ON_DISK_MB]
,[DBNAME]
,[NAME]
,[FILE_ID]
,[DB_FILE_TYPE]
,[DISK]
,[FILE_LOCATION]
,[TIMESTAMP])
SELECT a.io_stall, a.io_stall_read_ms, a.io_stall_write_ms, a.num_of_reads,
a.num_of_writes,
--a.sample_ms, a.num_of_bytes_read, a.num_of_bytes_written,
( ( a.size_on_disk_bytes / 1024 ) / 1024.0 ) AS size_on_disk_mb,
db_name(a.database_id) AS dbname,
b.nam
Solution
Question comment pasted below:
The above is a pretty good example of how you can see monumental and seemingly exaggerated numbers. By itself,
Instead of having this:
You need to divide the stall by I/O operations to get the average stall per I/O (or per read, or write, or whatever granularity you're looking for). In other words, my recommendation would be to modify your query to look something like this:
And then you need to have an extra predicate clause to ensure you're not dividing by zero:
What this basically does is calculate the average
io_stall by itself doesn't mean too much. If in 10 seconds you had 1000 operations stalling for 1 second each, you'll have 1000 seconds of stalls. That would be 16+ minutes of stalls in 10 seconds. You need to correlate this with IO operations...The above is a pretty good example of how you can see monumental and seemingly exaggerated numbers. By itself,
io_stall doesn't really mean anything. You need to know the scale of I/O operations for that cumulative stall.Instead of having this:
SELECT MAX([IO2].[IO_STALL] - [IO1].[IO_STALL])
FROM IOPS IO1 JOIN IOPS IO2 ON IO1.ROW = (IO2.ROW+1)
...You need to divide the stall by I/O operations to get the average stall per I/O (or per read, or write, or whatever granularity you're looking for). In other words, my recommendation would be to modify your query to look something like this:
SELECT
MAX(([IO2].[IO_STALL] - [IO1].[IO_STALL]) / (IO2.NUM_OF_READS + IO2.NUM_OF_WRITES - IO1.NUM_OF_READS - IO1.NUM_OF_WRITES))
FROM IOPS IO1 JOIN IOPS IO2 ON IO1.ROW = (IO2.ROW+1)And then you need to have an extra predicate clause to ensure you're not dividing by zero:
...
WHERE IO1.NAME = IO2.NAME
and (IO2.NUM_OF_READS + IO2.NUM_OF_WRITES - IO1.NUM_OF_READS - IO1.NUM_OF_WRITES) > 0
AND IO1.Disk = @DiskWhat this basically does is calculate the average
io_stall per I/O operation. By itself, a high io_stall could simply mean a higher workload and not necessarily the sign of a problem.Code Snippets
SELECT MAX([IO2].[IO_STALL] - [IO1].[IO_STALL])
FROM IOPS IO1 JOIN IOPS IO2 ON IO1.ROW = (IO2.ROW+1)
...SELECT
MAX(([IO2].[IO_STALL] - [IO1].[IO_STALL]) / (IO2.NUM_OF_READS + IO2.NUM_OF_WRITES - IO1.NUM_OF_READS - IO1.NUM_OF_WRITES))
FROM IOPS IO1 JOIN IOPS IO2 ON IO1.ROW = (IO2.ROW+1)...
WHERE IO1.NAME = IO2.NAME
and (IO2.NUM_OF_READS + IO2.NUM_OF_WRITES - IO1.NUM_OF_READS - IO1.NUM_OF_WRITES) > 0
AND IO1.Disk = @DiskContext
StackExchange Database Administrators Q#45999, answer score: 13
Revisions (0)
No revisions yet.