HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Are there any possibilties to mitigate data loss risk on delayed durability

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
possibiltieslossriskareanymitigatedelayeddurabilitytheredata

Problem

Due to poor disk performance I need to evaluate using the delayed durability setting in SQL Server 2016.

Microsoft states clearly, that this comes with a data loss risk because transactions are committed in memory and not on disk.

This risk is not limited to SQL Server crashing, it's also possible from other causes, for instance if the server is rebooted.

Are there any methods of mitigating the risk by pushing SQL Server to flash data to disk if someone tries to restart the server? I've searched for something like trigger on event but found nothing.

Do you see other risks aside from the ones I mentioned above that I should take into consideration? This is a standalone SQL Server, not involved in a Failover Cluster and not part of an Availability Group.

The poor performance I mentioned is about log latency; the highest waits in this SQL Server instance are CX* following by WriteLog. Right now it takes 3-4ms to write a transaction to disk. It's about the cloud and even after activating T1800 in SQL Server, the system suffers on log latency. I guess it's about virtualization / vSAN firmware and deeper volume settings; we have a vendor service ticket already open, but just in case the vendor can't help I have to investigate all possibilities which might improve performance.

Solution

Are there any possibilities to mitigate the risk pushing SQL server to
flash data to disk if someone tries to restart a server?

SQL Server uses write-ahead logging to guarantee durability so log buffers containing committed delayed durability transactions must be flushed to disk to mitigate data loss. Stored procedure sys.sp_flush_log may be used to flush log buffers, including delayed durability transactions. Below is the excerpt with guidance from the documentation:

If you choose to use delayed transaction durability because of the
performance benefits, but you also want to have a guaranteed limit on
the amount of data that is lost on server crash or failover, then
execute sys.sp_flush_log on a regular schedule. For example, if you
want to make sure you don't lose more than n seconds worth of data,
you would execute sp_flush_log every n seconds.

Example script executed via SQL agent job:

USE YourDatabase;
DECLARE @RPO datetime = '00:00:05'; --no more than 5 seconds of data loss
WHILE 1 = 1
BEGIN
    WAITFOR DELAY @RPO;
END;


I don't believe there is a way to automatically run the proc at shutdown and, even if there were, data loss is always possible with delayed durability. The feature is not intended to address "bad log disk performance" but to reduce log latency of when some data loss is acceptable.

As an aside, slow log writes are typically related to infrastructure. Even 3ms to 4ms is tolerable for many applications however I can't say whether that is acceptable for your workload. If it is not, your cloud provider may have options other than delayed durability, which is a heavy hammer to wield.

Code Snippets

USE YourDatabase;
DECLARE @RPO datetime = '00:00:05'; --no more than 5 seconds of data loss
WHILE 1 = 1
BEGIN
    WAITFOR DELAY @RPO;
END;

Context

StackExchange Database Administrators Q#332662, answer score: 8

Revisions (0)

No revisions yet.