HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

What type of operations are seen the most at the physical disk level — reads or writes? Why?

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
whytheoperationswriteswhatdiskareleveltypereads

Problem

This question came up in my operating systems class in a section about file system cache and RAID.

I'm speculating that the answer is that writes are seen more at the physical disk level because an efficient cache reduces the need to read from the disk.

Can someone confirm or correct this?

Solution

The proportion of reads to writes would be workload and system dependent. Before filtering by caching, reads will typically be more common, if for no other reason than code being read-only and data writes being dependent on at least an equal proportion of reads.

Under paging, discarding a clean (stored) page reduces the number of storage accesses, so there is an incentive for the OS to prefer paging out such pages. This biases toward reads (paging-in) at the storage level. (Obviously, allowing dirty pages to accumulate endangers responsiveness since eventually the best page to replace would be dirty so that demand for a not present page would be delayed by the required writeback. Maintaining a modest free list would ameliorate this issue.)

For an in-memory database (with persistence) or a log (or backup) server, writes would presumably dominate. In the former case reads are satisfied by memory by design; in the latter case reads are simply less common.

For many tasks oriented toward content consumption such as webserving and gaming where writes are much less common, storage reads can easily dominate even with caching.

Prefetching can also increase the number of read requests. If the cost of a read miss is high and the cost of an unused read is moderate, even somewhat aggressive prefetching can be beneficial.

The relative costs of reads and writes can also be a factor that can bias OS policies with respect to caching/prefetching. SSDs generally provide much higher read bandwidth than write bandwidth and do not penalize random accesses, reducing the cost of prefetching under low memory utilization.

Overprovisioning memory (in terms of cost per unit of work) can reduce the frequency of storage reads (moving toward the in-memory database model) and may be a reasonable design choice due to an impracticality of sharing memory across more work (and the non-linear cost per bit), a preference for responsiveness over average throughput, or a desire for load tolerance (i.e., the extra memory may not be worthwhile for average throughput or responsiveness but may avoid issues under high load).

Persistent (battery-backed or inherently persistent) memory can greatly reduce the number of writes to actual storage.

RAID will generally amplify the number of writes but not the number of reads seen by disks (RAID systems typically rely on the disks' internal error detection to detect read errors but write to more than one disk on a write).

Context

StackExchange Computer Science Q#35292, answer score: 4

Revisions (0)

No revisions yet.