patternMinor
Why do we need the valid-invalid bit in a page table?
Viewed 0 times
whythebitneedvalidpageinvalidtable
Problem
I'm reading about memory management in Operating System Concepts, by A. Silberschatz. The author says:
[...] the user process by definition is unable to access memory it does not own.
It has no way of addressing memory outside of its page table, and the table
includes only those pages that the process owns.
Then, the author explains the need of having a valid-invalid bit:
One additional bit is generally attached to each entry in the page table: a valid–invalid bit. When this bit is set to valid, the associated page is in the process’s logical address space and is thus a legal (or valid) page. When the bit is set to invalid, the page is not in the process’s logical address space.
Why do we also need a valid-invalid bit, if a process is by definition able to access only memory addresses stored in its own page table?
The two quoted sentences appear to be contradictory, in my opinion. There must be something I am missing here.
This exact same question has been already asked a few times;
however, none of the answers clarified my doubt. Source:
The answers that other users have given don’t explain why the validity bit is needed to protect address spaces. The page table of a process contains only those pages that are valid for that process (i.e., that the process actually owns), so why do we need to replicate a security feature that is already present thanks to the way the page table is implemented?
[...] the user process by definition is unable to access memory it does not own.
It has no way of addressing memory outside of its page table, and the table
includes only those pages that the process owns.
Then, the author explains the need of having a valid-invalid bit:
One additional bit is generally attached to each entry in the page table: a valid–invalid bit. When this bit is set to valid, the associated page is in the process’s logical address space and is thus a legal (or valid) page. When the bit is set to invalid, the page is not in the process’s logical address space.
Why do we also need a valid-invalid bit, if a process is by definition able to access only memory addresses stored in its own page table?
The two quoted sentences appear to be contradictory, in my opinion. There must be something I am missing here.
This exact same question has been already asked a few times;
however, none of the answers clarified my doubt. Source:
- What’s the purpose of having valid-invalid bits in page table?
- Valid-invalid bit in a process page table
- What is the need for valid/invalid bit in paged memory technique?
The answers that other users have given don’t explain why the validity bit is needed to protect address spaces. The page table of a process contains only those pages that are valid for that process (i.e., that the process actually owns), so why do we need to replicate a security feature that is already present thanks to the way the page table is implemented?
Solution
As David Richerby says, it’s not clear what you don't understand.
Most of my answer here has already been presented in answers (or comments)
to the questions you linked to.
I’ll admit, though, that the Operating System Concepts book
is probably not as clear as it should be.
The book (going back to Figure 8.6
(Hardware support for relocation and limit registers) on page 361)
suggests that the processor knows the maximum user virtual address
for a context (process),
so if the process tries to access an address above this limit,
the access fails with an addressing error.
This concept seems to erode over the following pages;
for example, Section 8.4.2 (Segmentation Hardware) (pages 365 and 366)
introduces a limit field in each segment table entry,
but waves its hands over the question
of specifying the maximum virtual address
(and, correspondingly, the length of the entire segment table).
I would argue that Figure 8.9 (Example of segmentation), on page 367:
[images copied from this 944-page PDF of the book]
adds to the confusion:
It says that Segment 0 is 1000* bytes long — both explicitly,
in the segment table, and in the map of physical memory $(2400-1400=1000)$.
Likewise, it says that Segment 4 is 1000 bytes long — both explicitly,
in the segment table, and in the memory map $(5700-4700=1000)$.
But Segment 4 is clearly larger than Segment 0 in the memory map,
while Segment 0 is clearly larger than Segment 4
in the nebulous logical address space on the left.
Segments 2 and 3 are gray on the left and white on the right.
Does this mean something?
Wandering Logic addresses the fourth bullet with their statement,
“Typically the page table is a tree of fixed size pages.”
(emphasis added).
So, looking at Figure 8.15 (“Valid (v) or invalid (i) bit in a page table”),
excerpted here,
it might be that the system has no way of knowing
which pages are valid (0-5) and which ones are invalid (6 and 7)
other than marking the entries for the invalid pages with an invalid flag.
But also, while Figure 8.9 is somewhat confusing,
it also contains the kernel of part of the answer to your question:
The user virtual address space may be
segmented / fragmented / sparse / discontiguous.
In other words, just because the maximum virtual address is $N$,
that doesn’t mean that every $M$ such that $0 \le M \le N$
is a valid address.
Just as the segments are explicitly shown as being discontiguous
(and out of order) in the physical memory (on the right),
and are implicitly shown as disassociated in the logical address space
on the left, with order being irrelevant**,
they may actually be discontiguous in the logical address space.
In fact, the book (somewhat) explicitly (but not clearly) tells us
that they are discontiguous,
in the last paragraph of Section 8.4.2:
A reference to byte 1222 of segment 0
would result in a trap to the operating system,
as this segment is only 1,000 bytes long.
So the 1222nd byte of the logical address space
is not the 222nd byte of segment 1,
but rather falls in the gap between segment 0 and segment 1.
If we assume that the segments occur in numeric order
in the logical address space with addresses at multiples of 2000,
starting at 0,
then the complete (logical and physical) segment map/table
might look something like this:
(I’m sorry, but I just don’t see any reason
to display the limit to the left of the base, as the authors do.)
So, pretending that the page size is 100 bytes,
using a format inspired by Figure 8.15,
we would get a page table that looks like this:
```
virtual physical ┃ virtual physical ┃ virtual physical ┃ virtual physical ┃ virtual physical
page # | page # ┃ page # | page # ┃ page # | page # ┃ page # | page # ┃ page # | page #
0 | 14 ┃ 18 | invalid ┃ 36 | invalid ┃ 54 | invalid ┃ 72 | invalid
1 | 15 ┃ 19 | invalid ┃ 37 | invalid ┃ 55 | invalid ┃ 73 | invalid
2 | 16 ┃ 20 | 63 ┃ 38 | invalid ┃ 56 | invalid ┃ 74 | invalid
3 | 17 ┃ 21 | 64 ┃ 39 | invalid ┃ 57 | invalid ┃ 75 | invalid
4 | 18 ┃
Most of my answer here has already been presented in answers (or comments)
to the questions you linked to.
I’ll admit, though, that the Operating System Concepts book
is probably not as clear as it should be.
The book (going back to Figure 8.6
(Hardware support for relocation and limit registers) on page 361)
suggests that the processor knows the maximum user virtual address
for a context (process),
so if the process tries to access an address above this limit,
the access fails with an addressing error.
This concept seems to erode over the following pages;
for example, Section 8.4.2 (Segmentation Hardware) (pages 365 and 366)
introduces a limit field in each segment table entry,
but waves its hands over the question
of specifying the maximum virtual address
(and, correspondingly, the length of the entire segment table).
I would argue that Figure 8.9 (Example of segmentation), on page 367:
[images copied from this 944-page PDF of the book]
adds to the confusion:
- It’s clearly not to scale.
It says that Segment 0 is 1000* bytes long — both explicitly,
in the segment table, and in the map of physical memory $(2400-1400=1000)$.
Likewise, it says that Segment 4 is 1000 bytes long — both explicitly,
in the segment table, and in the memory map $(5700-4700=1000)$.
But Segment 4 is clearly larger than Segment 0 in the memory map,
while Segment 0 is clearly larger than Segment 4
in the nebulous logical address space on the left.
- Why are the segments colored inconsistently?
Segments 2 and 3 are gray on the left and white on the right.
Does this mean something?
- How do we know which “segment” a given address is in?
- How do we know the length of the segment table?
Wandering Logic addresses the fourth bullet with their statement,
“Typically the page table is a tree of fixed size pages.”
(emphasis added).
So, looking at Figure 8.15 (“Valid (v) or invalid (i) bit in a page table”),
excerpted here,
it might be that the system has no way of knowing
which pages are valid (0-5) and which ones are invalid (6 and 7)
other than marking the entries for the invalid pages with an invalid flag.
But also, while Figure 8.9 is somewhat confusing,
it also contains the kernel of part of the answer to your question:
The user virtual address space may be
segmented / fragmented / sparse / discontiguous.
In other words, just because the maximum virtual address is $N$,
that doesn’t mean that every $M$ such that $0 \le M \le N$
is a valid address.
Just as the segments are explicitly shown as being discontiguous
(and out of order) in the physical memory (on the right),
and are implicitly shown as disassociated in the logical address space
on the left, with order being irrelevant**,
they may actually be discontiguous in the logical address space.
In fact, the book (somewhat) explicitly (but not clearly) tells us
that they are discontiguous,
in the last paragraph of Section 8.4.2:
A reference to byte 1222 of segment 0
would result in a trap to the operating system,
as this segment is only 1,000 bytes long.
So the 1222nd byte of the logical address space
is not the 222nd byte of segment 1,
but rather falls in the gap between segment 0 and segment 1.
If we assume that the segments occur in numeric order
in the logical address space with addresses at multiples of 2000,
starting at 0,
then the complete (logical and physical) segment map/table
might look something like this:
segment logical physical
# | base limit range ... base limit range
-------------------------------------- ... ----------------------------
0 | 0 1000 0-999 1400 1000 1400-2399
1 | 2000 400 2000-2399 6300 400 6300-6699
2 | 4000 400 4000-4399 4300 400 4300-4699
3 | 6000 1100 6000-7099 3200 1100 3200-4299
4 | 8000 1000 8000-8999 4700 1000 4700-5699(I’m sorry, but I just don’t see any reason
to display the limit to the left of the base, as the authors do.)
So, pretending that the page size is 100 bytes,
using a format inspired by Figure 8.15,
we would get a page table that looks like this:
```
virtual physical ┃ virtual physical ┃ virtual physical ┃ virtual physical ┃ virtual physical
page # | page # ┃ page # | page # ┃ page # | page # ┃ page # | page # ┃ page # | page #
0 | 14 ┃ 18 | invalid ┃ 36 | invalid ┃ 54 | invalid ┃ 72 | invalid
1 | 15 ┃ 19 | invalid ┃ 37 | invalid ┃ 55 | invalid ┃ 73 | invalid
2 | 16 ┃ 20 | 63 ┃ 38 | invalid ┃ 56 | invalid ┃ 74 | invalid
3 | 17 ┃ 21 | 64 ┃ 39 | invalid ┃ 57 | invalid ┃ 75 | invalid
4 | 18 ┃
Code Snippets
segment logical physical
# | base limit range ... base limit range
-------------------------------------- ... ----------------------------
0 | 0 1000 0-999 1400 1000 1400-2399
1 | 2000 400 2000-2399 6300 400 6300-6699
2 | 4000 400 4000-4399 4300 400 4300-4699
3 | 6000 1100 6000-7099 3200 1100 3200-4299
4 | 8000 1000 8000-8999 4700 1000 4700-5699virtual physical ┃ virtual physical ┃ virtual physical ┃ virtual physical ┃ virtual physical
page # | page # ┃ page # | page # ┃ page # | page # ┃ page # | page # ┃ page # | page #
0 | 14 ┃ 18 | invalid ┃ 36 | invalid ┃ 54 | invalid ┃ 72 | invalid
1 | 15 ┃ 19 | invalid ┃ 37 | invalid ┃ 55 | invalid ┃ 73 | invalid
2 | 16 ┃ 20 | 63 ┃ 38 | invalid ┃ 56 | invalid ┃ 74 | invalid
3 | 17 ┃ 21 | 64 ┃ 39 | invalid ┃ 57 | invalid ┃ 75 | invalid
4 | 18 ┃ 22 | 65 ┃ 40 | 43 ┃ 58 | invalid ┃ 76 | invalid
5 | 19 ┃ 23 | 66 ┃ 41 | 44 ┃ 59 | invalid ┃ 77 | invalid
6 | 20 ┃ 24 | invalid ┃ 42 | 45 ┃ 60 | 32 ┃ 78 | invalid
7 | 21 ┃ 25 | invalid ┃ 43 | 46 ┃ 61 | 33 ┃ 79 | invalid
8 | 22 ┃ 26 | invalid ┃ 44 | invalid ┃ 62 | 34 ┃ 80 | 47
9 | 23 ┃ 27 | invalid ┃ 45 | invalid ┃ 63 | 35 ┃ 81 | 48
10 | invalid ┃ 28 | invalid ┃ 46 | invalid ┃ 64 | 36 ┃ 82 | 49
11 | invalid ┃ 29 | invalid ┃ 47 | invalid ┃ 65 | 37 ┃ 83 | 50
12 | invalid ┃ 30 | invalid ┃ 48 | invalid ┃ 66 | 38 ┃ 84 | 51
13 | invalid ┃ 31 | invalid ┃ 49 | invalid ┃ 67 | 39 ┃ 85 | 52
14 | invalid ┃ 32 | invalid ┃ 50 | invalid ┃ 68 | 40 ┃ 86 | 53
15 | invalid ┃ 33 | invalid ┃ 51 | invalid ┃ 69 | 41 ┃ 87 | 54
16 | invalid ┃ 34 | invalid ┃ 52 | invalid ┃ 70 | 42 ┃ 88 | 55
17 | invalid ┃ 35 | invalid ┃ 53 | invalid ┃ 71 | invalid ┃ 89 | 56
90 | invalid
91 | ︙Context
StackExchange Computer Science Q#80215, answer score: 7
Revisions (0)
No revisions yet.