patternMinor
Why are both caller-save registers and callee save registers needed?
Viewed 0 times
calleewhyneededarecallerbothsaveregistersand
Problem
I am having a difficult time understanding callee and caller-save registers. I get that caller-save registers are those which are needed after function call and hence caller saves them in caller prologue and restored in caller epilogue. Callee-save are those who are used by callee and are saved in callee prologue and restored in callee epilogue. But why are both needed? If caller is saving the needed registers already, why should callee save them again on use? I am really confused.
Solution
Let's see. CPU have, f.e., 16 registers. If subroutine modifies them, they should be saved before that and restored later. Now, who should do that?
All 16 registers can be saved by caller, or all 16 registers can be saved by callee, or mixed scheme may be used. In this schema, 8 registers is considered as transient - caller don't expect that their contents will be left intact after a call. Thus callers usually use these registers only for temporary data, i.e. computations between calls. But in a few cases when these registers contains non-transient data, caller has to save their contents.
Other registers are considered as permanent, so caller expects that they are left untouched by callee. But in the cases when callee really needs to use more registers, it needs to save/restore their contents.
Why hybrid scheme is advantageous - because usually each procedure needs a few registers for transient data, so by having convention that these registers can be modified by callee, you can reduce calling sequence and therefore make procedure calls a bit faster and more compact.
OTOH, if we consider some registers as "permanent", we can try to avoid using these registers in leaf procedures (i.e. those not calling other procedures) and avoid saving/restoring these registers in leaf procedures. Since cpu usually spends most of time in leaf procedures, this also means quite substantial speed improvement.
So, by splitting registers into two classes, we can make program faster and shorter - leaf subroutines try to not use "permanent" registers and as result don't need to save/restore any. OTOH, other procedures try to don't use "transient" registers to hold values through calls, thus avoiding need to save/restore them.
So, hybrid scheme makes program more efficient compared to any of two pure schemes.
All 16 registers can be saved by caller, or all 16 registers can be saved by callee, or mixed scheme may be used. In this schema, 8 registers is considered as transient - caller don't expect that their contents will be left intact after a call. Thus callers usually use these registers only for temporary data, i.e. computations between calls. But in a few cases when these registers contains non-transient data, caller has to save their contents.
Other registers are considered as permanent, so caller expects that they are left untouched by callee. But in the cases when callee really needs to use more registers, it needs to save/restore their contents.
Why hybrid scheme is advantageous - because usually each procedure needs a few registers for transient data, so by having convention that these registers can be modified by callee, you can reduce calling sequence and therefore make procedure calls a bit faster and more compact.
OTOH, if we consider some registers as "permanent", we can try to avoid using these registers in leaf procedures (i.e. those not calling other procedures) and avoid saving/restoring these registers in leaf procedures. Since cpu usually spends most of time in leaf procedures, this also means quite substantial speed improvement.
So, by splitting registers into two classes, we can make program faster and shorter - leaf subroutines try to not use "permanent" registers and as result don't need to save/restore any. OTOH, other procedures try to don't use "transient" registers to hold values through calls, thus avoiding need to save/restore them.
So, hybrid scheme makes program more efficient compared to any of two pure schemes.
Context
StackExchange Computer Science Q#108986, answer score: 2
Revisions (0)
No revisions yet.