patternMinor
Turn a vsscanf call into a sscanf call
Viewed 0 times
turnintocallvsscanfsscanf
Problem
I am reviewing some C code (see extract below) that uses inline asm. I believe there are 4 spec violations, 1 inefficiency and 1 honking big design flaw in this code. While some of the problems I'm seeing apply to both the x86 and x64 sections in that code, this post is focused on x64.
Before I'll be allowed to check in a fix, I must first convince people that there is a problem. And the number of people prepared to review inline asm on that project is limited. Hopefully someone here can provide this.
What the code is trying to do:
This code is intended to be a wrapper that turns a vsscanf call (
Here are the problems I believe exist in the current code:
-
As part of the work, this code calls memcpy. However, it looks to me like there isn't enough room reserved on the stack for the call. While memcpy only takes 3 parameters, by spec:
The caller is responsible for allocating space for parameters to the callee, and must always allocate sufficient space for the 4 register parameters, even if the callee doesn’t have that many parameters.
And the 4th parameter would be at
-
Windows 64bit stacks are supposed to be 16 bit aligned, and this seems to call memcpy with a (potentially) unaligned stack (ie no
-
For a brief instant,
Before I'll be allowed to check in a fix, I must first convince people that there is a problem. And the number of people prepared to review inline asm on that project is limited. Hopefully someone here can provide this.
What the code is trying to do:
This code is intended to be a wrapper that turns a vsscanf call (
int vsscanf(const char buffer, const char format, va_list arglist)) into a sscanf call (int sscanf(const char buffer, const char format [,argument] ...)) on (Windows) platforms that don't have a builtin vsscanf. Unfortunately, there is no defined way to know how big a va_list is, so the existing code takes a rather... unconventional approach, which, as it turns out, doesn't work.Here are the problems I believe exist in the current code:
-
As part of the work, this code calls memcpy. However, it looks to me like there isn't enough room reserved on the stack for the call. While memcpy only takes 3 parameters, by spec:
The caller is responsible for allocating space for parameters to the callee, and must always allocate sufficient space for the 4 register parameters, even if the callee doesn’t have that many parameters.
And the 4th parameter would be at
0x18(%%rsp), which is being used for something else. If (for some bizarre reason) memcpy decided to use the memory that was supposed to have been allocated for that 4th parameter, bad things would happen, right?-
Windows 64bit stacks are supposed to be 16 bit aligned, and this seems to call memcpy with a (potentially) unaligned stack (ie no
andq $-16, %%rsp). Problem?-
For a brief instant,
rsp has a wild value (after lea 0xFFFFFFFFFFFFFFD8(%%rsp, %6), %%esp and before subq %5, %%rsp). Solution
As predicted, selling this change is turning out to be a challenge on my own, but that's life. I dislike leaving questions open, so I'm going to close my own question by quoting some specs that confirm my concerns:
-
There isn't enough room reserved on the stack - True per spec:
It contains at least four entries, but always enough space to hold all the parameters needed by any function that may be called. Note that space is always allocated for the register parameters, even if the parameters themselves are never homed to the stack; a callee is guaranteed that space has been allocated for all its parameters.
-
Windows 64bit stacks are supposed to be 16 bit aligned - True per spec:
The stack pointer must be aligned to 16 bytes, except for leaf functions, in any region of code that isn’t part of an epilog or prolog.
-
rsp has a wild value - True per spec
All memory beyond the current address of RSP is considered volatile: The OS, or a debugger, may overwrite this memory during a user debug session, or an interrupt handler. Thus, RSP must always be set before attempting to read or write values to a stack frame.
-
Actual values are passed in registers - True per spec:
The first four integer arguments are passed in registers.
and
The callee has the responsibility of dumping the register parameters into their shadow space if needed.
-
The code only works because the distance between ptr and &ret just happens to be big enough - Debugging shows this to be true. The number of bytes copied varies between debug and retail builds, but increasing the number of parameters being passed has no effect. Once you get past a certain number of params (~20), the stack corrupts and the app crashes.
-
Soes not include the appropriate .seh_* directives - Well, it surely doesn't include any of them. And it's modifying the stack size and calling functions, all of which normally require these directives. I can't say for sure that bad things will happen without them, but at a minimum I'm guessing it's bad practice.
-
There isn't enough room reserved on the stack - True per spec:
It contains at least four entries, but always enough space to hold all the parameters needed by any function that may be called. Note that space is always allocated for the register parameters, even if the parameters themselves are never homed to the stack; a callee is guaranteed that space has been allocated for all its parameters.
-
Windows 64bit stacks are supposed to be 16 bit aligned - True per spec:
The stack pointer must be aligned to 16 bytes, except for leaf functions, in any region of code that isn’t part of an epilog or prolog.
-
rsp has a wild value - True per spec
All memory beyond the current address of RSP is considered volatile: The OS, or a debugger, may overwrite this memory during a user debug session, or an interrupt handler. Thus, RSP must always be set before attempting to read or write values to a stack frame.
-
Actual values are passed in registers - True per spec:
The first four integer arguments are passed in registers.
and
The callee has the responsibility of dumping the register parameters into their shadow space if needed.
-
The code only works because the distance between ptr and &ret just happens to be big enough - Debugging shows this to be true. The number of bytes copied varies between debug and retail builds, but increasing the number of parameters being passed has no effect. Once you get past a certain number of params (~20), the stack corrupts and the app crashes.
-
Soes not include the appropriate .seh_* directives - Well, it surely doesn't include any of them. And it's modifying the stack size and calling functions, all of which normally require these directives. I can't say for sure that bad things will happen without them, but at a minimum I'm guessing it's bad practice.
Context
StackExchange Code Review Q#133266, answer score: 3
Revisions (0)
No revisions yet.