patterncppCritical
What is the strict aliasing rule?
Viewed 0 times
strictrulethewhataliasing
Problem
When asking about common undefined behavior in C, people sometimes refer to the strict aliasing rule.
What are they talking about?
What are they talking about?
Solution
A typical situation where you encounter strict aliasing problems is when overlaying a struct (like a device/network msg) onto a buffer of the word size of your system (like a pointer to
So in this kind of setup, if I want to send a message to something I'd have to have two incompatible pointers pointing to the same chunk of memory. I might then naively code something like this:
The strict aliasing rule makes this setup illegal: dereferencing a pointer that aliases an object that is not of a compatible type or one of the other types allowed by C 2011 6.5 paragraph 71 is undefined behavior. Unfortunately, you can still code this way, maybe get some warnings, have it compile fine, only to have weird unexpected behavior when you run the code. For example, when I compiled it with
See https://godbolt.org/z/oGsqe4d75 for how GCC and Clang both produce different unintended outputs.
To see why this behavior is undefined, we have to think about what the strict aliasing rule buys the compiler. Following this rule, the writes to
So how do I get around this?
-
Use
-
Use a union. Most compilers support this without complaining about strict aliasing. This is allowed in C99 and explicitly allowed in C11.
-
You can disable strict aliasing in your compiler (-f[no-]strict-aliasing in clang/gcc))
-
You can use
Beginner beware
This is only one potential minefield when overlaying two types onto each other. You should also learn about endianness, word alignment, and how to deal with alignment issues through packing structs correctly.
Footnote
1 The types that C 2011 6.5 7 allows an lvalue to access are:
uint32_ts or uint16_ts). When you overlay a struct onto such a buffer, or a buffer onto such a struct through pointer casting you can easily violate strict aliasing rules.So in this kind of setup, if I want to send a message to something I'd have to have two incompatible pointers pointing to the same chunk of memory. I might then naively code something like this:
#include
#include
#include
typedef struct Msg {
uint32_t a;
uint32_t b;
} Msg;
// Sends a buffer over a 16-bit channel
void SendWords(uint16_t *buf, size_t sz) {
for(size_t i = 0; i < sz; ++i)
printf("%.4x ", buf[i]);
printf("\n");
}
int main(void) {
Msg msg;
// Send a bunch of messages
for(int i = 0; i < 10; ++i) {
msg.a = 0xDEADBEEF;
msg.b = i;
SendWords((uint16_t*)&msg, sizeof(msg)/sizeof(uint16_t));
}
}The strict aliasing rule makes this setup illegal: dereferencing a pointer that aliases an object that is not of a compatible type or one of the other types allowed by C 2011 6.5 paragraph 71 is undefined behavior. Unfortunately, you can still code this way, maybe get some warnings, have it compile fine, only to have weird unexpected behavior when you run the code. For example, when I compiled it with
gcc -O2 it produced this very unexpected output:0000 0000 0000 0000
0000 0000 0000 0000
0000 0000 0000 0000
...See https://godbolt.org/z/oGsqe4d75 for how GCC and Clang both produce different unintended outputs.
To see why this behavior is undefined, we have to think about what the strict aliasing rule buys the compiler. Following this rule, the writes to
msg cannot alias the reads of buf in SendWords. Consequently, an optimizing compiler can reorder the instructions so that all assignments to msg happen before or after all the reads from buf in SendWords. In turn it can then deduce that buf[0], ..., buf[3] can be loaded from memory before the loop is executed, or that only the msg values from the last iteration of the loop need to be calculated. Either way, removing the redundant stores and loads (under this rule) makes the generated code faster. Before strict aliasing was introduced, the compiler had to live in a state of paranoia that the contents of buf could change by any preceding memory stores. So to get an extra performance edge, and assuming most people don't type-pun pointers, the strict aliasing rule was introduced.So how do I get around this?
-
Use
memcpy to load/store incompatible types. Compilers can actually optimize away memcpy of small values, but the use of memcpy by itself tells them that the source/destination may alias other types. See https://godbolt.org/z/91MvaP45Y.-
Use a union. Most compilers support this without complaining about strict aliasing. This is allowed in C99 and explicitly allowed in C11.
union {
Msg msg;
uint16_t buf[sizeof(Msg)/sizeof(uint16_t)];
};-
You can disable strict aliasing in your compiler (-f[no-]strict-aliasing in clang/gcc))
-
You can use
char for aliasing instead of your system's word. The rules allow an exception for char (including signed char and unsigned char). It's always assumed that char* aliases other types. This is essentially why memcpy works -- because it copies chars. However this won't work the other way: there's no assumption that your struct aliases a buffer of chars.Beginner beware
This is only one potential minefield when overlaying two types onto each other. You should also learn about endianness, word alignment, and how to deal with alignment issues through packing structs correctly.
Footnote
1 The types that C 2011 6.5 7 allows an lvalue to access are:
- a type compatible with the effective type of the object,
- a qualified version of a type compatible with the effective type of the object,
- a type that is the signed or unsigned type corresponding to the effective type of the object,
- a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
- an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
- a character type.
Code Snippets
#include <stdint.h>
#include <stdlib.h>
#include <stdio.h>
typedef struct Msg {
uint32_t a;
uint32_t b;
} Msg;
// Sends a buffer over a 16-bit channel
void SendWords(uint16_t *buf, size_t sz) {
for(size_t i = 0; i < sz; ++i)
printf("%.4x ", buf[i]);
printf("\n");
}
int main(void) {
Msg msg;
// Send a bunch of messages
for(int i = 0; i < 10; ++i) {
msg.a = 0xDEADBEEF;
msg.b = i;
SendWords((uint16_t*)&msg, sizeof(msg)/sizeof(uint16_t));
}
}0000 0000 0000 0000
0000 0000 0000 0000
0000 0000 0000 0000
...union {
Msg msg;
uint16_t buf[sizeof(Msg)/sizeof(uint16_t)];
};Context
Stack Overflow Q#98650, score: 733
Revisions (0)
No revisions yet.