debugrustCritical
Why does the Rust compiler not optimize code assuming that two mutable references cannot alias?
Viewed 0 times
whythealiasassumingoptimizedoescompilermutablerustnot
Problem
As far as I know, reference/pointer aliasing can hinder the compiler's ability to generate optimized code, since they must ensure the generated binary behaves correctly in the case where the two references/pointers indeed alias. For instance, in the following C code,
when compiled by
Here the code stores back to
When we explicitly tell the compiler that these two pointers cannot alias with the
Then Clang will emit a more optimized version that effectively does
Since Rust makes sure (except in unsafe code) that two mutable references cannot alias, I would think that the compiler should be able to emit the more optimized version of the code.
When I test with the code below and compile it with
it generates:
`000000000000000
void adds(int a, int b) {
a += b;
a += b;
}
when compiled by
clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final) with the -O3 flag, it emits0000000000000000 :
0: 8b 07 mov (%rdi),%eax # load a into EAX
2: 03 06 add (%rsi),%eax # load-and-add b
4: 89 07 mov %eax,(%rdi) # store into a
6: 03 06 add (%rsi),%eax # load-and-add b again
8: 89 07 mov %eax,(%rdi) # store into a again
a: c3 retq
Here the code stores back to
(%rdi) twice in case int a and int b alias.When we explicitly tell the compiler that these two pointers cannot alias with the
restrict keyword:void adds(int restrict a, int restrict b) {
a += b;
a += b;
}
Then Clang will emit a more optimized version that effectively does
a += 2 (b), which is equivalent if (as promised by restrict) b isn't modified by assigning to *a:0000000000000000 :
0: 8b 06 mov (%rsi),%eax # load b once
2: 01 c0 add %eax,%eax # double it
4: 01 07 add %eax,(%rdi) # a += 2 (*b)
6: c3 retq
Since Rust makes sure (except in unsafe code) that two mutable references cannot alias, I would think that the compiler should be able to emit the more optimized version of the code.
When I test with the code below and compile it with
rustc 1.35.0 with -C opt-level=3 --emit obj,#![crate_type = "staticlib"]
#[no_mangle]
fn adds(a: &mut i32, b: &mut i32) {
a += b;
a += b;
}
it generates:
`000000000000000
Solution
Rust originally did enable LLVM's
If you add
Simply put, Rust put the equivalent of C's
This has happened multiple times.
Related Rust issues
-
Current case
-
Previous case
-
Other
noalias attribute, but this caused miscompiled code. When all supported LLVM versions no longer miscompile the code, it will be re-enabled.If you add
-Zmutable-noalias=yes to the compiler options, you get the expected assembly:adds:
mov eax, dword ptr [rsi]
add eax, eax
add dword ptr [rdi], eax
ret
Simply put, Rust put the equivalent of C's
restrict keyword everywhere, far more prevalent than any usual C program. This exercised corner cases of LLVM more than it was able to handle correctly. It turns out that C and C++ programmers simply don't use restrict as frequently as &mut is used in Rust.This has happened multiple times.
- Rust 1.0 through 1.7 —
noaliasenabled
- Rust 1.8 through 1.27 —
noaliasdisabled
- Rust 1.28 through 1.29 —
noaliasenabled
- Rust 1.30 through 1.54 —
noaliasdisabled
- Rust 1.54 through ??? —
noaliasconditionally enabled depending on the version of LLVM the compiler uses
Related Rust issues
-
Current case
- Incorrect code generation for nalgebra's Matrix::swap_rows() #54462
- Re-enable noalias annotations by default once LLVM no longer miscompiles them #54878
- Enable mutable noalias for LLVM >= 12 #82834
- Regression: Miscompilation due to bug in "mutable noalias" logic #84958
-
Previous case
- Workaround LLVM optimizer bug by not marking &mut pointers as noalias #31545
- Mark &mut pointers as noalias once LLVM no longer miscompiles them #31681
-
Other
- make use of LLVM's scoped noalias metadata #16515
- Missed optimization: references from pointers aren't treated as noalias #38941
- noalias is not enough #53105
- mutable noalias: re-enable permanently, only for panic=abort, or stabilize flag? #45029
Context
Stack Overflow Q#57259126, score: 498
Revisions (0)
No revisions yet.