Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
22 Views

std::swap bad code gen -- alias analysis insufficient to remove dead store

The following code optimizes well for `custom_swap` and `restrict_std_swap`, but has an additional `mov` instruction for `std_swap`: void custom_swap(int * lhs, int * rhs) { int temp = *lhs; *lhs = *rhs; *rhs = temp; } void restrict_std_swap(int * __restrict lhs, int * __restrict rhs) { int temp = *lhs; *lhs = 0; *lhs = *rhs; *rhs = temp; } void std_swap(int * lhs, int * rhs) { int temp = *lhs; *lhs = 0; *lhs = *rhs; *rhs = temp; } Compiles into this assembly with -O1, -O2, -Os, or -O3: custom_swap(int*, int*): mov eax, DWORD PTR [rsi] #3.13 mov edx, DWORD PTR [rdi] #2.17 mov DWORD PTR [rdi], eax #3.6 mov DWORD PTR [rsi], edx #4.6 ret #5.1 restrict_std_swap(int*, int*): mov edx, DWORD PTR [rdi] #8.17 mov eax, DWORD PTR [rsi] #10.13 mov DWORD PTR [rdi], eax #10.6 mov DWORD PTR [rsi], edx #11.6 ret #12.1 std_swap(int*, int*): mov edx, DWORD PTR [rdi] #15.17 mov DWORD PTR [rdi], 0 #16.6 mov eax, DWORD PTR [rsi] #17.13 mov DWORD PTR [rdi], eax #17.6 mov DWORD PTR [rsi], edx #18.6 ret #19.1 As we see from the example that annotates the parameters with __restrict, the problem appears to be that the risk of *lhs aliasing *rhs disables the optimizer's ability to remove the dead store in the second line of std_swap. It is able to see that if they don't alias, the store in line 2 is dead. It is not able to see that if they do alias, the store in line 3 is dead and the store in line 2 is dead. See it live: https://godbolt.org/z/8nCTnL The real life problem here is that types that manage a resource but do not implement a custom std::swap, as well as all types that recursively contain a type that manages a resource, suffer from reduced performance for using std::swap. The larger, slightly more meaningful test case showing how I arrived at this reduction and its relationship to std::swap: struct unique_ptr { unique_ptr(): ptr(nullptr) { } unique_ptr(unique_ptr && other) noexcept: ptr(other.ptr) { other.ptr = nullptr; } unique_ptr & operator=(unique_ptr && other) noexcept { delete ptr; ptr = nullptr; ptr = other.ptr; other.ptr = nullptr; return *this; } ~unique_ptr() noexcept { delete ptr; } int * ptr; }; void custom_swap(unique_ptr & lhs, unique_ptr & rhs) noexcept { int * temp = lhs.ptr; lhs.ptr = rhs.ptr; rhs.ptr = temp; } void inlined_std_swap(unique_ptr & lhs, unique_ptr & rhs) noexcept { int * temp_ptr = lhs.ptr; lhs.ptr = nullptr; delete lhs.ptr; lhs.ptr = nullptr; lhs.ptr = rhs.ptr; rhs.ptr = nullptr; delete rhs.ptr; rhs.ptr = nullptr; rhs.ptr = temp_ptr; temp_ptr = nullptr; delete temp_ptr; } void std_swap(unique_ptr & lhs, unique_ptr & rhs) noexcept { auto temp = static_cast(lhs); lhs = static_cast(rhs); rhs = static_cast(temp); } See it live: https://godbolt.org/z/tWjmzo
0 Kudos
0 Replies