Does Intel have some plans to support "fine-grained" atomic operations in future x86 processors in the context of emerging C/C++0x standard and it's support for such operations?
Particularly I mean atomic RMW operations (XADD, XCHG, CMPXCHG, ADD, AND etc) with fine-grained memory ordering parameters. For example:
std::atomic_xchg(x, 1, std::memory_order_relaxed);
std::atomic_fetch_sub(x, 1, std::memory_order_release);
The main point is that programs relying on C/C++0x atomic API will be able to transparently benefit from those fine-grained hardware operations.
Since load on x86 is always acquire, and store is always release, so I think it will be difficult to eliminate acquire/release fences, i.e. provide real relaxed operations. But at least store-load memory fence can be eliminated from atomic RMW operations. Is it possible/feasible?