asm volatile("":::"memory"); merely prevents the compiler's optimizer from rearranging your code --- reads and writes that appear in source before that asm block must appear as CPU instructions before that asm block, and reads and writes that appear in source after that asm block must appear as CPU instructions after that asm block. The CPU may still reorder these reads and writes, and they may become visible to other CPUs in the system out of order (but constrained by the CPU memory model).
__sync_synchronize() will generate a suitable CPU instruction to generate a memory fence. On Intel CPUs this may be an MFENCE instruction, or a LOCKed instruction such as XCHG. This has the additional consequence of imposing visibility constraints on the memory accesses at the CPU level.