CUDA C++ Memory Model
The host/device-coherent, C++-style scoped memory model exposed by CUDA since version 9 (cuda::atomic with thread-scope template parameters). It mirrors the C++11 atomics interface and is layered on, and lowered to, the PTX memory model.
Ordering relationships
- Strictly weaker than
- C11/C++11 Memory Model — CUDA's cuda::atomic mirrors C++11 atomics but adds thread scopes (block, device, system); sub-system scopes admit behaviours C11 forbids.
- Compiles correctly to
- NVIDIA PTX Memory Model — CUDA C++ atomics lower to PTX; the CUDA model is layered directly on the PTX memory model.
References
- Daniel Lustig, Sameer Sahasrabuddhe, Olivier Giroux. A Formal Analysis of the NVIDIA PTX Memory Consistency Model. ASPLOS 2019, 2019. doi:10.1145/3297858.3304043