CUDA C++ Memory Model

2017 · NVIDIA · language, gpu, scoped

The host/device-coherent, C++-style scoped memory model exposed by CUDA since version 9 (cuda::atomic with thread-scope template parameters). It mirrors the C++11 atomics interface and is layered on, and lowered to, the PTX memory model.

Ordering relationships

Strictly weaker than: C11/C++11 Memory Model — CUDA's cuda::atomic mirrors C++11 atomics but adds thread scopes (block, device, system); sub-system scopes admit behaviours C11 forbids.
Compiles correctly to: NVIDIA PTX Memory Model — CUDA C++ atomics lower to PTX; the CUDA model is layered directly on the PTX memory model.

References

Daniel Lustig, Sameer Sahasrabuddhe, Olivier Giroux. A Formal Analysis of the NVIDIA PTX Memory Consistency Model. ASPLOS 2019, 2019. doi:10.1145/3297858.3304043

Open CUDA in the interactive map →