Global Memory

Global memory is the largest memory space on a GPU (i.e., VRAM). It is accessible by all threads across all blocks, but has higher latency compared to shared memory or registers.

Example

// Allocate and copy data to global memory
float *d_data;
cudaMalloc(&d_data, n * sizeof(float));
cudaMemcpy(d_data, h_data, n * sizeof(float), cudaMemcpyHostToDevice);

Key Characteristics

Size: Gigabytes (matches VRAM capacity)
Latency: ~400-800 cycles
Accessible by: All threads in all blocks
Coalesced access (consecutive threads reading consecutive addresses) is critical for performance

Example

Key Characteristics

See Also