CUDA

CUDA (Compute Unified Device Architecture) is NVIDIA's parallel computing platform and programming model that lets developers run general-purpose code on NVIDIA GPUs.

Example

// CUDA C++ kernel — adds two arrays element-wise
__global__ void add(float *a, float *b, float *c, int n) {
    int i = blockIdx.x * blockDim.x + threadIdx.x;
    if (i < n) c[i] = a[i] + b[i];
}

// Launch with 256 threads per block
add<<<(n + 255) / 256, 256>>>(a, b, c, n);

Key Concepts

Host — the CPU and its memory
Device — the GPU and its memory
Kernel — a function that runs on the GPU
CUDA Toolkit — compiler (nvcc), libraries, and profiling tools

Example

Key Concepts

See Also