CUDA Cores
The basic processing units inside NVIDIA GPUs
CUDA Cores are the individual processing units inside NVIDIA GPUs that execute floating-point and integer arithmetic operations. A single GPU can contain thousands of CUDA cores working in parallel.
Example
# A simple vector add — each CUDA core handles one element
import torch
a = torch.randn(1_000_000, device="cuda")
b = torch.randn(1_000_000, device="cuda")
c = a + b # distributed across CUDA cores
Core Counts by GPU
| GPU | CUDA Cores |
|---|---|
| RTX 4090 | 16,384 |
| A100 | 6,912 |
| H100 | 16,896 |