---
title: "Thread, Block, Grid"
canonical: "https://www.thundercompute.com/glossary/cuda-programming/thread-block-grid"
description: "The hierarchical execution model of CUDA"
sidebarTitle: "Thread / Block / Grid"
icon: "layer-group"
iconType: "solid"
---

CUDA organizes parallel execution into a three-level hierarchy: **threads**, **blocks**, and **grids**.

## Hierarchy

| Level | Description |
|-------|-------------|
| **Thread** | A single unit of execution |
| **Block** | A group of threads that can share memory and synchronize |
| **Grid** | A collection of blocks that together execute a kernel |

## Example

```cpp
// Launch a grid of 8 blocks, each with 128 threads = 1024 total threads
myKernel<<<8, 128>>>(data);

// Inside the kernel, compute a global thread ID:
int tid = blockIdx.x * blockDim.x + threadIdx.x;
```

## Key Points

- Threads within a block can use **shared memory** and call `__syncthreads()`
- Blocks are independent — they can execute in any order
- Maximum threads per block is typically **1024**

## See Also

- [Kernel](/cuda-programming/kernel)
- [Warp](/cuda-programming/warp)
- [Shared Memory](/memory/shared-memory)
