---
title: "Batch Size"
canonical: "https://www.thundercompute.com/glossary/training/batch-size"
description: "The number of training samples processed in one forward/backward pass"
sidebarTitle: "Batch Size"
icon: "boxes-stacked"
iconType: "solid"
---

**Batch size** is the number of training samples that are processed together in a single forward and backward pass. It directly affects GPU memory usage, training speed, and model convergence.

## Example

```python
from torch.utils.data import DataLoader

# Batch size of 64 — each iteration processes 64 samples
train_loader = DataLoader(dataset, batch_size=64, shuffle=True)

for batch in train_loader:
    inputs, labels = batch
    outputs = model(inputs.cuda())  # forward pass on 64 samples at once
    loss = criterion(outputs, labels.cuda())
    loss.backward()
    optimizer.step()
    optimizer.zero_grad()
```

## Trade-offs

| Larger Batch | Smaller Batch |
|-------------|---------------|
| Better GPU utilization | Lower memory usage |
| Faster wall-clock time per epoch | Can generalize better |
| May need learning rate scaling | More noisy gradients |

## See Also

- [Forward Pass](/training/forward-pass)
- [Epoch](/training/epoch)
- [VRAM](/gpu-hardware/vram)