---
title: "Backpropagation"
canonical: "https://www.thundercompute.com/glossary/training/backpropagation"
description: "The algorithm for computing gradients through a neural network"
sidebarTitle: "Backpropagation"
icon: "arrow-left"
iconType: "solid"
---

**Backpropagation** (backward pass) is the algorithm that computes the gradient of the loss function with respect to each weight in the network, by applying the chain rule layer by layer from output back to input.

## Example

```python
import torch
import torch.nn as nn

model = nn.Linear(784, 10).cuda()
x = torch.randn(32, 784, device="cuda")
target = torch.randint(0, 10, (32,), device="cuda")

output = model(x)                            # forward pass
loss = nn.CrossEntropyLoss()(output, target)  # compute loss
loss.backward()                               # backpropagation

print(model.weight.grad.shape)  # [10, 784] — gradient for each weight
```

## Key Points

- Follows the forward pass — requires stored activations
- Computes **gradients** used by the optimizer to update weights
- Typically uses ~2x the compute of the forward pass

## See Also

- [Forward Pass](/training/forward-pass)
- [Epoch](/training/epoch)