Backpropagation

Backpropagation (backward pass) is the algorithm that computes the gradient of the loss function with respect to each weight in the network, by applying the chain rule layer by layer from output back to input.

Example

import torch
import torch.nn as nn

model = nn.Linear(784, 10).cuda()
x = torch.randn(32, 784, device="cuda")
target = torch.randint(0, 10, (32,), device="cuda")

output = model(x)                            # forward pass
loss = nn.CrossEntropyLoss()(output, target)  # compute loss
loss.backward()                               # backpropagation

print(model.weight.grad.shape)  # [10, 784] — gradient for each weight

Key Points

Follows the forward pass — requires stored activations
Computes gradients used by the optimizer to update weights
Typically uses ~2x the compute of the forward pass

Example

Key Points

See Also