# Thunder Compute > A glossary of GPU computing terms, CUDA programming concepts, and machine learning terminology ## GPU Hardware - [GPU](https://thunder-compute-glossary.jamdesk.app/gpu-hardware/gpu.md): Graphics Processing Unit — a processor designed for parallel computation - [What is VRAM?](https://thunder-compute-glossary.jamdesk.app/gpu-hardware/vram.md): Short definition of VRAM and why GPU memory capacity matters for real workloads. - [What are CUDA Cores?](https://thunder-compute-glossary.jamdesk.app/gpu-hardware/cuda-cores.md): Learn the basics regarding CUDA cores, plus a simple example of how they enable parallel work on NVIDIA GPUs. - [What are Tensor Cores?](https://thunder-compute-glossary.jamdesk.app/gpu-hardware/tensor-cores.md): Specialized GPU cores for matrix multiply-accumulate operations ## CUDA Programming - [CUDA](https://thunder-compute-glossary.jamdesk.app/cuda-programming/cuda.md): NVIDIA's parallel computing platform and API - [Kernel](https://thunder-compute-glossary.jamdesk.app/cuda-programming/kernel.md): A function that executes in parallel on the GPU - [Thread, Block, Grid](https://thunder-compute-glossary.jamdesk.app/cuda-programming/thread-block-grid.md): The hierarchical execution model of CUDA - [Warp](https://thunder-compute-glossary.jamdesk.app/cuda-programming/warp.md): A group of 32 threads that execute in lockstep on a GPU ## Memory - [Global Memory](https://thunder-compute-glossary.jamdesk.app/memory/global-memory.md): The main, large memory space on a GPU accessible by all threads - [Shared Memory](https://thunder-compute-glossary.jamdesk.app/memory/shared-memory.md): Fast, on-chip memory shared among threads in a block - [Memory Bandwidth](https://thunder-compute-glossary.jamdesk.app/memory/memory-bandwidth.md): The rate at which data can be read from or written to GPU memory ## Training - [Forward Pass](https://thunder-compute-glossary.jamdesk.app/training/forward-pass.md): Computing the output of a neural network from input to prediction - [Backpropagation](https://thunder-compute-glossary.jamdesk.app/training/backpropagation.md): The algorithm for computing gradients through a neural network - [Batch Size](https://thunder-compute-glossary.jamdesk.app/training/batch-size.md): The number of training samples processed in one forward/backward pass - [Epoch](https://thunder-compute-glossary.jamdesk.app/training/epoch.md): One complete pass through the entire training dataset ## Inference - [Inference](https://thunder-compute-glossary.jamdesk.app/inference/inference.md): Using a trained model to make predictions on new data - [Quantization](https://thunder-compute-glossary.jamdesk.app/inference/quantization.md): Reducing model precision to use less memory and compute faster - [Throughput vs. Latency](https://thunder-compute-glossary.jamdesk.app/inference/throughput-vs-latency.md): Two key metrics for measuring inference performance ## Parallelism - [Data Parallelism](https://thunder-compute-glossary.jamdesk.app/parallelism/data-parallelism.md): Distributing training data across multiple GPUs, each with a copy of the model - [Model Parallelism](https://thunder-compute-glossary.jamdesk.app/parallelism/model-parallelism.md): Splitting a model across multiple GPUs when it's too large for one - [Tensor Parallelism](https://thunder-compute-glossary.jamdesk.app/parallelism/tensor-parallelism.md): Splitting individual layers across multiple GPUs