HBM (High Bandwidth Memory)

Appears in 1 paper

High-capacity GPU memory (e.g., 80GB on an H100), with lower bandwidth than SRAM.

As used in Paper 21 — Mamba: Linear-Time Sequence Modeling with Selective State Spaces →

High-capacity GPU memory (e.g., 80GB on an H100), with lower bandwidth than SRAM. Modern transformers and Mamba spend time shuttling data between HBM and faster SRAM. Efficient algorithms minimise HBM-SRAM communication.