Parameters

Appears in 1 paper

The learnable weights in a neural network model.

As used in Paper 18 — Mistral 7B →

The learnable weights in a neural network model. Mistral 7B has 7 billion parameters, while LLaMA 2 13B has 13 billion. More parameters generally allow higher capacity, but also require more memory and compute. Mistral proved that clever architecture can reduce parameter count while maintaining quality.