Definition
Language Models
Field: Natural Language Processing
A probability distribution over sequences of words.
Sequence of Expressions
Theorem
The Chain Rule
Theorem
Cross-Entropy Loss
Theorem
The softmax function
Theorem
Vanishing Gradients
Theorem
The KL Divergence
Principle
Maximum Likelihood Estimation (MLE)
Principle