Beta Phase: Square45 is currently in beta testing. Expect some features or content to be incomplete or missing.
45

The KL Divergence

\operatorname{KL}(p||q) = \sum_{i} p(i) log(\frac{p(i)}{q(i)}) Measures the difference between two probability distributions, frequently used in language model regularization.