KL Divergence

Label Smoothing: Intuition, Mathematics, Gradients, and Practical Use

Imagine a teacher grading a multiple-choice exam. If the teacher says, “Only this one answer has any value, and all […]

RLHF is a post-training recipe for turning a broadly capable language model into a more useful assistant. In practice, it

Imagine you are trying to teach a computer to paint. A classic autoencoder is a skilled copier: it learns an

Machine Learning is often described as “data + algorithms”, but mathematics is the glue that makes everything work. At its

The loss function quantifies the difference between the predicted output by the model and the actual output (or label) in