§ 01 · Writing

Thoughts,
written down.

Ideas about math, computer science, life, and everything in between.

6 essays · filter

2026

The Intuition Behind Self-Attention

Attention has many distinct advantages over its RNN predecessor. This article focuses on the intuition behind attention, so you can understand why it's so powerful and widely used.

machine-learningtransformers

Machine Learning Jun

The Geometry Behind L1/L2 Regularization

L1 prefers sparse weights while L2 prefers small weights. We'll explore why this is, and how circles and squares help answer this question.

regularizationoptimization

Mathematics May

PCA is the Answer to a Constrained Optimization

Eigenvectors of the covariance matrix aren't a coincidence. They fall out of maximizing variance under a unit-norm constraint.

linear-algebrastatisticsoptimizationcalculus

Machine Learning May

Every Gradient in Your Neural Network Is Just the Chain Rule

Hand-compute every gradient in a neural network. By the end, you'll know why we perform backpropagation to train a neural network.

calculusneural networksbackpropagation

Mathematics Apr

Eigenvectors: The Unifying Language Behind Matrix Decomposition

We'll discuss what an eigenvector is and then relate it to three common forms of matrix decomposition, showing how each form builds upon the previous.

linear algebrastatistics

Life Mar

Hello, World

First post! Why I'm starting this blog, and what to expect.

metawriting

Essays · 6 published Drafts arrive when they're ready. RSS