Recent Posts

Primer on Softmax

23 minute read

There is so much detail to softmax like numerical issues, why it is used with CE and implications of its inductive bias.

Word2Vec Explainer

21 minute read

A fun and nuanced explainer of Word2Vec, the earliest successful word embedding algorithm.

Surprising Success of Deep Learning

7 minute read

I have recently applied to SERIMATS where I was tasked to answer the question Why is it surprising, from the perspective of classic machine learning, that ne...