writing
Blog
Thoughts on AI, deep learning, and engineering automation.
2024-10-25
Unaligned Multi-Modal Transformers
Most vision-language models assume tight alignment between image patches and text tokens. What happens when we relax that constraint? A look at unaligned multi-modal architectures and why they matter.
Read more →
2024-10-20
Einstein Summation (einsum) for Multi-Dimensional Arrays
einsum notation lets you write complex tensor contractions — matrix multiplications, batch operations, traces — in one readable line. Here is how to master it.
Read more →
2024-09-01
Self-Attention in Transformers Explained
A visual walkthrough of the self-attention mechanism — the core building block of Transformer models — with step-by-step implementation in PyTorch.
Read more →