ml

  • Monotonic Attention November 9, 2023
    Write-up explaining my implementation of monotonic attention using a probabilistic graphical model.
  • Retentive Networks and RWKV September 16, 2023
    A short, hand-wavy explainer for the mathematical intuition behind faster attention mechanisms.
  • Streaming Convolutions August 24, 2023
    Working out the math for streaming convolutions.
  • Diffusion verses Flow Matching July 19, 2023
    An accessible introduction to diffusion and flow matching models. This post aims to be both complete and easy-to-follow as a reference for implementing diffusion models yourself.
  • Fast Attention Implementations June 29, 2023
    A reference collection of fast attention implementations.
  • RWKV Language Model Math June 16, 2023
    In-depth explanation of the math behind the RWKV model, with PyTorch implementations, plus a discussion of numerical stability.
  • Robotics Pre-training Idea November 1, 2022
    A collection of my ideas relating to robotics pre-training.
  • HMMs and CRFs April 7, 2020
    A comparison of Hidden Markov Models and Conditional Random Fields, two kinds of probabilistic graphical models.
  • Coding the Viterbi Algorithm in Numpy March 15, 2020
    A demo of how to code the Viterbi algorithm in Numpy.
  • Using Gensim Word2Vec Embeddings in Keras August 2, 2016
    A short post and script regarding using Gensim Word2Vec embeddings in Keras, with example code.
  • Restricted Boltzmann Machines July 18, 2016
    Building on the Recurrent RBM for sequence modeling. This post relates to what I am doing for my Master's thesis.
  • Question Answering using Keras April 27, 2016
    An in-depth introduction to using Keras for language modeling; word embedding, recurrent and convolutional neural networks, attentional RNNs, and similarity metrics for vector embeddings.
  • A Neural Network in 28 Lines of Theano February 23, 2016
    A quick introduction to using Theano for deep learning, from the bare-bones to a full neural network.