Principal Component Analysis (PCA) For Dummies

PCA is one of the first tools you’ll put in your data science tool box. Typically, you’ve collected so much data, that you don’t even know where to begin, so you perform PCA to reduce the data down to two dimensions, and then you can plot it, and look for structure. There are plenty of other reasons to perform PCA, but that’s a common one.

However, when you ask how PCA works, you either get a simple graphical explanation, or long webpages that boil down to the statement “The PCAs of your data are the eigenvectors of your data’s covariance matrix”. If you understood what that meant, you wouldn’t be looking up how PCA works! I want to try to provide a gateway between the graphical intuition and that statement about covariance and eigenvectors. Hopefully this will give you 1) a deeper understanding of linear algebra, 2) a nice intuition about what the covariance matrix is, and of course, 3) help you understand how PCA works.

Continue reading

Nonnegative Matrix Factorization for Dummies.

It seems like every paper I look at these days has Nonnegative Matrix Factorization (NMF) in its methods somewhere. From machine learning, to calcium imaging, the seemingly magic ability of NMF to pull apart signals gets a lot of use. In this post I want to explain NMF to people who have zero understanding of linear algebra, show a few applications, and maybe give you some inspiration of how to use NMF in your own work.
Continue reading

Visualizing how FFTs work.

A lot of scientists have performed Fast Fourier Transforms at some point, and those that haven’t, probably are going to in future, or at the very least, have read a paper using it. I’d used them for years before I ever began to think about how they algorithm actually worked. However, if you’ve ever looked it up, unless math is your first language, the explanation probably didn’t help you a lot. Normally you either get an explanation along the lines of “FFTs convert the signal from the time domain to the frequency domain” or you just get this:

X_{(k)}\ = \sum_{n=0}^{N-1} x_{(n)} \cdot e^{-2 \pi i k n / N}

However, the other day I came across an amazing explanation of the algorithm, and I really wanted to share it. While I might not be able to get you to the point that you completely understand the FFT, I think think it might seriously enhance your understanding.
Continue reading