Don’t be afraid of parallel programming in Python

When it comes to writing code, I have always been a believer of the rules of optimization, which state:

The first rule of optimization is: Don’t do it.

The second rule of optimization (for experts only) is: Don’t do it yet.

These rules exist because, in general, if you try to rewrite your code to get a speed up, you will probably waste a lot of time and end up with code that is unreadable, fragile and that only runs a few milliseconds faster. This is especially true in scientific computing, where we are writing in high level languages, which use highly optimized libraries to perform computationally intensive tasks.

However, there are times when those rules of optimization can be broken. And there is a super simple way of leveraging parallel programming in Python that can give you a >10x speed up.

Continue reading

Nonnegative Matrix Factorization for Dummies.

It seems like every paper I look at these days has Nonnegative Matrix Factorization (NMF) in its methods somewhere. From machine learning, to calcium imaging, the seemingly magic ability of NMF to pull apart signals gets a lot of use. In this post I want to explain NMF to people who have zero understanding of linear algebra, show a few applications, and maybe give you some inspiration of how to use NMF in your own work.
Continue reading

Merging ROIs in suite2p

Suite2p is a wonderful Matlab toolbox written by Marius Pachitariu for analyzing population calcium imaging data. It uses a number of computational tricks to automate and accelerate the process (so no more drawing regions of interest (ROIs) by hand!). However, I spend most of my time imaging dendrites and axons, and here suite2p has a problem. Suite2p uses a heuristic that is looking for approximately elliptical ROIs, and hence it tends to split axons/dendrites into a large number smaller ROIs. The problem was simple: how can we merge the ROIs belonging to single cells? Well I used the logic that ROIs that belong to the same neuron should have highly correlated calcium signals (yes, I can imagine a situations where this wont be the case in dendrites, but bAPs will still dominate the calcium trace 99.9% of the time). Hence I simply correlate each ROI with every other ROI. ROIs with a correlation coefficient above some user settable threshold are considered to be part of the same process.

The main script is available here, and it requires distinguishable_colors.m (which in turn requires the image processing toolbox I believe).


The code is relatively well documented/commented, and there is even a ‘Help!’ button. If anyone has any problems with it, please let me know.

Visualizing how FFTs work.

A lot of scientists have performed Fast Fourier Transforms at some point, and those that haven’t, probably are going to in future, or at the very least, have read a paper using it. I’d used them for years before I ever began to think about how they algorithm actually worked. However, if you’ve ever looked it up, unless math is your first language, the explanation probably didn’t help you a lot. Normally you either get an explanation along the lines of “FFTs convert the signal from the time domain to the frequency domain” or you just get this:

X_{(k)}\ = \sum_{n=0}^{N-1} x_{(n)} \cdot e^{-2 \pi i k n / N}

However, the other day I came across an amazing explanation of the algorithm, and I really wanted to share it. While I might not be able to get you to the point that you completely understand the FFT, I think think it might seriously enhance your understanding.
Continue reading

Extracting data from a scatter graph

I’ve already made a rudimentary script for extracting data from published waveforms and other line graphs. But what I’ve needed recently is to be able to extract data from XY scatter graphs. This is a slightly more complex problem because it requires feature detection of an unknown number of points. You can access the script here, and I’ll go over its use and some of the code in the rest of this post. Continue reading

Trying to make the worlds cheapest syringe pump/linear drive

99% of things you buy from scientific suppliers are violently overpriced. Peristaltic pumps for $2000 that only contain $50 worth of equipment. Homeothermic blankets should only cost about three fifty. But the one that has always annoyed me are syringe drivers. These are nothing more than a stepper motor, a lead screw and a bit of electronics. The budget ones tend to start at around $300, and they go up to ten times that. I’m not standing for that, and neither should you. So I wanted to make one myself.


Continue reading

Extracting raw data from figures

Because I’m a cynical bastard, I regularly try to figure out what the real content of a published waveform is. For me, it’s usually someones EEG data that supposedly has some FFT peak that I can’t really believe. So instead of pouring over waveforms with Photoshop (read: Microsoft Paint) to figure out the data that’s in the an image, some time agoe ago I wrote a program in python to allow you to automatically get the numbers.

So I finally translated it to JavaScript, so all of you can benefit from it (and also I can use it at SfN).
Continue reading

Shouting Into the Void: Interacting with PubMed part II

So you’ve just been accepted for publication. At this point, after months of extra experiments and back and forths with reviewers, you’re probably well and truly sick of your paper. However, the months roll by you look at the paper again, you note a cute bit of data analysis here, nice turn of phrase there, and with a gleam in your eye you look to see how many citations you’ve gotten. And unless you’re very lucky, that number might well be still in single digits. 12 months of work, and less than 10 people have ever cited your work. Maybe you feel like it was all for nothing. Well I’m here to make you feel better, because your work was much more important than that. Continue reading

Interacting with PubMed. Part I

As a scientist, your life’s work is your publication list. I like to be intimate with mine. Sometimes I just stare at it. I’d buy it a glass of wine if I could. Maybe even caress it softly. Sure, she ain’t much to look at, but she’s mine, and I want to show her off. And if you want a job, you’re going to want to show yours off too. So I’m going to show you how to scrape your publication list from Pubmed with Python. Continue reading