Erasing background from an image

I have two opaque images –  one with an object and a background, and another with just the background.  Like:

I want to subtract the background from the image so that the alpha blended result is visually identical, but the foreground is as transparent as possible.

E.g:

lion

Desired output (All images under Reuse With Modification license)

I’m sure that this must have been, but I couldn’t find a single correct way of doing this!

I asked a developer from the image editor gimp team, and they replied that the standard way is to create an alpha mask on the front image from the difference between the two images.  i.e. for each pixel in both layers, subtract the rgb values, average that difference between the three channels, and then use that as an alpha.

But this is clearly not correct.  Imagine the foreground has a green piece of semi-transparent glass against a red background.  Just using an alpha mask is clearly not going to subtract the background because you need to actually modify the rgb values in the top layer image to remove all the red.

So what is the correct solution?  Let’s do the calculations.

If we have a solution, the for a solid background with a semi-transparent foreground layer that is alpha blended on top, the final visual color is:

out_{rgb} = src_{rgb} * src_{alpha} + dst_{rgb} \cdot (1-src_{alpha})

We want the visual result to be the same, so we know the value of out_{rgb} – that’s our original foreground+background image.  And we know dst_{rgb} – that’s our background image.  We want to now create a new foreground image, src_{rgb}, with the maximum value of src_{alpha}.

So to restate this again – I want to know how to change the top layer src so that I can have the maximum possible alpha without changing the final visual image at all.  I.e. remove as much of the background as possible from our foreground+background image.

Note that we also have the constraint that for each color channel, that src_{rgb} \le 1 since each rgb pixel value is between 0 and 1.  So:

src_{alpha} \le (out_{rgb} - dst_{rgb})/(1-dst_{rgb})

So:

src_{alpha} = Min((out_r - dst_r)/(1-dst_r), out_g - dst_g)/(1-dst_g), out_b - dst_b)/(1-dst_b))\\ src_{rgb} = (dst_{rgb} \cdot (1-src_{alpha}) - out_{rgb})/src_{alpha}

Proposal

Add an option for the gimp eraser tool to ‘remove layers underneath’, which grabs the rgb value of the layer underneath and applies the formula using the alpha in the brush as a normal erasure would, but bounding the alpha to be no more than the equation above, and modifying the rgb values accordingly.

Result

I showed this to the Gimp team, and they found a way to do this with the latest version in git.  Open the two images as layers.  For the top layer do: Layer->Transparency->Add Alpha Channel.  Select the Clone tool.  On the background layer, ctrl click anywhere to set the Clone source.  In the Clone tool options, choose Default and Color erase, and set alignment to Registered.  Make the size large, select the top layer again, and click on it to erase everything.

Result is:

Screenshot_20170307_133539.png

When the background is a very different color, it works great – the sky was very nicely erased.  But when the colors are too similar, it goes completely wrong.

Overall..  a failure.  But interesting.

Practical Deep Neural Network in a few lines of code with TensorFlow

This is using the very latest TensorFlow tf.contrib.learn API.  The documentation is extremely thin, and as of writing there are no tutorials out there similar to this.  The API may change, and I don’t actually know if I’m doing things correctly.  This is just trial and error 🙂

So let’s say you want to:

  1. Train a Deep Neural Network by using some existing data, and training it against known labels.  e.g. labeled images.
  2. Use that trained network in your app to predict y given x.  E.g. guess the best label for a given image

    And as a bonus:

  3. Have some nice graphs of how well the training is doing

I assume that the data you have in a plain array-of-arrays or numpy 2D array.  It should be easy to change this to use csv or panda etc data. A row is a single example, and a column is a particular feature (e.g. house price, or pixel intensity at a specific location)

So, without further ado, here’s the function that will be shared between your training and your app

def get_tf_model():
    """ Setup our Deep Neural Network and LOAD any existing model
       
        Returns:
          ( Our (maybe trained) model, a helper input function)
    """
    # FEATURES is a short description for each column
    # of your data.  Change this to match your data
    FEATURES = ["x1", "x2"]
    # Set this to describe your columns.  If they are all real values,
    # you don't need to change it.
    feature_columns = [tf.contrib.layers.real_valued_column(k) for k in FEATURES]

    # Build 3 layer DNN.  You can change this however you want, or use a
    # linear regressor, or use a classifier etc.
    # NOTE:
    #   This will LOAD any existing model in the "model_dir" directory!
    # The documentation fails to mention this point as of writing
    regressor = tf.contrib.learn.DNNRegressor(feature_columns=feature_columns,
                                              hidden_units=[128, 128, 128],
                                              model_dir="tf_model")

    def input_fn(x_data, y_data = None):
        # Note the 'shape' parameter is to suppress a very noisy warning.
        # You can probably remove this parameter in a month or two.
        feature_cols = {k: tf.constant(x_data[:,i], shape=[len(x_data[:,i]), 1])
                        for i, k in enumerate(FEATURES)}
        if y_data is None:
            return feature_cols
        labels = tf.constant(y_data)
        return feature_cols, labels

    return regressor, input_fn

Now the code for training. Note that it’s absolutely fine to carry on training a model that is already trained, if you have some new data for it.

x_test = None
y_test = None
regressor, input_fn = get_tf_model()

def train(training_data_y, training_data_x):
    global x_test, y_test, regressor, input_fn

    if x_test is None:
        x_train = np.array(training_data_x[:-20], dtype=np.float32)
        x_test = np.array(training_data_x[-20:], dtype=np.float32)
        y_train = np.array(training_data_y[:-20], dtype=np.float32)
        y_test = np.array(training_data_y[-20:], dtype=np.float32)
    else:
        x_train = np.array(training_data_x, dtype=np.float32)
        y_train = np.array(training_data_y, dtype=np.float32)

    print("Training model ...")
    # Fit model.
    regressor.fit(input_fn=lambda: input_fn(x_train, y_train), steps=2000)

    ev = regressor.evaluate(input_fn=lambda: input_fn(x_test, y_test), steps=1)
    print('  -  Trained Loss: {0:f}'.format(ev["loss"]))

And now finally, the code to use this model in the app. Note that no explicit loading is needed, because it’s loading it from the model_dir
 

regressor, input_fn = get_tf_model()
def predict(x_data):
    return regressor.predict(input_fn=lambda:input_fn(x_data))

Isn’t that simple?

 
We can also view a graph of the loss etc with:

tensorboard tf_model  # or whatever you set model_dir to

Then navigating to http://127.0.1.1:6006/ in the browser

Worst/Trickiest code I have ever seen

It’s easy to write bad code, but it takes a real genius to produce truly terrible code.  And the guys who wrote the python program hyperopt were clearly very clever.

Have a look at this function:  (don’t worry about what it is doing) from tpe.py

# These produce conditional estimators for various prior distributions
@adaptive_parzen_sampler('uniform')
def ap_uniform_sampler(obs, prior_weight, low, high, size=(), rng=None):
    prior_mu = 0.5 * (high + low)
    prior_sigma = 1.0 * (high - low)
    weights, mus, sigmas = scope.adaptive_parzen_normal(obs,
        prior_weight, prior_mu, prior_sigma)
    return scope.GMM1(weights, mus, sigmas, low=low, high=high, q=None,
size=size, rng=rng)

The details don’t matter here, but clearly it’s calling some function “adaptive_parzen_normal”  which returns three values, then it passes that to another function called “GMM1”  and returns the result.

Pretty straight forward?  With me so far?  Great.

Now here is some code that calls this function:

fn = adaptive_parzen_samplers[node.name]
named_args = [[kw, memo[arg]] for (kw, arg) in node.named_args]
a_args = [obs_above, prior_weight] + aa
a_post = fn(*a_args, **dict(named_args))

Okay this is getting quite messy, but with a bit of thinking we can understand it.  It’s just calling the  ‘ap_uniform_sampler’  function, whatever that does, but letting us pass in parameters in some funky way.

So a_post is basically whatever “GMM1” returns  (which is a list of numbers, fwiw)

Okay, let’s continue!

fn_lpdf = getattr(scope, a_post.name + '_lpdf')
a_kwargs = dict([(n, a) for n, a in a_post.named_args if n not in ('rng', 'size')])
above_llik = fn_lpdf(*([b_post] + a_post.pos_args), **a_kwargs)

and that’s it.  There’s no more code using a_post.

This took me a whole day to figure out what on earth is going on.  But I’ll give you, the reader, a hint.  This is not running any algorithm – it’s constructing an Abstract Syntax Tree and manipulating it.

If you want, try and see if you can figure out what it’s doing.

Answer: Continue reading

Python – Hyperopt – Finding the optimal hyper parameters

This comes up as a very common question in ##machinelearning chat room – How do you chose the right parameters for your Neural Network model?  How do you chose how many hidden layers to use, or what your drop out parameter should be, or what the learning rate should be, and so on?

Or in other fields, say you’re doing an engineering or science experiment, how do you chose the right parameters to test?

This problem is called hyperoptimization.

One possible way is to do a “grid search”.  You keep all the parameters constant, then vary one by one and see what effect it has.  You adjust a different parameter, then repeat on the first parameter, trying to either to test out every possible combination.  The trouble is that the number of combinations blows dramatically, to the power of the number of parameters.

But never fear – there are smarter ways to do this, including random search, Bayesian Search, and Tree of Parzen Estimators (TPE).  The good thing is that you don’t actually need to know any of this is, in order to actually use it.

Let’s take a toy example, and since it’s valentines at the time of writing, let’s imagine instead our error space is:

heart.png

Toy hyperparameter space function.  I spent way too long making this pretty 😉

Where the axes are filling in for our hyperparameters (e.g. learning rate, size of hidden layer), and the z value is our error value (e.g. percentage of cat photos incorrectly identified as dogs, etc.  Something that we want to minimize).

This diagram was generated with the notebook python code:


%matplotlib notebook
import matplotlib.pyplot as plt
import numpy as np

def heart(x, y):
    return -((x**2 + y**2 - 1)**3 - x**2 *y**3);

x = y = np.arange(-2, 2, 0.01)
X, Y = np.meshgrid(x,y)
Z = heart(X, Y)

plt.figure()
def plotHeart():
CS = plt.contour(X, Y, Z, levels=[-0.5, -0.2, -0.1, 0, 1, 20],
            colors=["0.9", "0.9", "0.9", "red", "0.9", "0.9"])
plt.clabel(CS, inline=1, fontsize=10)
plotHeart()

So, say you’ve written a Tensorflow, or keras, or scipy, etc program in Python.  For example, tensorflow/mnist.py.  Have a quick look at that – see how they’ve got various hyperparameters that need tuning.  There’s some complicated error function that we can’t directly visualize, in some high dimensional hyperparameter space.  And we’re trying to find the parameters that minimize this error.

Here’s my dummy program that will stand in for your complex number or cat recognition program:

class HyperParameters:
    def __init__(self):
        self.learning_rate = 1
        self.size_of_hidden_layer = 1

    def runAndGetError(self):
        """This function should call your tensorflow etc
        code which does the whole training procedure and returns
        the TRAINING error.  Lower is better!"""
        return heart(self.learning_rate, self.size_of_hidden_layer)
hyperparameters = HyperParameters();

def main():
    pass # This is where you do all the normal training code etc

if __name__ == "__main__":
    main()

Note that we’ve got a class for all our hyperparameters.  I recommend you having this!  Note also that we don’t actual run the training unless __name__ == “__main__”.  The point here is to make sure that our program runs as normal if we just run it directly, but we can also create a second program that uses hyperopt to tune this without having to touch our main program at all.  This sort of separation is really nice.

You can put discrete ints, category ints, real, bools, etc in the

Don’t put any hyperopt code in your main program!

Now, without touching your main program at all, we create our hyperopt tuner:

#!/usr/bin/python3
from hyperopt import fmin, tpe, hp, STATUS_OK, Trials
import heart.hyperparameters as hyperparameters  # heart.py in the same folder
import numpy as np
import signal
import sys
import pickle

# The key in the space must match a variable name in HyperParameters
space = {
    'learning_rate': hp.uniform('learning_rate', -2, 2),
    'size_of_hidden_layer': hp.uniform('size_of_hidden_layer', -2, 2),
}

num_trials = 10
trials = Trials()

def summarizeTrials():
    print()
    print()
    print("Trials is:", np.sort(np.array([x for x in trials.losses() if x is not None])))

def main():
    global trials
    loadFromPickle = True
    try:
        if loadFromPickle:
            trials = pickle.load(open("hyperopt.p", "rb"))
    except:
        print("Starting new trials file")

    def objective(args):
        for key, value in args.items():
            if int(value) == value:
                value = int(value)
            setattr(hyperparameters, key, value)
        score = hyperparameters.runAndGetError()
        #print("Score:", score, "for", args)
        return {'loss': score, 'status': STATUS_OK}

    for i in range(num_trials):
        best = fmin(objective,
                space=space,
                algo=tpe.suggest,
                max_evals=(i+1),
                trials=trials)
        save_trials()
    summarizeTrials()
    print(i, "Best result was: ", best)

def save_trials():
    pickle.dump(trials, open("hyperopt.p", "wb"))

def signal_handler(signal, frame):
    summarizeTrials()
    save_trials()
    sys.exit(0)
signal.signal(signal.SIGINT, signal_handler)

if __name__ == '__main__':
    main()

And now let’s plot this to see what is going on and see how hyperopt is going to sample from our space:

results = [ (x['misc']['vals']['learning_rate'][0], x['misc']['vals']['size_of_hidden_layer'][0]) for x in trials.trials if 'loss' in x['result']]
plt.figure()
for i, result in enumerate(results):
    plt.annotate(i, xy = result, va="center", ha="center")
plotHeart()
plt.show()
heart2

Hyperopt TPE testing 10 samples from our space.  First trial labeled ‘0’

heart3

These suggestions for trial evaluations look awful!  Why is not sampling close to the middle, where there are known good results?

What on earth is hyperopt doing?

It hasn’t even come close to the minimum points (at 0,0), while with just 10 samples, three of the samples are all bunched right next to each other.  On repeated runs, sometimes 50% of the points are right on the border too!

Even with 50 samples you end up with clumping, and even though it finds an excellent value at 2, it doesn’t improve on that even after 48 samples!

There has to be something going on here.

Strange Hyperopt behavior

Digging into the hyperopt code, turns out that there’s a:

_default_n_startup_jobs = 20

parameter.  So for the first 20 runs, the ‘TPE’ behavior isn’t used at all.  Digging around in the code reveals that this can be adjusted in our code by using:

from functools import partial
...    best = fmin(objective,...
                algo=partial(tpe.suggest, n_startup_jobs=1),

Which now produces:

heart4.png

Which is looking a lot nicer!

I’m not very happy with the initial sampling – just look at the position of the first 5 trials, but this is already a lot better.  But let’s try to crack open this black box a bit.

 


How it works.

  1. We split our experimental observations into a small number of Good Observations, and the rest as the Bad Observations.  They are split by how badly they did (e.g. percentage of images that they incorrectly classified).  The uses uses the least-bad  γ·√(num_observations) fraction for the good observations (obs_good) and the rest for the bad observations (obs_bad).This is done in the code with:
    obs_good, obs_bad = scope.ap_filter_trials(obs_x, obs_y, gamma)
  2. Pick random values of x. but making it more likely to pick values of x which are similar to the x values in the Good Observations that we’ve already seen.  So if x=3 and x=6 usually give a good result, we’re more likely to pick values close to that.  This is mathematically done by sampling from a Gaussian Mixture Model comprised of Gaussians with a mean of each good observation’s x value.

    This is done in code with: (I’ve renamed the variables)

    good_weights, good_mus, good_sigmas = adaptive_parzen_normal(
                         obs_good, prior_weight, prior_mu, prior_sigma)
    x_samples = scope.GMM1(weights, mus, sigmas)
  3. Calculate p(x | y is good) for our Good Samples and p(x | y is bad) for the Bad Samples.  This is done by getting the probability density of the Gaussian Mixture Model for each sample value.  As an implementation note, hyperopt actually calculates the log for each of these values (llik = log likelihood).

    Now, in code, this is actually done in a very complicated way that I decided to devote its own post to.

    So here instead is the equivalent of what is done in code:

    bad_weights, bad_mus, bad_sigmas = adaptive_parzen_normal(
                         obs_bad, prior_weight, prior_mu, prior_sigma)
    good_llik = GMM1_lpdf(x_samples, good_weights, good_mus, good_sigmas)
    bad_llik = GMM1_lpdf(x_samples, bad_weights, bad_mus, bad_sigmas)
  4. Find x value that maximizes the Expected Improvement: Chose the x value from our x_samples that maximizes:    p(x | y is good) / p(x | y is bad).   This is mathematically equal to choosing the x value that maximizes log(p(x | y is good) – log(p(x | y is good).

    This is done in code with:

     score = good_llik - bad_llik  # Element-wise vector subtraction
     best = np.argmax(score)
     memo = x_samples[best]
  5. Repeat for each hyperparameter.  Note that this assumes that every hyperparameter is completely independent from the others, which is obviously false.  To compensate, it weights older observations lower than newer ones, in the hope that it starts to converge all the parameters to good parameters that don’t change much.

Let’s put this all together, and graph what is going on:

def miscs_to_idxs_vals(miscs):
keys = list(miscs[0]['idxs'].keys())
idxs = dict([(k, []) for k in keys])
vals = dict([(k, []) for k in keys])
for misc in miscs:
for node_id in idxs:
t_idxs = misc['idxs'][node_id]
t_vals = misc['vals'][node_id]
assert len(t_idxs) == len(t_vals)
assert t_idxs == [] or t_idxs == [misc['tid']]
idxs[node_id].extend(t_idxs)
vals[node_id].extend(t_vals)
return idxs, vals

def get_lpdf_ast(parameter_name):
idxs, vals = miscs_to_idxs_vals(trials.miscs)
idxs = idxs[parameter_name]
vals = vals[parameter_name]
losses = trials.losses()

from hyperopt import pyll
from hyperopt.tpe import ap_filter_trials
from hyperopt.tpe import adaptive_parzen_samplers

qu = space[parameter_name].pos_args[0].pos_args[1] #pyll.scope.uniform(-2, 2)
fn = adaptive_parzen_samplers['uniform']
fn_kwargs = dict(size=(4,), rng=np.random)
s_below = pyll.Literal()
s_above = pyll.Literal()
b_args = [s_below, prior_weight] + qu.pos_args
b_post = fn(*b_args, **fn_kwargs)
a_args = [s_above, prior_weight] + qu.pos_args
a_post = fn(*a_args, **fn_kwargs)

start_val = space[parameter_name].pos_args[0].pos_args[1].pos_args[0].eval()
end_val = space[parameter_name].pos_args[0].pos_args[1].pos_args[1].eval()
x_samples = np.arange(start_val,end_val,0.1)

# print b_post
# print a_post
fn_lpdf = getattr(pyll.scope, a_post.name + '_lpdf')
# calculate the llik of b_post under both distributions
a_kwargs = dict([(n, a) for n, a in a_post.named_args
if n not in ('rng', 'size')])
b_kwargs = dict([(n, a) for n, a in b_post.named_args
if n not in ('rng', 'size')])
below_llik = fn_lpdf(*([x_samples] + b_post.pos_args), **b_kwargs)
above_llik = fn_lpdf(*([x_samples] + a_post.pos_args), **a_kwargs)
new_node = pyll.scope.broadcast_best(x_samples, below_llik, above_llik)

return s_below, s_above, below_llik, above_llik, new_node

def get_lpdf(ii, s_below, s_above, below_llik, above_llik, new_node):
below, above = ap_filter_trials(idxs[:ii], vals[:ii], idxs[:ii], losses[:ii], gamma)
memo = {s_below: below, s_above: above}
bl, al, nv = pyll.rec_eval([below_llik, above_llik, new_node],
memo=memo)
return bl, al, nv
parameter_name = "learning_rate"
s_below, s_above, below_llik, above_llik, new_node = get_lpdf_ast(parameter_name)
plotAnnotatedHeart()
plt.show()
plt.figure()
plt.text(0.06, 0.5, '⟵ Number of Trials', ha='center', va='center', rotation='vertical')
num_subplots = 10
subplot_i =1
for ii in range(1, len(idxs)-1, 1+int(len(idxs)/num_subplots)):
if ii > len(idxs):
break
bl, al, nv = get_lpdf(ii, s_below, s_above, below_llik, above_llik, new_node)
plt.subplot(num_subplots, 1, subplot_i)
subplot_i += 1
plt.plot(all_vals, bl, c='g')
plt.plot(all_vals, al, c='r')
plt.plot(all_vals, bl-al, c='k')
plt.axvline(nv[0], c='k')
plt.show()

Phew!   That took… quite a lot of code!  I wish hyperopt made this easier.  I should try to make a patch for hyperopt to make it easier.

Anyway, so here’s the result (drum roll).  First the loss function overlaid with where we are running our trials:

heart5heart6-graph

The graph on the bottom is for the x axis only variable of the heart graph.  The x axis is the same, and the y axis is the log(p(x|y is good) in green, log(p(x|y is bad)) in red, and log(p(x|y is good) – p(x|y is bad)) in black.  The vertical part indicates the largest value for the black line, which is the x value that will give the Maximum Improvement to the energy.  We have one graph after every 10 trials, with the bottom graph being after 100 trials (aka observations).

We can start to see what is going wrong here.  I don’t like that it’s not sampling the center enough, even though it can clearly see that the highest energy is there (the green peaks there).  I don’t like that it tends to clump together measurements.  I’m sure with a bit of thinking we can do better than this!

But I’ll think about it and leave that for another post.

 

More realistic example

Here’s a hyperopt ‘space’ that I used for Q-Learning, to give an idea of the options for a more realistic model:

# The key in the space must match a variable name in HyperParameters
space = {
    # Learning rate should be between 0.00001 and 1
    'learning_rate':
        hp.loguniform('learning_rate', np.log(1e-5), np.log(1)),
    'use_double_q_learning':
        hp.choice('use_double_q_learning', [True, False]),
    'layer1_size': hp.quniform('layer1_size', 10, 100, 1),
    'layer2_size': hp.quniform('layer2_size', 10, 100, 1),
    'layer3_size': hp.quniform('layer3_size', 10, 100, 1),
    'future_discount_max': hp.uniform('future_discount_max', 0.5, 0.99),
    'future_discount_increment':
        hp.loguniform('future_discount_increment', np.log(0.001), np.log(0.1)),
    'recall_memory_size': hp.quniform('recall_memory_size', 1, 100, 1),
    # This doesn't tune very well:
    # 'random_action_epsilon':
    #    hp.loguniform('random_action_epsilon', np.log(0.0001), np.log(0.1)),
    'recall_memory_num_experiences_per_recall':
       hp.quniform('recall_memory_num_experiences_per_recall', 10, 2000, 1),
    'num_epochs': hp.quniform('num_epochs', 1, 10, 1),
}

Tensorflow for Neurobiologists

I couldn’t find anyone else that has done this, so I made this really quick guide.  This uses tensorflow which is a complete overkill for this specific problem, but I figure that a simple example is much easier to follow.

Install and run python3 notebook, and tensorflow.  In Linux, as a user without using sudo:

$ pip3 install --upgrade --user ipython[all] tensorflow matplotlib
$ ipython3  notebook

Then in the notebook window, do New->Python 3

Here’s an example I made earlier. You can download the latest version on github here: https://github.com/johnflux/Spike-Triggered-Average-in-TensorFlow

Spike Triggered Average in TensorFlow

The data is an experimentally recorded set of spikes recorded from the famous H1 motion-sensitive neuron of the fly (Calliphora vicina) from the lab of Dr Robert de Ruyter van Steveninck.

This is a complete rewrite of non-tensorflow code in the Coursera course Computational Neuroscience by University of Washington. I am thoroughly enjoying this course!

Here we use TensorFlow to find out how the neuron is reacting to the data, to see what causes the neuron to trigger.

%matplotlib inline
import pickle
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
sess = tf.InteractiveSession()

FILENAME = 'data.pickle'

with open(FILENAME, 'rb') as f:
    data = pickle.load(f)

stim = tf.constant(data['stim'])
rho = tf.constant(data['rho'])
sampling_period = 2 # The data was sampled at 500hz = 2ms
window_size = 150 # Let's use a 300ms / sampling_period sliding window

We now have our data loaded into tensorflow as a constant, which means that we can easily ‘run’ our tensorflow graph. For example, let’s examine stim and rho:

print("Spike-train time-series =", rho.eval(),
      "\nStimulus time-series     =", stim.eval())
Spike-train time-series = [0 0 0 ..., 0 0 0] 
Stimulus time-series    = [-111.94824219  -81.80664062 
    10.21972656 ...,  9.78515625 24.11132812 50.25390625]

rho is an binary array where a 1 indicates a spike. Let’s turn that into an array of indices of where the value is 1, but ignoring the first window_size elements.

Note: We can use the [] and + operations on a tensorflow variable, and it correctly adds those operations to the graph. This is equivalent to using the tf.slice and tf.add operations.

spike_times = tf.where(tf.not_equal(rho[window_size:-1], 0)) + window_size
print("Time indices where there is a spike:\n", spike_times.eval())
Time indices where there is a spike:
 [[   158]
 [   160]
 [   162]
 ..., 
 [599936]
 [599941]
 [599947]]
def getStimWindow(index):
    i = tf.cast(index, tf.int32)
    return stim[i-window_size+1:i+1]
stim_windows = tf.map_fn(lambda x: getStimWindow(x[0]), spike_times, dtype=tf.float64)
spike_triggered_average = tf.reduce_mean(stim_windows, 0).eval()
print("Spike triggered averaged is:", spike_triggered_average[0:5], "(truncated)")
Spike triggered averaged is: [-0.33083048 -0.29083503 -0.23076012 -0.24636984 -0.10962767] (truncated)

Now let’s plot this!

time = (np.arange(-window_size, 0) + 1) * sampling_period
plt.plot(time, spike_triggered_average)
plt.xlabel('Time (ms)')
plt.ylabel('Stimulus')
plt.title('Spike-Triggered Average')

plt.show()

output_8_0

It’s… beautiful!

What we are looking at here, is that we’ve discovered that our neuron is doing a leaky integration of the stimulus. And when that integration adds up to a certain value, it triggers.

Do see the github repo for full source: https://github.com/johnflux/Spike-Triggered-Average-in-TensorFlow

Update: I was curious how much noise there was. There’s a plot with 1 standard deviation in light blue:

mean, var = tf.nn.moments(stim_windows,axes=[0])
plt.errorbar(time, spike_triggered_average, yerr=tf.sqrt(var).eval(), ecolor="#0000ff33")

spike2

Yikes!  This is why the input signal MUST be Gaussian, and why we need lots of data to average over.  For this, we’re averaging over 53583 thousand windows.

3D Camera Recording Software in React

This is project that I’ve been working on for the past 6 months.  We are now officially selling the software, so I wanted to give a brief overview of how I implemented this.  There’s no secrets here – nothing here that you couldn’t work out easily from looking at the site.

I implemented the front end in React + Redux in Javascript using a websocket and RESTful API for communication.  I worked with a UI designer and a team that implemented the backend.

Some quick lessons learned:

  • React is really nice 🙂
  • I initially implemented the gamma correction graphs in HTML 5 Canvas, but found it difficult to get it to look nice on high density displays.  I switched to creating SVGs instead.
  • Modifying SVGs on the fly using React is really really nice!
  • I initially developed without using Redux, but was eventually persuaded into switching to Redux.  Honestly, I’m really not convinced it was the right choice.  Dan Abramov, the co-author of Redux, wrote an article called: “You might not need Redux” which I agree with.
  • I intended from the start to have a “fake backend” that acted like a simulator of the feel hardware and used the same websocket protocol.  This turned out to be extremely useful, and paid itself back over and over.
  • Keep in touch with the end users as much as possible, and release early and release often.  There’s a lot of hate for the word ‘Agile’, but I’ve used it in 3 major projects so far, and I’m a true believer in it.
  • Visual Studio Code (not Visual Studio) works in Linux flawlessly and is absolutely awesome.  It’s won me over after being a heavy vim user for 15 years.  It supports ES7 and JSX syntax.
  • Javascript lint is awesome, even though at the start the rules seem totally ridiculous.  (e.g. you have to have colors in uppercase in the css).  The rules still feel ridiculous, but it doesn’t take long to just do it the way lint wants.
  • Don’t go for a minimal react setup, but use a boilerplate with the most common things included, such as this.  It can seem really heavy for a beginner, but things like babel, router, image loader, ES7, etc are just awesome.