Biped Robot

I’ve always wanted to make a walking robot.  I wanted to make something fairly rapidly and cheaply that I could try to get walking.

And so, 24 hours of hardware and software hacking later:

8jiqqy

He’s waving only by a short amount because otherwise he falls over 🙂  Took a day and half to do, so overall I’m pretty pleased with it.  It uses 17 MG996R servos, and a Chinese rtrobot 32 servo controller board.

Reverse Engineering Servo board

The controller board amazingly provides INCOMPLETE instructions.  The result is that anyone trying to use this board will find that it just does not work because the board completely ignores the commands that are supposed to work.

I downloaded the example software that they provide, which does work.  I ran the software through strace like:

$ strace  ./ServoController 2>&1 | tee dump.txt

Searching in dump.txt for ttyACM0 reveals the hidden initialization protocol.  They do:

open("/dev/ttyACM0", O_RDWR|O_NOCTTY|O_NONBLOCK) = 9
write(9, "~RT", 3)                      = 3
read(9, "RT", 2)                        = 2
read(9, "\27", 1)                       = 1
ioctl(9, TCSBRK, 1)                     = 0
write(9, "~OL", 3)                      = 3
write(9, "#1P1539\r\n", 9)              = 9

(The TCSBRK  ioctl basically just blocks until nothing is left to be sent).  Translating this into python we get:


#!/usr/bin/python
import serial
from time import sleep

ser = serial.Serial('/dev/ttyACM0', 9600)
ser.write('~RT')
print(repr(ser.read(3)))
ser.write('~OL')
ser.flush()
ser.write("#1P2000\r\n")  # move motor 1 to 2000
sleep(1)
ser.write("#1P1000\r\n")  # move motor 1 to 1000
print("done")

(Looking at the strace more, running it over multiple runs, sometimes it writes “~OL” and sometimes “OL”.  I don’t know why.  But it didn’t seem to make a difference.  That’s the capital letter O btw.)

Feedback

I wanted to have a crude sensor measurement of which way up it is.  After all, how can it stand up if it doesn’t know where up is?  On other projects, I’ve used an accelerometer+gyro+magnetometer, and fused the data with a kalman filter or similar.  But honestly it’s a huge amount of work to get right, especially if you want to calibrate them (the magnetometer in particular).  So I wanted to skip all that.

Two possible ideas:

  1. There’s a really quick hack that I’ve used before – simply place the robot underneath a ceiling light, and use a photosensitive diode to detect the light (See my Self Balancing Robot).  Thus its resistance is at its lowest when it’s upright 🙂   (More specifically, make a voltage divider with it and then measure the voltage with an Arduino).  It’s extremely crude, but the nice thing about it is that it’s dead cheap, and insensitive to vibrational noise, and surprisingly sensitive still.  It’s also as fast as your ADC.
  2. Use an Android phone.

I want to move quickly on this project, so I decided to give the second way a go.  Before dealing with vibration etc, I first wanted to know whether it could work, and what the latency would be if I just transmitted the Android fused orientation data across wifi (UDP) to my router, then to my laptop, which then talks via USB to the serial board which then finally moves the servo.

So, I transmitted the data and used the phone tilt to control the two of the servos on the arm, then recorded with the same phone’s camera at the same time.   The result is:

I used a video editor (OpenShot) to load up the video, then measured the time between when the camera moved and when the arm moved.  I took 6 such measurements, and found 6 or 7 frames each time – so between 200ms and 233ms.

That is..  exactly what TowerKing says is the latency of the servo itself (Under 0.2s).  Which means that I’m unable to measure any latency due to the network setup.  That’s really promising!

I do wonder if 200ms is going to be low enough latency though (more expensive hobby servos go down to 100ms), but it should be enough.  I did previously do quite extensive experimental tests on latency on the stabilization of a PID controlled quadcopter in my own simulator, where 200ms delay was found to be controllable, but not ideal.  50ms was far more ideal.  But I have no idea how that lesson will transfer to robot stabilization.

But it is good enough for this quick and dirty project.  This was done in about 0.5 days, bringing the total so far up to 2 full days of work.

Cost and Time Breakdown so far

Metal skeleton $99 USD
17x MG996R servo motors $49 USD
RT Robot 32ch Servo control board $25 USD
Delivery from China $40 USD
USB Cable $2 USD
Android Phone (used own phone)
Total: $215 USD
Parts cost:

For tools, I used nothing more than some screwdrivers and needle-nosed pliers, and a bench power supply. Around $120 in total. I could have gotten 17x MG995 servos for a total of $45, but I wanted the metal gears that the MG996R provide.

Time breakdown:
Mechanical build 1 day
Reverse engineering servo board 0.5 days
Hooking up to Android phone + writing some visualization code 0.5 days
Blogging about it 🙂 0.5 days
Total: 2.5 days

Future Plans – Q Learning

My plan is to hang him loosely upright by a piece of string, and then make a neural network in tensor flow to control him to try to get him to stand full upright, but not having to deal with recovering from a collapsed lying-down position.

Specifically, I want to get him to balance upright using Q learning.  One thing I’m worried about is the sheer amount of time required to physically do each tests.  When you have a scenario where each test takes a long time compared to the compute power, this just screams out for Bayesian learning.   So…  Bayesian Q-parameter estimation?  Is there such a thing?  A 10 second google search doesn’t find anything.  Or Bayesian policy network tuning?    I need to have a think about it 🙂

React + Twine

Twine is a neat ‘open-source tool for telling interactive, nonlinear stories’, as its blurb says.  It lets you write stories using variables, conditionals, and other simple programming primitives.  It supports three different languages, but I’ll focus only on the default, Harlowe.  Some example code:

Hello $name. (if: $isMale)[You're male!](else:)[You're female!].

This works pretty well, but it can be really cumbersome to do anything complicated.  I was curious to see how well it could be integrated into React.  Harlowe purposefully does not expose any of its internals, so our solution is going to need to be pretty hacky.

The solution that I came up with was to add some javascript to a startup tag that Harlowe will run, and attach the necessary private variable to the global window variable.  Then to load and run the Harlowe engine in the react componentWillMount function.  Like so:

import React, { Component } from 'react';
import './App.css';

class App extends Component {
  componentWillMount() {
    const script = document.createElement("script");
    script.src = "harlowe-min.js";
    script.async = true;
    document.body.appendChild(script);
  }

  render() {
    return (
<div className="App">
        <header id="header">My header

</header>
<div id="container">
          <main id="center" className="column">
            <article>
              <tw-story dangerouslySetInnerHTML={{__html: ""}}></tw-story>
              <tw-storydata startnode={1} style={{display:"none"}}>
                <script role="script" id="twine-user-script" type="text/twine-javascript">{`
                  if (!window.harlowe){ window.harlowe = State; }
                `}</script>
                <tw-passagedata pid={1} name="Start">**Success!**(set: $foo to 1) Foo is $foo</tw-passagedata>
              </tw-storydata>
            </article>
          </main>

          <nav id="left" className="column">
<h3>Left heading</h3>
<ul>
	<li><a href="#">Some Link</a></li>
</ul>
</nav>
<div id="right" className="column">
<h3>Right heading</h3>
</div>
</div>
<div id="footer-wrapper">
          <footer id="footer">&nbsp;

</footer></div>
</div>
);
  }
}

export default App;

It seemed to work just fine without

dangerouslySetInnerHTML={{__html: ""}}

But I added it to make it clear to react that we don’t want react to manage this.

Unfortunately I couldn’t proceed further than this with trying to make them nicely integrate. I couldn’t see a way to hook into the engine to know when a variable is updated via, say, a user clicking on a (link:..)[].

There is a VarRef object that would allow exactly this with jQuery, by doing something like:

var setState = this.setState.bind(this);
VarRef.on('set', (obj, name, value) => {
  if (obj === State.variables) {
    setState({[name]: value})
  }
})

Unfortunately the VarRef is not exposed in any scope that I can find. The source code is available, so the VarRef can be exposed with a fairly simple change, but not without doing that, as far I can tell.

Changing the color of image in HTML with an SVG feColorMatrix filter

In a previous post, I changed the color of a simple image (of a pair of eyes) by converting it to an SVG first, and then changing the SVG colors.

But what if you want to a take a complex image and change the color?

 

In Photoshop/Gimp this can be achieved by creating a new layer on top of the image and filling it with a solid color, and then setting its Mode to ‘Multiply’.  But how can we reproduce this on the web?

There are various solutions using svg filters (feFlood and feBlend) but these are not very well supported in browsers.  So I’ve come up with a solution that is very well supported in all modern browsers, including IE.









...


Replace the numbers 0.5 with the rgb values of the color that you want.  For example, in react:


hexToRgb(hex) {
var result = /^#?([a-f\d]{2})([a-f\d]{2})([a-f\d]{2})$/i.exec(hex);
return result ? {
r: parseInt(result[1], 16),
g: parseInt(result[2], 16),
b: parseInt(result[3], 16)
} : {r: 255, g: 255, b: 255};
}

skinColorDef(colorAsString) {
const hex = hexToRgb(colorAsString); /* <-- See next for an improvement*/
const r = hex.r/255;
const g = hex.g/255;
const b = hex.b/255;
return (





);
}

 

We can now tint our image with a color like skinColorDef(“#e7b48f”).

But let’s make small improvement.  It’s not obvious what the final color is going to be because the tint color is multiplied by the color in the image.  So let’s make it more intuitive by first looking at the main color in the image (e.g. using the color picker in gimp/photoshop) and then dividing (i.e ‘un-multiplying’) the colorAsString by that color.

For example, the skin color in that girl image is #fff2f2 which is (255,242,228).  So:


divideByImageSkinColor(rgb) {
return {r: rgb.r * (255/255), g: rgb.g * (255/242), b: rgb.b * (255/242)}
}

and modify the skinColorDef like:


skinColorDef(colorAsString) {
const hex = divideByImageSkinColor(hexToRgb(colorAsString));

Now we can just chose the colors directly.  For skin, the Fitzpatrick Scale is a nice place to start:

fitzpatrick-color-chart

We can now use these RGB values directly in our skinColorDef function.  Here’s an example html combobox to select the color: (The onChange function is left to you to implement)



Light
Fair
Medium
Olive
Brown
Black


And that’s it!

Sidenote: Many years ago, I wrote the graphics drivers (when I worked at Imagination Technologies) to accelerate this sort of multiply operation using shaders.  That driver is used in the iPhone, iPad, TomTom, and many other small devices.

Photoshop/gimp layers to SVG

Ever wanted to export multiple layers in a Gimp or Photoshop image, with each layer as its own PNG, but the whole thing then wrapped up as an SVG?

The usefulness is that an artist can create an image of, say, a person, with eyes of various different colours in multiple layers.  Then we can create an SVG file that we can embed in an html page, and then change the color of the eyes through Javascript.

So take this example.  In this image we have a face made up of various layers, and the layers are further grouped in GroupLayers.

So imagine having this image, then in Javascript on your page being able to swap out just the eye image.  Or just the mouth image.

To achieve this, I had to modify an existing gimp python script from 5 years ago that has since bitrotted.  Back when it was written, there was no such thing as group layers, so the script doesn’t work now.  A bit of hacking, and I get:

#!/usr/bin/env python
# -*- coding: <utf-8> -*-
# Author: Erdem Guven <zuencap@yahoo.com>
# Copyright 2016 John Tapsell
# Copyright 2010 Erdem Guven
# Copyright 2009 Chris Mohler
# "Only Visible" and filename formatting introduced by mh
# License: GPL v3+
# Version 0.2
# GIMP plugin to export as SVG

# Save this to ~/.gimp-*/plug-ins/export_svg.py

from gimpfu import *
import os, re

gettext.install("gimp20-python", gimp.locale_directory, unicode=True)

def format_filename(imagename, layer):
	layername = layer.name.decode('utf-8')
	regex = re.compile("[^-\w]", re.UNICODE)
	filename = imagename + '-' + regex.sub('_', layername) + '.png'
	return filename

def export_layers(dupe, layers, imagename, path, only_visible, inkscape_layers):
	images = ""
	for layer in layers:
		if not only_visible or layer.visible:
			style=""
			if layer.opacity != 100.0:
				style="opacity:"+str(layer.opacity/100.0)+";"
			if not layer.visible:
				style+="display:none"
			if style != "":
				style = 'style="'+style+'"'

			if hasattr(layer,"layers"):
				image = '<g inkscape:groupmode="layer" inkscape:label="%s" %s>' % (layer.name.decode('utf-8'),style)
				image += export_layers(dupe, layer.layers, imagename, path, only_visible, inkscape_layers)
				image += '</g>'
				images = image + images
			else:
				filename = format_filename(imagename, layer)
				fullpath = os.path.join(path, filename);
				pdb.file_png_save_defaults(dupe, layer, fullpath, filename)

				image = ""
				if inkscape_layers:
					image = '<g inkscape:groupmode="layer" inkscape:label="%s" %s>' % (layer.name.decode('utf-8'),style)
					style = ""
				image += ('<image xlink:href="%s" x="%d" y="%d" width="%d" height="%d" %s/>\n' %
					(filename,layer.offsets[0],layer.offsets[1],layer.width,layer.height,style))
				if inkscape_layers:
					image += '</g>'
				images = image + images
		dupe.remove_layer(layer)
	return images

def export_as_svg(img, drw, imagename, path, only_visible=False, inkscape_layers=True):
	dupe = img.duplicate()

	images = export_layers(dupe, dupe.layers, imagename, path, only_visible, inkscape_layers)

	svgpath = os.path.join(path, imagename+".svg");
	svgfile = open(svgpath, "w")
	svgfile.write("""<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!-- Generator: GIMP export as svg plugin -->

<svg xmlns:xlink="http://www.w3.org/1999/xlink" """) 	if inkscape_layers: 		svgfile.write('xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape" ') 	svgfile.write('width="%d" height="%d">' % (img.width, img.height));
	svgfile.write(images);
	svgfile.write("</svg>");

register(
	proc_name=("python-fu-export-as-svg"),
	blurb=("Export as SVG"),
	help=("Export an svg file and an individual PNG file per layer."),
	author=("Erdem Guven <zuencap@yahoo.com>"),
	copyright=("Erdem Guven"),
	date=("2016"),
	label=("Export as SVG"),
	imagetypes=("*"),
	params=[
		(PF_IMAGE, "img", "Image", None),
		(PF_DRAWABLE, "drw", "Drawable", None),
		(PF_STRING, "imagename", "File prefix for images", "img"),
		(PF_DIRNAME, "path", "Save PNGs here", os.getcwd()),
		(PF_BOOL, "only_visible", "Only Visible Layers?", False),
		(PF_BOOL, "inkscape_layers", "Create Inkscape Layers?", True),
		   ],
	results=[],
	function=(export_as_svg),
	menu=("<Image>/File"),
	domain=("gimp20-python", gimp.locale_directory)
	)

main()

(Note that if you get an error ‘cannot pickle GroupLayers’, this is a bug in gimp. It can be fixed by editing

    350: gimpshelf.shelf[key] = defaults

)

Which when run, produces:

girl-Layer12.png
girl-Layer9.png
girl-Layer11.png
girl-Layer14.png
girl.svg

(I later renamed the layers to something more sensible 🙂 )

The (abbreviated) svg file looks like:


<g inkscape:groupmode="layer" inkscape:label="Expression" >

<g inkscape:groupmode="layer" inkscape:label="Eyes" >

<g inkscape:groupmode="layer" inkscape:label="Layer10" ><image xlink:href="girl-Layer10.png" x="594" y="479" width="311" height="86" /></g>

<g inkscape:groupmode="layer" inkscape:label="Layer14" ><image xlink:href="girl-Layer14.png" x="664" y="470" width="176" height="22" /></g>

<g inkscape:groupmode="layer" inkscape:label="Layer11" ><image xlink:href="girl-Layer11.png" x="614" y="483" width="268" height="85" /></g>

<g inkscape:groupmode="layer" inkscape:label="Layer9" ><image xlink:href="girl-Layer9.png" x="578" y="474" width="339" height="96" /></g>

<g inkscape:groupmode="layer" inkscape:label="Layer12" ><image xlink:href="girl-Layer12.png" x="626" y="514" width="252" height="30" /></g>

</g>

</g>

We can now paste the contents of that SVG directly into our html file, add an id to the groups or image tag, and use CSS or Javascript to set the style to show and hide different layers as needed.

CSS Styling

This all works as-is, but I wanted to go a bit further.  I didn’t actually have different colors of the eyes.  I also wanted to be able to easily change the color.  I use the Inkscape’s Trace Bitmap to turn the layer with the eyes into a vector, like this:

girl-eyecolor

Unfortunately, WordPress.com won’t let me actually use SVG images, so this is a PNG of an SVG created from a PNG….

I used as few colors as possible in the SVG, resulting in just 4 colors used in 4 paths.  I manually edited the SVG, and moved the color style to its own tag, like so:


<defs>
<style type="text/css"><![CDATA[ #eyecolor_darkest { fill:#34435a; } #eyecolor_dark { fill:#5670a1; } #eyecolor_light { fill:#6c8abb; } #eyecolor_lightest { fill:#b4dae5; } ]]></style>

</defs>

<path id="eyecolor_darkest" ..../>

The result is that I now have an svg of a pair of eyes that can be colored through css.  For example, green:

girl-eyecolor_green

Which can now be used directly in the head svg in an html, and styled through normal css:

head_green

Colors

For the sake of completeness, I wanted to let the user change the colors, but not have to make them specify each color individually. I have 4 colors used for the eye, but they are obviously related. Looking at the blue colors in HSL space we get:

RGB:#34435a =  hsl(216, 27%, 28%)
RGB:#5670a1 =  hsl(219, 30%, 48%)
RGB:#6c8abb =  hsl(217, 37%, 58%)
RGB:#b4dae5 =  hsl(193, 49%, 80%)

Annoyingly, the lightest color has a different hue. I viewed this color in gimp, change the hue to 216, then tried to find the closest saturation and value that matched it. 216, 85%, 87% seemed the best fit.

So, armed with this, we now have a way to set the color of the eye with a single hue:

#eyecolor_darkest  =  hsl(hue, 27%, 28%)
#eyecolor_dark     =  hsl(hue, 30%, 48%)
#eyecolor_light    =  hsl(hue, 37%, 58%)
#eyecolor_lightest =  hsl(hue, 85%, 87%)

Or in code:

function setEyeColorHue(hue) {
    document.getElementById("eyecolor_darkest").style.fill = "hsl("+hue+", 27%, 28%)";
    document.getElementById("eyecolor_dark").style.fill = "hsl("+hue+", 30%, 48%)";
    document.getElementById("eyecolor_light").style.fill = "hsl("+hue+", 37%, 58%)";
    document.getElementById("eyecolor_lightest").style.fill = "hsl("+hue+", 85%, 87%)";
}
<label for="hue">Color:</label>
<input type="range" id="hue" min="0" value="216" max="359" step="1" oninput="setEyeColorHue(this.value)" onchange="setEyeColorHue(this.value)"/>

Tinting a more complex image

But what if the image is more complex, and you don’t want to convert it to an SVG?  E.g.

The solution is to apply a filter to multiply the layer by another color.

See my follow up post: Changing the color of image in HTML with an SVG feColorMatrix filter

 

Simple HTML

I frequently want a simple single-file html page that generates some text dynamically based on some inputs at the top of the page.  This can be done in react etc of course, but sometimes my usecase is so simple that it’s an overkill.

For example, to generate some template code based on a few input parameters. Or to make some calculations based on inputs, or to make a customizable story, etc.

With this in mind, I produced the following minimal HTML, using the handlebars processor, that lets me do exactly this:


<!DOCTYPE html>
<html>
<head>
	<meta charset="UTF-8">
    <title>Welcome</title>
    <script type="application/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/handlebars.js/4.0.5/handlebars.min.js"></script>
    <script id="result-template" type="text/x-handlebars-template">
Hello {{name}}!  You are {{age}} years old.  {{#xif "this.age > 18"}} That's really old! {{else}} So young! {{/xif}}


Put the rest of your page here.
    </script>
</head>

<body>
<h2>What is your name?</h2>
Name: <input type="text" id="name" value="Bob"/>

        Age: <input type="number" id="age" value=32 min=0 ><p/><p/>
<div id="resultDiv"></div>
<script>
		var inputs = document.querySelectorAll("input");
		function update() {
			var params = {};
			for (i = 0; i < inputs.length; ++i) {
				params[inputs[i].id] = (inputs[i].type === "number")?Number(inputs[i].value):inputs[i].value;
			}
			document.querySelector("#resultDiv").innerHTML = template(params);
		}
		document.addEventListener("DOMContentLoaded", function() {
			Handlebars.registerHelper("xif", function (expression, options) {
   				 return Handlebars.helpers["x"].apply(this, [expression, options]) ? options.fn(this) : options.inverse(this);
			});
			Handlebars.registerHelper("x", function (expression, options) {
				try { return Function.apply(this, ["window", "return " + expression + " ;"]).call(this, window); } catch (e) { console.warn("{{x " + expression + "}} error: ", e); }
			});

			var source = document.querySelector("#result-template").innerHTML;
			template = Handlebars.compile(source);
			for (i = 0; i < inputs.length; ++i) {
				// Use 'input' to update as the user types, or 'change' on loss of focus
				inputs[i].addEventListener("input", update);
			}
			update();
		});
	</script>
</body>
</html>

Which produces a result like:

html

Single-page HTML that changes the page on user input

Unity vs Unreal Engine 4

I implemented two medium-sized projects, one in Unreal Engine 4 and one in Unity 5.

Unfortunately these were both for clients, so I can’t talk about any specifics.  I do, however, want to give some general thoughts on the comparison between them.

Pros and Cons:

  • Unreal Engine 4 seems to have a lot more advanced features.  But I didn’t personally use any of these advanced features.  They didn’t seem easy to use.
  • Unity 5 was much more intuitive for me to use.
  • The Unity 5 asset store was so much nicer to use.  I could buy an asset and import it into my game with a couple of clicks.  With UE4 it seemed so much more difficult.
  • UE4’s VR support simply didn’t work on a Mac.  This sucked because my artists all use Macs.   More annoyingly, it didn’t say why it didn’t work, it just simply disabled the Preview In VR button, giving no reason.   And the reasons were written up in an Internal bug report (UE-11247 apparently) that the UE4 developers constantly refer to, but users aren’t actually allowed to view or see the status of!
  • I much preferred having a managed language (C# or javascript) in Unity than the C++ support in UE4.  Mistakes in C++ code meant crashing the whole app.  It also led to long compile times.   But a mistake in C# meant just having an exception and the app being able to easily recover from it.
  • I tried really hard to get on with UE4’s Blueprint, which is basically a visual “programming” language.  But implementing in a fairly simply mathematical formula would result in 20+ nodes.  Implementing a simple polynomial like  $latex  y = 3x^2 + 2x + 5 $  was incredibly painful in dragging out nodes for each operation.
untitled-46b8493

Blueprint quickly becomes a mess. This is a random example from the web.

  • UE4’s blueprints become particularly annoying when users are asking questions about them.  They’ll paste a screenshot of their blueprint saying that they have a problem.  Someone else then has to try to decipher what is going on from a screenshot, with really no easy way to reproduce.  Users who want to copy a blueprint have to do so manually, node by node..
    I would really love for UE4 to mix in a scripting language, like Javascript.
  • UE4 has lots of cool features, but they are really difficult to just use.  For example, it has a lot of support for adding grass.  You can just paint grass onto your terrain..  except that you can’t because you don’t have any actual grass assets by default.
    The official UE4 tutorials say that to add grass, you should import the whole 6.4 GB Open World Demo Collection to your project!
    But then, even that isn’t enough because it doesn’t have any actual grass materials!  You have to then create your own grass material which is quite a long process.  This was really typical of my experience with UE4.  Why not just have a single ‘grass’ asset that could be instantly used, and then let the user tweak it in more complicated ways if they want to later on?
    Compare this to Unity.  You go to: Assets > Import Package > Terrain Assets  click on the tree or grass that you want, and that’s it.  You can then start painting with that tree or grass immediately.  If you later want to make your own trees, it comes with a tree editor, built in!
  • Unity’s support for Android was much better than UE4’s.
  • UE4 taxed my system a lot more than Unity.  For my beefy desktop, that was no problem.  But the artists had Mac laptops that really struggled.
  • I really like Unity’s GameObject plus Component approach.  Basically, you make a fairly generic GameObject that is in your scene, and then you attach multiple components to it.  For example, if you want a button, your button GameObject would have a mesh, a material, a renderer (to draw the material on the mesh), a hit box (to know when the user presses it) and presumably some custom script component that runs when you hit it.
    And because your custom scripts are written in C# or javascript, you get lovely automatically introspection on the class variables, and any variables are automatically added to the GUI!

Overall, I guess I’ve become a unity fanboy.  Which is a shame, because I started with UE4 and I really wanted to like it.  I have been with UE4 for 2 years, and was a paying sponsor for a year.

I feel that the trouble is their different audiences.  UE4 is obviously targeted towards much larger studios, who want advanced features and don’t care about built in assets etc.  Unity on the other hand is targeted towards Indie developers who want to make quick prototypes and cheap products easily.

This has resulted into a sort of stigma against Unity projects, because there is a glut of rubbish games produced by novices in Unity.  Unity charges about $1,500 per developer to remove the start-up Unity splashscreen, resulting in most indie developers not paying that fee.  Only the good games which sell well can afford to remove that splashscreen.

The result being that if you start up a random indie game on steam greenlight, for example, and see the Unity splashscreen, you know that the game is unlikely to be that good.  Hence a stigma.

Logistic Regression and Regularization

Tons has been written about regularization, but I wanted to see it for myself to try to get an intuitive feel for it.

I loaded a dataset from google into python (a set of images of letters) and implemented a double for-loop to run a logistic regression with different test data sizes, and different regularization parameters.  (The value shown in the graph is actually 1/regularization ).

 

def doLogisticRegression(trainsize, regularizer):
    fitmodel = linear_model.LogisticRegression(C=regularizer)
    train_datasubset = train_dataset[0:trainsize,:,:].reshape(trainsize, -1)
    x = fitmodel.fit(train_dataset[0:trainsize,:,:].reshape(trainsize, -1),
                 train_labels[0:trainsize])
    return [fitmodel.score(train_datasubset, train_labels[0:trainsize]),
            fitmodel.score(valid_dataset.reshape(valid_dataset.shape[0], -1), valid_labels)
            ]
print(train_dataset.shape[0])
trainsizes = [50,200,300,400,500,600,700,800,900,1000,2000,3000,4000,5000, 10000, 50000, 100000, 200000];
plt.xscale('log')
color = 0
plots = []
for regularizer in [1, 0.1, 0.01, 0.001]:
    results = np.array([doLogisticRegression(x, regularizer) for x in trainsizes])
    dashedplot = plt.plot(trainsizes, results[:,1], '--', label=(&amp;quot;r:&amp;quot; + str(regularizer)))
    plt.plot(trainsizes, results[:,0], c=dashedplot[0].get_color(), label=(&amp;quot;r:&amp;quot; + str(regularizer)))
    plots.append(dashedplot[0])
plt.legend(loc='best', handles=plots)

The result is very interesting. The solid line is the training set accuracy, and the dashed line is the validation set accuracy. The vertical axis is the accuracy rate (percentage of images recognized as the correct letter) and the horizontal axis is the number of training examples.

graph of accuracy against training set for logistic regression

Image to letter recognition accuracy against training size, for various values of r = 1/regularization_factor.  Solid line is training set accuracy, dotted line is validation set accuracy.

First, I find it fascinating that purely a logistic regression can produce an accuracy of recognizing letters at 82%. If you added in spell checking, and ran this over an image, you could probably get a pretty decent OCR system, from purely an logistical regression.

Second, it’s interesting to see the effect of the regularization term. At less than about 500 training examples, the regularization term only hurts the algorithm. (A value of 1 means no regularization). At about 500 training examples though, the strong regularization (really helps). As the number of training examples increases, regularization makes less and less of an impact, and everything converges at around 200,000 training samples.

It’s quite clear at this point, at 200,000 training samples, that we are unlikely to get more improvements with more training samples.

A good rule of thumb that I’ve read is that you need approximately 50 training samples per feature. Since we have 28×28 = 784 features, this would be at 40,000 training samples which is actually only a couple of percent from our peak performance at 200,000 training samples (which is 2000000/784=2551 training samples per feature).

At this point, we could state fairly confidently that we need to improve the model if we want to improve performance.

Stochastic Gradient Descent

I reran with the same data but with stochastic gradient descent (batch size 128) and no regularization.  The accuracy (after 9000 runs) on the validation set was about the same as the best case with the logistic regression (81%), but took only a fraction of the time to run.  It took just a few minutes to run, verses a few hours for the logistic regression.

Stochastic Gradient Descent with 1 hidden layer

I added a hidden layer (of size 1024) and reran. The accuracy was only marginally better (84%).  Doubling the number of runs increased this accuracy to 86%.