This takes text, for example: (line numbers added for clarity)
1. Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor
2. incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis
3. nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
4. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu
5. fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
6. culpa qui officia deserunt mollit anim id est laborum.
And produces output without the hardwrapped newlines, like: (line numbers added for clarity)
1. Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
2. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
The rules are:
Two adjacent lines are considered to have been hardwrapped and should be remerged, if the first line ends with a letter, comma or hyphen.
But don’t merge if the second line starts with a space or utf8 opening quote.
Everything between utf8 speechmarks “..” will be treated as one line.
This is an ongoing multi-year project. I need to write this up properly, but I’ve got so much stuff that I’m going to attack this in stages.
Current Status
14 motors, controlled by a raspberry pi. Arm doesn’t quite work yet. Everything else works.
Front end – I wrote this in ReactJS, and it communicates with the robot via a websocket to RobotOS, using ‘rosbridge’. Sorry it’s so pink – it uses infrared LEDs to work at night, but they are kinda overpowering. Also sorry for the mess – I have two children…
In the top left, the green circle is a ‘nipple’ – to let you move the rover about via webpage. Either via mouse or finger press.
In the top right, the xbox controller image shows what buttons are being pressed on an xbox controller, to control the robot. The xbox controller can either be connected to the rover, or connected to the PC / mobile phone – the webpage relays controller commands through to the rover.
Beneath the xbox controller is the list of running processes (“ros nodes”), with cpu and mem information.
Below that is the error messages etc.
Console UI
I’m strong believer in trying to make my projects ‘transparent’ – as in, when you connect, it should be obvious what is going on, and how to do things.
With that in mind, I always create nice colored scripts that show the current status.
Below is the output when I ssh into the raspberry pi. It shows:
– Wifi status and IP address
– Currently running ROS Node processes, and their mem and cpu status. Colored Green for good, Red for stopped/crashed.
– Show that ‘rosmon’ is running on ‘screen’ and how to connect to it.
– The command line shows the current git status
See those links for the source code. I think that every ROS project can benefit from these.
Boogie Rocker Design
Ultrasonic Sensor
I played about with making a robot face. I combined a Ultrasonic Sensor (HC-SR04) and NeoPixel leds. The leds reflect the distance – one pixel per 10 of cm.
I’m wondering if I can do SLAM – Simultaneous Location And Mapping with a single (or a few) very-cheap ultrasonic sensor. Well, the answer is almost certainly no, but I’m very curious to see how far I can get.
NOTE: This is an old design – from more than two years ago
I wanted a simple way to track the movement and orientation of an object (In this case, my Samurai robot). TrackIR’s SDK is not public, and has no python support as far as I know.
I looked at the symbols in dll that is distributed with the trackir app, and with some googling around on the function names, I could piece together how to use this API.
Here’s the result – my console program running in a window in the middle. I have pressed ctrl+c after 70ms, to cancel, to give a better screenshot:
Implementation Details
Python lets you call functions in any .dll library pretty easily (although the documentation for doing so is pretty awful). The entire code is approximately 500 lines, with half of that being comments, and I’ve put on github here:
Query the Windows registry for where the TrackIR software is installed.
Load the NPClient64.dll in that TrackIR software folder, using python ctypes.WinDLL (Note, I’m using 64 bit python, so I want the 64 bit library)
Expose each function in the dll that we want to use. This can get a bit cryptic, and took a bit of messing about to get right. For example, there is a function in the dll ‘NP_StopCursor’, that in C would look like:
where ‘NP_StopCursor’ is now a function that can then be called with: NP_StopCursor(). See the trackir.py file for more, including on how to pass parameters which are data structures etc.
Note: Python ctypes supports a way to mark a parameter as an ‘output’, but I didn’t fully understand what it was doing. It seemed to just work, and appeared to automatically allocate the memory for you, but the documentation didn’t explain it, and I didn’t entirely trust that I wasn’t just corrupting random memory, so I stuck to making all the parameters input parameters and ‘new’ing the data structure myself.
Create a graphical window and get its handle (hWnd), and call the functions in the TrackIR .dll to setup the polling, and pass the handle along. The TrackIR software will poll every 10 seconds ish to check that the given graphic window is still alive, and shutdown the connection if not.
Poll the DLL at 120hz to get the 6DOF position and orientation information.
ROS Integration
I wanted to get this to stream 6DOF data to ROS as well, to open up possibilities such as being able to determine the position of a robot, or to control the robot with your head etc.
The easiest way is to run a websocket server on the linux ROS server, and send the data via ROS message via the websocket.
I saw this for sale in a Christmas market for 500 yen (~$5 USD) and immediately bought it. My wife questioned my sanity, since it’s for “boy’s day” (It’s a “五月人形”) and we have no boys.
But, I could see the robotic potential!
Note: I have put the gold plated metal through an ultrasonic cleaner and buffed them with British Museum wax. But I’m not sure how to remove the corrosion – please message me if you have any ideas.
Next step is to build a wooden skeleton to mount the motors:
I’ve used 2 MG996 servo motors (presumably clones).
And a quick and dirty arduino program to move the head in a circular motion:
#include
Servo up_down_servo; // create servo object to control a servo
Servo left_right_servo; // create servo object to control a servo
int pos = 0; // variable to store the servo position
void setup() {
up_down_servo.attach(8); // attaches the servo on pin 8 to the servo object
up_down_servo.write(170); // Maximum is 170 and that's looking forward, and smallest is about 140 and that is looking a bit downwards
left_right_servo.attach(9); // attaches the servo on pin 9 to the servo object
left_right_servo.write(170); // 155 is facing forward. 170 is looking to the left for him
Serial.begin(115200);
Serial.println("up_down, left_right"); // Output a CSV stream, showing the desired positions
}
void loop() {
const float graduations = 1000.0;
for (pos = 0; pos < graduations; pos++) {
const float up_down_angle = 160+10*sin(pos / graduations * 2.0 * 3.1415);
const float left_right_angle = 155+5*cos(pos / graduations * 2.0 * 3.1415);
left_right_servo.write(left_right_angle);
up_down_servo.write(up_down_angle);
Serial.print(up_down_angle);
Serial.print(",");
Serial.println(left_right_angle);
delay(15);
}
}
And the result:
Jerky Movement
Unfortunately the movement of the head is extremely jerky. to quantify this, lets track the position accurately.
To do so, I mounted a TrackIR: 3 small reflective markers, which are then picked up by a camera. The position of the reflective markers are then determined, and the 6DOF position of the head calculated.
To actually log the data from TrackIR was tricky. To achieve this, I used FreePIE – an application for ‘bridging and emulating input devices’. It supports TrackIR as an input, and allows you to run a python function on input.
I wrote the following script to output CSV:
#Use Z to force flushing file to disk
import os
def update():
# Note that you need to set the
# curves for these to 1:1 in the TrackIR gui, which needs to be running
# at the same time.
yaw = trackIR.yaw
pitch = trackIR.pitch
roll = trackIR.roll
x = trackIR.x
y = trackIR.y
z = trackIR.z
csv_str = yaw.ToString() + "," + pitch.ToString() + "," + roll.ToString() + "," + x.ToString() + "," + y.ToString() + "," + z.ToString() + "\n"
#diagnostics.debug(csv_str)
f.write(csv_str)
if starting:
global f
trackIR.update += update
f = open("trackir output.csv","w")
diagnostics.debug("Logging to: " + os.path.realpath(f.name))
f.write("yaw,pitch,roll,x,y,z\n")
if keyboard.getPressed(Key.Z):
f.flush()
diagnostics.debug("flushing")
And taped the head tracker like so:
Note 1: I also set the TrackIR response curves to all be 1:1 and disabled smoothing of the input. We want the raw data. Although I later realised that freepie exposes the raw data, so I could have just used that instead.
Note 2: If you enable logging in FreePIE, it outputs the log to “C:\Program Files (x86)\FreePIE\trackirworker.txt”
Note 3: For some reason, FreePIE is only updating the TrackIR readings at 4hz. I don’t know why – I looked into the code, but without much success. I filed a bug.
And here are the results:
There is a lot of a jerkiness here. This is supposed to look like a beautiful sinusoidal curve:
To fix this, and make the movement smooth, we need to understand where this jerkiness comes from.
Source of jerkiness
A quick word about motivation: For almost my entire life, I’ve wanted to build robots with smooth natural-looking movement, and I’ve hated the unnatural jerkiness that you often see.
A quick word about terminology: ‘Jerkiness’ has a real technical meaning – it is the function of position over time, differentiated 3 times. Since we are aiming for a circular movement, and since the 3rd order differential of a sinusoidal movement is still sinusoidal, even in the ideal situation there is still technically a non-zero jerk. So here I mean jerkiness in a more informal manor. I have previously written about, and written code for, motion planning with a guaranteed jerk of 0 when controlling a car, for which the jerkiness really must be minimized to prevent whiplash and discomfort.
We must understand how a servo works:
How a servo works
A servo is a small geared DC motor, a potentiometer to measure position, and a control circuit that presumably uses some basic PID algorithm to power the motor to try to match the potentiometer reading to the pulse width given on the input line:
To prevent the motor from ‘rocking’ back and forth around the desired value, the control circuit implements a ‘dead bandwith’ – a region in which it considers the current position to be ‘close enough’ to the given pulse.
For the MG996R servo that I’m using, the spec sheet says:
Note the ‘Dead band width’ of 5µs. The servo moves about 130 degrees when given a duty cycle between 800µs to 2200µs. NOTE: This means that the maximum duty cycle is only about 10% of the PWM period.
So this means that we move approximately 10µs per degree: (2200µs – 800µs)/130° ≈10µs.
So our dead band width is approximately 0.5°. Note that that means ±0.25°.
The arduino Servo::write(int angle) command takes an integer number of degrees, and by default is configured to 5µs per degree (2400µs – 544µs)/180° ≈10µs this means that we are introducing four times as much ‘jerkiness’ into our movement than we need to, simply from sloppy API usage. (Numbers taken from github Servo.h code)
So lets change the code to use the four times as accurate: Servo::writeMicroseconds(int microseconds) which lets us specify the number of microseconds for the duty cycle:
#include
Servo up_down_servo; // create servo object to control a servo
Servo left_right_servo; // create servo object to control a servo
int pos = 0; // variable to store the servo position
void servoAccurateWrite(const Servo &servo, float angle) {
/* 544 default used by Servo object, but ~800 on MG996R) */
const int min_duty_cycle_us = MIN_PULSE_WIDTH;
/* 2400 default used by Servo object, but ~2200 on MG996R) */
const int max_duty_cycle_us = MAX_PULSE_WIDTH;
/* Default used by Servo object, but ~130 on MG996R */
const int min_to_max_angle = 180;
const int duty_cycle_us = min + angle * (max - min)/min_to_max_angle;
servo.writeMicroseconds(duty_cycle_us);
}
void setup() {
up_down_servo.attach(8); // attaches the servo on pin 8 to the servo object
up_down_servo.write(170); // Maximum is 170 and that's looking forward, and smallest is about 140 and that is looking a bit downwards
left_right_servo.attach(9); // attaches the servo on pin 9 to the servo object
left_right_servo.write(170); // 155 is facing forward. 170 is looking to the left for him
Serial.begin(115200);
Serial.println("up_down, left_right"); // Output a CSV stream, showing the desired positions
}
void loop() {
const float graduations = 1000.0;
for (pos = 0; pos < graduations; pos++) {
const float up_down_angle = 160+10*sin(pos / graduations * 2.0 * 3.1415);
const float left_right_angle = 155+5*cos(pos / graduations * 2.0 * 3.1415);
servoAccurateWrite(left_right_servo, left_right_angle);
servoAccurateWrite(up_down_servo, up_down_angle);
Serial.print(up_down_angle);
Serial.print(",");
Serial.println(left_right_angle);
delay(15);
}
}
And the result of the position, as reported by TrackIR:
What a huge difference this makes! This is much more smooth. It doesn’t look very sinusoidal, but it’s a lot smoother.
Imagine you have a bluetooth device somewhere in your house, and you want to try to locate it. You have several other devices in the house that can see it, and so you want to triangulate its position.
This article is about the result I achieved, and the methodology.
I spent two solid weeks working on this, and this was the result:
Estimating the most probable position of a bluetooth device, based on 7 strength readings.
We are seeing a map, overlaid in blue by the most likely position of a lost bluetooth device.
And a more advanced example, this time with many different devices, and a more advanced algorithm (discussed further below).
Note that some of the devices have very large uncertainties in their position.
Estimating position of multiple bluetooth devices, based on RSSI strength.
There are quite a few articles, video and books on estimating the distance of a bluetooth device (or wifi hub) based on knowing the RSSI signal strength readings.
But I couldn’t find any that gave the uncertainty of that estimation.
Noise is Normally distributed, with mean 0 and variance σ²
d is our distance (above) or estimated distance (below)
Rearranging to get an estimated distance, we get:
Now Noise is sampled from a Normal distribution, with mean = 0 and variance = σ², so let’s write our estimated d as a random variable:
Important note: Note that random variable d is distributed as the probability of the rssi given the distance. Not the probability of the distance given the rssi. This is important, and it means that we need to at least renormalize the probabilities over all possible distances to make sure that they add up to 1. See section at end for more details.
Adding a constant to a normal distribution just shifts the mean:
Now let’s have a bit of fun, by switching it to base e. This isn’t actually necessary, but it makes it straightforward to match up with wikipedia’s formulas later, so:
Distance in meters against probability density, for an rssi value of -80, A=-30, n=3, sigma^2=40
Bayes Theorem
I mentioned earlier:
Important note: Note that random variable d is distributed as the probability of the rssi given the distance. Not the probability of the distance given the rssi. This is important, and it means that we need to at least renormalize the probabilities over all possible distances to make sure that they add up to 1. See section at end for more details.
So using the above graph as an example, say that our measured RSSI was -80 and the probability density for d = 40 meters is 2%.
The Laplace Transform is a particular tool that is used in mathematics, science, engineering and so on. There are many books, web pages, and so on about it.
And yet I cannot find a single decent visualization of it! Not a single person that I can find appears to have tried to actually visualize what it is doing. There are plenty of animations for the Fourier Transform like:
But nothing for Laplace Transform that I can find.
So, I will attempt to fill that gap.
What is the Laplace Transform?
It’s a way to represent a function that is 0 for time < 0 (typically) as a sum of many waves that look more like:
Graph of
Note that what I just said isn’t entirely true, because there’s an imaginary component here too, and we’re actually integrating. So take this as a crude idea just to get started, and let’s move onto the math to get a better idea:
Math
The goal of this is to visualize how the Laplace Transform works:
To do this, we need to look at the definition of the inverse Laplace Transform:
While pretty, it’s not so nice to work with, so let’s make the substitution:
so that our new limits are just to , and giving:
Which we will now approximate as:
Code
The code turned out to be a bit too large for a blog post, so I’ve put it here:
Note: The graphs say “Next frequency to add: … where “, but really it should be “Next two frequencies to add: … where ” since we are adding two frequencies at a time, in such a way that their imaginary parts cancel out, allowing us to keep everything real in the plots. I fixed this comment in the code, but didn’t want to rerender all the videos.
A cubic polynomial:
A cosine wave:
Now a square wave. This has infinities going to infinity, so it’s not technically possible to plot. But I tried anyway, and it seems to visually work:
Note the overshoot ‘ringing’ at the corners in the square wave. This is the Gibbs phenomenon and occurs in Fourier Transforms as well. See that link for more information.
Now some that it absolutely can’t handle, like: . (A function that is 0 everywhere, except a sharp peak at exactly time = 0). In the S domain, this is a constant, meaning that we never converge. But visually it’s still cool.
Note that this never ‘settles down’ (converges) because the frequency is constantly increasing while the magnitude remains constant.
There is visual ‘aliasing’ (like how a wheel can appear to go backwards as its speed increases). This is not “real” – it is an artifact of trying to render high frequency waves. If we rendered (and played back) the video at a higher resolution, the effect would disappear.
At the very end, it appears as if the wave is just about to converge. This is not a coincidence and it isn’t real. It happens because the frequency of the waves becomes too high so that we just don’t see them, making the line appear to go smooth, when in reality the waves are just too close together to see.
The code is automatically calculating this point and setting our time step such that it only breaksdown at the very end of the video. If make the timestep smaller, this effect would disappear.
Sometimes you come across a weird bug where the output seems to be completely impossible. And it’s extremely hard to debug or search on google for, if you don’t know about this:
If your code has Undefined Behaviour, the compiler is allowed to assume it won’t happen, and can ‘optimize out’ chunks of your code.
Here’s a code snippet in a real bug I found yesterday:
#define FOO_SIZE 10
int foo2[FOO_SIZE];
...
int n=0;
do {
p->foo[n] = foo2[n];
n++;
} while( (foo2[n] != 0) && (n < FOO_SIZE) );
printk("n is %d\n", n);
This printed out:
n is 165
and the system crashed.
But how can n become larger than FOO_SIZE=10 ? Because the compiler ‘optimized’ away the check. Here’s the asm:
What we are seeing here is that the “n < FOO_SIZE” check has been completely removed by the compiler. Why?
Because in the check, if n == FOO_SIZE, we would be checking if: foo2[FOO_SIZE] != 0. But this is out of bounds for foo. The compiler knows that this is out of bounds, and so knows that this is undefined behaviour. But the compiler is allowed to assume that undefined behaviour doesn’t happen, and so it assumes that n can NEVER be >= FOO_SIZE. Thus it can remove the n < FOO_SIZE check.
This can be fixed by switching the order of the && like:
} while( (n < FOO_SIZE) && (foo2[n] != 0) );
Or, by checking n-1 instead (which is slightly different behaviour, but good enough for me. I was changing a lot of code with this bug.)
Here I assume these photos are in camera_cal/calibration*.jpg
Even if you ultimately want to calibrate the images from c++, python, javascript etc, the calibration itself can be done separately, in a more convenient language, producing a matrix that you can use pretty much anywhere. In my case, I calibrate in a jypter notebook, but use the matrix in c++.
%matplotlib inline
import numpy as np
import cv2
import glob
import matplotlib.pyplot as plt
#%matplotlib qt
# prepare object points, like (0,0,0), (1,0,0), (2,0,0) ....,(6,5,0)
objp = np.zeros((6*9,3), np.float32)
objp[:,:2] = np.mgrid[0:9,0:6].T.reshape(-1,2)
# Arrays to store object points and image points from all the images.
objpoints = [] # 3d points in real world space
imgpoints = [] # 2d points in image plane.
# Make a list of calibration images
images = glob.glob('camera_cal/calibration*.jpg')
# Step through the list and search for chessboard corners
for fname in images:
img = cv2.imread(fname)
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# Find the chessboard corners
ret, corners = cv2.findChessboardCorners(gray, (9,6),None)
# If found, add object points, image points
if ret == True:
objpoints.append(objp)
imgpoints.append(corners)
# Draw and display the corners
img = cv2.drawChessboardCorners(img, (9,6), corners, ret)
plt.imshow(img)
plt.show()
ret, camera_undistort_matrix, camera_undistort_dist, rvecs, tvecs = cv2.calibrateCamera(objpoints, imgpoints, gray.shape[::-1], None, None)
In the top left is the original image, in the top right is the undistorted image. There is little difference to the eye, but the distortion lets us to now apply further distortions such as a perspective transform to provide an apparent “top-down view”.
It is also useful for applying other transforms key to stitching together images from multiple cameras.
It was cool to see my VR software being used by the Dallas Cowboys:
To make this a bit more of a technical post, instead of purely a brag, we use a fair amount of machine learning behind the scenes. We look for similar items in the scene between cameras, to match up and align the seams:
We try to ‘autotune’ all the parameters. We adjust both the seam position and the camera warp to minimize the squared RGB pixel difference in the overlapped areas. This is actually quite computationally expensive, so we run this in CUDA to accelerate it.
A way to make a webpage that lets you have an interactive dialogue by choosing options, like the old Interactive Fiction games.
The input file is very simple:
PRINT Hello!
WAIT ; This is a comment. Wait for a key press or mouse click
PRINT I would like to ask you a question.
PRINTW Please don't be nervous. ; The 'W' means WAIT afterwards
PRINT Are you happy?
PRINT [0] Yes
PRINT [1] Not really...
CALL INPUTINT(0, 1)
IF RESULT == 0
PRINTW I'M SO HAPPY THAT YOU'RE HAPPY!
ELSE
PRINTW Then I am miserable 😦
ENDIF
The challenge is to make a webpage that could read that input, and run it, producing the interactive output shown in the video above.
Perhaps have a think if you don’t know how you would implement this. It is perhaps not so obvious.
Here is the approach that I decided to take. Note that I only wanted to spend a weekend on this, so I took various shortcuts that I wouldn’t recommend if writing for production:
Read in the input file in, using javascript with a Tokenizer and Parser.
Output javascript using the above Parser, effectively making a transpiler from the input language to javascript. Call outside javascript functions to ‘PRINT’ etc.
Use ‘eval’ to run the resulting transpiled code. Yes, eval is evil and this is a case of don’t do what I do.
Use Promises, via async and await, to ‘pause’ and allow interaction. Implement ‘WAIT’, ‘PRINTW’ ‘INPUTINT’ etc functions in javascript with Promises, so that they only resolve when the user has clicked, typed a number etc.
Display output by just appending to a list, and displaying that list in React.
1. Read the file in, using Javascript with a Tokenizer and Parser
I used jison. Although the README blurb says that it is for Context Free Grammars, because it is based on BISON which is likewise, it does actually support context. This was vital because the input file language is not actually a context free grammar.
2. Output javascript using the above Parser, effectively making a transpiler from the input language to javascript
The correct way to do this would be to create an Abstract Syntax Tree, however I didn’t want to take too long on this, and instead simply outputted javascript code as a string.
3. Use ‘eval’ to run the resulting transpiled code.
This is very frowned upon, but this was a weekend project, so….
There is one trick that I used here. I wrap the entire program to be ‘eval’ed like:
async function() { ..... }()
This allows the program code inside to use async and await to wait for input, and the eval is returning a Promise. One minor point – when we use eval to evaluate this, we want to catch errors that the Promise throws, to provide clear feedback to the user if there are problems. E.g.
try {
await eval(program);
} catch(e) { ... }
4. Use Promises, via async and await, to ‘pause’ and allow interaction. Implement ‘WAIT’, ‘PRINTW’ ‘INPUTINT’ etc functions in javascript with Promises, so that they only resolve when the user has clicked, typed a number etc.
async function wait() {
await new Promise(
resolve => cb_setClickListener(() => resolve())
)
}
So we subscribe to a click listener via the callback ‘cb_setClickListener’ and then resolve the promise (and thus resume running the program) when the click is published.
Inside the React page, we now listen for clicks and publish it to the callback:
And likewise for keypresses. (Note, I’ve simplified the code here a bit. In the real code, I pass the keypressed etc, so that INPUTINT can listen to a particular key).
5. Display output by just appending to a list, and displaying that list in React.
This way, when presented with a choice, the user can just click a link instead. I maintain two separate output lists, so that I can disable the links once we are done with them.
Conclusion and notes
It worked very nicely! I further extended this to support variables, entering strings, showing pictures, and so on.
The language is actually a real language, but very niche and used almost exclusively by Japanese. It hasn’t seen much acceptance or use outside of Japan.