USB i2c convertor chip. A lesson in USB Latency.

I was interested in transmitting data from an i2c sensor to an ARM board running FreeRTOS along a long wire, and decided to use a i2c to i2c converter. Specifically a chip called MCP2221.

A single sensor reading from the i2c sensor is 2 bytes. I wanted to read 128 samples per second, so I need 256 bytes per second of data.

I figured we’d be able to do easily, what with i2c at 400khz and USB at 12 M/s. There is considerable overhead – each sensor reading needs four USB packets:

  1. Send USB packet from arm board to usb chip telling it to read from i2c sensor
  2. Receive ACK from USB chip
  3. Send USB packet from arm board to usb chip telling it to send us the data that it just read
  4. Receive the data read

Each USB packet is 64 bytes, so we have an overhead of 256 bytes for every 2 bytes of data that we want to read. This means we need a throughput of (256 bytes per sample) * (128 samples per second) = 32kb/sec.

Trying to get 32kb/s out of a 1.5MB/s USB link should be easy, I naively thought.

I wrote a driver for the chip for FreeRTOS, and tested it out.

I got a miserable 40 samples per second. Far out from the target of 128 samples per second.

So what was going wrong? Obviously I suspected a mistake with my driver. So first step is to try to reproduce on linux. I downloaded the MCP2221 Linux driver, and tested the speed of it, like so:

#!/usr/bin/env python
import smbus
import time
import sys

CONTINUOUS_MODE = True
I2C_ADDRESS = 0x48

if (len(sys.argv) == 1):
print("Please specify the bus, like:\n sudo i2c_sensor_read.py $(sudo i2cdetect -l | sed -ne 's/^i2c-\([0-9]\+\).*i2c-mcp2221.*$/\1/p')");
exit(1);

bus = smbus.SMBus(int(sys.argv[1]))

def getContinuousValue():
values = bus.read_i2c_block_data(I2C_ADDRESS, 0x0);
value = (values[0] << 8) + values[1]
return value;

last_epoch_time = 0

write_bytes([0x01, 0x4, 0x83]);
while True:
epoch_time = int(time.time()*1000);
epoch_diff = epoch_time - last_epoch_time;
last_epoch_time = epoch_time;
hz = 0;
if (epoch_diff!=0):
hz = int(1000/epoch_diff);
print str(getContinuousValue()) + " (took " + str(int(epoch_diff)) + " ms - " + str(hz) + " per second)";

This produced a miserable 62.5 samples per second. i.e. 16ms. I even tested on windows and got the same result.

So what was going on?

I tested with a oscilloscope. It was taking 1ms to send a USB packet to receive the reply, compared to the 8ms for a roundtrip that we see in the driver.

This 8ms for a roundtrip is 4ms average for sending and receiving a packet. The code in the driver (after I modified it) looks like:

static int mcp2221_ll_cmd(struct i2c_mcp2221 *dev)
{
       int rv;

       /* tell everybody to leave the URB alone */
       dev-&gt;ongoing_usb_ll_op = 1;

       /* submit the interrupt out ep packet */
       if (usb_submit_urb(dev-&gt;interrupt_out_urb, GFP_KERNEL)) {
               dev_err(&dev-&gt;interface-&gt;dev,
                               "mcp2221(ll): usb_submit_urb intr out failed\n");
               dev-&gt;ongoing_usb_ll_op = 0;
               return -EIO;
       }

       /* wait for its completion */
       rv = wait_event_interruptible(dev-&gt;usb_urb_completion_wait,
                       (!dev-&gt;ongoing_usb_ll_op));
       if (rv interface-&gt;dev, "mcp2221(ll): wait interrupted\n");
               goto ll_exit_clear_flag;
       }

       /* tell everybody to leave the URB alone */
       dev-&gt;ongoing_usb_ll_op = 1;

       /* submit the interrupt in ep packet */
       if (usb_submit_urb(dev-&gt;interrupt_in_urb, GFP_KERNEL)) {
               dev_err(&dev-&gt;interface-&gt;dev, "mcp2221(ll):
usb_submit_urb intr in failed\n");
               dev-&gt;ongoing_usb_ll_op = 0;
               return -EIO;
       }

       /* wait for its completion */
       rv = wait_event_interruptible(dev-&gt;usb_urb_completion_wait,
                       (!dev-&gt;ongoing_usb_ll_op));
       if (rv interface-&gt;dev, "mcp2221(ll): wait interrupted\n");
               goto ll_exit_clear_flag;
       }

ll_exit_clear_flag:
       dev-&gt;ongoing_usb_ll_op = 0;
       return rv;
}

I spoke to the Alan Stern and Greg KH, the Linux kernel USB maintainers, who were not surprised by this 8ms delay at all, and told me:

Why are we seeing such a large latency?
I don’t see anything wrong here, USB isn’t guaranteed to have any specific latency, what you are doing here is the worst-possible-case for a USB system (i.e. send a packet, wait for it to complete, send another one, etc.) There are lots of ways to get much better throughput if that is what you are wanting to do.

Can we make this faster?

For a horrid protocol like this, no, there isn’t any way to make it go faster, sorry. That is designed in such a way to make it the worst possible thing for a USB system, go kick the person who designed such a thing (hint, they have no idea how USB works…)

I strongly suggest going and getting a different sensor chip, don’t encourage such behaviour by actually buying their hardware.

Where is the delay coming from?

Multiple places: time to submit the request, time to reserve bandwidth for the previously unused interrupt endpoint, time to complete the transfer, all multiplied by 2.

You can get more information from usbmon (see Documentation/usb/usbmon.txt in the kernel source). But Greg is right; the protocol you described is terrible. There’s no need for a multiple ping-pong interchange like that; all you should need to do is wait for the device to send the next bit (or whatever) of data as soon as it becomes available.