Why I'm ditching the Arduino software platform

I'm getting set up for my next project and decided to update my development environment. I've finally decided to entirely ditch the Arduino software environment and just use the boards. I stopped using the Arduino IDE some time ago, but now I'm going whole hog and ditching the Arduino library as well. Why? Well, it's simple:

Significant parts of it are pile of junk.

I know that's a pretty strong statement, so I better back it up with evidence. OK, let's start with the hardware serial IO code. Before version 1.0 of the Arduino platform, although reading from the serial ports was interrupt-driven, writing wasn't. Rather, the code went into a spin loop, polling the transmit status bit until the USART was idle before sending the next character. Why was that a problem? Well if you wrote a 80-character string at 9600 baud it would take (8 bits + 1 start bit + 1 stop bit) * 80 / 9600 = 0.083, i.e. 83 milliseconds. That's a huge amount of time for the CPU to be spending just to do some output. I found a number of posts where people were complaining that doing reasonable amounts of IO screwed up all the other bits of their sketches, and no wonder. Admittedly the Arduino 1.0 release notes say that's been changed so that output now uses interrupts as well, but that's not the end of the problems.

Let's take a peek at the HardwareSerial.cpp class. First thing to note is that two 64-byte buffers are allocated for each USART, even if it isn't used. That's 128 bytes on a Duemilanove and 512 bytes on a Mega, or 6% and 12% of the available SRAM respectively. On the Duemilanove that's reasonable as there's only 1 UART, but on the Mega it represents a significant waste of precious memory when only 1 USART is normally going to be in use.

OK, let's look at the new write() function that does interrupt-driven output:

size_t HardwareSerial::write(uint8_t c)
{
  int i = (_tx_buffer->head + 1) % SERIAL_BUFFER_SIZE;

  // If the output buffer is full, there's nothing for it other than to
  // wait for the interrupt handler to empty it a bit
  // ???: return 0 here instead?
  while (i == _tx_buffer->tail)
    ;

  _tx_buffer->buffer[_tx_buffer->head] = c;
  _tx_buffer->head = i;

  sbi(*_ucsrb, _udrie);
  
  return 1;
}

Is there a problem? Let's look at the definition of _tx_buffer:

struct ring_buffer
{
  unsigned char buffer[SERIAL_BUFFER_SIZE];
  volatile unsigned int head;
  volatile unsigned int tail;
};

Oh dear. head and tail are declared as int, i.e. 16 bits, 2 bytes. They are accessed by both the write routine and the interrupt service routine that actually transmits the data yet there's no locking in the write routine so the accesses aren't atomic. Why is that an issue? Well, the avr-libc documentation makes it clear:

A typical example that requires atomic access is a 16 (or more) bit variable that is shared between the main execution path and an ISR. While declaring such a variable as volatile ensures that the compiler will not optimize accesses to it away, it does not guarantee atomic access to it.

The documentation goes on to explain the sorts of symptoms you'll see if you ignore this, follow the link above if you want the full details. This is inexcusably shoddy code - the constraints on accessing variables that are shared between ISR and non-ISR code are well-known. What really concerns me is that people will use the Arduino code as an example of 'good' AVR code and it isn't, in many places it's frankly awful.

"So what?" you say, "That's only one chunk of code that's a bit naff." Unfortunately it's not an isolated instance. Let's move on now to look at one of the newer features that has been added to the Arduino platform, the re-implemented String class. Ok, let's build a minimal program that uses it:

#include "WString.h"
int main(void) {
    String bloat = "hello world";
    return 0;
}

And let's build it:

WString.cpp: In member function ‘int String::lastIndexOf(char, unsigned int) const’:
WString.cpp:503:38: error: comparison of unsigned expression < 0 is always false [-Werror=type-limits]
WString.cpp: In member function ‘int String::lastIndexOf(const String&, unsigned int) const’:
WString.cpp:519:63: error: comparison of unsigned expression < 0 is always false [-Werror=type-limits]

Sigh. One would think that the Arduino developers would at least turn on warnings when they are compiling their code, but they don't. And in this case, the consequence is a bug. So, temporarily comment out the offending lines so we get a successful build, and:

/opt/avr-gcc/bin/avr-size build/test.elf
   text    data     bss     dec     hex filename
  10194      20       5   10219    27eb build/test.elf

Can that really be right? 10K for a one-line program? Unfortunately it is. Any mention of String pulls in the entirety of the class, as well as all the other avr-libc routines it references. So on a Duemilanove that only has 32k to start with, a third of the available memory is gone before you start. At the time the class was being rewritten I expressed my opinion that it was probably a bad idea and that the Arduino developers really needed to target the platform they actually had and not the one they wished they had. And that's not the end of the issues with the String class - on a constrained-memory platform such as the AVR, providing a class like String that relies on malloc, creates lots of temporaries, fragments the (tiny) heap and has no real ability to deal with out-of-memory conditions is a recipe for problems, problems that will manifest themselves as random, mysterious and un-diagnosable run-time errors. And sure enough, a quick google shows that's exactly what tends to happen - just about the worst possible outcome for a platform that's targeted at neophytes.

That's just two examples - there are others as well, such as the well-known performance problems with pin access, which may be up to 50x slower that direct pin access. In fact the only two remaining parts of the Arduino libraries that I still use are the millisecond clock and the serial IO, and they are easy enough to replace, so that's what I'm doing.

While I applaud the aims of the Arduino project, the realities of the restricted hardware platform have to be taken into consideration. In addition, one of the aims of the project is to:

provide a well-designed, maintainable, and stable platform for the future
and despite its unquestionable success on many other fronts, on that one I feel the Arduino platform is less than entirely successful. I for one won't be using any of the software any more, it's just not what I consider to be acceptable quality.

Categories : Tech, AVR

Scala snippet of the day

The following is useful for benchmarking code from inside the REPL:

def timed(op: => Unit) = {
  val start = System.nanoTime
  op
  (System.nanoTime - start) / 1e9
}

Then you can use it like this:

scala> timed { Thread.sleep(5000) }
res0: Double = 5.010083526

That's syntactic sugar for an anonymous function definition and a call of timed. If a function takes only a single parameter you can pass it inside { }, so if we wrote that out longhand it would be:

scala> def fn = Thread.sleep(5000)
fn: Unit

scala> timed(fn)
res0: Double = 5.010088248

The other neat bit is the use of Scala's pass-by-name mechanism to pass the op parameter. That's the funny-looking op: => Unit argument list to timed. Normally Scala uses by-value parameters, i.e. the line timed(fn) would result in the evaluation of fn, and then passing the returned value (in this case, nothing) into timed. However by using pass-by-name, the evaluation of fn takes place inside timed, and by wrapping it inside timing instrumentation we can time how long op takes to execute.

Tags : ,
Categories : Web, Tech