Don't delay()

It's very common for Arduino sketches to use the delay() library routine to control timing when performing time-related operations such as LED animations. Unfortunately delay() is toxic if you need to do more than one thing at once - for example animating more than one LED strip - as calling delay() just makes the CPU spin until the required amount of time has passed, and obviously nothing else can happen until the delay() call returns. The same is true for anything that uses any kind of spin-loop to wait for an event, for example polling a switch or a rotary encoder until it changes state (commonly used as a way of debouncing). The Arduino ecosphere is full of such example code and to be blunt all of it is completely useless - unless of course all you want do is dedicate your whole microcontroller to dealing with a single IO device.

It is possible to partially work around this issue by using timer interrupts to trigger actions, because the interrupt service routine will be called even if delay() is currently executing. However interrupts have their issues as well. The issues around interrupt service routine overhead and reentrancy are fairly widely understood. The issues around atomicity are less well understood, even in commercial products. Simply declaring a variable as volatile isn't sufficient - except for very simple cases you need to be sure that interrupts are disabled during all access to variables that are shared between ISR and non-ISR code. And if you are accessing several variables, or a structure, you need to make sure interrupts are disabled around the entire block.

Beyond the relatively simple use of interrupts, the next thing that's often considered is a Real-Time Operating System (RTOS) such as FreeRTOS. This seems like an attractive solution - a preemtive scheduler, interprocess communication and synchronisation all come as standard. Unfortunately when running on a microcontroller such as the ATMega there are a number of issues that a RTOS either doesn't solve or even worse, directly creates:

  • A RTOS may help manage shared software resources by providing features such as thread-safe queues and lists, but it can't really solve the problems related to shared hardware resources. For example, when several tasks need to access the SPI bus they each need to complete their work before relinquishing the bus. If a task is communicating with a peripheral on the bus, scheduling another task in the middle of the operation is not possible so RTOS preemption support is irrelevant - preemption needs to be disabled anyway until the current task has completed its bus transaction.
  • Supporting preemption means that tasks can potentially be suspended and resumed at any point. This means that the complete state of a task needs to be saved somewhere, including all the processor registers and the current stack for the thread. This overhead is significant - the ATMega168 only has 1Kb of SRAM, and the ATMega328P has 2K. Reports say that as few as 3-4 FreeRTOS tasks can be run on Arduino-class microcontrollers.
  • avr-libc, the core of the runtime system, isn't thread-safe, so if you use a RTOS you need either to lock around every call into avr-libc or you'll need a complete replacement for avr-libc. The situation with commonly-used Arduino libraries is just as bad - they weren't written with threads in mind, and will almost certainly break in strange and mysterious ways if used with a RTOS.

All in all, a RTOS isn't a good solution for severely constrained platforms such as the ATMega, yet we really need something to help us manage concurrency. That was the impetus behind the creation of my task library. It has the following features:

  • Non-preemptive, cooperative scheduler.
  • Fixed, priority scheduling governed by task list ordering.
  • Task list fixed at compile time.
  • Small code size - 28 bytes for the Task class and 116 bytes for the scheduler.
  • Low RAM requirements - 4 bytes + 2 bytes per task.
  • No thread stacks required - each task runs to completion, so the stack is fully unwound before task switching.

Whilst the task manager is undoubtedly much more limited than a RTOS, the limitations are ones that can usually be lived with. In compensation it becomes possible to schedule many tens of tasks even on a severely constrained platform such as the ATMega. As long as we can break the workload into small enough chunks, we can just schedule the tasks in a cooperative fashion. By providing priority scheduling we can ensure that even when temporarily overloaded the system will still remain responsive by scheduling the most important tasks first, and deferring the lower-priority tasks until the load drops again. In any case, preemption is no help if the system can't actually keep up with the rate of events it is being expected to handle.

So, in summary don't use delay() when there are better alternatives available! :-)

Categories : Tech, AVR

Re: Don't delay()

What about the idea about building an inter-mcu lock, permitting concurrency on any resources in a multi-core environment ? You'll find my old-tech solution here. An easier, faster and prettier solution would be to implement the lock system on a MCU, permitting to handle several locks (and protect several resources) on a single chip. My design was primary made to discover logic gates ;)

Re: Don't delay()

That's certainly an interesting approach. Another possibility would be to use the bus arbitration support that's provided for both the SPI and I2C interfaces on the ATMega. Those busses both support a single master and multiple slaves, and provide contention management. I know it's there because it has caused me grief in the past :-) From looking a the docs I think I2C might be the better choice as it has more sophisticated bus management and addressing features.

Re: Don't delay()

That's precisely why I would'nt use busses to make inter-core synchronisation: busses have high level negociation that would take time to negotiate :) My lock have a 6mhz clock, meaning it takes about 0.6ms to have a lock (on a interruption) when 4 concurrents MCU fight together. I don't think busses would be as fast (but it's an intuition, I am not sure about that).  On a 20Mhz MCU acting as a lock service, this would be even faster :)

Re: Don't delay()

How many clock cycles does your hardware take to arbitrate?  0.6 milliseconds seems kinda long...

The SPI bus can be clocked at up to 1/2 of the CPU clock rate, i.e. on a 16MHz Arduino, 8MHz.  TWI is a good bit slower - 400Khz, which may/may not be an issue, depending on what the synchronisation is being used for.

Re: Don't delay()

 Have you looked at protothreads?