Event-driven programming strategy

23 jan 2023

You need a strategy to write effective programs. Without a strategy you end up with spaghetti code.

Events in the real world are arbitrary and asynchronous. The whole point of pushbutton switches is that people press them whenever they need to, not when the computer wants to. Embedded systems need to respond to the sensors, controls, and actuators attached to them.

Event-loops are a way to coerce linear-one-step-at-a-time stored-program computers, MCUs like Arduino, into handing as many tasks as you can think up, and clearly, succinctly, understandably and maintainably.

The event loop model

I agonized over how to describe this system incrementally, introducing one idea at a time, but it always ended up tedious and plodding. So instead I've laid out some examples that introduce these interrelated ideas at once:

With these you can code pretty much any interactive, real-time interaction you can think of. With these ideas, and the simple framework skeletal code in the examples, you can write real programs. With the addition of one more fundamental idea, Messages, not covered here yet anyway, you will have the basis for a genuine but simple real-time "operating system"; this has been highly developed in my roadster and Flock projects.

Event-loop programming is not new. The event-loop model is what set the first Macintosh apart from it's DOS competitor in the early 80's: where the Mac was (almost) entirely event-driven, DOS was trivially single-threaded: if you popped a floppy disk out when a program needed it, the entire system froze with the prompt "ERROR: R)etry, I)gnore, A)bort?". The Mac mostly handled this with grace; you could back out gracefully and try again, no work lost. Event-loops made the mouse and keyboard work independently.

Here is a more formal approach to teaching about Arduino and event-loop programming at CMU.

Event-loop examples, built-up from BLINK

As crappy as the Arduino-supplied BLINK program is, it hints at the right approach: turn LED on, delay, turn LED off, delay... then exit. Control passes right back in at the top, endlessly; the LED blinks. The problem that delay() function. It is a serious disappointment that so much of the Example code is built around delay().

BLINK matters. It's the first thing every newbie runs, and it and those other simple examples are looked to for basic ideas. And fail you, right from the start. delay() is harmful.

Set up to run these examples

DOWNLOAD THIS FILE: HowTo-SRTimer.zip





ILLUSTRATION #1: baseline: SRTimer_example_1

In the Arduino IDE, load sketch SRTimer_example_1, connect your Arduino board and upload it. This is the basic BLINK example with some fancy stuff added.

Load SRTimer_example_1 in the IDE, connect your Arduino board, under Tools select the board type (Uno, etc) and the serial port. Compile and upload to your board. Click the little hourglass in the upper right tool bar to open the serial monitor. Select 115200 bits/sec in the pulldown in the bottom right. The IDE will remember this setting.

Explanation

The statement A includes my library into your program. The library contains a bunch of code that you can use that you couldn't before. SRResources is documented here. B directs the compiler to place an object of type SRLoopTally right here, and name it SRL. C sets up the USB serial port for use in debugging; every program I write does some variation of this.

D is the setup function for SRLoopTally; by C language convention most objects that need setting up have a method called "begin". What begin does depends on the object. For SRLoopTally it says how often to print out it's little tally of statistics (5 seconds).

Last, the code within loop() is the plain old dumb BLINK example code, with one addition: at E, the invokation ("calling") the loop statistics tally, SRL.tally().

What SRLoopTally does

Instantiated and setup, SRL.tally() runs in every iteration through loop() and internal to the object generates statistics on how much time the code inside loop() takes to execute, and how many iterations (how many times loop() is called) per second. This will soon become interesting.

Discussion

The built-in LED should be blinking. Let it blink for 10 seconds or so. The serial monitor will display:

The text loop: 1/s, 1000mS avg, 1001mS max is the work of the SRL.tally(). It is very terse, but states that loop() ran through, completed, one iteration in one second ("1/s"), that the average loop execution time was 1000 milliseconds, and the maximum time any one iteration took was 1001 milliseconds.

Because this delay()-base blink is so stupid and wasteful, yes, it's all redundant, one blink cycle took one second. This is not news is it.

The "useful work" of BLINK is contained in the two digitalWrite() statements. The two delay()s essentially halt the machine -- they block. Blocking is bad. Blocking code is about the worst thing you can do in a machine that is supposed to be responsive or reactive to its environment.



ILLUSTRATION #2: inline-coded timer: SRTimer_example_2

Explanation

This program does exactly what example 1 did, with a big difference... But first let's look at the additions and changes.

First of all, we've added two of the fundamental structures mentioned, a crude timer and a simple state machine. This is still a "blink" program, it does the same thing as before to the LED, but so far all we've done is make it more complicated. This added complexity has great value, and will become simpler and shorter in subsequent examples.

The first structure, marked in red, is a timer. It uses a feature available in all Arduino compatibles, workalikes, and pretty much every microcontroller available anywhere: a millisecond "clock". millis() is a function that returns the number of milliseconds that have passed since the controller was reset/powered on. It is strongly analogous to a clock on the wall, time passes whether you're looking at it or not; millis() "runs in the background", it's actually a hardware feature of the CPU. Read about millis() elsewhere, here we will just use it.

millis() hands us a "time line" that goes in the forward direction that we can mark with a timer, here called T1. At program start T1 contains 0. At program line 27 the code compares millis() to T1, time has certainly passed (it always does) so by the time it gets to line 27 millis() is certainly greater than 0, so the we're inside the if. The first thing done is to set the timer for next time, T1 to 500 milliseconds into the future. millis() returns "now" o'clock; add 500 to that. Then it executes the state machine.

Lines 30 through 40 comprise our state machine. The state variable, ingeniously named state, directly controls which case statement is executed. state was explicitly set to 0 in setup(). This wasn't strictly necessary since new global variables are always set to zero but it tells us humans what the programmer's intent is. It's a good habit to develop.

We know that the first time through, state is 0, so the first case is executed. This turns the LED on, then sets state to 1, then break makes it exit the switch statement. The next thing executed is SRL.tally(), as in the previous example, and then loop() terminates -- only to be reinvoked immediately.

I will not tediously explain subsequent loop iterations. Note that since we advance the timer 500 milliseconds, that test against millis() will be false, and the state machine will not be executed, iterating repeatedly until somehow, half a second passes.

Which branch will the program take then?

Discussion

We just went over it line by line. Now push back in your chair and look at this running program as a dynamic, "living" thing, and not line by line. How does it behave? What does loop() do, most of the time? This loop entry, test, tally, exit, how long does that take? How many per second does it execute? Why has no one ever asked you this before? lol.

It should be obvious, and the serial monitor results will tell you, the CPU is not doing very much work at all to blink those LEDs (the goal forgotten in this discussion). loop() spends most of its time (...) waiting for the clock on the wall (millis()) to change, twice a second it executes a total of four tiny statements.

Instead of one iteration a second, it is now doing over 50,000 iterations/second with an average of 17 microseconds each. (The maximum loop time of some 800 uS is in fact LoopTally.)

Blinking an LED takes 34 uS every second (17 uS on, 17 uS off).

What else can you do in all that free time?

The added complexity of example 2, vs. example 1, is the state machine and timer. Working together these two concepts eliminated blocking.

Blocking vs non-blocking

Blocking means waiting for an event or time or a resource to become available. Blocking means literally no other thing can happen while it waits this one thing. If you are boiling eggs on the stove, you can stand there for 13 minutes [I hate soft-boiled eggs] and do nothing else, or you can start the process, and periodically check the clock while you do something else, like make coffee.

The state machine, coupled with the timer, eliminates blocking. It "unwinds the loop", it turns a loop -- repeating over and over the same task -- into a series of simple decisions and executed statements.

Here's a general rule to identify blocking code: code execution goes "backwards". This includes for (; ;), while ( ), do { } ; statements. These are all loops and loops potentially block.

Loosely, code execution should start at the top and continue unimpeded to the closing brace or returnstatement.

What makes a loop bad is waiting for some event or resource that is not ready or available.

Filling an array with a value non-blocking
Waiting for a switch press or release BLOCKING
Reading a Serial character BLOCKING
Searching an array, table, etc non-blocking
Waiting for a device READY pin BLOCK or non-blocking...
Data copy or events that never end BLOCK or non-blocking...

Waiting for an attached device or component or resource to become ready may or may not block, depending on the device. Very often, waiting is reasonable if the datasheet tells you the wait is a few microseconds or milliseconds. Each instance requires a bit of research.

Serial communication devices, and keyboards, radios, like switches, may change at any time or not at all. The Arduino Serial object has the available method precisely for this; if it returns true then read will return a character without blocking.

Some jobs, like writing lots of data to a big LCD display, even though non-blocking in that they terminate in a known, fixed time, might still take long enough to make other code sluggish. When you need to write thousands of bytes to a device (such as simple animation to an LCD) consider a state machine that writes a few hundred at a time then exits, picking up where it left off in the next call/iteration.

With not much practice, writing code to not block becomes natural, when you have easy solutions like the state machine to resolve multi-step decisions.



ILLUSTRATION #3: using SRTimer: SRTimer_example_3

Explanation

This series of examples is heading towards a particular end, a structure that can be expanded to handle a large number of sub-tasks, using a succinct style of event-loop coding.

This example improves on the last by using the SRTimer object, with which you can create any number of timers in the abstract. These timers can each be set, reset, force-triggered, tell you how-long-until. By default timers automatically restart after they fire, as in this blink example. SRTimeralso contains one-shot (monostable) and holdoff timers that run only once until explcitly triggered.

Documentation for SRTimer is here.

A major feature of using something like SRTimer is clarity of intent. With it you can write code such as "when this timer fires do this thing".

Discussion

Note that using SRTimer increased overhead a bit (a few more lines of code in the common global area above setup()) but the code in loop() is actually shorter; fewer lines. It doesn't matter much in programs this small, but as complexity increases, you want code to remain concise for clarity, and you'll find fixed overhead even when long just isn't that distracting.



ILLUSTRATION #4: blink two LEDs: SRTimer_example_4

Explanation

OK now we apply this so-far abstract advantage.

There is no practical way to blink two LEDs at independent rates, at all, using delay(). Here, it is easy, and by example 5, trivial.

I am intentionally leading you towards a particular and peculiar style of coding -- mine. If you are an experienced programmer you have your own. I'll suggest only that you consider clarity of intent when you write. If you're a novice this style is as good as any and you'll inevitably modify it to suit your needs. Take your own work seriously. There is no excuse for ugly code. Any program you persist at should be readable and critically, reusable.

Examples 4 and 5 require that you add a second LED, here, to pin 12. A resistor and LED in series.

Discussion

Note the output of SRL.tally() shows that blinking two LEDs is takes longer than one, not quite twice as long, or 28 microseconds. The number of iterations/second has dropped from 50,000 to 35,000, accordingly. Maximum is about the same, because SRLoopTally()'s overhead doesn't change.

Note also that I re-styled one of the blinkers in a very compact way. Instead of testing that the the timer has fired, it tests for it not having fired, and exits if so. This of course accomplishes the same thing. Why do it this way?

There is a long history of how code should be structured. While the general consensus is that there should be only one exit (at the bottom) and the first blinker does adhere to that, it is recognized that for tests such as this, straightforward tests that enable all following code, that putting negative-test cheks at the top adds clarity. This also eliminates a level of braces and detents in from the left margin. Again this is a short, simple subroutine, so these details are superfluous. I think it adds clarity to this and similar subroutines.

Code should be (dangerous words, that) written to maximize clarity. When a state machine such as an LED-blinker is this simple, this compact style adds clarity by highlighting the similarities and differences; the symmetry is clearly visible.

State machines can sometimes be long and complex, and the C/C++ switch statement often problematic, and considered prone to grammatical errors. It is also very useful at times, and can be extremely clear. Long if ... then ... else if ... chains, clearly indented and labelled, work also. Effort used to style code for readability is worth it.

This is fairly representative of timer- and event-driven code execution. This is the beginning of multi-tasking programming.



ILLUSTRATION #5: objects: SRTimer_example_5

Explanation

Example 5 is no longer about timers, blinking, or state machines; it is straight-up an example of C++ objects being used to contain and define a "subtask", containered as a separate program each, and tied together into a functional program by the main source file. I call these separate objects task loops.

The two blinkers of example 4 here each have their own tab, and the tab is a .h file. This means the IDE doesn't automatically include it as part of your program, you must do it with an explicit #include statement, as shown here. This lets you "insert code here" where you want it rather than where the IDE wants it. In this case it matters.

Note that each task loop is structured like an Arduino sketch. It has a setup() for doing one-time things and a loop() meant to be invoked over and over to accomplish the work.

This leaves the main loop to serve as an executive, supervisor program that does no or little work itself; it simply serves to execute each of the task loops' loop() code. SRL.tally() remains in the main module, and provides overview statistics on your program.