Ron Kreymborg (28-Nov-2001)
Elsewhere on these pages (TaskR, TaskR2), I have described small, non-premptive task management programs that provide simple and controlled process delay and management. While more than adequate for many microcontroller situations, real-time environments with a heavy I/O or computational overhead need a preemptive process management system with process prorities to ensure real-time response.
In particular a simple task switcher provides no support for complex programs where several processes must interact, and the programmer finds him or herself managing a plethora of flags that indicate such things as when the serial port is busy, or when a task is ready to handle new data, etc. In addition, tasks that are waiting for an event have no option but to continually reschedule themselves to wakeup and check if what they are waiting for has occurred. These are not functions easily added to a task scheduler, but are provided as a matter of course by semaphores in any preemptive operating system worth its salt.
A search of the literature or the Web will turn up several systems that are in the public domain at the source level. (Tiny, Nut, etc). However, I had always been interested in the Xinu operating system, and many years ago bought the book by Douglas Comer titled Operating System Design and subtitled The Xinu Approach (ISBN 0-13-637554-5). I pulled this down from the bookshelf one recent weekend and started a re-read to see if the code could be adapted to the smaller environment of a microcontroller.
The version in the book is written for a DEC PDP/11-2, a 64K single address space microprocessor. Various versions on the web exist for Intel '86 and Motorola 68000 micros. In each case the authors have hacked the original code to make it fit the new hardware environment. I have departed from this practise by moving the processor based code out of the Xinu modules into separate files. This makes porting the OS to other micros a darn sight easier (I am also doing a Mitsubishi M16C port).
Much of the later chapters in the book deal with managing devices and peripherals that you expect to find in a normal interactive, disk based operating system. For the moment I have just implemented the kernel, although adding a disk and comms interface (TCP/IP, CAN, Bluetooth, etc) is in the pipeline. The code for many of these capabilities already exists at sites such as...
The microXinu kernel includes pre-emption, process priorities, semaphores, timed waits, and message passing. Timed waits can be in terms of the system clock tick or a number of seconds. Messages are two bytes long and can thus be a pointer.
The key to understanding process management and scheduling in Xinu is understanding the create function and what happens during the process switch function ctxsw.
Create must set up an initial processor environment that defines the process at startup. This includes initial register values, startup parameters, and an allocated working stack and program counter. The stack is allocated using normal avrgcc malloc functions. Create must load into the top of this stack the address of a function that will manage that process's eventual shutdown. For the AVR, any startup parameters must be loaded into the correct registers ready for unpacking on process start. In the current implementation these are all pointers except for the parameter count. The stack pointer must be initialised to ensure that when the process exits it cleanly executes the shutdown function userret.
The actual process switch is done in the ctxsw function. This is written in assembler as it must save the state of the current process as represented by the current processor state, and then reload the processor registers with the state of the process being switched to. This may be a new process or one that has been previously switched away from for some reason, or the null process if there are no other processes ready to run. The key point here is that all process switching occurs at this point, and every process resumes running as though they have just returned from ctxsw.
The null process is a simple, empty loop. Where power supply energy is limited, a processor like the '103 could easily replace this with a sleep instruction that powers down the cpu until the next interrupt.
While having 32 registers in the AVR micros is great for assembly programming, in a preemptive OS they can be something of a liability. Because there is no way to tell which registers are in use, the whole lot must be saved and reloaded during a process switch, thus increasing the switching times. And because there is absolutely no support for task switching in the AVR instruction set, each register must be individually pushed onto the stack. In fact, to push the stack pointer it must first be transferred to one of the registers using IO instructions! As an example, for a process waiting on an interrupt via a semaphore, the elapsed time from the interrupt occurrence until the process restarts is 210 uSecs for a 4MHz ATmega103 (this is using avrgcc). Of this, 52 uSecs are associated with saving and restoring the AVR state. The clock interrupt itself only takes about 8 uSec when there is nothing to do.
A little over a fifth of a millisecond is not a bad realtime response. In most of my projects the few devices that do require a fast response (comms for example) could be handled from within the interrupt routines, and the other various number crunching and IO tasks running simultaneously could live with these switch times.
I usually tick the system clock (Timer0) at 15.625 mSecs. This gives a process switch time that is a little over 1% of processing time, and a convenient figure to count down for a realtime clock. Xinu provides a constant called QUANTUM defining the number of clock ticks that must elapse before a process switch. This is decremented by the clock interrupt and reset in resched, providing a simple means for increasing the process run time where necessary. However, most IO tasks will never run for this length of time, and typically complete or block waiting for some event.
A process is initiated by calling create, passing the process name, priority, stack size, and any parameters. It returns the allocated process number with the process ready to run on the ready queue. The process number is passed to resume to actually start the process running. A process will run until it voluntarily exits. During its life it may be interrupted many times, both by normal peripheral interrupts and other processes with the same or higher priorities. Thus the run time of a process cannot be predefined or pre-established.
Semaphores solve the major problem with TaskR2. A process can flag it is waiting for an event and if that event is not ready, Xinu simply suspends the process until it is. The event can be an IO operation completeed, a message from another process (synchronization), or a device or function now free to handle a particular request. With this capability most IO tasks can be managed on an exclusive access basis. By this I mean a process can own a device even while waiting for the associated IO operation to complete. Owning a device means other processes also wishing to use it must wait until the owning process relenquishes that ownership. This is important where the entire operation may take considerable time and must run to conclusion. An good example would be a number of processes sending messages over a shared serial line (RS232, SPI, I2C, etc).
Semaphores are a necessary and powerful addition to a preemptive operating system. However they are not enough where a process must complete a particular sequence of IO operations without being interrupted. For most micros, turning off global interrupts for the duration of the operation is the only option. Naturally this must be used with great care, and should only be used where the operation elapsed time can be measured in microseconds.
However simply turning off interrupts is not enough in a preemptive system. In many cases the current code wll not know if it is being called with interrupts enabled or not. To prevent catastrophy, it must therefore save the current state of the interrupt flag before setting it to off, and restore it on exit. In a micro this is most easily implemented by pushing the processor state register onto the stack then clearing the interrupt bit. Processing within the critical region then proceeds. Subsequent exit requires popping the state off the stack and copying it back to the state register, restoring whatever state the interrupt flag had on entry. In my case this is implemented by an assembler macro.
So how do you use microXinu? Here I am assuming you are working with an AVR microprocessor. First the micro.c and micro.h files need to be modified for your microprocessor. The supplied examples have versions for the ATmega103 and 8515 (micro103 and micro8515). The included makefile compiles to a library file for a '103. The makefile8515 shows the changes to compile for an 8515. You must add the library to the make file for your application by adding the parameter -lxinu to the link line.
The kernel must be running before you start any processes, so very early in your startup code you will need to call the initxinu function. This initialises all major structures and starts the clock ticking. The example ticks TIMER0 at 15.625 mSecs from a standard 32768 Hz crystal, but you can change this to suit your application. Ticking much faster raises the percentage of processor time spent switching processes, much slower and response time will start to suffer.
This tick time was chosen to allow the use of a simple counter to manage a real time clock. Part of the sample microxinu initialisation starts a number of example housekeeping processes, one of which maintains a real time clock. This process is given a high priority and uses a semaphore initialised to zero. The process starts and the wait call thus immediately suspends it waiting on this semaphore. The clock tick interrupt decrements a counter that goes to zero after one second. When this occurs the interrupt routine calls signal for this semaphore, incrementing the semaphore and waking the high priority clock process. This runs immediately and is aware by the fact that it is running that another second has gone by, so it updates its internal counters that maintain time of day information.
This computation every second would probably be an extravagance for a real system which is more likely to use a long seconds counter that is zeroed when the time is set (giving about 136 years). The computation for derived structures like real time and the date would then be done on request. However it is meant to show how the operating system itself can include processes as long as their run time is extremely short.
When initxinu returns the OS is running. You can now start processes with the create and resume or ready functions. You typically use ready for the initial set of processes started as part of initialisation and before enabling interrupts. Create returns the process ID with the process suspended in the ready queue. Processes can be passed parameters much like starting from a command line. A count and a corresponding array of pointers can be included in the create call where these are required.
Example 1 shows a single process that must turn on a led for 100 mSec, and off for 900 mSec for a one second flash rate. It is started with no parameters, these values being hardcoded. It initialises a binary state variable that will monitor whether the led is on or off. This is set to indicate off and then the program enters an infinite loop. Each time the loop is entered the state variable is checked. If it indicates off then the led is turned on and the subsequent delay set for 100 mSec. If it is already on then it is turned off and the delay set for 900 mSec. The process then puts itself on the delay queue for the specified delay time.
Example 2 shows the use of a samaphore to manage a serial resource, here the uart transmitter. It also shows how two threads can use the same process with different parameters. The SendString function takes a pointer to a null terminated string and sends it to the uart. First it calls the wait function. This will suspend the calling process if the uart transmitter is busy, but otherwise acquires the semaphore and returns immediately. When it returns the string is sent to the serial output function. This records the string length and pointer and sends the first character before returning. The AVR interrupt sends the subsequent characters. When all have been sent the interrupt calls the signal function to release the semaphore.
Example 3 is more complex in that many more processes are initially started. Note that some processes start other processes as part of their respective tasks.
Note how the number of tasks and semahores is set in the conf.h file. You will need to define these values for your application before compiling microxinu. Defining the consant STATS will make microxinu keep track of the maximum number of processes, queued tasks, semaphores, etc used during that lifetime. Summarise this data at the completion of a test sequence to get an idea of the extent of these variables in your application. Add an additional margin and use them to recompile microxinu with a ram requirement that better suits your application.
The STATS option formats its results into ascii lines and passes them to a user supplied function for action. Depending upon you application, this can be passed to a uart or recorded in some way that allows subsequent access for review. I usually hook it up to Windows Terminal and capture the results to a disk file for later analysis. Of course this mechanism in itself is a semaphore controlled resource, and in a busy system will easily become the resource that sets the pace of execution. However, bearing this in mind that your system will not run at its full speed, it is very helpful in the early stages of process design.
Remember that Comer himself noted that the reason for most Xinu crashes was stack overflow, and this has been verified by others. Without memory protection hardware there is no easy solution here. The aforementioned STATS option fills a process' stack with known values at allocation and counts the unused entries when the process is killed, so at least this gives you an idea and helps in setting stack sizes per process.