Taking advantage of the Cortex-M3’s pre-emptive context switches

The ARM Cortex-M3 (CM3) architecture is a 32-bit microcontroller core designed to replace many 8-bit and 16-bit devices by offering faster speeds and advanced system features.


Leveraging these advanced features requires a sound understanding of the CM3 hardware as well as dedicated systems software development. This article explains the CM3 hardware used for pre-emptive context switching as well as how to develop systems software routines that enable multi-tasking programs.


Understanding the CM3 Hardware

The CM3 has dedicated multi-tasking hardware including task-switching interrupts (SysTick and PendSV) and two stack pointers. The SysTick hardware consists of a 24-bit timer that triggers an interrupt each time it counts to zero. The PendSV interrupt is a software request, which can manually force a context switch.

Essentially, the SysTick interrupt enables round-robin style scheduling while the PendSV supports FIFO style.
The stack pointers for the CM3 include the main stack pointer (MSP) and the process stack pointer (PSP). The MSP is always used when in handler mode (when an interrupt is being serviced) and can optionally be used during regular program execution. The PSP is limited to use during regular program execution. This gives the system designer several options. The ARM Cortex-M3 technical reference manual suggests one configuration:


For a basic protected thread model, the user threads run in Thread mode using the process stack, and the kernel and the interrupts run privileged using the main stack.


Alternatively, the MSP can be used exclusively for handling interrupts (privileged) while the PSP executes all other execution threads (unprivileged).


To create a pre-emptive, multitasking system with the CM3 hardware, the systems designer must design a task table as well as routines for: initializing the switching system, creating new tasks, and handling the context switching interrupts. An entry in the task table can be as simple as the task stack pointer and a set of flags telling the context switcher which tasks to execute.


Typedef struct {

     
Void * stack; //Task’s stack pointer

     
Uint32_t flags; //In Use flag and dynamic execution flag


} task_table_t;


This implementation uses two flags. One flag indicates if an entry is “in use” while the other is a dynamic execution flag. The “in use” flag is helpful when creating tasks to indicate which table entries are available.


The dynamic execution flag allows the context switcher to quickly decide whether or not to execute any given task. If memory is sparse, a NULL stack pointer can be the “in use” flag, and the LSb of the stack pointer can act as the dynamic execution flag (since the stack is word-aligned the bottom two LSb’s are not used). Doing this requires additional overhead when switching to ensure the bottom two LSb’s are masked properly.


A detailed understanding of the CM3 stacking hardware and register convention is imperative to implementing multi-tasking handling routines. Figure 1 below shows the different values assigned to the PSP when switching tasks.

Taking advantage of the Cortex-M3s pre-emptive context switches


Figure 1: Values assigned to the PSP when switching tasks

When an interrupt request is serviced on the CM3, some registers (see “Hardware Stack Frame” in Figure 1) are automatically pushed by hardware onto the current stack-in this case, the process stack.


Software must save the remaining general-purpose registers (see “Software Stack Frame”). The following describes the chronological values assigned to the PSP when performing a context switch (reference Figure 1):


* PSP(0): Just before an interrupt request is serviced

* PSP(1): Just after an interrupt request is serviced

* PSP(2): After the context switcher saves the necessary registers on the stack

* PSP(3): After the context switcher reassigns the PSP to a new execution thread

* PSP(4): After the context switcher loads the last know state of the new thread

* PSP(5): After the interrupt request returns and execution of the new thread begins/resumes


Designing Context Switching Routines

Using the above details about the CM3 stacking and registers, the systems designer needs to create just three routines: the context switcher, the system initializer, and the task creator. Figure 2 below shows a software flow diagram of each routine.

Taking advantage of the Cortex-M3s pre-emptive context switches


Figure 2. Software flow diagram of system routines

The context switcher is invoked only through the SysTick and PendSV interrupt requests. It immediately pushes the software stack frame on to the process stack. It then saves the current value of the PSP in the task table of the previously executing task. Next, it decides which task to execute. This implementation–designed for fast switching–traverses the task table starting with the previously executing task and switches to the next task having the “execute” flag set.


The context switcher can alternatively be as sophisticated or as simple as the systems designer wishes and may consider task priority, CPU time, or other factors when designing a switching algorithm. Once the next task is determined, the PSP is assigned the value of the new task’s stack pointer retrieved from the task table.


Lastly-immediately before returning from the interrupt-the software pops the software stack frame from the process stack. When the interrupt returns, the CM3 interrupt handling hardware pops the hardware stack frame and execution of the new task begins/resumes.


In Figure 2, points “A” and “B” mark the locations where additional functionality may be added to the context switcher. An example of which is task timers. A single hardware timer can be used to track the CPU time utilized by each task. To do this, an entry in the task table for the task time is added.


At point “A”, the task timer for the previously executing task is saved in the task table. At point “B”, the task timer for the upcoming task is loaded from the task table. The systems designer may also consider setting the privilege level for the new task or adding task specific memory protection. The CM3 hardware supports both of these advanced features.


The system initializer shown in Figure 2 initializes the first task’s stack as well as the switching related hardware: namely the SysTick and PendSV interrupts. Figure 3 below shows the values of an initialized task stack–allocated dynamically or statically.


The hardware stack frame must be populated correctly in order for the task to start and stop properly. The values of the software stack frame are ignored. Nonetheless, the initial value of the stack pointer must point to the bottom of the software stack frame in order for the context switcher to load the software stack frame when switching to the task.

Taking advantage of the Cortex-M3s pre-emptive context switches


Figure 3. Values of an initialized task stack

Once the task’s stack is ready, the SysTick and PendSV interrupts are initialized. The SysTick reload register is loaded with the desired value to set the round robin interrupt time. The interrupt interval is calculated by multiplying the CPU frequency by the reload value.


Once the SysTick timing is configured, the interrupt is enabled to start switching. The PendSV interrupt is enabled by default, and no initialization is required. The PendSV interrupt is used to force a context switch after all initialization is complete. After which, execution never returns to the task initialization function.


The initial task can create additional tasks using the task creator routine (See Figure 3 above). Creating an additional task involves preparing the task’s stack and configuring an entry in the task table. The task creator routine finds an unused entry in the task table, populates the entry, and initializes the stack.


If there are no unused entries in the task table, no more tasks can be created unless the systems designer integrates a mechanism to dynamically resize the task table. If an available entry is found, the task creator routine initializes the stack for the new task. The calling function provides the memory location and size of the stack. This allows the caller to dynamically or statically allocate the stack.


The task switching routines can then be implemented in systems with or without a dynamic memory allocator. The stack initialization of a new task uses the same approach as initializing the initial task. The hardware stack frame must be properly initialized while the software stack frame is allocated but can be left uninitialized. For POSIX style threads, the type for new thread routines is:


void * routine(void * args);


Using this template, args is assigned to r0 in the hardware stack (additional arguments, if desired, are assigned to r1, r2, and r3 according to the “Procedure Call Standard for the ARM Architecture ABI” revision r2.08), routine is assigned to the start function, and a systems designer defined routine defines the stop function.


The stop function erases the task table entry so that the context switcher no longer executes the task-both the “in use” and dynamic execution flags should be cleared. The stop function can also be provided by the caller allowing the developer to free the stack or do any other task cleanup that might be required. The result of routine() is passed to the stop function in r0 using the following prototype for the stop function:


void stop_function(void * ret);


The stop function can then be used to pass the return value to any functions having requested this data. Under POSIX, the return value of routine() is required for implementing the pthread_join() function.


Conclusion

The CM3 has many advanced features that lend themselves to creating an embedded operating system. These features are context-switching friendly and include two stacks designed for switching system integration as well as two interrupts which enable support for round robin and FIFO switching algorithms.


The systems developer must have a sound understanding of the CM3 switching hardware as well as the stacking and register conventions in order to create the three routines required for context switching: the context switcher, system initialization, and the task creator.


Tyler Gilbert is the lead developer on CoActionOS, an embedded development platform for the ARM Cortex-M architecture (visit www.coactionos.com to learn more). He welcomes your feedback at tgil@coactionos.com.


References:
1 – CortexM3 Technical Reference Manual
2 – http://pubs.opengroup.org/onlinepubs/../pthread_join.html
3 – http://pubs.opengroup.org/onlinepubs/../functions/pthread_create.html
4 – http://pubs.opengroup.org/onlinepubs/../functions/pthread_exit.html

This article provided courtesy of Embedded.com and Embedded
Systems Design Magazine
. Sign up for
subscriptions
and newsletters. Copyright © 2011 UBM–All rights reserved.

Leave a Reply

Your email address will not be published. Required fields are marked *

*