Current area: HOME -> Unbound Stacks and Stoppable Tasks
Unbound Stacks and Stoppable Tasks
by Ralph Moore, President, Micro Digital Inc.
It is generally recognized that UNIX-style processes are not appropriate for most embedded systems.
This is because memory and performance overhead are too great. It is not widely recognized that
ordinary tasks (a.k.a. threads) can also have too much overhead for some embedded system
requirements.
This is primarily due to stacks. There is no practical way to avoid requiring one stack per active task.1
If, for example, one tries to use one monolithic stack for all tasks, a problem develops since tasks are
resumed in a different order than they are suspended. Soon, most tasks will have fixed size stack spaces
which do not permit deeper nesting of functions or handling of interrupts, etc. Although various complex
stratagems might be employed the universal solution is a stack per task.
Unfortunately even a lean kernel is likely to impose the need for about 200 bytes of stack. Added to this
are the requirements of the tasks, themselves, including maximum function nesting which, in C, can
easily add another 200 bytes. Some standard C library functions (e.g. sprintf( ) ) can add much more
than this just by themselves. This does not even include saving a coprocessor state or making room for
floating point emulation.
Finally, space is needed for the maximum possible nesting of interrupts another 200 bytes, or so. It is
true that interrupt service routines could have their own stacks. However, this adds delay at the most
critical point of most embedded systems namely, at the point of interrupt processing. Hence,
although done in non-real time operating systems (e.g. MS-DOS) interrupt stacks are not a good idea
for high-performance, embedded systems. Also, separate interrupt stacks create other complexities and
may not even save much, if any, memory.
Hence, we end up with on the order of 600 bytes of stack per task. This is a minimal requirement.
Special tasks may require even more. For example, graphic tasks can require as much as 10,000 bytes
of stack. Note also, that so far, we are thinking mostly in terms of 16-bit stacks. 32-bit processors
require about twice as much stack. However, 32-bit systems usually do not have as tight memory
constraints, so we will continue to focus, here, on 16-bit systems. 600 bytes per task may not seem like
much. However, this depends upon how beneficial it is to break a particular application into a large
(1: Except for link service routines which run in the context of the current task and act like foreground tasks with limited
capabilities. These are the subject of another article. See "Link Service Routines" by Ralph Moore.)
Number of small tasks. If say, 50 is an optimum number of tasks for a particular application, then 30KB
of RAM may be a bit too much.
There is a tendency among developers to divide an application into just a few tasks rather than to look
at what the application really needs. This I call the "big os mindset." There is an implicit assumption in
such thinking that tasks have far too much overhead to be used extensively in embedded designs. This is
not correct.
An example will clarify this. Suppose we are developing a communication system. Usually it works well
to have at least two tasks a receive task and a send task. Suppose there are 10 identical channels.
How many tasks should there be? I say 20. However, most designers would struggle with only 2. Why
is 20 better?
Because:
The code is simpler it only has to deal with one channel. The exact same
receive code is used by all 10 receive tasks. Similarly for the send code. Thus to have 20 tasks does not
add any more code than having two. In fact, the amount of code required is less.
More of what needs to be done is being done by kernel code. For example, kernel objects such as semaphores and
messages, are likely to be used more and application mechanisms such as flags and buffers, less. Kernel
code is a few orders of magnitude more reliable than brand-new application code.
It is scalable channels can be added or removed very easily.
It is dynamic tasks can be created and deleted on the fly.
The final design is easier to understand and, hence, maintain by different people.
Interfaces between sections of code are defined by the kernel API. Hence, they are more robust and
better documented.
The kernel adds extensive error checking and handling. These seldom get put into application code. This speeds debugging.
So in this simple example, there are abundant reasons to support a fine-grained task structure. This is not what is taught in most university IS departments. There, the massive-task concept of UNIX pervades and it colors thinking. I think that when dealing with the needs of high-performance, highreliability embedded systems, some rethinking of basic concepts is appropriate.
First, we need a kernel (not an os) which is mean and lean, so to speak. Then we need to learn how to best utilize that kernel
to achieve project goals.
So, let's get back to the subject of stacks. Wouldn't it be nice if at least some tasks could share stacks?
Then there would not be so many. This is where we introduce the concept of the unbound stack.
What is an unbound stack? An unbound stack is a stack which is not permanently bound to a particular task
it can be shared. How is this possible? It works as follows: Unbound stacks are all the same size and
belong to a single stack pool. When a new task is first dispatched (i.e. started running), unless it already
has a permanently-bound stack, it is loaned a stack from the stack pool. As long as the task is active
(i.e. either running or suspended) it keeps the stack. Hence the task and stack function normally when
the task is running. However, when the task completes its function and is ready to wait for more work, it
gives up the stack. We call this stopping the task as opposed to suspending it. In the latter case, the
task would keep the stack, but in the former case, the task has no need for the stack so the task gives it
up, (since a stopped task is later restarted from the beginning rather than resumed where it left off.)
The stack goes back to the stack pool where it is available to be used by another task. Now the task is
in what we call the unbound mode:
Figure 1: Task Modes
In this mode the task has no state variables (e.g. register contents) and no auto variables worth
remembering. This state is appropriate only if the task has completed all its work and is ready to start
over again, or to restart, as we call it.
The conventional task is created with task_main( ) as its main function. When this task first starts, it
initializes, unlocks itself so it can be preempted by higher priority tasks, then goes into an infinite while
loop. At the start of this loop the task waits at xchg for a message. When a message is received, it is
processed, then the task waits at xchg for another message. This repeats for each new message
received.
The stoppable task has a much different structure. It is created with task_main_init( ) as its initial main
function. When the task first runs, task_main_init( ) does the initialization, the task's main function is
changed to task_main( ), and the task stops on xchg. When a message is received the task restarts,
except this time with task_main( ) as its main function, and msg is passed in as a parameter. The
message is processed and the task again stops on xchg. Now, task_main( ) will restart for each new
message received.
What is the advantage of this? Let's return to our example and look at the 10 send tasks. Suppose
outgoing traffic is very light and a send task spends 90% of its time waiting at an exchange for another
message to send. Why tie up a valuable stack all this time? Quite possibly, 2 stacks (or even 1 stack)
would satisfy the needs of the 10 tasks. Hence using stoppable tasks would, in this one area, give an
80% reduction in stack space!
To reiterate what has been alluded to before, the memory savings, per se, is not the most important
gain.. What is most important is the freedom to utilize a fine-grained task structure to achieve an
optimum architecture for the application at hand.
What about performance? It turns out that getting and binding a free stack is an efficient process which
costs no more time than handling a bound stack. Hence, there is no downside performance-wise.
What happens, "once in a blue moon," when there aren't enough stacks? Simple, some tasks wait.
Processing is stacked up due to limited processor bandwidth, anyway, so waiting for free-stacks does
not degrade performance further. At worst a high-priority task could miss a deadline. However, such a
task can be given a permanently bound stack so that tis won't happen.
A final point: Notice that the stoppable task is not unlocked. This is not a requirement it can be
unlocked. However, then it can be preempted and will thus hold its borrowed stack longer. This is not a
problem, but a bigger stack pool may be required, as a consequence. Ideally, a stoppable task is a
tiger task sort of like an interrupt service routine. It is best to let it rip. We call tasks like this a
nonpre (nonpreemptible) task.
Although unbound stacks and stoppable tasks are not a panacea for all embedded system needs, they
are good additional tools to have in your toolbox. They have been an integral part of smx for the past 7
years and many applications have benefited from them. Hence, they are not just intellectual curiosities.
They make a contribution to the evolving theory of optimum embedded system kernels.
About the author
Ralph Moore graduated in physics from Caltech in 1962. He
did computer research at Honeywell then went to work for
Scientific Data Systems in 1965 as a logic designer. He
started his own consulting business, Micro Digital Associates,
in 1975 doing embedded systems (except no one called them that
then). At that time the 2MHz 8080 cost over $300 in 100 quantity!
Over the years the work mix changed steadily from hardware to
software. In 1988, Mr. Moore designed SMX, a real-time multi-
tasking kernel, and took Micro Digital into the software product
business. Today he continues as chief architect and president
of Micro Digital, Inc. You can contact him at Ralph@smxinfo.com
can b still fine Hello,
The article will b still fine if had
the tools for stack overflow analysis and
the stack's importance in embedded arena .
It would have been a delicious dish if
it provides a graphical analysis of the
stack analysing tools .
I have found a tool called stack analyser
that states to give us a control over
the stack used by embedded applications.
WHich is the best stack analysing tool
for embedded applications ? Let me know
the free/commercial variants of the tool !!
With StackAnalyzer -->
* StackAnalyzer provides automatic tool support to calculate the stack usage of your application. The analysis results are valid for all inputs and each task execution.
* StackAnalyzer not only reduces development effort, but also helps to prevent runtime errors due to stack overflow.
* The analysis results provide valuable feedback in optimizing the stack usage of your application.
StackAnalyzer for HC12 features:
* Detailed and precise information on stack usage by application tasks.
* Intuitive handling of complex software projects.
* Stack analysis for all application levels: routines, basic blocks, assembly instructions.
LET ME KNOW SOMETHING MORE THAN THIS.
CHEEEEEEEEEEEEEEEERS !!!
Computerworld The most comprehensive source of news and analysis on the technologies and management issues of information technology. ...
subscribe now