This document is Copyright 1994 ARM Ltd, and has been included on this 
disc with their kind permission. This manual is supplied "as is"; ARM 
Limited ("ARM") makes no warranty, express or implied, of the 
merchantability of this document or its fitness for any particular 
purpose. In no circumstances shall ARM be liable for any damage, loss 
of profits, or any indirect or consequential loss arising out of the 
use of these recipes or inability to use these recipes, even if ARM has 
been advised of the possibility of such loss.
---------------------------------------------------------------------------

4. Interfacing Assembly Language and C
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

4.1 Register Usage under the ARM Procedure Call Standard
--------------------------------------------------------
4.1.1 About this Recipe
-----------------------
In this recipe you will learn about:

     the basic issues involved with interfacing ARM Assembly Language code 
     to C programs;
     the basic concepts of the ARM Procedure Call Standard (or APCS), with 
     more detail on register usage issues.

The supporting example illustrates:

     a simple function written in assembler which is linkable with C modules;
     some of the issues involved with the APCS.

4.1.2 Introduction to the APCS
------------------------------
The ARM Procedure Call Standard is a set of rules which govern calls between 
functions which are visible between separately compiled or assembled code 
fragments.
The following are defined by the APCS:

     constraints on the use of registers;
     stack conventions;
     the format of a stack backtrace data structure;
     argument passing and result return;
     support for the ARM shared library mechanism.

Code which is produced by compilers is expected to adhere to the APCS at all 
times.  Such code is said to be strictly conforming.
Hand written code is expected to adhere to the APCS when making calls to 
externally visible functions.  Such code is said to be conforming.
The ARM Procdeure Call Standard comprises a family of variants.  The 
following independent choices need to be made to fix the variant of the APCS 
required:

     Is the Program Counter 32-bit or 26-bit?
     Is stack limit checking explicit or implicit? ie. is stack limit 
     checking performed by code, or is it checked by memory management 
     hardware?
     Should floating point values be passed in floating point registers?
     Is code reentrant or non-reentrant?

Code which conforms to one APCS variant conforms to none of the other 
variants.
For the full specification of the APCS see ARM Procedure Call Standard 
starting on page38 of the Technical Specifications.

4.1.3 Register Names and Usage under the APCS
---------------------------------------------
The following table summarises the names and uses allocated to the ARM and 
Floating Point registers under the APCS (note that not all ARM systems 
support floating point):

Name     Register     APCS Role
----     --------     ---------
 a1        r0      argument 1 / integer result
 a2        r1      argument 2
 a3        r2      argument 3
 a4        r3      argument 4
 v1-v5    r4-r8    register variables
 sb        r9      static base
 sl        r10     stack limit / stack chunk handle
 fp        r11     frame pointer
 ip        r12     new-static base in inter-link-unit calls
 sp        r13     lower end of current stack frame
 lr        r14     link address
 pc        r15     program counter
 f0        f0      FP argument 1 / FP result
 f1        f1      FP argument 2
 f2        f2      FP argument 3
 f3        f3      FP argument 4
 f4-f7    f4-f7    FP register variables

Simplistically:

a1-a4, f0-f3	are used to pass arguments to functions.  a1 is also used to 
return integer results, and f0 to return FP results.  These registers can be 
corrupted by a called function.
v1-v5, f4-f7	are used as register variables.  They must be preserved by 
called functions.
sb,sl,fp,ip,sp,lr,pc		have a dedicated role in some APCS variants, 
some of the time.  ie. there are times when some of these registers can be 
used for other purposes even when strictly conforming to the APCS.  In some 
variants of the APCS sb and sl are available as additional variable 
registers v6 and v7 respectively.

As stated previously, hand coded assembler routines need not conform 
strictly to the APCS, but need only conform.  This means that all registers 
which do not need to be used in their APCS role by an assembler routine (eg. 
fp) can be used as working registers as long as their value on entry is 
restored before returning.

4.1.4 64 Bit Integer Addition
-----------------------------
The purpose of this example is to examine coding a small function in ARM 
Assembly Language, in a way which will enable it to be used from C modules.  
First, however, the function is coded in C, and the compiler's output 
examined.
Let us consider writing a 64 bit integer addition routine in C, where the 
data structure used to store 64 bit integers is a two word structure.  The 
obvious way to code the addition of these double length integers in 
assembler is to make use of the Carry flag from the low word addition in the 
high word addition.  However, there is no way to specify this in C.

A possible way to code around this in C is as follows:

void add_64(int64 *dest, int64 *src1, int64 *src2)
{ unsigned hibit1=src1->lo >> 31, hibit2=src2->lo >> 31, hibit3;
  dest->lo=src1->lo + src2->lo;
  hibit3=dest->lo >> 31;
  dest->hi=src1->hi + src2->hi +
           ((hibit1 & hibit2) || (hibit1!= hibit3));
  return;
}

Explanation
-----------
The highest bits of the low words in the two operands are calculated 
(shifting them into bit 0, while clearing the rest of the register). These 
are then used to determine the value of the carry bit (in the same way as 
the ARM itself does).

Examining the Compiler's Output
-------------------------------
If the 64 bit integer addition routine is used a great deal, then a poor 
implementation such as this is likely to be inadequate.  To see just how 
good or bad this implementation is let us look at the actual code which the 
compiler produces.
Set the current directory to examples.  The above code can be found in 
add64_1.c, which we can compile to ARM Assembly Language source as follows:

armcc -li -apcs 3/32bit -S add64_1.c

The -S flag tells armcc to produce ARM Assembly Language source (suitable 
for armasm) rather than producing object code.  The -li flag tells armcc to 
compile for a little-endian memory and the -apcs option specifies that the 
32 bit version of APCS 3 should be used.  You can omit these options if your 
armcc has been configured for this default (see The ARM Tool Reconfiguration 
Utility (reconfig) starting on page45 of the User Manual for details).
Looking at the output file, add64_1.s, we can see that this is indeed an 
inefficient implementation.

Modifying the Compiler's Output
-------------------------------
Let us go back to the original intention of coding the 64 bit integer 
addition using the ARM's Carry flag.  Since use of the Carry flag cannot be 
specified in C, we can get the compiler to produce almost the right code, 
and then modify it by hand.  Let us start with (incorrect) code which does 
not perform the carry addition:

void add_64(int64 *dest, int64 *src1, int64 *src2)
{ dest->lo=src1->lo + src2->lo;
  dest->hi=src1->hi + src2->hi;
  return;
}

To compile this to give assembler suitable for use with armasm first set the 
current directory to examples, and issue this command (the options used are 
described above):

armcc -li -apcs 3/32bit -S add64_2.c

This will produce source in add64_2.s, which will include something like the 
following code (it may be slightly different with the version of armcc 
supplied with this release):

add_64
    LDR    a4,[a2,#0]
    LDR    ip,[a3,#0]
    ADD    a4,a4,ip
    STR    a4,[a1,#0]
    LDR    a2,[a2,#4]
    LDR    a3,[a3,#4]
    ADD    a2,a2,a3
    STR    a2,[a1,#4]
    MOV    pc,lr

Looking at this carefully comparing it to the C source we can see that the 
first ADD instruction produces the low order word, and the second produces 
the high order word.  All we need to do to get the carry from the low to 
high word right is change the first ADD to ADDS (add and set flags), and the 
second ADD to an ADC (add with carry).  This modified code is available in 
the examples directory as add64_3.s.

What effect did the APCS have on this example ?
-----------------------------------------------
Look at the above code again.  The most obvious may in which the APCS has 
affected the code produced is that the registers are all given APCS style 
names, as introduced earlier in this recipe.
a1 clearly holds a pointer to the destination structure, a2 and a3 pointers 
to the operand structures.  Both a4 and ip are used as temporary registers, 
which are not preserved.  The conditions under which ip can be corrupted 
will be discussed later in this recipe.
This is a simple leaf function, which uses few temporary registers.  
Therefore no registers are saved to the stack, and none need to be restored 
on exit.  Thus a simple "MOV pc,lr" can be used to return.
If we had wished to return a result, perhaps the carry out from this 
addition, then it would be loaded into a1 prior to exit.  In this example, 
this could be done by changing the second ADD to ADCS (add with carry and 
set flags), and adding the following instructions to load a1 with 1 or 0 
depending on the carry out from the high order addition.

    MOV    a1, #0
    ADC    a1, a1, #0

Back to the first inefficient implementation
--------------------------------------------
Although the first C implementation was inefficient, it shows us more about 
the APCS than the more efficient hand modified version.
We have already seen a4 and ip being used as non-preserved temporary 
registers.  However, here v1 and lr are also used as temporary registers.  
v1 is preserved by storing it (together with lr) on entry.  lr is corrupted, 
but a copy is saved, onto the stack, and is reloaded into pc at the same 
time that v1 is restored.
Thus there is still only a single exit instruction, but now it is:

    LDMIA  sp!,{v1,pc}

4.1.5 More Detailed APCS Register Usage Information
---------------------------------------------------
It was stated initially that sb,sl,fp,ip,sp and lr are dedicated registers, 
but in the example we saw ip and lr being used as temporary registers.  
Indeed, there are times when these registers are not used for their APCS 
roles, and it is useful to know about these situations, so that efficient 
(but safe) code can be written to make use of as many of the registers as 
possible and thereby avoid doing unnecessary register saving and restoring.
ip	This register is used only during function call.  It is conventionally 
used as a local code generation temporary register.  At other times it can 
be used as a corruptible temporary register. 
 
lr   This register holds the address to which control must return on 
     function exit.  It can be (and often is) used as a temporary register 
     after pushing its contents onto the stack.  This value can then be 
     reloaded straight into the PC, as was the case in Back to the first 
     inefficient implementation starting on page65.

sp   This is the stack pointer, which is always valid in strictly conforming 
     code, but need only be preserved in hand written code.  Note, however, 
     that if any use of the stack is to be made by hand written code, sp 
     must be available. 

sl   This is the stack limit register.  If stack limit checking is explicit 
     (ie. it is performed by code when stack pushes occur, rather than by 
     memory management hardware causing a trap when stack overflow occurs), 
     then sl must be valid whenever sp is valid.  If stack checking is 
     implicit sl is instead treated as v7, an additional register variable 
     (which must be preserved by called functions).

fp   This is the frame pointer register.  It contains either zero, or a 
     pointer to the most recently created stack backtrace data structure.  
     As with the stack pointer, this must be preserved, but in hand written 
     code need not be available at all instants.  It should, however, be 
     valid whenever any strictly conforming functions are called.  For more 
     information refer to Function Invocations and Backtrace Structures 
     starting on page43 of the Technical Specifications.

sb   This is the static base register. If a the variant of the APCS being 
     used is reentrant, then this register is used to access an array of 
     static data pointers to allow code to access data reentrantly.  For 
     more information see Reentrant vs Non-Reentrant Code starting on page46
     of the Technical Specifications.  However, if the variant of the APCS 
     being used is not reentrant then sb is instead available as an 
     additional register variable, v6 (which must be preserved by called 
     functions).

Thus sp,sl,fp and sb must all be preserved on function exit for APCS conforming code.

4.1.6 Related Topics
--------------------

     Passing and Returning structs starting on page67;
     In-Line SWIs starting on page72.

4.2 Passing and Returning structs
---------------------------------
4.2.1 About this Recipe
-----------------------
In this recipe you will learn about:

     the way structs are normally passed to and from functions;
     cases when this is automatically optimised;
     how to tell the compiler to return a struct value using several 
     registers.

4.2.2 The Default Way to Pass and Return a struct
-------------------------------------------------
Unless special conditions apply (detailed in following sections), C 
structures are:

     passed in registers which if necessary overflow onto the stack;
     returned via a pointer to the memory location of the result.

For struct-valued functions a pointer to the location where the struct 
result is to be placed is passed in a1, (the first argument register).  The 
first argument is then passed in a2, the second in a3 etc.

It is as if:

struct s f(int x)

were compiled as:

void f(struct s *result, int x)

As a demonstration of the default way in which structures are passed and 
returned consider the following code:

typedef struct two_ch_struct
{ char ch1;
  char ch2;
} two_ch;

two_ch max( two_ch a, two_ch b )
{ return (a.ch1>b.ch1) ? a : b;
}

This code is available in the examples directory as two_ch.c.  It can be 
compiled to produce Assembly Language source by using the following command:

armcc -S two_ch.c -li -apcs 3/32bit

Where -li and -apcs 3/32bit can be omitted if armcc has been configured 
appropriately already - see The ARM Tool Reconfiguration Utility (reconfig) 
starting on page45 of the User Manual for more details.
Here is the code which armcc produced (the version of armcc supplied with 
this release may produce slightly different output to that listed here):

max
    MOV    ip,sp
    STMDB  sp!,{a1-a3,fp,ip,lr,pc}
    SUB    fp,ip,#4
    LDRB   a3,[fp,#-&14]
    LDRB   a2,[fp,#-&10]
    CMP    a3,a2
    SUBLE  a2,fp,#&10
    SUBGT  a2,fp,#&14
    LDR    a2,[a2,#0]
    STR    a2,[a1,#0]
    LDMDB  fp,{fp,sp,pc}

The STMDB instruction saves the arguments onto the stack, together with the 
frame pointer, stack pointer, link register and current pc value (this 
sequence of values is the stack backtrace data structure).
a2 and a3 are then used as temporary registers to hold the the required part 
of the strucures passed, and a1 as a pointer to an area in memory in which 
the resulting struct is placed - all as expected.
For a basic explanation of register naming and usage under the APCS, see 
Register Usage under the ARM Procedure Call Standard starting on page62.  
Detailed information can be found in C Language Calling Conventions 
starting on page47 of the Technical Specifications.

4.2.3 The Optimisation of Integer-like Structures
-------------------------------------------------
The ARM Procedure Call Standard specifies different rules for returning 
integer-like structs.  An integer-like struct is one which has the following 
properties:

     The size of the struct is no larger than one word;
     The byte offset of each addressable sub-field is 0 (bit-fields are not 
     addressable).

Thus the following structs are integer-like:

struct
{ unsigned a:8, b:8, c:8, d:8;
}

union polymorphic_ptr
{ struct A *a;
  struct B *b;
  int      *i;
}

Whereas the structure used in the previous example is not integer-like:

struct { char ch1, ch2; }

Integer-like structs are returned by returning the struct's contents in a1 
rather than a pointer to the struct's contents.  Thus a1 is not needed to 
pass a pointer to a result struct in memory, and is instead be used to pass 
the first argument.

For example, consider the following code:
typedef struct half_words_struct
{ unsigned field1:16;
  unsigned field2:16;
} half_words;

half_words max( half_words a, half_words b )
{ half_words x;
  x= (a.field1>b.field1) ? a : b;
  return x;
}

We would expect arguments a and b to be passed in registers a1 and a2, and 
since half_word_struct is integer-like we expect the result structure to be 
passed back directly in a1, (rather than a1 being used to return a pointer 
to the result half_words_struct).
The above code is available in the examples directory as half_str.c.  It can 
be compiled to produce Assembly Language source by using the following 
command:

armcc -S half_str.c -li -apcs 3/32bit

Where -li and -apcs 3/32bit can be omitted if armcc has been configured 
appropriately already - see The ARM Tool Reconfiguration Utility (reconfig) 
starting on page45 of the User Manual for more details.
Here is the code which armcc produced (the version of armcc supplied with 
this release may produce slightly different output to that listed here):

max
    MOV    a3,a1,LSL #16
    MOV    a3,a3,LSR #16
    MOV    a4,a2,LSL #16
    MOV    a4,a4,LSR #16
    CMP    a3,a4
    MOVLE  a1,a2
    MOV    pc,lr

Clearly the contents of the half_words structure is returned directly in a1 
as expected.

4.2.4 Returning Non Integer-Like structs in Registers
-----------------------------------------------------
There are occasions when a function needs to return more than one value.  
The normal way to achieve this is to define a structure which holds all the 
values to be returned, and return this.
As we have seen, this will result in a pointer to the structure being passed 
in a1, which will then be dereferenced to store the values returned.
For some applications in which such a function is time critical, the 
overhead involved in "wrapping" and then "unwrapping" this structure can be 
significant.  However, there is a way to tell the compiler that a structure 
should be returned in the argument registers a1 - a4.  Clearly this is only 
useful for returning structures which are no larger than 4 words.
The way to tell the compiler to return a structure in the argument registers 
is to use the keyword "__value_in_regs".

Multiplication - Returning a 64-bit Result
------------------------------------------
To illustrate how to use __value_in_regs, let us consider writing a function 
which multiplies two 32-bit integers together and returns the 64-bit result.
The way this function must work is to split the two 32-bit numbers (a, b) 
into high and low 16-bit parts,(a_hi, a_lo, b_hi, b_lo).  The four 
multiplications a_lo * b_lo, a_hi * b_lo, a_lo * b_hi, a_hi * b_lo must be 
performed, and the results added together, taking care to deal with carry 
correctly.
Since the problem involves dealing with carry correctly, coding this 
function in C will not produce optimal code (see 64 Bit Integer Addition 
starting on page63 for more details).  Therefore we will want to code the 
function in ARM Assembly Language.  The following code performs the 
algorithm just described:

; On entry a1 and a2 contain the 32-bit integers to be multiplied (a, b)
; On exit a1 and a2 contain the result (a1 bits 0-31, a2 bits 32-63) 

mul64
    MOV    ip, a1, LSR #16        ; ip = a_hi
    MOV    a4, a2, LSR #16        ; a4 = b_hi
    BIC    a1, a1, ip, LSL #16    ; a1 = a_lo
    BIC    a2, a2, a4, LSL #16    ; a2 = b_lo
    MUL    a3, a1, a2             ; a3 = a_lo * b_lo        (m_lo)
    MUL    a2, ip, a2             ; a2 = a_hi * b_lo        (m_mid1)
    MUL    a1, a4, a1             ; a1 = a_lo * b_hi        (m_mid2)
    MUL    a4, ip, a4             ; a4 = a_hi * b_hi        (m_hi)
    ADDS   ip, a2, a1             ; ip = m_mid1 + m_mid2    (m_mid)
    ADDCS  a4, a4, #&10000        ; a4 = m_hi + carry       (m_hi')
    ADDS   a1, a3, ip, LSL #16    ; a1 = m_lo + (m_mid<<16)
    ADC    a2, a4, ip, LSR #16    ; a2 = m_hi' + (m_mid>>16) + carry
    MOV    pc, lr

Clearly this code is fine for use with Assembly language modules, but in 
order to use it from C we need to be able tell the compiler that this 
routine returns its 64-bit result in registers.  This can be done by making 
the following declarations in a header file:

typedef struct int64_struct
{ unsigned int lo;
  unsigned int hi;
} int64;

__value_in_regs extern int64 mul64(unsigned a, unsigned b);

The Assembly Language code above, and the declarations above together with a 
test program are all in the examples directory, as the files: mul64.s, 
mul64.h, int64.h and multest.c.  To compile, assemble and link these to 
produce an executable image suitable for armsd first set your current 
directory to examples, and then execute the following commands:

armasm mul64.s -o mul64.o -li
armcc -c multest.c -li -apcs 3/32bit
armlink mul64.o multest.o somewhere/armlib.32l -o multest

Where somewhere is the directory in which the semi-hosted C libraries reside 
(eg. the lib directory of the ARM Software Tools Release).  Note also that 
-li and -apcs 3/32bit can be omitted if armcc and armasm (and armsd below) 
have been configured appropriately - see The ARM Tool Reconfiguration 
Utility (reconfig) starting on page45 of the User Manual for more details.

multest can then be run under armsd as follows:

> armsd -li multest
A.R.M. Source-level Debugger, version 4.10 (A.R.M.) [Aug 26 1992]
ARMulator V1.20, 512 Kb RAM, MMU present, Demon 1.01, FPE, Little endian.
Object program file multest
armsd: go
Enter two unsigned 32-bit numbers in hex eg.(100 FF43D)
12345678 10000001
Least significant word of result is 92345678
Most  significant word of result is  1234567
Program terminated normally at PC = 0x00008418
      0x00008418: 0xef000011 .... : >  swi     0x11
armsd: quit
Quitting
>

To convince yourself that __value_in_regs is being used try removing it from 
mul64.h, recompile multest.c, relink multest, and rerun armsd.  This time 
the answers returned will be incorrect, as the result is no longer expected 
to be returned in registers, but instead in a block of memory (ie. the code 
now has a bug).

4.2.5 Related Topics
--------------------
      Register Usage under the ARM Procedure Call Standard starting on page62;

      ARM6 Multiplier Performance Issues starting on page38.

4.3 In-Line SWIs
----------------
4.3.1 About This Recipe
-----------------------
This recipe shows how the ARM C Compiler can be used to generate in-line 
SWIs directly from C.

4.3.2 Introduction
------------------
The ARM instruction set provides the Software Interrupt (SWI) instruction to 
call Operating System routines.  It is useful to be able to generate such 
operating system calls from C without having to call hand crafted ARM 
Assembly Language to provide an interface between C and the SWI.
The ARM C Compiler provides a mechanism which allows many SWIs to be called 
efficiently from C.  SWIs which conform to the following rules can be 
compiled in-line,  without additional calling overhead:

     The arguments to the SWI (if any) must be passed in r0-r3 only.
     The results returned from the SWI (if any) must be returned in r0-r3 
     only.

The following sections demonstrate how to use the in-line SWI facility of 
armcc for a variety of different SWIs which conform to these rules.  These 
SWIs are taken from the ARM Debug Monitor interface, which is described in 
Standard Monitor SWIs starting on page105 of the Technical Specifications.
In the examples below, the following options are used with armcc:

-li   This specifies that the the target is a little endian ARM.
-apcs 3/32bit	This specifies that the 32 bit variant of APCS 3 should be 
      used.

4.3.3 Using a SWI which returns no result
-----------------------------------------
For example: SWI_WriteC, which we want to be SWI number 0.
This SWI is intended to write a byte to the debugging channel.  The byte to 
be written is passed in r0.
The following C code, intended to write a Carriage Return / Line Feed 
sequence to the debugging channel, can be found in the examples directory as 
newline.c:

void __swi(0) SWI_WriteC(int ch);

void output_newline(void)
{ SWI_WriteC(13);
  SWI_WriteC(10);
}

Look carefully at the declaration of SWI_WriteC.  __swi(0) is the way in 
which the SWI_WriteC 'function' is declared to be in-line SWI number 0.
This code can be compiled to produce ARM Assembly Language source using:

armcc -S -li -apcs 3/32bit newline.c -o newline.s

The code produced for the output_newline function is:

output_newline
    MOV    a1,#&d
    SWI    &0
    MOV    a1,#&a
    SWI    &0
    MOV    pc,lr
Please note that the version of armcc supplied with this release may produce 
slightly different output to that listed here.

4.3.4 Using a SWI which returns one result
------------------------------------------
Consider SWI_ReadC, which we want to be SWI number 4.
This SWI is intended to read a byte from the debug channel, returning it in 
r0.
The following C code, a naive read a line routine, can be found in the 
examples directory as readline.c:

char __swi(4) SWI_ReadC(void);

void readline(char *buffer)
{ char ch;
  do {
    *buffer++=ch=SWI_ReadC();
  } while (ch!=13);
  *buffer=0;
}

Again, the way in which SWI_ReadC is declared should be noted: it is a 
function which takes no arguments and returns a char, and is implemented as 
in-line SWI number 4. 
This code can be compiled to produce ARM Assembler source using:

armcc -S -li -apcs 3/32bit readline.c -o readline.s

The code produced for the readline function is:

readline
    STMDB  sp!,{lr}
    MOV    lr,a1
|L000008.J4.readline|
    SWI    &4
    STRB   a1,[lr],#1
    CMP    a1,#&d
    BNE    |L000008.J4.readline|
    MOV    a1,#0
    STRB   a1,[lr,#0]
    LDMIA  sp!,{pc}

Please note that the version of armcc supplied with this release may produce 
slightly different output to that listed here.

4.3.5 Using a SWI which returns 2-4 results
-------------------------------------------
If a SWI returns two, three or four results then its declaration must 
specify that it is a struct-valued SWI, and the special keyword 
__value_in_regs must also be used.  This is because a struct valued function 
is usually treated much as if it were a void function with a pointer to 
where to return the struct as the first argument.  See Passing and 
Returning structs starting on page67 for more details.
As an example consider SWI_InstallHandler, which we want to be SWI number 
0x70.
On entry r0 contains the exception number, r1 contains the workspace 
pointer, r2 contains the address of the handler.
On exit r0 is undefined, r2 contains the address of the previous handler and 
r1 the previous handler's workspace pointer.
The following C code fragment demonstrates how this SWI could be declared 
and used in C:

typedef struct SWI_InstallHandler_struct
{ unsigned exception;
  unsigned workspace;
  unsigned handler;
} SWI_InstallHandler_block;


SWI_InstallHandler_block 
  __value_in_regs  
    __swi(0x70) SWI_InstallHandler(unsigned r0, unsigned r1, unsigned r2);

void InstallHandler(SWI_InstallHandler_block *regs_in,
                    SWI_InstallHandler_block *regs_out)
{ *regs_out=SWI_InstallHandler(regs_in->exception,
                               regs_in->workspace,
                               regs_in->handler);
}

This code is provided in the examples directory as installh.c, and can be 
compiled to produce ARM Assembler source using:

armcc -S -li -apcs 3/32bit installh.c -o installh.s 

The code which armcc produces is:

InstallHandler
    STMDB  sp!,{lr}
    MOV    lr,a2
    LDMIA  a1,{a1-a3}
    SWI    &70
    STMIA  lr,{a1-a3}
    LDMIA  sp!,{pc}

Please note that the version of armcc supplied with this release may produce 
slightly different output to that listed here.

4.3.6 The SWI Number is not Known Until Run Time
------------------------------------------------
If a SWI is to be called, but the number of the SWI is not known until run 
time, then the mechanisms discussed above are not appropriate.
This situation might occur when there are a number of related operations 
which can be performed on a object, and these various operations are 
implemented by SWIs with different numbers.
There are several ways to deal with this, including:

     The SWI instruction can be constructed from the SWI Number, stored 
     somewhere and then executed.
     A 'generic' SWI can be used which takes as an extra argument a code for 
     the actual operation to be performed on its arguments.  This 'generic' 
     SWI must then decode the operation and then perform it.

A mechanism has been added to armcc to support the second method outlined 
here.  The operation is specified by a value which is passed in r12 (ip).  
The arguments to the 'generic' SWI are as usual passed in registers r0-r3, 
and values may optionally be returned in r0-r3 using the mechanisms 
described above.  The operation number passed in r12 may well be the number 
of the SWI to be called by the 'generic' SWI, but it need not be.

Here is an C fragment which uses a 'generic', or 'indirect' SWI:

unsigned __swi_indirect(0x80)
    SWI_ManipulateObject(unsigned operationNumber, unsigned object,
                         unsigned parameter);

unsigned DoSelectedManipulation(unsigned object, unsigned parameter,
                                unsigned operation)
{ return SWI_ManipulateObject(operation, object, parameter);
}

This code is provided in the examples directory as swimanip.c, and can be 
compiled to produce ARM Assembler source using:

armcc -S -li -apcs 3/32bit swimanip.c -o swimanip.s 

The code which armcc produces is:

DoSelectedManipulation
    MOV    ip,a3
    SWI    &80
    MOV    pc,lr

Please note that the version of armcc supplied with this release may produce 
slightly different output to that listed here.

4.3.7 Related Topics
--------------------
      Register Usage under the ARM Procedure Call Standard starting on page62;

      Passing and Returning structs starting on page67;
      C Programming for Deeply Embedded Applications starting on page87 for 
      example programs which make use of in line swis.
