Previous Section Next Section Table of Contents Glossary Index

Chapter 18. Modifying Clozure CL

18.4. Using AltiVec in Clozure CL LAP functions

18.4.1. Overview

It's now possible to use AltiVec instructions in PPC LAP (assembler) functions.

The lisp kernel detects the presence or absence of AltiVec and preserves AltiVec state on lisp thread switch and in response to exceptions, but the implementation doesn't otherwise use vector operations.

This document doesn't document PPC LAP programming in general. Ideally, there would be some document that did.

This document does explain AltiVec register-usage conventions in Clozure CL and explains the use of some lap macros that help to enforce those conventions.

All of the global symbols described below are exported from the CCL package. Note that lap macro names, ppc instruction names, and (in most cases) register names are treated as strings, so this only applies to functions and global variable names.

Much of the Clozure CL support for AltiVec LAP programming is based on work contributed to MCL by Shannon Spires.

18.4.2. Register usage conventions

Clozure CL LAP functions that use AltiVec instructions must interoperate with each other and with C functions; that fact suggests that they follow C AltiVec register usage conventions. (vr0-vr1 scratch, vr2-vr13 parameters/return value, vr14-vr19 temporaries, vr20-vr31 callee-save non-volatile registers.)

The EABI (Embedded Application Binary Interface) used in LinuxPPC doesn't ascribe particular significance to the vrsave special-purpose register; on other platforms (notably MacOS), it's used as a bitmap which indicates to system-level code which vector registers contain meaningful values.

The WITH-ALTIVEC-REGISTERS lap macro generates code that saves, updates, and restores VRSAVE on platforms where this is required (as indicated by the value of the special variable that controls this behavior) and ignores VRSAVE on platforms that don't require it to be maintained.

On all PPC platforms, it's necessary to save any non-volatile vector registers (vr20 .. vr31) before assigning to them and to restore such registers before returning to the caller.

On platforms that require that VRSAVE be maintained, it's not necessary to mention the "use" of vector registers that are used as incoming parameters. It's not incorrect to mention their use in a WITH-ALTIVEC-REGISTERS form, but it may be unnecessary in many interesting cases. One can likewise assume that the caller of any function that returns a vector value in vr2 has already set the appropriate bit in VRSAVE to indicate that this register is live. One could therefore write a leaf function that added the bytes in vr3 and vr2 and returned the result in vr2 as:

(defppclapfunction vaddubs ((y vr3) (z vr2))
  (vaddubs z y z)
  (blr))
      

When vector registers that aren't incoming parameters are used in a LAP function, WITH-ALTIVEC-REGISTERS takes care of maintaining VRSAVE and of saving/restoring any non-volatile vector registers:

(defppclapfunction load-array ((n arg_z))
  (check-nargs 1)
  (with-altivec-registers (vr1 vr2 vr3 vr27) ; Clobbers imm0
    (li imm0 arch::misc-data-offset)
    (lvx vr1 arg_z imm0)                ; load MSQ
    (lvsl vr27 arg_z imm0)              ; set the permute vector
    (addi imm0 imm0 16)                 ; address of LSQ
    (lvx vr2 arg_z imm0)                ; load LSQ
    (vperm vr3 vr1 vr2 vr27)           ; aligned result appears in VR3
    (dbg t))                         ; Look at result in some debugger
  (blr))
      

AltiVec registers are not preserved by CATCH and UNWIND-PROTECT. Since AltiVec is only accessible from LAP in Clozure CL and since LAP functions rarely use high-level control structures, this should rarely be a problem in practice.

LAP functions that use non-volatile vector registers and that call (Lisp ?) code which may use CATCH or UNWIND-PROTECT should save those vector registers before such a call and restore them on return. This is one of the intended uses of the WITH-VECTOR-BUFFER lap macro.


Previous Section Next Section Table of Contents Glossary Index