October 23, 2007 The following paragraph does not apply to the United Kingdom or any country or state where such provisions are inconsistent with local law. The specifications in this manual are subject to change without notice. This manual is provided "AS IS". International Business Machines Corp. makes no warranty of any kind, either expressed or implied, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. International Business Machines Corp. does not warrant that the contents of this publication or the accompanying source code examples, whether individually or as one or more groups, will meet your requirements or that the publication or the accompanying source code examples are error-free. This publication could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. Address comments to IBM Corporation, 11400 Burnett Road, Austin, Texas 78758-3493. IBM may use or distribute whatever information you supply in any way it believes appropriate without incurring any obligation to you. The following terms are trademarks of the International Business Machines Corporation in the United States and/or other countries: IBM® Power ISA PowerPC® Power Architecture PowerPC Architecture Power Family RISC/System 6000® POWER POWER2 POWER4 POWER4+ POWER5 System/370 System z The POWER ARCHITECTURE and POWER.ORG. word marks and the Power and Power.org logos and related marks are trademarks and service marks licensed by Power.org. AltiVec is a trademark of Freescale Semiconductor, Inc. used under license. Notice to U.S. Government Users—Documentation Related to Restricted Rights—Use, duplication or disclosure is subject to restrictions set fourth in GSA ADP Schedule Contract with IBM Corporation. © Copyright International Business Machines Corporation, 1994, 2007. All rights reserved. # **Preface** The roots of the Power ISA (Instruction Set Architecture) extend back over a quarter of a century, to IBM Research. The POWER (Performance Optimization With Enhanced RISC) Architecture was introduced with the RISC System/6000 product family in early 1990. In 1991, Apple, IBM, and Motorola began the collaboration to evolve to the PowerPC Architecture, expanding the architecture's applicability. In 1997, Motorola and IBM began another collaboration, focused on optimizing PowerPC for embedded systems, which produced Book E. In 2006, Freescale and IBM collaborated on the creation of the Power ISA Version 2.03, which represented the reunification of the architecture by combining Book E content with the more general purpose PowerPC Version 2.02. A significant benefit of the reunification is the establishment of a single, compatible, 64-bit programming model. The combining also extends explicit architectural endorsement and control to Auxiliary Processing Units (APUs), units of function that were originally developed as implementation- or product family-specific extensions in the context of the Book E allocated opcode space. With the resulting architectural superset comes a framework that clearly establishes requirements and identifies options. To a very large extent, application program compatibility has been maintained throughout the history of the architecture, with the main exception being application exploitation of APUs. The framework identifies the base, pervasive, part of the architecture, and differentiates it from "categories" of optional function (see Section 1.3.5 of Book I). Because of the substantial differences in the supervisor (privileged) architecture that developed as Book E was optimized for embedded systems, the supervisor architectures for embedded and general purpose implementations are represented as mutually exclusive categories. Future versions of the architecture will seek to converge on a common solution where possible. This document defines the Power ISAversion 2.05. It is comprised of five books and a set of appendices. Book I, *Power ISA User Instruction Set Architecture*, covers the base instruction set and related facilities available to the application programmer. It includes five chapters derived from APU function, including the vector extension also known as Altivec. Book II, *Power ISA Virtual Environment Architecture*, defines the storage model and related instructions and facilities available to the application programmer. Book III-S, *Power ISA Operating Environment Architecture*, defines the supervisor instructions and related facilities used for general purpose implementations. Book III-E, *Power ISA Operating Environment Architecture*, defines the supervisor instructions and related facilities used for embedded implementations. It was derived from Book E and extended to include APU function. Book VLE, *Power ISA Variable Length Encoded Instructions Architecture*, defines alternative instruction encodings and definitions intended to increase instruction density for very low end implementations. It was derived from an APU description developed by Freescale Semiconductor. As used in this document, the term "Power ISA" refers to the instructions and facilities described in Books I, II, III-S. III-E. and VLE. Usage of the phrase "Book III" refers to both Book III-S and Book III-E. An exception to this rule is when, at the beginning of a Section or Book, it is specified that usage of the phrase "Book III" implies only either "Book III-S" or "Book III-E". Change bars have been included to indicate changes from the Power ISA Version 2.04. # **Summary of Changes in Power ISA 2.05** The PowerISA was created by applying the following requests for change(RFCs) to Power ISA version 2.04. <u>Power Management Architecture:</u> Four new hypervisor-level instructions are added that put the processor into power-saving modes in which execution is suspended and power consumption is reduced to varying degrees; see Chapter 3.3.2 of Book III-S. <u>Decimal Floating-Point</u>: Decimal Floating-Point (DFP) support is added as a category to the architecture; see Chapter 5 of Book I - III. <u>PCR (Program Compatibility Register)</u>: A hypervisor-accessible register is added that controls the availability of processor resources not available on implementations of previous versions of the architecture; see Section 2.6 of Book III-S. The next 2 RFCs facilitate Decimal Floating Point emulation, as an alternative to supporting the instructions described in RFC02080 in hardware. Binary Coded Decimal Assist (BCD) Instructions: Three new hypervisor-level instructions are aded that operate on Binary Coded Decimal operands; see Section 4.4.3 of Book III-S. <u>Hypervisor Emulation Assistance</u>: Illegal Instruction type Program interrupts are routed to a new interrupt (Hypervisor Emulation Assistance interrupt), which goes to the hypervisor, and the instruction image is copied into a new register (HEIR); see Section 6.5.19 of Book III-S. <u>Changes to mtspr and mfspr</u>: The behavior of the *mtspr* and *mfspr* instructions is defined for when the specified SPR is inaccessable due to the privilege level; see Section 4.4.5 of Book III-S. <u>Load Floating-Point as Integer Word</u>: A new instruction is added that loads the specified word into the low-order half of an FPR, and propagates the sign bit to fill the high-order half of the FPR; see Section 4.6.2 of Book I. <u>FPSCR</u> extended to 64 bits: The FPSCR is extended to 64 bits to accomodate an anticipated need for more floating-point status and control bits. The *mffs*, *mtfsfi*, and *mtfsf* instructions are extended to provide a means of managing the extended FPSCR; see Section 4.2.2 of Book I. <u>Floating-Point Copy Sign</u>: A new instruction is added that combines the sign from one register with the rest of the floating-point number in another register, and the result is placed in the target register. This instruction can be used for building a floating-point number efficiently (without bitwise manipulation); see Section 4.6.5 of Book I. <u>Parity Instructions</u>: Two new instructions are added that compute the parity on a word and a doubleword; see Section 3.3.12 of Book I. <u>Compare Byte Instruction:</u> A new instruction is added that compares each byte of a register to a byte-sized token: see Section 3.3.12 of Book I. <u>Come-From Address Register</u>: A new hypervisor-accessible register is added that is set to the effective address of the *rfid* instruction upon execution of the instruction. When a *Branch* instruction is executed and the branch is taken, the register is set to the effective address of an instruction in the instruction cache block containing the *Branch* instruction; see Section 8.1.1 of Book III-S. <u>Floating-Point Estimate</u>: Additional language is added to allow for implementations that provide higher than the minimum architected precision. (To cover current implementations, a new variant is introduced as category: Phased-Out that allows denormalized operands to be treated as 0, however this variant will be removed in the next revision.) See Section 4.6.6.1 of Book I. Load/Store Floating-Point Double Pair: New instructions are added that transfer pairs of doublewords between adjacent locations in memory and adjacent floating-point registers. These instructions are categorized as category: Phased-Out so that software does not develop a dependency on them. See Section 4.6.1 of Book I. <u>Miscellaneous Changes for V 2.05</u>: Miscellaneous primarily editorial enhancements. <u>Disable Secondary Page Table Search</u>: A new field is added to the LPCR to disable the secondary hash function during a page table search; see Section 5.7.7.3 of Book III-S. <u>SLB Find Entry ESID Instruction</u>: A new instruction is added that searches the SLB for an entry that matches the ESID specified by a GPR operand; see Section 5.9.3.1 of Book III-S. <u>Executed no-op Instruction</u>: The instruction *xori 0,0,0* is designated as a form of no-op that is excluded from run-time optimizations related to no-ops; see Section 3.3.12 of Book I. <u>Relaxed Page Table Alignment:</u> The option is provided to align the Page Table at any 2<sup>18</sup>-byte boundary instead of at a boundary that is a multiple of its size; see Section 5.7.7.4 of Book III-S. <u>Data Cache Block Flush Local Primary</u>: A new variant of the *dcbf* instruction is added that flushes the specified block from the local primary cache, but not from lower level caches; see Section 3.3.2 of Book II. <u>Reserved no-op</u>: Book E reserved-no-op instructions are added to Power ISA as category: Phased-In. These instructions are intended to be redefined as performance hint type instructions in the future while treated as no-ops in earlier processors; see Section 1.8.3 of Book I. <u>Stream Prefetching Extensions</u>: The ability to specify a default prefetch depth for hardware-detected and software-specified streams is added. Also, software-specification of store data streams is introduced; see Section 3.3.2 of Book II. <u>Mutex Hint:</u> A hint specification is added to the *Load And Reserve* instructions to indicate the type of mutual exclusion algorithm represented by the corresponding sequence of instructions; see Section 3.4.2 of Book II. <u>Enhanced Lookaside Buffer Management</u>Hint bits are added to the *slbia* instruction for limiting the invalidation of implementation-specific lookaside information (e.g. ERAT); see Section 5.9.3.1 of Book III-S. Caching Inhibited Load/Store Instructions (Hypervisor Only): The RMI bit in the LPCR is redefined as reserved. Eight new hypervisor-level instructions are added to replace the function previously provided by the RMI bit: The storage accesses caused by the new instructions are performed as though the specified stor- age location is Caching Inhibited and Guarded; see Section 4.4.1 of Book III-S. <u>Hypervisor Maintenance Interrupt</u>: A new type of interrupt is added that is caused by certain conditions in the hardware requiring the attention of the Hypervisor but that are not serious enough to require a Machine Check; see Section 6.5.20 of Book III-S. Mediated External Interrupt: A new type of External exception is added, called the "Mediated External exception". The currently defined External exception is renamed to be a "Direct External exception". A new bit, called the "Mediated External Exception Request" (MER) bit, is added to the LPCR, to indicate that a Mediated External exception is requested; see Section 6.5.7 of Book III-S. Scaled Processor Utilization of Resources (SPURR): A new SPR is added that measures the fraction of hardware resources used by a processor (as does the PURR), but takes into account changes in processing capacity made to help manage the thermal environment; see Section 7.6 of Book III-S. #### **Version Verification** See the Power ISA representative for your company. # **Table of Contents** | | 1.6.8 XL-FORM | |----------------------------------------------|---------------------------------------------| | | 1.6.9 XFX-FORM | | | 1.6.10 XFL-FORM | | | 1.6.11 XS-FORM | | | 1.6.12 XO-FORM | | Prefaceiii | 1.6.13 A-FORM | | Summary of Changes in Power ISA 2.05 | 1.6.14 M-FORM | | iv | 1.6.15 MD-FORM | | IV | 1.6.16 MDS-FORM | | Table of Contents vii | 1.6.17 VA-FORM | | Table of Contentsvii | 1.6.18 VC-FORM | | | 1.6.19 VX-FORM17 | | Figuresxxiii | 1.6.20 EVX-FORM | | | 1.6.21 EVS-FORM | | Book I: | 1.6.22 Z22-FORM | | | 1.6.23 Z23-FORM | | Power ISA User Instruction Set | 1.6.24 Instruction Fields 18 | | | 1.7 Classes of Instructions 21 | | Architecture1 | 1.7.1 Defined Instruction Class 21 | | | 1.7.2 Illegal Instruction Class 21 | | Chapter 1. Introduction 3 | 1.7.3 Reserved Instruction Class 21 | | 1.1 Overview | 1.8 Forms of Defined Instructions 21 | | 1.2 Instruction Mnemonics and Operands3 | 1.8.1 Preferred Instruction Forms 21 | | 1.3 Document Conventions 4 | 1.8.2 Invalid Instruction Forms 21 | | 1.3.1 Definitions 4 | 1.8.3 Reserved-no-op Instructions [Cate- | | 1.3.2 Notation 4 | gory: Phased-In (sV2.07)] 22 | | 1.3.3 Reserved Fields and Reserved Val- | 1.9 Exceptions | | ues 5 | 1.10 Storage Addressing | | 1.3.4 Description of Instruction Operation 7 | 1.10.1 Storage Operands 23 | | 1.3.5 Categories 9 | 1.10.2 Instruction Fetches | | 1.3.5.1 Phased-In/Phased-Out 10 | 1.10.3 Effective Address Calculation 26 | | 1.3.5.2 Corequisite Category 10 | | | 1.3.5.3 Category Notation 10 | Chapter 2. Branch Processor29 | | 1.3.6 Environments | 2.1 Branch Processor Overview 29 | | 1.4 Processor Overview 12 | 2.2 Instruction Execution Order 29 | | 1.5 Computation modes 14 | 2.3 Branch Processor Registers 30 | | 1.5.1 Modes [Category: Server] 14 | 2.3.1 Condition Register 30 | | 1.5.2 Modes [Category: Embedded] 14 | 2.3.2 Link Register | | 1.6 Instruction formats | 2.3.3 Count Register | | 1.6.1 I-FORM | 2.4 Branch Instructions 31 | | 1.6.2 B-FORM | 2.5 Condition Register Instructions 37 | | 1.6.3 SC-FORM | 2.5.1 Condition Register Logical Instruc- | | 1.6.4 D-FORM | tions | | 1.6.5 DS-FORM | 2.5.2 Condition Register Field Instruction. | | 1.6.6 DQ-FORM | 38 | | 1.6.7 X-FORM 16 | 2.6 System Call Instruction 39 | | Chapter 3. Fixed-Point Processor . 41 | 3.3.14.2 Move To/From System Registers | |-----------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------| | 3.1 Fixed-Point Processor Overview41 | [Category: Embedded] 97 | | 3.2 Fixed-Point Processor Registers 42 | | | 3.2.1 General Purpose Registers 42 | Chapter 4. Floating-Point Processor | | 3.2.2 Fixed-Point Exception Register42 | [Category: Floating-Point] 99 | | 3.2.3 Program Priority Register [Category: | 4.1 Floating-Point Processor Overview 99 | | Server] | 4.2 Floating-Point Processor Registers 100 | | 3.2.4 Software Use SPRs [Category: | 4.2.1 Floating-Point Registers 100 | | Embedded]43 | 4.2.2 Floating-Point Status and Control | | 3.2.5 Device Control Registers | Register101 | | [Category: Embedded] | 4.3 Floating-Point Data | | 3.3.1 Fixed-Point Storage Access Instruc- | 4.3.1 Data Format | | tions | 4.3.2 Value Representation 104 | | 3.3.1.1 Storage Access Exceptions44 | 4.3.3 Sign of Result 105 | | 3.3.2 Fixed-Point Load Instructions 44 | 4.3.4 Normalization and | | 3.3.2.1 64-bit Fixed-Point Load Instruc- | Denormalization | | tions [Category: 64-Bit]49 | 4.3.5 Data Handling and Precision 106<br>4.3.5.1 Single-Precision Operands 106 | | 3.3.3 Fixed-Point Store Instructions51 | | | 3.3.3.1 64-bit Fixed-Point Store Instruc- | 4.3.5.2 Integer-Valued Operands 107<br>4.3.6 Rounding 107 | | tions [Category: 64-Bit]54 | 4.4 Floating-Point Exceptions 108 | | 3.3.4 Fixed-Point Load and Store with Byte | 4.4.1 Invalid Operation Exception 110 | | Reversal Instructions | 4.4.1.1 Definition | | 3.3.5 Fixed-Point Load and Store Multiple | 4.4.1.2 Action | | Instructions | 4.4.2 Zero Divide Exception | | 3.3.6 Fixed-Point Move Assist Instructions | 4.4.2.1 Definition | | [Category: Move Assist] | 4.4.2.2 Action | | 3.3.7 Other Fixed-Point Instructions 61 | 4.4.3 Overflow Exception | | 3.3.8 Fixed-Point Arithmetic Instructions 62 | 4.4.3.1 Definition | | 3.3.8.1 64-bit Fixed-Point Arithmetic | 4.4.3.2 Action | | Instructions [Category: 64-Bit] 69 | 4.4.4 Underflow Exception 112 | | 3.3.9 Fixed-Point Compare Instructions.71 | 4.4.4.1 Definition | | 3.3.10 Fixed-Point Trap Instructions 73 | 4.4.4.2 Action | | 3.3.10.1 64-bit Fixed-Point Trap Instruc- | 4.4.5 Inexact Exception 113 | | tions [Category: 64-Bit]74 | 4.4.5.1 Definition | | 3.3.11 Fixed-Point Select [Category: | 4.4.5.2 Action | | Phased-In (sV2.06)] | 4.5 Floating-Point Execution Models . 113 | | 3.3.12 Fixed-Point Logical Instructions .75 | 4.5.1 Execution Model for IEEE Opera- | | 3.3.12.1 64-bit Fixed-Point Logical Instructions [Category: 64-Bit]81 | tions | | 3.3.12.2 Phased-In Fixed-Point Logical | 4.5.2 Execution Model for | | Instructions [Category: Phased-In | Multiply-Add Type Instructions 115 | | (sV2.05)]81 | 4.6 Floating-Point Processor Instructions. | | 3.3.13 Fixed-Point Rotate and Shift | 116 | | Instructions | 4.6.1 Floating-Point Storage Access | | 3.3.13.1 Fixed-Point Rotate Instructions 82 | Instructions | | 3.3.13.1.1 64-bit Fixed-Point Rotate | 4.6.1.1 Storage Access Exceptions 117 | | Instructions [Category: 64-Bit] 85 | <ul><li>4.6.2 Floating-Point Load Instructions 117</li><li>4.6.3 Floating-Point Store Instructions 121</li></ul> | | 3.3.13.2 Fixed-Point Shift Instructions88 | 4.6.4 Floating-Point Load Store Double- | | 3.3.13.2.1 64-bit Fixed-Point Shift Instruc- | word Pair Instructions [Category: Floating- | | tions [Category: 64-Bit]90 | Point.Phased-Out] | | 3.3.14 Move To/From System Register | 4.6.5 Floating-Point Move Instructions 126 | | Instructions | 4.6.6 Floating-Point Arithmetic Instructions | | 3.3.14.1 Move to/From One Condition | 127 | | Register Field Instructions [Category: | 4.6.6.1 Floating-Point Elementary Arith- | | Phased-In (sV2.05)] | metic Instructions | | 4.6.6.2 Floating-Point Multiply-Add Instruc- | 5.5.11 Summary of Normal Rounding And | |-----------------------------------------------------------------------|---------------------------------------------| | tions | Range Actions 160 | | 4.6.7 Floating-Point Rounding and Con- | 5.6 DFP Instruction Descriptions 162 | | version Instructions | 5.6.1 DFP Arithmetic Instructions 163 | | 4.6.7.1 Floating-Point Rounding Instruc- | 5.6.2 DFP Compare Instructions 167 | | tion | 5.6.3 DFP Test Instructions170 | | 4.6.7.2 Floating-Point Convert To/From | 5.6.4 DFP Quantum Adjustment Instruc- | | Integer Instructions 134 | tions | | 4.6.7.3 Floating Round to Integer Instruc- | 5.6.5 DFP Conversion Instructions 182 | | tions [Category: Floating-Point.Phased-In | 5.6.5.1 DFP Data-Format Conversion | | (sV2.05)] | Instructions | | 4.6.8 Floating-Point Compare Instructions | 5.6.5.2 DFP Data-Type Conversion | | 138 | Instructions | | 4.6.9 Floating-Point Select Instruction 139 | 5.6.6 DFP Format Instructions 187 | | 4.6.10 Floating-Point Status and Control | 5.6.7 DFP Instruction Summary 191 | | Register Instructions | 3.0.7 Did instruction Summary 191 | | Register instructions | Chapter 6 Vector Processor | | Chanter F. Desimal Electing Beint | Chapter 6. Vector Processor | | Chapter 5. Decimal Floating-Point | [Category: Vector] 193 | | [Category: Decimal Floating-Point] | 6.1 Vector Processor Overview 194 | | 143 | 6.2 Chapter Conventions 194 | | 5.1 Decimal Floating-Point (DFP) Proces- | 6.2.1 Description of Instruction Operation. | | sor Overview | 194 | | 5.2 DFP Register Handling 144 | 6.3 Vector Processor Registers 195 | | 5.2.1 DFP Usage of Floating-Point Regis- | 6.3.1 Vector Registers 195 | | ters | 6.3.2 Vector Status and Control Register. | | 5.3 DFP Support for Non-DFP Data Types | 195 | | 146 | 6.3.3 VR Save Register 196 | | 5.4 DFP Number Representation 147 | 6.4 Vector Storage Access Operations 196 | | 5.4.1 DFP Data Format | 6.4.1 Accessing Unaligned Storage Oper- | | 5.4.1.1 Fields Within the Data Format 148 | ands198 | | 5.4.1.2 Summary of DFP Data Formats | 6.5 Vector Integer Operations 199 | | 149 | 6.5.1 Integer Saturation 199 | | 5.4.1.3 Preferred DPD Encoding 149 | 6.6 Vector Floating-Point Operations . 200 | | 5.4.2 Classes of DFP Data | 6.6.1 Floating-Point Overview 200 | | 5.5 DFP Execution Model | 6.6.2 Floating-Point Exceptions 200 | | 5.5.1 Rounding | 6.6.2.1 NaN Operand Exception 201 | | 5.5.2 Rounding Mode Specification 151 | 6.6.2.2 Invalid Operation Exception 201 | | 5.5.3 Formation of Final Result 152 | 6.6.2.3 Zero Divide Exception 201 | | 5.5.3.1 Use of Ideal Exponent 152 | 6.6.2.4 Log of Zero Exception 201 | | 5.5.4 Arithmetic Operations 152 | 6.6.2.5 Overflow Exception 201 | | 5.5.4.1 Sign of Arithmetic Result 152 | 6.6.2.6 Underflow Exception 202 | | 5.5.5 Compare Operations | 6.7 Vector Storage Access Instructions | | 5.5.6 Test Operations | 202 | | 5.5.7 Quantum Adjustment Operations 153 | 6.7.1 Storage Access Exceptions 202 | | | 6.7.2 Vector Load Instructions 203 | | 5.5.8 Conversion Operations 153<br>5.5.8.1 Data-Format Conversion 153 | 6.7.3 Vector Store Instructions 206 | | | 6.7.4 Vector Alignment Support Instruc- | | 71 | tions | | 5.5.9 Format Operations | 6.8 Vector Permute and Formatting | | 5.5.10 DFP Exceptions | Instructions | | 5.5.10.1 Invalid Operation Exception . 156 | 6.8.1 Vector Pack and Unpack Instructions | | 5.5.10.2 Zero Divide Exception 157 | 209 | | 5.5.10.3 Overflow Exception | 6.8.2 Vector Merge Instructions 214 | | 5.5.10.4 Underflow Exception 158 | 6.8.3 Vector Splat Instructions 214 | | 5.5.10.5 Inexact Exception 159 | 6.8.4 Vector Permute Instruction 217 | | | 6.8.5 Vector Select Instruction 217 | | | 0.0.3 VEGIOI SEIEGI IIISII UGIIOII 217 | | 6.8.6 Vector Shift Instructions 218 | 7.3.8 Saturation, Shift, and Bit Reverse | |----------------------------------------------|-------------------------------------------| | 6.9 Vector Integer Instructions | Models | | 6.9.1 Vector Integer Arithmetic Instructions | 7.3.8.1 Saturation 267 | | 220 | 7.3.8.2 Shift Left 267 | | 6.9.1.1 Vector Integer Add Instructions 220 | 7.3.8.3 Bit Reverse 267 | | 6.9.1.2 Vector Integer Subtract Instruc- | 7.3.9 SPE Instruction Set 268 | | tions | | | 6.9.1.3 Vector Integer Multiply Instructions | Chapter 8. Embedded Floating-Point | | 226 | | | 6.9.1.4 Vector Integer Multiply-Add/Sum | [Category: SPE.Embedded Float Scal | | - · · · · · · · · · · · · · · · · · · · | ar Double] | | Instructions | <del>-</del> | | 6.9.1.5 Vector Integer Sum-Across Instruc- | [Category: SPE.Embedded Float Scal | | tions | ar Single] | | 6.9.1.6 Vector Integer Average Instruc- | [Category: SPE.Embedded Float Vect | | tions | or] | | 6.9.1.7 Vector Integer Maximum and Mini- | - | | mum Instructions237 | 8.1 Overview | | 6.9.2 Vector Integer Compare Instructions | 8.2 Programming Model 316 | | 241 | 8.2.1 Signal Processing Embedded Float- | | 6.9.3 Vector Logical Instructions 244 | ing-Point Status and Control Register | | 6.9.4 Vector Integer Rotate and Shift | (SPEFSCR) | | Instructions | 8.2.2 Floating-Point Data Formats 316 | | 6.10 Vector Floating-Point Instruction Set . | 8.2.3 Exception Conditions 317 | | 249 | 8.2.3.1 Denormalized Values on Input 317 | | 6.10.1 Vector Floating-Point Arithmetic | 8.2.3.2 Embedded Floating-Point Overflow | | Instructions | and Underflow | | 6.10.2 Vector Floating-Point Maximum and | 8.2.3.3 Embedded Floating-Point Invalid | | Minimum Instructions | Operation/Input Errors 317 | | 6.10.3 Vector Floating-Point Rounding and | 8.2.3.4 Embedded Floating-Point Round | | Conversion Instructions | (Inexact) | | 6.10.4 Vector Floating-Point Compare | 8.2.3.5 Émbedded Floating-Point Divide | | Instructions | by Zero | | 6.10.5 Vector Floating-Point Estimate | 8.2.3.6 Default Results | | Instructions | 8.2.4 IEEE 754 Compliance 318 | | 6.11 Vector Status and Control Register | 8.2.4.1 Sticky Bit Handling For Exception | | Instructions | Conditions | | 111311 40110113 | 8.3 Embedded Floating-Point Instructions | | Chapter 7 Signal Brossesing Engine | 319 | | Chapter 7. Signal Processing Engine | 8.3.1 Load/Store Instructions 319 | | (SPE) | 8.3.2 SPE.Embedded Float Vector Instruc- | | [Category: Signal Processing Engine | tions [Category: SPE.Embedded Float | | ]261 | Vector] | | <b>-</b> | 8.3.3 SPE.Embedded Float Scalar Single | | 7.1 Overview | Instructions | | 7.2 Nomenclature and Conventions 261 | [Category: SPE.Embedded Float Scalar | | 7.3 Programming Model | Single] | | 7.3.1 General Operation | 8.3.4 SPE.Embedded Float Scalar Double | | 7.3.2 GPR Registers | Instructions | | 7.3.3 Accumulator Register | [Category: SPE.Embedded Float Scalar | | 7.3.4 Signal Processing Embedded Float- | Double] | | ing-Point Status and Control Register | 8.4 Embedded Floating-Point Results | | (SPEFSCR) | Summary344 | | 7.3.5 Data Formats | Julillary | | 7.3.5.1 Integer Format | | | 7.3.5.2 Fractional Format265 | | | 7.3.6 Computational Operations 266 | | | 7.3.7 SPE Instructions | | | Chapter 9. Legacy Move Assist Instruction [Category: Legacy Move Assist] | <ul> <li>C.5 Convert to Single-Precision Embedded Floating-Point from Integer Word. 381</li> <li>C.6 Convert to Double-Precision Embedded Floating-Point from Integer Word. 381</li> <li>C.7 Convert to Double-Precision Embedded Floating-Point from Integer Double-</li> </ul> | |---------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Accumulate Instructions [Category: Legacy Integer Multiply- | word382 | | Accumulate] | Appendix D. Assembler Extended Mnemonics | | Appendix A. Suggested Floating- Point Models [Category: Floating- Point] | D.1 Symbols | | Appendix C. Embedded Floating-<br>Point RTL Functions | Appendix E. Programming Examples 397 | | [Category: SPE.Embedded Float Scalar Double] [Category: SPE.Embedded Float Scalar Single] [Category: SPE.Embedded Float Vector] | E.1 Multiple-Precision Shifts | | E.2.6 Conversion from Signed Fixed-Point | Chapter 2. Effect of Operand | |-------------------------------------------------------------------------|----------------------------------------------------------------------------------| | Integer Doubleword to Floating-Point Num- | Placement on Performance 421 | | ber401 E.2.7 Conversion from Unsigned Fixed- | 2.1 Instruction Restart 423 | | Point Integer Doubleword to Floating-Point | | | Number401 | Chapter 3. Storage Control | | E.2.8 Conversion from Signed Fixed-Point | Instructions 425 | | Integer Word to Floating-Point Number 401 | 3.1 Parameters Useful to Application Pro- | | E.3 Floating-Point Selection [Category: | grams | | Floating-Point]402 | 3.2 Data Stream Control Register (DSCR) | | E.3.1 Comparison to Zero | [Category: Stream] | | E.3.2 Minimum and Maximum | 3.3 Cache Management Instructions . 427 3.3.1 Instruction Cache Instructions 428 | | Constructions | 3.3.2 Data Cache Instructions 429 | | E.3.4 Notes | 3.3.2.1 Obsolete Data Cache Instructions | | E.4 Vector Unaligned Storage Operations | [Category: Vector.Phased-Out] 437 | | [Category: Vector]403 | 3.4 Synchronization Instructions 440 | | E.4.1 Loading a Unaligned Quadword | 3.4.1 Instruction Synchronize Instruction . | | Using Permute from Big-Endian Storage | 440 | | 403 | 3.4.2 Load and Reserve and Store Condi- | | Book II: | tional Instructions | | BOOK II. | Conditional Instructions [Category: 64-Bit] | | B 104 W 4 1 E 1 | 444 | | Power ISA Virtual Environment | 3.4.3 Memory Barrier Instructions 446 | | Architecture 405 | 3.4.4 Wait Instruction 449 | | 01 4 64 04 11 40 | OL | | Chapter 1. Storage Model 407 | Chapter 4. Time Base 451 | | 1.1 Definitions | 4.1 Time Base Overview | | 1.2 Introduction | 4.2 Time Base | | 1.4 Single-copy Atomicity | 4.3 Alternate Time Base [Category: Alter- | | 1.5 Cache Model | nate Time Base] 454 | | 1.6 Storage Control Attributes 410 | • | | 1.6.1 Write Through Required 410 | Chapter 5. External Control | | 1.6.2 Caching Inhibited | [Category: External Control] 455 | | 1.6.3 Memory Coherence Required [Cate- | 5.1 External Access Instructions 456 | | gory: Memory Coherence] 411<br>1.6.4 Guarded | | | 1.6.5 Endianness [Category: Embed- | Appendix A. Assembler Extended | | ded.Little-Endian] | Mnemonics 457 | | 1.6.6 Variable Length Encoded (VLE) | A.1 Data Cache Block Flush Mnemonics. | | Instructions | 457 | | 1.7 Shared Storage | A.2 Synchronize Mnemonics 457 | | 1.7.1 Storage Access Ordering413 1.7.2 Storage Ordering of I/O Accesses | Ammandia D. Dramanania a Evananta | | 415 | Appendix B. Programming Examples | | 1.7.3 Atomic Update | for Sharing Storage 459 | | 1.7.3.1 Reservations | B.1 Atomic Update Primitives 459 | | 1.7.3.2 Forward Progress 417 | B.2 Lock Acquisition and Release, and | | 1.8 Instruction Storage417 | Related Techniques 461 B.2.1 Lock Acquisition and Import Barriers | | 1.8.1 Concurrent Modification and Execu- | 461 | | tion of Instructions419 | B.2.1.1 Acquire Lock and Import Shared | | | Storage | | | | | B.2.1.2 Obtain Pointer and Import Shared Storage | 3.3.2.1 Entering and Exiting Power-Saving Mode | |--------------------------------------------------|------------------------------------------------| | 462 | Chapter 4. Fixed-Point Processor 487 | | B.2.2.1 Export Shared Storage and | 4.1 Fixed-Point Processor Overview 487 | | Release Lock | 4.2 Special Purpose Registers 487 | | B.2.2.2 Export Shared Storage and | 4.3 Fixed-Point Processor Registers 487 | | Release Lock using lwsync 462 | 4.3.1 Processor Version Register 487 | | B.2.3 Safe Fetch | 4.3.2 Processor Identification Register 487 | | B.3 List Insertion | 4.3.3 Control Register 488 | | B.4 Notes | 4.3.4 Program Priority Register 488 | | - · · · · · | 4.3.5 Software-use SPRs 489 | | Book III-S: | 4.4 Fixed-Point Processor Instructions 490 | | | 4.4.1 Fixed-Point Load and Store Caching | | Power ISA Operating Environment | Inhibited Instructions 490 | | Architecture - Server Environment | 4.4.2 Fixed-Point Load and Store Quad- | | | word Instructions [Category: Load/Store | | 465 | Quadword]493 | | | 4.4.3 Binary Coded Decimal (BCD) Assis- | | Chapter 1. Introduction 467 | tance Instructions [Category: BCD Assis- | | 1.1 Overview | tance] | | 1.2 Document Conventions 467 | 4.4.4 OR Instruction 496 | | 1.2.1 Definitions and Notation 467 | 4.4.5 Move To/From System Register | | 1.2.2 Reserved Fields 468 | Instructions | | 1.3 General Systems Overview 468 | | | 1.4 Exceptions 469 | Chapter 5. Storage Control505 | | 1.5 Synchronization 469 | 5.1 Overview 505 | | 1.5.1 Context Synchronization 469 | 5.2 Storage Exceptions 506 | | 1.5.2 Execution Synchronization 469 | 5.3 Instruction Fetch 506 | | | 5.3.1 Implicit Branch 506 | | Chapter 2. Logical Partitioning | 5.3.2 Address Wrapping Combined with | | (LPAR) 471 | Changing MSR Bit SF 506 | | 2.1 Overview 471 | 5.4 Data Access 506 | | 2.2 Logical Partitioning Control Register | 5.5 Performing Operations | | (LPCR) | Out-of-Order | | 2.3 Real Mode Offset Register (RMOR). | 5.6 Invalid Real Address 507 | | 473 | 5.7 Storage Addressing 508 | | 2.4 Hypervisor Real Mode Offset Register | 5.7.1 32-Bit Mode | | (HRMOR) 474 | 5.7.2 Virtualized Partition Memory (VPM) | | 2.5 Logical Partition | Mode | | Identification Register (LPIDR) 474 | 5.7.3 Real And Virtual Real Addressing Modes | | 2.6 Processor Compatibility Register | 5.7.3.1 Hypervisor Offset Real Mode | | (PCR) 474 | Address | | 2.7 Other Hypervisor Resources 475 | 5.7.3.2 Offset Real Mode Address 509 | | 2.8 Sharing Hypervisor Resources 476 | 5.7.3.3 Storage Control Attributes for | | 2.9 Hypervisor Interrupt Little-Endian | Accesses in Real and Hypervisor Real | | (HILE) Bit 476 | Addressing Modes510 | | | 5.7.3.3.1 Hypervisor Real Mode Storage | | Chapter 3. Branch Processor 477 | Control510 | | 3.1 Branch Processor Overview 477 | 5.7.3.4 Virtual Real Mode Addressing | | 3.2 Branch Processor Registers 477 | Mechanism510 | | 3.2.1 Machine State Register 477 | 5.7.3.5 Storage Control Attributes for | | 3.3 Branch Processor Instructions 479 | Implicit Storage Accesses 511 | | 3.3.1 System Linkage Instructions 479 | 5.7.4 Address Ranges Having Defined | | 3.3.2 Power-Saving Mode Instructions 481 | Uses | | | | | 5.7.5 Address Translation Overview514 | 6.2.2 Hypervisor Machine Status Save/ | |--------------------------------------------|--------------------------------------------| | 5.7.6 Virtual Address Generation 514 | Restore Registers 548 | | 5.7.6.1 Segment Lookaside Buffer (SLB) . | 6.2.3 Data Address Register 548 | | 514 | 6.2.4 Hypervisor Data Address Register | | 5.7.6.2 SLB Search 515 | 548 | | 5.7.7 Virtual to Real Translation 517 | 6.2.5 Data Storage Interrupt | | 5.7.7.1 Page Table518 | Status Register 548 | | 5.7.7.2 Storage Description | 6.2.6 Hypervisor Data Storage Interrupt | | Register 1 | Status Register 549 | | 5.7.7.3 Page Table Search 520 | 6.2.7 Hypervisor Emulation Instruction | | 5.7.7.4 Relaxed Page Table Alignment | Register [Category: Hypervisor Emula- | | [Category: Server.Relaxed Page Table | tion Assistance] 549 | | Alignment]522 | 6.2.8 Hypervisor Maintenance Exception | | 5.7.8 Reference and Change Recording | Register | | 522 | 6.2.9 Hypervisor Maintenance Exception | | 5.7.9 Storage and Virtual Page Class Key | Enable Register 549 | | Protection | 6.3 Interrupt Synchronization 550 | | 5.7.9.1 Virtual Page Class Key Protection. | 6.4 Interrupt Classes 550 | | 524 | 6.4.1 Precise Interrupt 550 | | 5.7.9.2 Storage Protection, Address | 6.4.2 Imprecise Interrupt 550 | | Translation Enabled | 6.4.3 Interrupt Processing 551 | | 5.7.9.3 Storage Protection, Address | 6.4.4 Implicit alteration of HSRR0 and | | Translation Disabled526 | HSRR1 554 | | 5.8 Storage Control Attributes 527 | 6.5 Interrupt Definitions 555 | | 5.8.1 Guarded Storage527 | 6.5.1 System Reset Interrupt 556 | | 5.8.1.1 Out-of-Order Accesses to Guarded | 6.5.2 Machine Check Interrupt 557 | | Storage527 | 6.5.3 Data Storage Interrupt 559 | | 5.8.2 Storage Control Bits 527 | 6.5.4 Data Segment Interrupt 560 | | 5.8.2.1 Storage Control Bit Restrictions | 6.5.5 Instruction Storage Interrupt 560 | | 528 | 6.5.6 Instruction Segment | | 5.8.2.2 Altering the Storage Control Bits | Interrupt | | 528 | 6.5.7 External Interrupt 561 | | 5.9 Storage Control Instructions 529 | 6.5.8 Alignment Interrupt 562 | | 5.9.1 Cache Management Instructions 529 | 6.5.9 Program Interrupt 563 | | 5.9.2 Synchronize Instruction529 | 6.5.10 Floating-Point Unavailable | | 5.9.3 Lookaside Buffer | Interrupt | | Management | 6.5.11 Decrementer Interrupt 565 | | 5.9.3.1 SLB Management Instructions 530 | 6.5.12 Hypervisor Decrementer | | 5.9.3.2 Bridge to SLB Architecture [Cate- | Interrupt | | gory:Server.Phased-Out]536 | 6.5.13 System Call Interrupt 565 | | 5.9.3.2.1 Segment Register | 6.5.14 Trace Interrupt [Category: Trace] . | | Manipulation Instructions 536 | 565 | | 5.9.3.3 TLB Management Instructions .539 | 6.5.15 Hypervisor Data Storage Inter- | | 5.10 Page Table Update Synchronization | rupt 566 | | Requirements | 6.5.16 Hypervisor Instruction Storage | | 5.10.1 Page Table Updates 543 | Interrupt | | 5.10.1.1 Adding a Page Table Entry544 | 6.5.17 Hypervisor Data Segment Inter- | | 5.10.1.2 Modifying a Page Table Entry 545 | rupt 567 | | 5.10.1.3 Deleting a Page Table Entry .546 | 6.5.18 Hypervisor Instruction Segment | | | Interrupt | | Chapter 6. Interrupts 547 | 6.5.19 Hypervisor Emulation Assis- | | 6.1 Overview | tance Interrupt [Category: Hypervisor | | 6.2 Interrupt Registers | Emulation Assistance] 568 | | 6.2.1 Machine Status Save/Restore Regis- | 6.5.20 Hypervisor Maintenance Interrupt. | | ters | 568 | | 6.5.21 Performance Monitor Interrupt [Category: Server.Performance Monitor] | B.2.4 Monitor Mode Control Register A596 B.2.5 Sampled Instruction Address Register | |-----------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | 6.7.2 Ordered Exceptions 571<br>6.8 Interrupt Priorities 571 | Extensions599 | | Chapter 7. Timer Facilities | Appendix D. Interpretation of the DSISR as Set by an Alignment Interrupt601 Appendix E. Programming Examples | | 7.3.1 Writing and Reading the Decrementer | E.1 Unsigned Single-PrecisionBCD Arithmetic | | Chapter 8.Debug Facilities5818.1Overview5818.1.1Come-From Address Register5818.1.2Data Address Breakpoint581 | Book III-E: Power ISA Operating Environment Architecture - Embedded | | Chapter 9. External Control [Category: External Control] 583 9.1 External Access Register 583 | Environment | | 9.2 External Access Instructions 583 | Chapter 1. Introduction | | Chapter 10. Synchronization Requirements for Context Alterations 585 | 1.3 Document Conventions 607 1.3.1 Definitions and Notation 607 1.3.2 Reserved Fields 608 1.4 General Systems Overview 608 | | Appendix A. Assembler Extended Mnemonics | 1.5Exceptions | | Appendix B. Example Performance Monitor | Chapter 2. Branch Processor611 2.1 Branch Processor Overview611 2.2 Branch Processor Registers611 2.2.1 Machine State Register611 2.3 Branch Processor Instructions613 2.4 System Linkage Instructions613 | | <ul> <li>B.2.1 Performance Monitor Counter Registers</li> <li>B.2.2 Monitor Mode Control Register 0 594</li> <li>B.2.3 Monitor Mode Control Register 1 596</li> </ul> | Chapter 3. Fixed-Point Processor 617 3.1 Fixed-Point Processor Overview 617 3.2 Special Purpose Registers 617 | | 3.3 Fixed-Point Processor Registers617 | 4.9.2.1 Lock Setting and Clearing 653 | |---------------------------------------------|----------------------------------------------| | 3.3.1 Processor Version Register 617 | 4.9.2.2 Error Conditions 653 | | 3.3.2 Processor Identification Register 617 | 4.9.2.2.1 Overlocking 653 | | 3.3.3 Software-use SPRs618 | 4.9.2.2.2 Unable-to-lock and Unable-to- | | 3.3.4 External Process ID Registers [Cate- | unlock Conditions 654 | | gory: Embedded.External PID]619 | 4.9.2.3 Cache Locking Instructions 655 | | 3.3.4.1 External Process ID Load Context | 4.9.3 Synchronize Instruction 657 | | (EPLC) Register | 4.9.4 Lookaside Buffer | | 3.3.4.2 External Process ID Store Context | Management657 | | (EPSC) Register620 | 4.9.4.1 TLB Management Instructions 658 | | 3.4 Fixed-Point Processor Instructions 621 | · · | | 3.4.1 Move To/From System Register | Chapter 5. Interrupts and Exceptions | | Instructions | 661 | | 3.4.2 External Process ID Instructions | | | [Category: Embedded.External PID]627 | 5.1 Overview | | . 0, | 5.2 Interrupt Registers 662 | | Chapter 4. Storage Control 639 | 5.2.1 Save/Restore Register 0 662 | | 4.1 Storage Addressing 639 | 5.2.2 Save/Restore Register 1 662 | | 4.1 Storage Addressing | 5.2.3 Critical Save/Restore Register 0 663 | | 4.3 Instruction Fetch | 5.2.4 Critical Save/Restore Register 1 663 | | | 5.2.5 Debug Save/Restore Register 0 | | 4.3.1 Implicit Branch640 | [Category: Embedded.Enhanced Debug]. | | 4.3.2 Address Wrapping Combined with | 663 | | Changing MSR Bit CM | 5.2.6 Debug Save/Restore Register 1 | | 4.4 Data Access | [Category: Embedded.Enhanced Debug]. | | 4.5 Performing Operations | 663 | | Out-of-Order | 5.2.7 Data Exception Address Register | | 4.6 Invalid Real Address | 664 | | 4.7 Storage Control | 5.2.8 Interrupt Vector Prefix Register . 664 | | 4.7.1 Storage Control Registers 641 | 5.2.9 Exception Syndrome Register 665 | | 4.7.1.1 Process ID Register | 5.2.10 Interrupt Vector Offset Registers | | 4.7.1.2 Translation Lookaside Buffer641 | 666 | | 4.7.2 Page Identification | 5.2.11 Machine Check Registers 666 | | 4.7.3 Address Translation | 5.2.11.1 Machine Check Save/Restore | | 4.7.4 Storage Access Control 647 | Register 0 667 | | 4.7.4.1 Execute Access | 5.2.11.2 Machine Check Save/Restore | | 4.7.4.2 Write Access | Register 1 667 | | 4.7.4.3 Read Access | 5.2.11.3 Machine Check Syndrome Regis- | | 4.7.4.4 Storage Access Control Applied to | ter | | Cache Management Instructions 647 | 5.2.12 External Proxy Register [Category: | | 4.7.4.5 Storage Access Control Applied to | External Proxy] 667 | | String Instructions | 5.3 Exceptions 668 | | 4.7.5 TLB Management | 5.4 Interrupt Classification 668 | | 4.8 Storage Control Attributes 649 | 5.4.1 Asynchronous Interrupts 668 | | 4.8.1 Guarded Storage | 5.4.2 Synchronous Interrupts 668 | | 4.8.1.1 Out-of-Order Accesses to Guarded | 5.4.2.1 Synchronous, Precise Interrupts . | | Storage | 669 | | 4.8.2 User-Definable | 5.4.2.2 Synchronous, Imprecise Interrupts | | 4.8.3 Storage Control Bits | 669 | | 4.8.3.1 Storage Control Bit Restrictions | 5.4.3 Interrupt Classes 669 | | 650 | 5.4.4 Machine Check Interrupts 669 | | 4.8.3.2 Altering the Storage Control Bits | 5.5 Interrupt Processing 670 | | 651 | 5.6 Interrupt Definitions 672 | | 4.9 Storage Control Instructions 652 | 5.6.1 Critical Input Interrupt 674 | | 4.9.1 Cache Management Instructions 652 | 5.6.2 Machine Check Interrupt 674 | | 4.9.2 Cache Locking [Category: Embed- | 5.6.3 Data Storage Interrupt 675 | | ded Cache Locking] | 5.6.4 Instruction Storage Interrupt 676 | | | | | <ul> <li>5.6.5 External Input Interrupt 676</li> <li>5.6.6 Alignment Interrupt 677</li> <li>5.6.7 Program Interrupt 678</li> <li>5.6.8 Floating-Point Unavailable Interrupt</li> <li>679</li> </ul> | 5.9.1.5 Exception Priorities for Defined Trap Instructions | |-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | 5.6.9 System Call Interrupt 679 5.6.10 Auxiliary Processor Unavailable Interrupt 679 | Branch Instructions 691 5.9.1.8 Exception Priorities for Defined Return From Interrupt Instructions 691 | | 5.6.11 Decrementer Interrupt 680 5.6.12 Fixed-Interval Timer Interrupt 680 5.6.13 Watchdog Timer Interrupt 680 | 5.9.1.9 Exception Priorities for Other Defined Instructions | | 5.6.14 Data TLB Error Interrupt 681<br>5.6.15 Instruction TLB Error Interrupt . 681<br>5.6.16 Debug Interrupt 682 | Instructions | | 5.6.17 SPE/Embedded Floating-Point/ Vector Unavailable Interrupt [Categories: SPE.Embedded Float Scalar | 693 6.1 Background 693 6.2 Reset Mechanisms 693 | | Double, SPE.Embedded Float Vector,<br>Vector] | 6.3 Processor State After Reset 693 6.4 Software Initialization Requirements 694 | | Interrupt [Categories: SPE.Embedded Float Scalar Double, SPE.Embedded Float Scalar Sin- | Chapter 7. Timer Facilities 695 | | gle, SPE.Embedded Float Vector] 684 5.6.19 Embedded Floating-Point Round Interrupt [Categories: SPE.Embedded Float Scalar Double, SPE.Embedded Float Scalar Sin- | 7.1 Overview | | gle, SPE.Embedded Float Vector] 684<br>5.6.20 Performance Monitor Interrupt [Category: Embedded.Performance Monitor] 685 | menter | | <ul><li>5.6.21 Processor Doorbell Interrupt [Category: Embedded.Processor Control] 685</li><li>5.6.22 Processor Doorbell Critical Interrupt [Category: Embedded.Processor Control].</li></ul> | 7.5 Timer Control Register 698 7.5.1 Timer Status Register 700 7.6 Fixed-Interval Timer 700 7.7 Watchdog Timer 701 7.0 702 | | <ul><li>685</li><li>5.7 Partially Executed Instructions 686</li><li>5.8 Interrupt Ordering and Masking 687</li><li>5.8.1 Guidelines for System Software 688</li></ul> | 7.8 Freezing the Timer Facilities 702 Chapter 8. Debug Facilities 703 8.1 Overview | | 5.8.2 Interrupt Order | 8.2 Internal Debug Mode | | Instructions 690 5.9.1.1 Exception Priorities for Defined Floating-Point Load and Store Instructions 690 | 8.4 Debug Events | | 5.9.1.2 Exception Priorities for Other Defined Load and Store Instructions and Defined Cache Management Instructions. 690 | 8.4.2 Data Address Compare Debug Event | | <ul><li>5.9.1.3 Exception Priorities for Other</li><li>Defined Floating-Point Instructions 690</li><li>5.9.1.4 Exception Priorities for Defined</li><li>Privileged Instructions 690</li></ul> | 709 8.4.6 Interrupt Taken Debug Event709 8.4.6.1 Causes of Interrupt Taken Debug Events709 | | Description | High728 A.2.1.2 Data Cache Debug Tag Register | |-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------| | 8.4.8 Unconditional Debug Event710 | A.2.1.2 Data Cache Debug Tag Register | | 8.4.8 Unconditional Debug Event710 | | | | Low | | | A.2.1.3 Instruction Cache Debug Data | | [Category: Embedded.Enhanced Debug] | Register | | 710 | A.2.1.4 Instruction Cache Debug Tag Reg- | | 8.4.10 Critical Interrupt Return Debug | ister High | | Event [Category: Embedded.Enhanced | A.2.1.5 Instruction Cache Debug Tag Reg- | | Debug] | ister Low | | | | | 8.5 Debug Registers | A.2.2 Embedded Cache Debug Instruc- | | 8.5.1 Debug Control Registers 711 | tions 730 | | 8.5.1.1 Debug Control Register 0 (DBCR0) | | | 711 | Appendix B. Assembler Extended | | 8.5.1.2 Debug Control Register 1 (DBCR1) | Mnemonics 733 | | 712 | B.1 Move To/From Special Purpose Reg- | | 8.5.1.3 Debug Control Register 2 (DBCR2) | ister Mnemonics 734 | | 714 | ister winemonics | | 8.5.2 Debug Status Register | Annondia C. Cuidelines for C4 bit | | 8.5.3 Instruction Address Compare Regis- | Appendix C. Guidelines for 64-bit | | ters | Implementations in 32-bit Mode and | | 8.5.4 Data Address Compare Registers | 32-bit Implementations 735 | | 716 | C.1 Hardware Guidelines 735 | | 8.5.5 Data Value Compare Registers .717 | C.1.1 64-bit Specific Instructions 735 | | 8.6 Debugger Notify Halt Instruction | C.1.2 Registers on 32-bit Implementations | | [Category: Embedded.Enhanced Debug] | 735 | | 718 | | | | C.1.3 Addressing on 32-bit Implementa- | | Chapter 9. Processor Control | tions | | | C.1.4 TLB Fields on 32-bit Implementa- | | [Category: Embedded Processor | tions | | | | | Control] 719 | C.2 32-bit Software Guidelines 735 | | Control]719 | | | <b>Control]</b> | C.2 32-bit Software Guidelines 735 | | Control] 719 9.1 Overview .719 9.2 Programming Model .719 | C.2 32-bit Software Guidelines 735<br>C.2.1 32-bit Instruction Selection 735 | | Control] | C.2 32-bit Software Guidelines 735 C.2.1 32-bit Instruction Selection 735 Appendix D. Type FSL Storage | | Control]7199.1 Overview | C.2 32-bit Software Guidelines 735 C.2.1 32-bit Instruction Selection 735 Appendix D. Type FSL Storage Control | | Control] 719 9.1 Overview 719 9.2 Programming Model 719 9.2.1 Processor Message Handling and Filtering 719 9.2.1.1 Doorbell Message Filtering 720 | C.2 32-bit Software Guidelines 735 C.2.1 32-bit Instruction Selection 735 Appendix D. Type FSL Storage Control [Category: Embedded.MMU Type | | Control]7199.1 Overview.7199.2 Programming Model.7199.2.1 Processor Message Handling and<br>Filtering.7199.2.1.1 Doorbell Message Filtering.7209.2.1.2 Doorbell Critical Message Filtering | C.2 32-bit Software Guidelines 735 C.2.1 32-bit Instruction Selection 735 Appendix D. Type FSL Storage Control | | Control]7199.1 Overview7199.2 Programming Model7199.2.1 Processor Message Handling and<br>Filtering7199.2.1.1 Doorbell Message Filtering7209.2.1.2 Doorbell Critical Message Filtering720 | C.2 32-bit Software Guidelines 735 C.2.1 32-bit Instruction Selection 735 Appendix D. Type FSL Storage Control [Category: Embedded.MMU Type FSL] 737 | | Control]7199.1 Overview.7199.2 Programming Model.7199.2.1 Processor Message Handling and<br>Filtering.7199.2.1.1 Doorbell Message Filtering.7209.2.1.2 Doorbell Critical Message Filtering | C.2 32-bit Software Guidelines 735 C.2.1 32-bit Instruction Selection 735 Appendix D. Type FSL Storage Control [Category: Embedded.MMU Type FSL] | | 9.1 Overview | C.2 32-bit Software Guidelines 735 C.2.1 32-bit Instruction Selection 735 Appendix D. Type FSL Storage Control [Category: Embedded.MMU Type FSL] | | Control]7199.1 Overview7199.2 Programming Model7199.2.1 Processor Message Handling and<br>Filtering7199.2.1.1 Doorbell Message Filtering7209.2.1.2 Doorbell Critical Message Filtering720 | C.2 32-bit Software Guidelines 735 C.2.1 32-bit Instruction Selection 735 Appendix D. Type FSL Storage Control [Category: Embedded.MMU Type FSL] | | 9.1 Overview | C.2 32-bit Software Guidelines 735 C.2.1 32-bit Instruction Selection 735 Appendix D. Type FSL Storage Control [Category: Embedded.MMU Type FSL] | | 9.1 Overview | C.2 32-bit Software Guidelines 735 C.2.1 32-bit Instruction Selection 735 Appendix D. Type FSL Storage Control [Category: Embedded.MMU Type FSL] | | 9.1 Overview | C.2 32-bit Software Guidelines 735 C.2.1 32-bit Instruction Selection 735 Appendix D. Type FSL Storage Control [Category: Embedded.MMU Type FSL] | | 9.1 Overview | C.2 32-bit Software Guidelines 735 C.2.1 32-bit Instruction Selection 735 Appendix D. Type FSL Storage Control [Category: Embedded.MMU Type FSL] | | 9.1 Overview | C.2 32-bit Software Guidelines 735 C.2.1 32-bit Instruction Selection 735 Appendix D. Type FSL Storage Control [Category: Embedded.MMU Type FSL] | | 9.1 Overview | C.2 32-bit Software Guidelines 735 C.2.1 32-bit Instruction Selection 735 Appendix D. Type FSL Storage Control [Category: Embedded.MMU Type FSL] | | 9.1 Overview | C.2 32-bit Software Guidelines 735 C.2.1 32-bit Instruction Selection 735 Appendix D. Type FSL Storage Control [Category: Embedded.MMU Type FSL] | | 9.1 Overview | C.2 32-bit Software Guidelines 735 C.2.1 32-bit Instruction Selection 735 Appendix D. Type FSL Storage Control [Category: Embedded.MMU Type FSL] | | 9.1 Overview | C.2 32-bit Software Guidelines | | 9.1 Overview | C.2 32-bit Software Guidelines | | 9.1 Overview | C.2 32-bit Software Guidelines | | 9.1 Overview | C.2 32-bit Software Guidelines | | 9.1 Overview | C.2 32-bit Software Guidelines | | 9.1 Overview | C.2 32-bit Software Guidelines | | D.2.5.1 MMU Configuration Register (MMUCFG) | ronment759 | |-------------------------------------------------------------------------------------------------------------|--------------------------------------------------| | (TLBnCFG) | Observan A. Wariahla Langth Francisco | | D.2.5.3 MMU Control and Status Register | Chapter 1. Variable Length Encoding | | (MMUCSR0) | Introduction761 | | D.3 Page Identification and Address | 1.1 Overview 761 | | Translation | 1.2 Documentation Conventions 762 | | D.4 TLB Management744 | 1.2.1 Description of Instruction Operation. | | D.4.1 Reading TLB Entries 744 | 762 | | D.4.2 Writing TLB Entries 744 | 1.3 Instruction Mnemonics and Operands | | D.4.3 Invalidating TLB Entries 745 | 762 | | D.4.4 Searching TLB Entries 745 | 1.4 VLE Instruction Formats 762 | | D.4.5 TLB Replacement Hardware Assist 745 | 1.4.1 BD8-form (16-bit Branch Instruc- | | | tions) | | D.5 32-bit and 64-bit Specific MMU Behavior | 1.4.2 C-form (16-bit Control Instructions) . 762 | | D.6 Type FSL MMU Instructions 747 | 1.4.3 IM5-form (16-bit register + immediate | | D.0 Type FOE WINTO MISTROCIONS 141 | Instructions)762 | | Appendix E. Example Performance | 1.4.4 OIM5-form (16-bit register + offset | | | immediate Instructions) 762 | | Monitor [Category: | 1.4.5 IM7-form (16-bit Load immediate | | Embedded Performance Monitor] 751 | Instructions) | | E.1 Overview 751 | 1.4.6 R-form (16-bit Monadic Instructions) | | E.2 Programming Model 751 | 763 | | E.2.1 Event Counting | 1.4.7 RR-form (16-bit Dyadic Instructions) | | E.2.2 Processor Context Configurability | 763 | | 752 | 1.4.8 SD4-form (16-bit Load/Store Instruc- | | E.2.3 Event Selection | tions) | | E.2.4 Thresholds | 1.4.9 BD15-form | | <ul><li>E.2.5 Performance Monitor Exception 753</li><li>E.2.6 Performance Monitor Interrupt . 753</li></ul> | 1.4.10 BD24-form | | E.3 Performance Monitor Registers 753 | 1.4.11 D8-form | | E.3.1 Performance Monitor Global Control | 1.4.12 I16A-form | | Register 0 | 1.4.13 | | E.3.2 Performance Monitor Local Control | 1.4.15 SCI8-form | | A Registers | 1.4.16 LI20-form | | E.3.3 Performance Monitor Local Control | 1.4.17 Instruction Fields 763 | | B Registers | 1.4.17 Illistraction 1 lotas 700 | | E.3.4 Performance Monitor Counter Registers | Chapter 2. VLE Storage Addressing . 767 | | E.5 Performance Monitor Software Usage | 2.1 Data Storage Addressing Modes . 767 | | Notes | 2.2 Instruction Storage Addressing Modes | | E.5.1 Chaining Counters 757 | 768 | | E.5.2 Thresholding 757 | 2.2.1 Misaligned, Mismatched, and Byte | | Ç | Ordering Instruction Storage Exceptions | | Book VLE: | 768 | | | 2.2.2 VLE Exception Syndrome Bits 768 | | Power ISA Operating Environment | Chapter 3. VLE Compatibility with | | Architecture - | Books I–III71 | | | | | | 3.1 Overview | | | 3.2 VLE Processor and Storage Control Extensions | | | 3.2.1 Instruction Extensions | | | J.Z. 1 111311 UUIIUI LAIGII SIUI 13 | | 3.2.2 MMU Extensions | 7.6 External PID 811 7.7 Embedded Performance Monitor . 812 7.8 Processor Control 812 | |------------------------------------------------------------------|---------------------------------------------------------------------------------------| | Chapter 4. Branch Operation | 7.0 7.1000000. 001.1101.1111.111.111.111.111. | | Instructions | Appendix A. VLE Instruction Set | | 4.1 Branch Processor Registers 773 4.1.1 Condition Register (CR) | Sorted by Mnemonic 813 | | 4.1.1.1 Condition Register Setting for | Appendix B. VLE Instruction Set | | Compare Instructions | • • | | 4.1.1.2 Condition Register Setting for the | Sorted by Opcode 829 | | Bit Test Instruction | | | 4.1.2 Link Register (LR) | Appendices: | | 4.1.3 Count Register (CTR) | | | 4.2 Branch Instructions | Power ISA Book I-III Appendices 845 | | 4.3 System Linkage Instructions 778 | Tower to A Dock I in Appointment of the | | 4.4 Condition Register Instructions781 | Appendix A. Incompatibilities with | | | | | Chapter 5. Fixed-Point Instructions. | the POWER Architecture 847 | | 783 | A.1 New Instructions, Formerly Privileged | | 5.1 Fixed-Point Load Instructions 783 | Instructions | | 5.2 Fixed-Point Store Instructions 787 | A.2 Newly Privileged | | 5.3 Fixed-Point Load and Store with Byte | Instructions | | Reversal Instructions | Instructions | | 5.4 Fixed-Point Load and Store Multiple | A.4 Reserved Bits in Registers 847 | | Instructions | A.5 Alignment Check 847 | | 5.5 Fixed-Point Arithmetic Instructions.791 | A.6 Condition Register 848 | | 5.6 Fixed-Point Compare and Bit Test | A.7 LK and Rc Bits 848 | | Instructions795 | A.8 BO Field | | 5.7 Fixed-Point Trap Instructions 799 | A.9 BH Field | | 5.8 Fixed-Point Select Instruction 799 | A.10 Branch Conditional to Count Register | | 5.9 Fixed-Point Logical, Bit, and Move | 848 | | Instructions | A.11 System Call 848 | | tions805 | A.12 Fixed-Point Exception | | 5.11 Move To/From System Register | Register (XER) 849 | | Instructions | A.13 Update Forms of Storage Access | | mon donorio i i i i i i i i i i i i i i i i i i | Instructions | | Chapter 6. Storage Control | A.14 Multiple Register Loads 849 A.15 Load/Store Multiple Instructions . 849 | | Instructions | A.16 Move Assist Instructions 849 | | 6.1 Storage Synchronization Instructions. | A.17 Move To/From SPR 849 | | 809 | A.18 Effects of Exceptions on FPSCR Bits | | 6.2 Cache Management Instructions810 | FR and FI | | 6.3 Cache Locking Instructions 810 | A.19 Store Floating-Point Single Instruc- | | 6.4 TLB Management Instructions810 | tions | | 6.5 Instruction Alignment and Byte Order- | A.20 Move From FPSCR 850 | | ing | A.21 Zeroing Bytes in the Data Cache 850 | | | A.22 Synchronization 850 | | Chapter 7. Additional Categories | A.23 Move To Machine State Register | | Available in VLE | Instruction | | 7.1 Move Assist | A.24 Direct-Store Segments 850 | | 7.2 Vector | A.25 Segment Register | | 7.3 Signal Processing Engine 811 | Manipulation Instructions 850 A.26 TLB Entry Invalidation 851 | | 7.4 Embedded Floating Point811 | A.26 ILB Entry invalidation | | 7.5 Legacy Move Assist 811 | A.28 Floating-Point Interrupts 851 | | | , 1.20 1.0ating 1.0int interrupto 001 | | A.29 Timing Facilities 851 A.29.1 Real-Time Clock 851 A.29.2 Decrementer 851 A.30 Deleted Instructions 852 A.31 Discontinued Opcodes 852 A.32 POWER2 Compatibility 853 A.32.1 Cross-Reference for Changed 853 POWER2 Mnemonics 853 A.32.2 Load/Store Floating-Point Double 853 A.32.3 Floating-Point Conversion to Integer 853 A.32.4 Floating-Point Interrupts 854 A.32.5 Trace 854 A.33 Deleted Instructions 854 A.33.1 Discontinued Opcodes 854 | |------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Appendix B. Platform Support Requirements 855 | | Appendix C. Complete SPR List. 859 | | Appendix D. Illegal Instructions . 863 | | Appendix E. Reserved Instructions . 865 | | Appendix F. Opcode Maps 867 | | Appendix G. Power ISA Instruction Set Sorted by Mnemonic 889 | | Appendix H. Power ISA Instruction Set Sorted by Category 907 | | Appendix I. Power ISA Instruction Set Sorted by Opcode 925 | | Index 943 | | Last Page - End of Document 953 | # **Figures** | Preface | . iii | 35. | Condition Register | 30 | |-------------------------------------------------|----------|------|-----------------------------------------------------|------| | | | 36. | Link Register | 31 | | Table of Contents | :: | 37. | Count Register | 31 | | Table of Contents | VII | 38. | BO field encodings | 32 | | | | | "at" bit encodings | | | Figuresx | viii | | BH field encodings | | | 1 igui 05 | XIII | | General Purpose Registers | 42 | | | | | Fixed-Point Exception Register | 42 | | Book I: | | | Program Priority Register | 43 | | | | | Software-use SPRs | | | Dower ISA Hear Instruction Set Archite | | | | | | Power ISA User Instruction Set Archite | | | | 101 | | ture | . 1 | | Floating-Point Status and Control | | | | | | • | 101 | | 1. Category Listing | 9 | 18 | | 103 | | Logical processing model | 12 | | | 103 | | | 13 | | 31 3 | 103 | | 3. Power ISA user register set | 14 | | 3 | 104 | | 4. I instruction format | | | <b>3</b> 1 | | | 5. B instruction format | 14 | | 1 1 | 104 | | 6. SC instruction format | 15 | | | 108 | | 7. D instruction format | 15 | | | 114 | | 8. DS instruction format | 15 | | , , | 114 | | 9. DQ instruction format | 15 | 56. | Location of the Guard, Round, and | | | 10. X Instruction Format | 16 | | • | 114 | | 11. XL instruction format | 16 | | , , | 115 | | 12. XFX instruction format | 16 | 58. | Location of the Guard, Round, and Sticky bits in | the | | 13. XFL instruction format | 16 | | 1 7 | 115 | | 14. XS instruction format | 17 | | 3 | 147 | | 15. XO instruction format | 17 | 61. | Format for Signed Decimal Data | 147 | | 16. A instruction format | 17 | | | 147 | | 17. M instruction format | 17 | 63. | DFP Short format | 148 | | 18. MD instruction format | 17 | 64. | DFP Long format | 148 | | 19. MDS instruction format | 17 | 65. | DFP Extended format | 148 | | 20. VA instruction format | 17 | 66. | Encoding of the G field for Special Symbols . | 148 | | 21. VC instruction format | 17 | | Encoding of bits 0:4 of the G field for Finite Numb | oers | | 22. VX instruction format | 17 | | 148 | | | 23. EVX instruction format | 17 | 68. | Summary of DFP Formats | 149 | | 24. EVS instruction format | 17 | | | | | 25. Z22 instruction format | 18 | | 150 | | | 26. Z23 instruction format | 18 | 70 | | 150 | | 27. Storage operands and byte ordering | 23 | | | 151 | | 28. C structure 's', showing values of elements | 24 | 72 | Encoding of DFP Rounding-Mode Control (DRI | | | 29. Big-Endian mapping of structure 's' | 24 | 12. | 151 | ٠, . | | 30. Little-Endian mapping of structure 's' | 24 | 73 | Primary Encoding of Rounding-Mode Control | 152 | | 31. Instructions and byte ordering | 24<br>24 | | Secondary Encoding of Rounding-Mode Control | | | | 24<br>24 | 74. | 152 | л | | 32. Assembly language program 'p' | | 75 | | 150 | | 33. Big-Endian mapping of program 'p' | 24 | | | 152 | | 34. Little-Endian mapping of program 'p' | 25 | 7 b. | Overflow Results When Exception Is Disabled | 108 | | 77. Rounding and Range Actions (Part 1) 160 | 4. Logical Partition Identification Register 474 | |------------------------------------------------------|-----------------------------------------------------------| | 78. Rounding and Range Actions (Part 2) 161 | 5. Processor Compatibility Register 474 | | 79. Actions: Add 164 | 6. Machine State Register 477 | | 80. Actions: Multiply | 7. Processor Version Register 487 | | 81. Actions: Divide | 8. Processor Identification Register 488 | | 82. Actions: Compare Unordered 168 | 9. Control Register 488 | | 83. Actions: Compare Ordered 169 | 10. Program Priority Register | | 84. Actions: Test Exponent | 11. Software-use SPRs 489 | | 85. Actions: Test Significance | 12. SPRs for use by hypervisor programs 489 | | 86. DFP Quantize examples | 13. Priority levels for or Rx,Rx,Rx | | 87. Actions (part 1) Quantize | 14. SPR encodings | | | | | 88. Actions (part2) Quantize | 15. SLBE for VRMA | | 89. DFP Reround examples 177 | 16. Address translation overview 514 | | 90. Actions: Reround | 17. Translation of 64-bit effective address to | | 91. Actions: Round to FP Integer With Inexact 180 | 78 bit virtual address 514 | | 92. Actions: Round to FP Integer Without Inexact 181 | 18. SLB Entry | | 93. Actions: Data-Format Conversion Instructions 182 | 19. SLBL <sub>L LP</sub> Encoding 515 | | 94. Actions: Convert To Fixed 186 | 20. Translation of 78-bit virtual address to 60-bit real | | 95. Actions: Insert Biased Exponent 189 | address 517 | | 96. Decimal Floating-Point Instructions Summary 191 | 21. Page Table Entry 518 | | 97. Vector Register elements 195 | 22. Format of PTE <sub>LP</sub> | | 98. Vector Registers | 23. SDR1 | | 99. Vector Status and Control Register 195 | 24. Setting the Reference and Change bits 523 | | 100. VR Save Register | 25. Authority Mask Register (AMR) 524 | | 101. Aligned quadword storage operand 197 | 26. PP bit protection states, address | | 102. Vector Register contents for aligned quadword | translation enabled | | Load or Store | 27. Protection states, address translation | | 103. Unaligned quadword storage operand 197 | disabled | | 9 1 9 1 | | | 3 | 28. Storage control bits | | 105. GPR | 29. GPR contents for slbmte | | 106. Accumulator | 30. GPR contents for slbmfev | | 107. Signal Processing and Embedded Floating-Point | 31. GPR contents for slbmfee 534 | | Status and Control Register | 32. GPR contents for slbfee | | 108. Floating-Point Data Format | 33. GPR contents for mtsr, mtsrin, mfsr, and | | | mfsrin | | Book II: | 34. Save/Restore Registers 548 | | 2001 | 35. Hypervisor Save/Restore Registers 548 | | | 36. Data Address Register 548 | | Power ISA Virtual Environment Architec- | 37. Hypervisor Data Address Register 548 | | ture 405 | 38. Data Storage Interrupt Status Register 548 | | 100 | 39. Hypervisor Data Storage Interrupt Status Register 549 | | 1. Performance effects of storage operand placement | 40. Hypervisor Emulation Instruction Register 549 | | 422 | | | 2. [Category: Server] Performance effects of storage | 41. Hypervisor Maintenance Exception Register. 549 | | operand placement, Little-Endian 422 | 42. Hypervisor Maintenance Exception Enable Regis- | | 3. Data Stream Control Register 426 | ter | | 4. Time Base | 43. MSR setting due to interrupt | | 5. Alternate Time Base | 44. Effective address of interrupt vector by | | J. Alternate Time Dase 404 | interrupt type556 | | | 45. Time Base 575 | | Book III-S: | 46. Decrementer 576 | | | 47. Hypervisor Decrementer 577 | | Dower ICA Operation Francisco and Arabi | 48. Processor Utilization of Resources Register . 578 | | Power ISA Operating Environment Archi- | 49. Scaled Processor Utilization of Resources Register | | tecture - Server Environment 465 | 578 | | | 50. Come-From Address Register 581 | | 1 Logical Partitioning Control Pagister 474 | 51. Data Address Breakpoint Register 582 | | 1. Logical Partitioning Control Register 471 | 52. Data Address Breakpoint Register Extension 582 | | 2. Real Mode Offset Register | 53. External Access Register 583 | | 3. Hypervisor Real Mode Offset Register 474 | 55. External / tools it register | | 54. | Performance Monitor SPR encodings for | | 40. MMU Control and Status Register 0 744 | |-----|--------------------------------------------------------------------|------------|--------------------------------------------------------------------------------------------------------| | 55. | mfspr | 593 | 41. Processor States and PMLCan Bit Settings. 752 42. [User] Performance Monitor Global Control Regis- | | 56 | mtspr | 593<br>593 | ter 0 | | | Monitor Mode Control Register 0 | 594 | ters | | 58. | Monitor Mode Control Register 1 | 596<br>596 | 44. [User] Performance Monitor Local Control B Register | | 60. | Sampled Instruction Address Register Sampled Data Address Register | 597<br>597 | 45. [User] Performance Monitor Counter Registers | | 01. | Campica Data / taarooo / togotor | 001 | 46. Embedded.Peformance Monitor PMRs 756 | | Во | ook III-E: | | | | | | | Book VLE: | | Po | ower ISA Operating Environment Arc | chi- | | | tec | cture - Embedded Environment | 605 | Power ISA Operating Environment Archi- | | | | | tecture - | | 1. | Machine State Register | 611 | Variable Length Encoding (VLE) Environ | | | Processor Version Register | 617 | ment | | | Processor Identification Register | 618 | | | 4. | Special Purpose Registers | 618 | 4. DD0 instruction format | | 5. | External Process ID Load Context Register | 619 | 1. BD8 instruction format 762 2. C instruction format 762 | | 6. | External Process ID Store Context Register | 620 | | | 7. | SPR Numbers | 621 | | | 8. | Virtual Address to TLB Entry Match Process . | 644 | 4. OIM5 instruction format | | 9. | Effective-to-Real Address Translation Flow | 645 | 5. IM7 instruction format | | 10. | Access Control Process | 646 | 6. R instruction format | | 11. | Storage control bits | 650 | 7. RR instruction format | | | Exception Syndrome Register | | 8. SD4 instruction format | | | Definitions | 665 | 9. BD15 instruction format | | 13. | Interrupt Vector Offset Register | | 10. BD24 instruction format | | | Assignments | 666 | 11. D8 instruction format | | 14. | External Proxy Register | 667 | 12. I16A instruction format | | | Interrupt and Exception Types | 673 | 13. I16L instruction format | | | Interrupt Hierarchy | 687 | 14. M instruction format | | | Machine State Register Initial Values | 693 | 15. SC18 instruction format 763 | | | TLB Initial Values | 694 | 16. Ll20 instruction format | | | Time Base | 695 | 17. Condition Register | | | Decrementer | 697 | 18. BO32 field encodings 775 | | | Decrementer | 698 | 19. BO16 field encodings 775 | | 22. | | | | | | Watchdog State Machine | 701 | Appendices: | | | Watchdog Timer Controls | 702 | пропаюсь. | | | Data Cache Debug Tag Register High | 728 | | | | Data Cache Debug Tag Register Low | 728 | Power ISA Book I-III Appendices 845 | | | Instruction Cache Debug Data Register | 729 | | | | Instruction Cache Debug Tag Register High. | 729 | 20. Platform Support Requirements 856 | | | Instruction Cache Debug Tag Register Low . | 729 | 21. SPR Numbers | | | Process ID Register (PID0–PID2) | 737 | 21. 01 1(1(01)010 | | | - · · · · · · · · · · · · · · · · · · · | 738 | | | | MAS0 register | 730<br>739 | Index | | | • | | | | | MAS2 register | 739<br>740 | Last Page - End of Document 953 | | | MAS3 register | | Lactings Lind of Doddinent 900 | | | MASS register | 740 | | | | MAS6 register | 741 | | | | MAS7 register | 741 | | | | MMU Configuration Register | 743 | | | 39. | TLB Configuration Register | 743 | | # **Book I:** **Power ISA User Instruction Set Architecture** # **Chapter 1. Introduction** | 1.1 Overview | 1.6.12 XO-FORM 17 | |----------------------------------------------|------------------------------------------| | 1.2 Instruction Mnemonics and Operands3 | 1.6.13 A-FORM17 | | 1.3 Document Conventions 4 | 1.6.14 M-FORM 17 | | 1.3.1 Definitions 4 | 1.6.15 MD-FORM 17 | | 1.3.2 Notation 4 | 1.6.16 MDS-FORM 17 | | 1.3.3 Reserved Fields and Reserved Val- | 1.6.17 VA-FORM | | ues | 1.6.18 VC-FORM 17 | | 1.3.4 Description of Instruction Operation 7 | 1.6.19 VX-FORM17 | | 1.3.5 Categories 9 | 1.6.20 EVX-FORM 17 | | 1.3.5.1 Phased-In/Phased-Out 10 | 1.6.21 EVS-FORM 17 | | 1.3.5.2 Corequisite Category 10 | 1.6.22 Z22-FORM | | 1.3.5.3 Category Notation 11 | 1.6.23 Z23-FORM | | 1.3.6 Environments | 1.6.24 Instruction Fields 18 | | 1.4 Processor Overview 12 | 1.7 Classes of Instructions 21 | | 1.5 Computation modes 14 | 1.7.1 Defined Instruction Class 21 | | 1.5.1 Modes [Category: Server] 14 | 1.7.2 Illegal Instruction Class 21 | | 1.5.2 Modes [Category: Embedded] 14 | 1.7.3 Reserved Instruction Class 21 | | 1.6 Instruction formats 14 | 1.8 Forms of Defined Instructions 21 | | 1.6.1 I-FORM | 1.8.1 Preferred Instruction Forms 21 | | 1.6.2 B-FORM 15 | 1.8.2 Invalid Instruction Forms 21 | | 1.6.3 SC-FORM 15 | 1.8.3 Reserved-no-op Instructions [Cate- | | 1.6.4 D-FORM 15 | gory: Phased-In (sV2.07)] 22 | | 1.6.5 DS-FORM 15 | 1.9 Exceptions22 | | 1.6.6 DQ-FORM 15 | 1.10 Storage Addressing23 | | 1.6.7 X-FORM | 1.10.1 Storage Operands 23 | | 1.6.8 XL-FORM 16 | 1.10.2 Instruction Fetches 24 | | 1.6.9 XFX-FORM 16 | 1.10.3 Effective Address Calculation 26 | | 1.6.10 XFL-FORM 16 | | | 1.6.11 XS-FORM 17 | | # 1.1 Overview This chapter describes computation modes, document conventions, a processor overview, instruction formats, storage addressing, and instruction fetching. # **1.2 Instruction Mnemonics and Operands** The description of each instruction includes the mnemonic and a formatted list of operands. Some examples are the following. stw RS,D(RA) addis RT,RA,SI Power ISA-compliant Assemblers will support the mnemonics and operand lists exactly as shown. They should also provide certain extended mnemonics, such as the ones described in Appendix D of Book I. # 1.3 Document Conventions # 1.3.1 Definitions The following definitions are used throughout this document. # program A sequence of related instructions. ### ■ application program A program that uses only the instructions and resources described in Books I and II. # quadwords, doublewords, words, halfwords, and bytes 128 bits, 64 bits, 32 bits, 16 bits, and 8 bits, respectively. ## positive Means greater than zero. ## negative Means less than zero. # floating-point single format (or simply single format) Refers to the representation of a single-precision binary floating-point value in a register or storage. # floating-point double format (or simply double format) Refers to the representation of a double-precision binary floating-point value in a register or storage. # ■ system library program A component of the system software that can be called by an application program using a *Branch* instruction. # ■ system service program A component of the system software that can be called by an application program using a *System Call* instruction. ## ■ system trap handler A component of the system software that receives control when the conditions specified in a *Trap* instruction are satisfied. ### system error handler A component of the system software that receives control when an error occurs. The system error handler includes a component for each of the various kinds of error. These error-specific components are referred to as the system alignment error handler, the system data storage error handler, etc. ### latency Refers to the interval from the time an instruction begins execution until it produces a result that is available for use by a subsequent instruction. #### unavailable Refers to a resource that cannot be used by the program. For example, storage is unavailable if access to it is denied. See Book III. #### undefined value May vary between implementations, and between different executions on the same implementation, and similarly for register contents, storage contents, etc., that are specified as being undefined. # boundedly undefined The results of executing a given instruction are said to be boundedly undefined if they could have been achieved by executing an arbitrary finite sequence of instructions (none of which yields boundedly undefined results) in the state the processor was in before executing the given instruction. Boundedly undefined results may include the presentation of inconsistent state to the system error handler as described in Section 1.8.1 of Book II. Boundedly undefined results for a given instruction may vary between implementations, and between different executions on the same implementation. #### ■ "must" If software violates a rule that is stated using the word "must" (e.g., "this field must be set to 0"), the results are boundedly undefined unless otherwise stated. # sequential execution model The model of program execution described in Section 2.2, "Instruction Execution Order" on page 29. # ■ Auxiliary Processor An implementation-specific processing unit. Previous versions of the architecture use the term Auxiliary Processing Unit (APU) to describe this extension of the architecture. Architectural support for auxiliary processors is part of the Embedded category. # 1.3.2 Notation The following notation is used throughout the Power ISA documents. - All numbers are decimal unless specified in some special way. - Obnnnn means a number expressed in binary format. - 0xnnnn means a number expressed in hexadecimal format. Underscores may be used between digits. - RT, RA, R1, ... refer to General Purpose Registers. - FRT, FRA, FR1, ... refer to Floating-Point Registers. - FRTp, FRAp, FRBp, ... refer to an even-odd pair of Floating-Point Registers. Values must be even, otherwise the instruction form is invalid. - VRT, VRA, VR1, ... refer to Vector Registers. - (x) means the contents of register x, where x is the name of an instruction field. For example, (RA) means the contents of register RA, and (FRA) means the contents of register FRA, where RA and FRA are instruction fields. Names such as LR and CTR denote registers, not fields, so parentheses are not used with them. Parentheses are also omitted when register x is the register into which the result of an operation is placed. - (RA|0) means the contents of register RA if the RA field has the value 1-31, or the value 0 if the RA field is 0. - Bits in registers, instructions, fields, and bit strings are specified as follows. In the last three items (definition of X<sub>p</sub> etc.), if X is a field that specifies a GPR, FPR, or VR (e.g., the RS field of an instruction), the definitions apply to the register, not to the field. - Bits in instructions, fields, and bit strings are numbered from left to right, starting with bit 0 - For all registers except the Vector category, bits in registers that are less than 64 bits start with bit number 64-L, where L is the register length; for the Vector category, bits in registers that are less than 128 bits start with bit number 128-L. - The leftmost bit of a sequence of bits is the most significant bit of the sequence. - X<sub>p</sub> means bit p of register/instruction/field/ bit\_string X. - X<sub>p:q</sub> means bits p through q of register/instruction/field/bit\_string X. - X<sub>p q ...</sub> means bits p, q, ... of register/instruction/field/bit string X. - ¬(RA) means the one's complement of the contents of register RA. - A period (.) as the last character of an instruction mnemonic means that the instruction records status information in certain fields of the Condition Register as a side effect of execution. - The symbol || is used to describe the concatenation of two values. For example, 010 || 111 is the same as 010111. - x<sup>n</sup> means x raised to the n<sup>th</sup> power. - nx means the replication of x, n times (i.e., x concatenated to itself n-1 times). (n)0 and (n)1 are special cases: - n0 means a field of n bits with each bit equal to 0. Thus 50 is equivalent to 0b00000. - n1 means a field of n bits with each bit equal to 1. Thus <sup>5</sup>1 is equivalent to 0b111111. - Each bit and field in instructions, and in status and control registers (e.g., XER, FPSCR) and Special Purpose Registers, is either defined or reserved. Some defined fields contain reserved values. In such cases when this document refers to the specific field, it refers only to the defined values, unless otherwise specified. - /, //, ///, ... denotes a reserved field, in a register, instruction, field, or bit string. - ?, ??, ???, ... denotes an implementation-dependent field in a register, instruction, field or bit string. # 1.3.3 Reserved Fields and Reserved Values Reserved fields in instructions are ignored by the processor. This is a requirement in the Server environment and is being phased into the Embedded environment. In some cases a defined field of an instruction has certain values that are reserved. This includes cases in which the field is shown in the instruction layout as containing a particular value; in such cases all other values of the field are reserved. In general, if an instruction is coded such that a defined field contains a reserved value the instruction form is invalid; see Section 1.8.2 on page 21. The only exceptions to the preceding rule is that it does not apply to Reserved and Illegal classes of instructions (see Section 1.7) or to portions of defined fields that are specified, in the instruction description, as being treated as reserved fields. To maximize compatibility with future architecture extensions, software must ensure that reserved fields in instructions contain zero and that defined fields of instructions do not contain reserved values. The handling of reserved bits in System Registers (e.g., XER, FPSCR) is implementation-dependent. Unless otherwise stated, software is permitted to write any value to such a bit. A subsequent reading of the bit returns 0 if the value last written to the bit was 0 and returns an undefined value (0 or 1) otherwise. In some cases a defined field of a System Register has certain values that are reserved. Software must not set a defined field of a System Register to a reserved value. References elsewhere in this document to a defined field (in an instruction or System Register) that has reserved values assume the field does not contain a reserved value, unless otherwise stated or obvious from context. #### **Assembler Note** Assemblers should report uses of reserved values of defined fields of instructions as errors. # Programming Note It is the responsibility of software to preserve bits that are now reserved in System Registers, because they may be assigned a meaning in some future version of the architecture. In order to accomplish this preservation in implementation-independent fashion, software should do the following. - Initialize each such register supplying zeros for all reserved bits. - Alter (defined) bit(s) in the register by reading the register, altering only the desired bit(s), and then writing the new value back to the register. The XER and FPSCR are partial exceptions to this recommendation. Software can alter the status bits in these registers, preserving the reserved bits, by executing instructions that have the side effect of altering the status bits. Similarly, software can alter any defined bit in the FPSCR by executing a Floating-Point Status and Control Register instruction. Using such instructions is likely to yield better performance than using the method described in the second item above. # 1.3.4 Description of Instruction Operation Instruction descriptions (including related material such as the introduction to the section describing the instructions) mention that the instruction may cause a system error handler to be invoked, under certain conditions, if and only if the system error handler may treat the case as a programming error. (An instruction may cause a system error handler to be invoked under other conditions as well; see Chapter 6 of Book III-S and Chapter 5 of Book III-E). A formal description is given of the operation of each instruction. In addition, the operation of most instructions is described by a semiformal language at the register transfer level (RTL). This RTL uses the notation given below, in addition to the notation described in Section 1.3.2. Some of this notation is also used in the formal descriptions of instructions. RTL notation not summarized here should be self-explanatory. The RTL descriptions cover the normal execution of the instruction, except that "standard" setting of status registers, such as the Condition Register, is not shown. ("Non-standard" setting of these registers, such as the setting of the Condition Register by the *Compare* instructions, is shown.) The RTL descriptions do not cover cases in which the system error handler is invoked, or for which the results are boundedly undefined. The RTL descriptions specify the architectural transformation performed by the execution of an instruction. They do not imply any particular implementation. | Notation | Meaning | |---------------------------------|----------------------------------------------| | ← | Assignment | | ← <sub>iea</sub> | Assignment of an instruction effective | | 700 | address. In 32-bit mode the high-order 32 | | | bits of the 64-bit target address are set to | | | 0. | | ¬ | NOT logical operator | | + | Two's complement addition | | - | Two's complement subtraction, unary | | | minus | | × | Multiplication | | $\times_{si}$ | Signed-integer multiplication | | × <sub>ui</sub> | Unsigned-integer multiplication | | / | Division | | ÷ | Division, with result truncated to integer | | $\sqrt{}$ | Square root | | =, ≠ | Equals, Not Equals relations | | <, ≤, >, ≥ | Signed comparison relations | | < <sup>u</sup> , > <sup>u</sup> | Unsigned comparison relations | | ? | Unordered comparison relation | | &, | AND, OR logical operators | | ⊕, ≡ | Exclusive OR, Equivalence logical opera- | | ADC(v) | tors ((a≡b) = (a⊕¬b))<br>Absolute value of x | | ABS(x) | ADSUIDLE VAIDE OF X | | CEIL(x) DCR(x) | Least integer ≥ x Device Control Register x Result of converting x from floating-point | |--------------------------|------------------------------------------------------------------------------------------------------------------------------| | DOUBLE(X) | single format to floating-point double format, using the model shown on page 117 | | EXTS(x) | Result of extending x on the left with sign bits | | FLOOR(x) | Greatest integer ≤ x | | GPR(x) | General Purpose Register x Mask having 1s in positions x through y | | MASK(X, y) | (wrapping if x > y) and 0s elsewhere | | MEM(x, y) | Contents of a sequence of y bytes of storage. The sequence depends on the byte ordering used for storage access, as follows. | | | Big-Endian byte ordering: | | | The sequence starts with the byte at | | | address x and ends with the byte at | | | address x+y-1. Little-Endian byte ordering: | | | The sequence starts with the byte at | | | address x+y-1 and ends with the byte at address x. | | ROTL <sub>64</sub> (x, y | <i>y</i> ) | | | Result of rotating the 64-bit value x left y positions | | ROTL <sub>32</sub> (x, ) | | | | Result of rotating the 64-bit value $x x $ left y positions, where x is 32 bits long | | SINGLE(x) | | | ( ) | double format to floating-point single for- | | | mat, using the model shown on page 121 | | SPR(x)<br>TRAP | Special Purpose Register x | | characteriza | Invoke the system trap handler | | Characteriza | Reference to the setting of status bits, in a | | | standard way that is explained in the text | | undefined | An undefined value. | | | | | CIA | Current Instruction Address, which is the | | OΙΛ | 64-bit address of the instruction being | CIA Current Instruction Address, which is the 64-bit address of the instruction being described by a sequence of RTL. Used by relative branches to set the Next Instruction Address (NIA), and by Branch instructions with LK=1 to set the Link Register. Does not correspond to any architected register. NIA Next Instruction Address, which is the register. Next Instruction Address, which is the 64-bit address of the next instruction to be executed. For a successful branch, the next instruction address is the branch target address: in RTL, this is indicated by assigning a value to NIA. For other instructions that cause non-sequential instruction fetching (see Book III), the RTL is similar. For instructions that do not branch, and do not otherwise cause instruction fetching to be non-sequential, the next instruction address is CIA+4 (VLE behavior is different; see Book VLE). Does not correspond to any architected register. if... then... else... Conditional execution, indenting shows range; else is optional. Do loop, indenting shows range. "To" and/ do or "by" clauses specify incrementing an iteration variable, and a "while" clause gives termination conditions. Leave innermost do loop, or do loop leave described in leave statement. For loop, indenting shows range. Clause for after "for" specifies the entities for which to execute the body of the loop. The precedence rules for RTL operators are summarized in Table 1. Operators higher in the table are applied before those lower in the table. Operators at the same level in the table associate from left to right, from right to left, or not at all, as shown. (For example, - associates from left to right, so a-b-c = (a-b)-c.) Parentheses are used to override the evaluation order implied by the table or to increase clarity; parenthesized expressions are evaluated before serving as operands. | Table 1: Operator precedence | | | |---------------------------------------------------------------------|---------------|--| | Operators | Associativity | | | subscript, function evaluation | left to right | | | pre-superscript (replication),<br>post-superscript (exponentiation) | right to left | | | unary -, ¬ | right to left | | | ×, ÷ | left to right | | | +, -, | left to right | | | II | left to right | | | $=, \neq, <, \leq, >, \geq, <^{U}, >^{U}, ?$ | left to right | | | &, ⊕, ≡ | left to right | | | | left to right | | | : (range) | none | | | ←,← <sub>iea</sub> | none | | # 1.3.5 Categories Each facility (including registers and fields therein) and instruction is in exactly one of the categories listed in Figure 1. A category may be defined as a *dependent category*. These are categories that are supported only if the category they are dependent on is also supported. Depen- dent categories are identified by the "." in their category name, e.g., if an implementation supports the Floating-Point.Record category, then the Floating-Point category is also supported. An implementation that supports a facility or instruction in a given category, except for the two categories described in Section 1.3.5.1, supports all facilities and instructions in that category. | Category | Abvr. | Notes | | |--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--| | Base | В | Required for all implementations | | | Server | s | Required for Server implementations | | | Embedded | E | Required for Embedded implementations | | | Alternate Time Base | ATB | An additional Time Base; see Book II | | | BCD Assistance | BCDA | Binary Coded Decimal Assistance Instructions | | | Cache Specification | cs | Specify a specific cache for some instructions; see Book II | | | Decimal Floating-Point | DFP | Decimal Floating-Point facilities | | | Embedded.Cache Debug | E.CD | Provides direct access to cache data and directory content | | | Embedded.Cache Initialization | E.CI | Instructions that invalidate the entire cache | | | Embedded.Enhanced Debug | E.ED | Embedded Enhanced Debug facility; see Book III-E | | | Embedded.External PID | E.PD | Embedded External PID facility; see Book III-E | | | Embedded.Little-Endian | E.LE | Embedded Little-Endian page attribute; see Book III-E | | | Embedded.MMU Type FSL | E.MF | Embedded MMU example Type FSL; see Book III-E | | | Embedded.Performance Monitor | E.PM | Embedded performance monitor example; see Book III-E | | | Embedded.Processor Control | E.PC | Processor control facility; see Book III-E | | | Embedded Cache Locking | ECL | Embedded Cache Locking facility; see Book III-E | | | External Control | EC | External Control facility; see Book II | | | External Proxy | EXP | External Proxy facility; see Book III-E | | | Floating-Point<br>Floating-Point.Record | FP<br>FP.R | Floating-Point Facilities Floating-Point instructions with Rc=1 | | | Hypervisor Emulation Assistance | HEA | Hypervisor Emulation Assistance Facilities | | | Legacy Integer Multiply-Accumulate <sup>1</sup> | LMA | Legacy Integer Multiply-accumulate instructions | | | Legacy Move Assist | LMV | Determine Left most Zero Byte instruction | | | Load/Store Quadword | LSQ | Load/Store Quadword instructions; see Book III-S | | | Memory Coherence | ММС | Requirement for Memory Coherence; see Book II | | | Move Assist | MA | Move Assist instructions | | | Processor Compatibility | PCR | Processor Compatibility Register | | | Server.Performance Monitor | S.PM | Performance monitor example for Servers; see Book III-S | | | Server.Relaxed Page Table Alignment | S.RPTA | HTAB alignment on 256 KB boundary; see Book III-S | | | Signal Processing Engine <sup>1, 2</sup> SPE.Embedded Float Scalar Double SPE.Embedded Float Scalar Single SPE.Embedded Float Vector | SP<br>SP.FD<br>SP.FS<br>SP.FV | Facility for signal processing GPR-based Floating-Point double-precision instruction set GPR-based Floating-Point single-precision instruction set GPR-based Floating-Point Vector instruction set | | | Stream | STM | Stream variant of <i>dcbt</i> instruction; see Book II | | | Trace | TRC | Trace Facility; see Book III-S | | | Because of overlapping opcode usage, SPE is mutually exclusive with Vector and with Legacy Integer Multiply-Accumulate, and Legacy Integer Multiply-Accumulate is mutually exclusive with Vector. The SPE-dependent Floating-Point categories are collectively referred to as SPE.Embedded Float_* or SP.*. | | | | Figure 1. Category Listing (Sheet 1 of 2) | Category | Abvr. | Notes | |---------------------------------------------|-------|---------------------------------------------------------------------------| | Variable Length Encoding | VLE | Variable Length Encoding facility; see Book VLE | | Vector <sup>1</sup><br>Vector.Little-Endian | - | Vector facilities<br>Little-Endian support for Vector storage operations. | | Wait | WT | wait instruction; see Book II | | 64-Bit | 64 | Required for 64-bit implementations; not defined for 32-bit impl's | Because of overlapping opcode usage, SPE is mutually exclusive with Vector and with Legacy Integer Multiply-Accumulate, and Legacy Integer Multiply-Accumulate is mutually exclusive with Vector. The SPE-dependent Floating-Point categories are collectively referred to as SPE.Embedded Float\_\* or SP.\*. Figure 1. Category Listing (Sheet 2 of 2) An instruction in a category that is not supported by the implementation is treated as an illegal instruction or an unimplemented instruction on that implementation (see Section 1.7.2). For an instruction that is supported by the implementation with field values that are defined by the architecture, the field values defined as part of a category that is not supported by the implementation are treated as reserved values on that implementation (see Section 1.3.3 and Section 1.8.2). Bits in a register that are in a category that is not supported by the implementation are treated as reserved. # 1.3.5.1 Phased-In/Phased-Out There are two special dependent categories, Phased-In and Phased-Out, defined below. Phased-In (sVxxx) These are facilities and instructions that, in some future version of the architecture, will be required as part of the category they are dependent on. > Starting with version 2.05, servers may not implement a facility in this category until the version indicated. Starting with the version indicated, servers must implement the facility. Servers that comply with earlier versions of this architecture may have optionally implemented features that were category Phased-In. #### Phased-Out These are facilities and instructions that, in some future version of the architecture, will be dropped out of the architecture. System developers should develop a migration plan to eliminate use of them in new systems. These facilities are required for the Server Platform. # Programming Note - Warning: Instructions and facilities being phased out of the architecture are likely to perform poorly on future implementations. New programs should not use them. ### Programming Note – Facilities are categorized as Phased-In only in cases where there is a difference between the Server and Embedded environments. As soon as the facility is supported by both environments, the Phased-In categorization will be removed. # 1.3.5.2 Corequisite Category A corequisite category is an additional category that is associated with an instruction or facility, and must be implemented if the instruction or facility is implemented. #### 1.3.5.3 Category Notation Instructions and facilities are considered part of the Base category unless otherwise marked. If a section is marked with a specific category tag, all material in that section and its subsections are considered part of the category, unless otherwise marked. Overview sections may contain discussion of instructions and facilities from various categories without being explicitly marked. An example of a category tag is: [Category: Server]. An example of a dependent category is: [Category: Server.Phased-In] The shorthand <E> and <S> may also be used for Category: Embedded and Server respectively. ## 1.3.6 Environments All implementations support one of the two defined environments, Server or Embedded. Environments refer to common subsets of instructions that are shared across many implementations. The Server environment describes implementations that support Category: Base and Server. The Embedded environment describes implementations that support Category: Base and Embedded. ## 1.4 Processor Overview The processor implements the instruction set, the storage model, and other facilities defined in this document. There are four basic classes of instructions: - branch instructions (Chapter 2) - fixed-point instructions (Chapter 3), and other instructions that use the fixed-point registers (Chapters 7, 8, 9, and 10) - floating-point instructions (Chapter 4) and decimal floating-point instructions (Chapter 5) - vector instructions (Chapter 6) Fixed-point instructions operate on byte, halfword, word, and doubleword operands. Floating-point instructions operate on single-precision and double-precision floating-point operands. Vector instructions operate on vectors of scalar quantities and on scalar quantities where the scalar size is byte, halfword, word, and quadword. The Power ISA uses instructions that are four bytes long and word-aligned (VLE has different instruction characteristics; see Book VLE). It provides for byte, halfword, word, and doubleword operand fetches and stores between storage and a set of 32 General Purpose Registers (GPRs). It provides for word and doubleword operand fetches, and stores between storage and a set of 32 Floating-Point Registers (FPRs). It also provides for byte, halfword, word, and quadword operand fetches and stores between storage and a set of 32 Vector Registers (VRs). Signed integers are represented in two's complement form. There are no computational instructions that modify storage; instructions that reference storage may reformat the data (e.g. load halfword algebraic). To use a storage operand in a computation and then modify the same or another storage location, the contents of the storage operand must be loaded into a register, modified, and then stored back to the target location. Figure 2 is a logical representation of instruction processing. Figure 3 shows the registers of the Power ISA User Instruction Set Architecture. Figure 2. Logical processing model # CR 32 63 "Condition Register" on page 30 <sup>&</sup>quot;Count Register" on page 31 <sup>&</sup>quot;General Purpose Registers" on page 42 <sup>&</sup>quot;Fixed-Point Exception Register" on page 42 ## Category: Embedded: | | SPRG4 | | |---|-------|----| | | SPRG5 | | | | SPRG6 | | | | SPRG7 | | | 0 | | 63 | <sup>&</sup>quot;Software-use SPRs" on page 43. ## Category: Embedded, Vector | | VRSAVE | | |----|--------|----| | 32 | | 63 | "VR Save Register" on page 196 # Category: Floating-Point, Decimal Floating-Point: | | FPR 0 | |---|--------| | | FPR 1 | | | • • • | | | | | F | FPR 30 | | F | FPR 31 | | 0 | 63 | "Floating-Point Registers" on page 101 | | FPSCR | | |---|-------|----| | 0 | | 63 | "Floating-Point Registers" on page 101 and "DFP Usage of Floating-Point Registers" on page 144. ## **Category: Vector:** | | VR 0 | | |---|-------|-----| | | VR 1 | | | | ••• | | | | ••• | | | | VR 30 | | | | VR 31 | | | 0 | | 127 | "Vector Registers" on page 195 "Vector Status and Control Register" on page 195 ## Category: SPE: "Accumulator" on page 262 "Signal Processing and Embedded Floating-Point Status and Control Register" on page 262 Figure 3. Power ISA user register set ## 1.5 Computation modes ## 1.5.1 Modes [Category: Server] Processors provide two execution modes, 64-bit mode and 32-bit mode. In both of these modes, instructions that set a 64-bit register affect all 64 bits. The computational mode controls how the effective address is interpreted, how status bits are set, how the Link Register is set by Branch instructions in which LK=1, and how the Count Register is tested by Branch Conditional instructions. Nearly all instructions are available in both modes (the only exceptions are a few instructions that are defined in Book III-S). In both modes, effective address computations use all 64 bits of the relevant registers (General Purpose Registers, Link Register, Count Register, etc.) and produce a 64-bit result. However, in 32-bit mode the high-order 32 bits of the computed effective address are ignored for the purpose of addressing storage; see Section 1.10.3 for additional details. # 1.5.2 Modes [Category: Embedded] Processors may provide 32-bit mode, or both 64-bit mode and 32-bit mode. The modes differ in the following ways. - In 64-bit mode, the processor behaves as described for 64-bit mode in the Server environment; see Section 1.5.1. - In 32-bit mode, instructions other than SP. SP.Embedded Float Scalar Double. SP.Embedded Float Vector use only the lower 32 bits of a GPR and produce a 32-bit result. Results written to the GPRs write only the lower 32-bits and the upper 32 bits are undefined except for SP.Embedded Float Scalar Single instructions which leave the upper 32-bits unchanged. SP, SP.Embedded Float Scalar Double. SP.Embedded Float Vector instructions use all 64 bits of a GPR and produce a 64-bit result regardless of the mode. Instructions that set condition bits do so based on the 32-bit result computed. Effective addresses and all SPRs operate on the lower 32 bits only unless otherwise stated. The instructions in the 64-Bit category are not necessarily available; if they are not available, attempting to execute such an instruction causes the system illegal instruction error handler to be invoked. Floating-Point and and Decimal Floating-Point instructions operate on FPRs, and Vector instructions operate VPRs, independent of mode. ## 1.6 Instruction formats All instructions are four bytes long and word-aligned (except for VLE instructions; see Book VLE). Thus, whenever instruction addresses are presented to the processor (as in *Branch* instructions) the low-order two bits are ignored. Similarly, whenever the processor develops an instruction address the low-order two bits are zero. Bits 0:5 always specify the opcode (OPCD, below). Many instructions also have an extended opcode (XO, below). The remaining bits of the instruction contain one or more fields as shown below for the different instruction formats. The format diagrams given below show horizontally all valid combinations of instruction fields. The diagrams include instruction fields that are used only by instructions defined in Book II or in Book III. ## **Split Field Notation** In some cases an instruction field occupies more than one contiguous sequence of bits, or occupies one contiguous sequence of bits that are used in permuted order. Such a field is called a *split field*. In the format diagrams given below and in the individual instruction layouts, the name of a split field is shown in small letters, once for each of the contiguous sequences. In the RTL description of an instruction having a split field, and in certain other places where individual bits of a split field are identified, the name of the field in small letters represents the concatenation of the sequences from left to right. In all other places, the name of the field is capitalized and represents the concatenation of the sequences in some order, which need not be left to right, as described for each affected instruction. ## 1.6.1 I-FORM Figure 4. I instruction format ## 1.6.2 B-FORM Figure 5. B instruction format ## 1.6.3 SC-FORM | 0 | 6 | 11 | 16 | 20 | 27 | 30 | 31 | |------|-----|-----|-----|-----|----|----|----| | OPCD | /// | /// | // | LEV | // | 1 | / | | OPCD | /// | /// | /// | /// | // | 1 | / | Figure 6. SC instruction format ## 1.6.4 **D-FORM** | OPCD RT RA D OPCD RT RA SI OPCD RS RA D OPCD RS RA UI OPCD BF / L RA SI OPCD BF / L RA UI OPCD TO RA SI OPCD FRT RA D | 0 | 6 | 11 | 16 31 | |---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|--------|----|-------| | OPCD RS RA D OPCD RS RA UI OPCD BF / L RA SI OPCD BF / L RA UI OPCD TO RA SI | OPCD | RT | RA | D | | OPCD RS RA UI OPCD BF / L RA SI OPCD BF / L RA UI OPCD TO RA SI | OPCD | RT | RA | SI | | OPCD BF / L RA SI OPCD BF / L RA UI OPCD TO RA SI | OPCD | RS | RA | D | | OPCD BF / L RA UI OPCD TO RA SI | OPCD | RS | RA | UI | | OPCD TO RA SI | OPCD | BF / L | RA | SI | | | OPCD | BF / L | RA | UI | | OPCD FRT RA D | OPCD | TO | RA | SI | | | OPCD | FRT | RA | D | | OPCD FRS RA D | OPCD | FRS | RA | D | Figure 7. D instruction format ## 1.6.5 **DS-FORM** | 0 | 6 | 11 | 16 | 30 31 | |------|------|----|----|-------| | OPCD | RT | RA | DS | XO | | OPCD | RS | RA | DS | XO | | OPCD | RSp | RA | DS | XO | | OPCD | FRTp | RA | DS | XO | | OPCD | FRSp | RA | DS | XO | Figure 8. DS instruction format ## 1.6.6 **DQ-FORM** Figure 9. DQ instruction format ## 1.6.7 X-FORM | | 0 | 6 | 11 | 16 | 21 | 31 | |---|------|--------|--------|------|----|----| | | OPCD | RT | RA | /// | XO | / | | | OPCD | RT | RA | RB | XO | / | | Ī | OPCD | RT | RA | RB | XO | ΕH | | | OPCD | RT | RA | NB | XO | / | | | OPCD | RT | / SR | /// | XO | / | | | OPCD | RT | /// | RB | XO | / | | Ī | OPCD | RT | /// | RB | XO | 1 | | | OPCD | RT | /// | /// | XO | / | | | OPCD | RS | RA | RB | XO | Rc | | Ī | OPCD | RT | RA | RB | XO | Rc | | | OPCD | RS | RA | RB | XO | 1 | | | OPCD | RS | RA | RB | XO | / | | | OPCD | RS | RA | NB | XO | / | | | OPCD | RS | RA | SH | XO | Rc | | | OPCD | RS | RA | /// | XO | Rc | | | OPCD | RS | RA | /// | XO | / | | | OPCD | RS | / SR | /// | XO | / | | | OPCD | RS | /// | RB | XO | / | | | OPCD | RS | /// | /// | XO | / | | | OPCD | RS | /// L | /// | XO | / | | | OPCD | TH | RA | RB | XO | / | | | OPCD | BF / L | RA | RB | XO | / | | | OPCD | BF // | FRA | FRB | XO | / | | | OPCD | BF // | BFA // | /// | XO | / | | | OPCD | BF // | /// W | U / | XO | Rc | | | OPCD | BF // | /// | /// | XO | / | | | OPCD | TH | RA | RB | XO | / | | | OPCD | / CT | /// | /// | XO | / | | | OPCD | / CT | RA | RB | XO | / | | | OPCD | /// L | RA | RB | XO | / | | | OPCD | /// L | _ /// | RB | XO | / | | | OPCD | /// L | /// | /// | XO | / | | | OPCD | TO | RA | RB | XO | / | | | OPCD | FRT | RA | RB | XO | / | | | OPCD | FRT | FRA | FRB | XO | / | | | OPCD | FRTp | RA | RB | XO | / | | | OPCD | FRT | /// | FRB | XO | Rc | | | OPCD | FRT | /// | FRBp | XO | Rc | | | OPCD | FRT | /// | /// | XO | Rc | | | OPCD | FRTp | /// | FRB | XO | Rc | | | OPCD | FRTp | /// | FRBp | XO | Rc | | | OPCD | FRTp | FRA | FRBp | XO | Rc | | | OPCD | FRTp | FRAp | FRBp | XO | Rc | | | OPCD | BF // | FRA | FRBp | XO | / | | | OPCD | BF // | FRAp | FRBp | XO | / | | | OPCD | FRT | S | FRB | XO | Rc | Figure 10. X Instruction Format | 0 | 6 | 11 | 16 | 21 | 31 | |------|-------|-----|-------|----|----| | OPCD | FRTp | S | FRBp | XO | Rc | | OPCD | FRS | RA | RB | XO | / | | OPCD | FRSp | RA | RB | XO | / | | OPCD | BT | /// | /// | XO | Rc | | OPCD | /// | RA | RB | XO | / | | OPCD | /// | /// | RB | XO | / | | OPCD | /// | /// | /// | XO | / | | OPCD | /// | /// | E /// | XO | / | | OPCD | // IH | /// | /// | XO | / | | OPCD | ??? | RA | RB | XO | ? | | OPCD | ??? | ??? | ??? | XO | / | | OPCD | VRT | RA | RB | XO | / | | OPCD | VRS | RA | RB | XO | / | | OPCD | MO | /// | /// | XO | / | Figure 10. X Instruction Format ## 1.6.8 XL-FORM | 0 | 6 | | 11 | | 16 | | 21 | 31 | |------|-----|----|-----|----|-----|----|----|----| | OPCD | В٦ | _ | BA | ١ | BI | 3 | XO | / | | OPCD | BC | ) | BI | BI | | ВН | XO | LK | | OPCD | BF | // | BFA | // | /// | / | XO | / | | OPCD | /// | ' | /// | ' | /// | / | XO | / | Figure 11. XL instruction format ## 1.6.9 XFX-FORM | 0 | 6 | 11 | | | 21 | 31 | |------|-----|----|------|---|----|----| | OPCD | RT | | spr | | XO | / | | OPCD | RT | | tbr | | XO | / | | OPCD | RT | 0 | /// | | XO | / | | OPCD | RT | 1 | FXM | / | XO | / | | OPCD | RT | | dcr | | XO | / | | OPCD | RT | | pmrn | | XO | / | | OPCD | DUI | | DUIS | | XO | / | | OPCD | RS | 0 | FXM | / | XO | / | | OPCD | RS | 1 | FXM | / | XO | / | | OPCD | RS | | spr | | XO | / | | OPCD | RS | | dcr | | XO | / | | OPCD | RS | | pmrn | | XO | / | Figure 12. XFX instruction format ## 1.6.10 XFL-FORM | 0 | 6 7 | | 15 16 | 21 | 31 | |-----|-----|-----|-------|----|----| | OPC | D L | FLM | W FRB | XO | Rc | Figure 13. XFL instruction format ## 1.6.11 XS-FORM | ( | 0 | 6 | 11 | 16 | 21 | 30 | 31 | |---|------|----|----|----|----|----|----| | Ī | OPCD | RS | RA | sh | XO | sh | Rc | Figure 14. XS instruction format ## 1.6.12 XO-FORM | 0 | 6 | 11 | 16 | 21 | 22 | 31 | |------|----|----|-----|----|----|----| | OPCD | RT | RA | RB | OE | XO | Rc | | OPCD | RT | RA | RB | / | XO | Rc | | OPCD | RT | RA | RB | / | XO | / | | OPCD | RT | RA | /// | OE | XO | Rc | Figure 15. XO instruction format ## 1.6.13 A-FORM | 0 | 6 | 11 | | 16 | 21 | 26 | 31 | |------|-----|-----|---|-----|-----|----|----| | OPCD | FRT | FRA | ١ | FRB | FRC | XO | Rc | | OPCD | FRT | FRA | ١ | FRB | /// | XO | Rc | | OPCD | FRT | FRA | | /// | FRC | XO | Rc | | OPCD | FRT | /// | | FRB | /// | XO | Rc | | OPCD | FRT | /// | L | FRB | /// | XO | Rc | | OPCD | RT | RA | | RB | BC | XO | / | Figure 16. A instruction format ## 1.6.14 M-FORM | 0 | 6 | 11 | 16 | 21 | 26 | 31 | |------|----|----|----|----|----|----| | OPCD | RS | RA | RB | MB | ME | Rc | | OPCD | RS | RA | SH | MB | ME | Rc | Figure 17. M instruction format ## 1.6.15 MD-FORM | | 0 | 6 | 11 | 16 | 21 | 27 | 30 | 31 | |---|------|----|----|----|----|----|----|----| | Ī | OPCD | RS | RA | sh | mb | ХО | sh | Rc | | ĺ | OPCD | RS | RA | sh | me | ХО | sh | Rc | Figure 18. MD instruction format ## 1.6.16 MDS-FORM | - | 0 | 6 | 11 | 16 | 21 | 27 | 31 | |---|------|----|----|----|----|----|----| | Ī | OPCD | RS | RA | RB | mb | XO | Rc | | Ī | OPCD | RS | RA | RB | me | XO | Rc | Figure 19. MDS instruction format ## 1.6.17 VA-FORM | | 0 | 6 | 11 | 16 | 21 | 26 | 31 | |---|------|-----|-----|-----|-------|----|----| | , | OPCD | VRT | VRA | VRB | VRC | XO | | | | OPCD | VRT | VRA | VRB | / SHB | XO | | Figure 20. VA instruction format ## 1.6.18 VC-FORM | | 0 | 6 | 11 | 16 | 21 | 22 | 31 | |---|------|-----|-----|-----|----|----|----| | 1 | OPCD | VRT | VRA | VRB | Rc | XO | | Figure 21. VC instruction format ## 1.6.19 VX-FORM | 0 | 6 | 11 | 16 | 21 31 | |------|-----|---------|-----|-------| | OPCD | VRT | VRA | VRB | XO | | OPCD | VRT | /// | VRB | XO | | OPCD | VRT | UIM | VRB | XO | | OPCD | VRT | / UIM | VRB | XO | | OPCD | VRT | // UIM | VRB | XO | | OPCD | VRT | /// UIM | VRB | XO | | OPCD | VRT | SIM | /// | XO | | OPCD | VRT | /// | | XO | | OPCD | | /// | VRB | XO | | L | | | 1 | 1 | Figure 22. VX instruction format ## 1.6.20 **EVX-FORM** | 0 | 6 | 11 | 16 | 21 31 | |------|-------|-----|-----|-------| | OPCD | RS | RA | RB | XO | | OPCD | RS | RA | UI | XO | | OPCD | RT | /// | RB | XO | | OPCD | RT | RA | RB | XO | | OPCD | RT | RA | /// | XO | | OPCD | RT | UI | RB | XO | | OPCD | BF // | RA | RB | XO | | OPCD | RŤ | RA | UI | XO | | OPCD | RT | SI | /// | XO | Figure 23. EVX instruction format ## 1.6.21 **EVS-FORM** | 0 | 6 | 11 | 16 | 21 | 29 3 | 31 | |------|----|----|----|----|------|----| | OPCD | RT | RA | RB | XO | BF/ | ١ | Figure 24. EVS instruction format ## 1.6.22 **Z22-FORM** | 0 | 6 | | 11 15 | 16 | 22 | 31 | |------|----|----|-------|-----|----|----| | OPCD | BF | // | FRA | DCM | XO | / | | OPCD | BF | // | FRAp | DCM | XO | / | | OPCD | BF | // | FRA | DGM | XO | / | | OPCD | BF | // | FRAp | DGM | XO | / | | OPCD | FR | Т | FRA | SH | XO | Rc | | OPCD | FR | Гр | FRAp | SH | ХО | Rc | Figure 25. Z22 instruction format ## 1.6.23 Z23-FORM | 0 | 6 | 11 | 16 | 21 | 23 | 31 | |------|------|------|--------|-----|----|----| | OPCD | FRT | TE | FRB | RMC | XO | Ro | | OPCD | FRTp | TE | FRBp | RMC | XO | Rc | | OPCD | FRT | FRA | FRB | RMC | XO | Rc | | OPCD | FRTp | FRA | FRBp | RMC | XO | Rc | | OPCD | FRTp | FRAp | FRBp | RMC | XO | Rc | | OPCD | FRT | /// | R FRB | RMC | XO | Rc | | OPCD | FRTp | /// | R FRBp | RMC | XO | Rc | Figure 26. Z23 instruction format ## 1.6.24 Instruction Fields #### AA (30) Absolute Address bit. - O The immediate field represents an address relative to the current instruction address. For I-form branches the effective address of the branch target is the sum of the LI field sign-extended to 64 bits and the address of the branch instruction. For B-form branches the effective address of the branch target is the sum of the BD field sign-extended to 64 bits and the address of the branch instruction. - 1 The immediate field represents an absolute address. For I-form branches the effective address of the branch target is the LI field sign-extended to 64 bits. For B-form branches the effective address of the branch target is the BD field sign-extended to 64 bits. #### BA (11:15) Field used to specify a bit in the CR to be used as a source. BB (16:20) Field used to specify a bit in the CR to be used as a source. #### BC (21:25) Field used to specify a bit in the CR to be used as a source. #### BD (16:29) Immediate field used to specify a 14-bit signed two's complement branch displacement which is concatenated on the right with 0b00 and sign-extended to 64 bits. #### BF (6:8) Field used to specify one of the CR fields or one of the FPSCR fields to be used as a target. ## BFA (11:13 or 29:31) Field used to specify one of the CR fields or one of the FPSCR fields to be used as a source. #### BH (19:20) Field used to specify a hint in the *Branch Conditional to Link Register* and *Branch Conditional to Count Register* instructions. The encoding is described in Section 2.4, "Branch Instructions". #### BI (11:15) Field used to specify a bit in the CR to be tested by a *Branch Conditional* instruction. #### BO (6:10) Field used to specify options for the *Branch Conditional* instructions. The encoding is described in Section 2.4, "Branch Instructions". #### BT (6:10) Field used to specify a bit in the CR or in the FPSCR to be used as a target. #### CT (7:10) Field used in X-form instructions to specify a cache target (see Section 3.3.2 of Book II). #### D (16:31) Immediate field used to specify a 16-bit signed two's complement integer which is sign-extended to 64 bits. #### DCM (16:21) Immediate field used as the Data Class Mask. #### DCR (11:20) Field used by the *Move To/From Device Control Register* instructions (see Book III-E). #### DGM (16:21) Immediate field used as the Data Group Mask. #### DQ (16:27) Immediate field used to specify a 12-bit signed two's complement integer which is concatenated on the right with 0b0000 and sign-extended to 64 bits. ## DS (16:29) Immediate field used to specify a 14-bit signed two's complement integer which is concatenated on the right with 0b00 and sign-extended to 64 bits. #### **DUI (6:10)** Field used by the *dnh* instruction (see Book II). #### **DUIS (11:20)** Field used by the *dnh* instruction (see Book II). ### E (16) Field used by the *Write MSR External Enable* instruction (see Book III-E). ## EH (31) Field used to specify a hint in the *Load and Reserve* instructions. The meaning is described in Section 3.4.2, "Load and Reserve and Store Conditional Instructions", in Book II. ### FLM (7:14) Field mask used to identify the FPSCR fields that are to be updated by the *mtfsf* instruction. #### FRA (11:15) Field used to specify an FPR to be used as a source. #### FRAp (11:15) Field used to specify an even/odd pair of FPRs to be concatenated and used as a source. #### FRB (16:20) Field used to specify an FPR to be used as a source. ## FRBp (16:20) Field used to specify an even/odd pair of FPRs to be concatenated and used as a source. ## FRC (21:25) Field used to specify an FPR to be used as a source. #### FRS (6:10) Field used to specify an FPR to be used as a source. ### FRSp (6:10) Field used to specify an even/odd pair of FPRs to be concatenated and used as a source. ## FRT (6:10) Field used to specify an FPR to be used as a target. ## FRTp (6:10) Field used to specify an even/odd pair of FPRs to be concatenated and used as a target. ## FXM (12:19) Field mask used to identify the CR fields that are to be written by the *mtcrf* and *mtocrf* instructions, or read by the *mfocrf* instruction. #### IH (8:10) Field used to specify a hint in the *SLB Invalidate All* instruction. The meaning is described in Section 5.9.3.1, "SLB Management Instructions", in Book III-S. ## L (6) Field used to specify whether the *mtfsf* instruction updates the entire FPSCR. ## L (10 or 15) Field used to specify whether a fixed-point Compare instruction is to compare 64-bit numbers or 32-bit numbers. Field used by the *Data Cache Block Flush* instruction (see Section 3.3.2 of Book II). Field used by the *Move To Machine State Register* and *TLB Invalidate Entry* instructions (see Book III). Field used to specify whether the *Floating-Point Estimate* instructions may treat denormalized operands as 0. #### L (9:10) Field used by the *Synchronize* instruction (see Section 3.4.1 of Book II). #### LEV (20:26) Field used by the System Call instruction. #### LI (6:29) Immediate field used to specify a 24-bit signed two's complement integer which is concatenated on the right with 0b00 and sign-extended to 64 bits. ## LK (31) LINK bit. - 0 Do not set the Link Register. - Set the Link Register. The address of the instruction following the *Branch* instruction is placed into the Link Register. ## MB (21:25) and ME (26:30) Fields used in M-form instructions to specify a 64-bit mask consisting of 1-bits from bit MB+32 through bit ME+32 inclusive and 0-bits elsewhere, as described in Section 3.3.13, "Fixed-Point Rotate and Shift Instructions" on page 82. ## MB (21:26) Field used in MD-form and MDS-form instructions to specify the first 1-bit of a 64-bit mask, as described in Section 3.3.13, "Fixed-Point Rotate and Shift Instructions" on page 82. ## ME (21:26) Field used in MD-form and MDS-form instructions to specify the last 1-bit of a 64-bit mask, as described in Section 3.3.13, "Fixed-Point Rotate and Shift Instructions" on page 82. #### MO (6:10) Field used in X-form instructions to specify a subset of storage accesses. #### NB (16:20) Field used to specify the number of bytes to move in an immediate Move Assist instruction. ## **OPCD (0:5)** Primary opcode field. ## OE (21) Field used by XO-form instructions to enable setting OV and SO in the XER. #### PMRN (11:20) Field used to specify a Performance Monitor Register for the *mfpmr* and *mtpmr* instructions. ## R (15) Immediate field that specifies whether the RMC is specifiying the primary or secondary encoding ## RA (11:15) Field used to specify a GPR to be used as a source or as a target. ### RB (16:20) Field used to specify a GPR to be used as a source. ## Rc (21 OR 31) RECORD bit. - 0 Do not alter the Condition Register. - 1 Set Condition Register Field 0, Field 1, or Field 6 as described in Section 2.3.1, "Condition Register" on page 30. ## RMC (21:22) Immediate field used for DFP rounding mode control. #### RS (6:10) Field used to specify a GPR to be used as a source. ### RSp (6:10) Field used to specify an even/odd pair of GPRs to be concatenated and used as a source. ## RT (6:10) Field used to specify a GPR to be used as a target. #### RTp (6:10) Field used to specify an even/odd pair of GPRs to be concatenated and used as a target. ## S (11) Immediate field that specifies signed versus unsigned conversion. ## SH (16:20, or 16:20 and 30, or 16:21) Field used to specify a shift amount. #### SHB (22:25) Field used to specify a shift amount in bytes. #### SI (16:31 or 11:15) Immediate field used to specify a 16-bit signed integer. ## SIM (11:15) Immediate field used to specify a 5-bit signed integer. ### SP (11:12) Immediate field that specifies signed versus unsigned conversion. #### SPR (11:20) Field used to specify a Special Purpose Register for the *mtspr* and *mfspr* instructions. ## SR (12:15) Field used by the Segment Register Manipulation instructions (see Book III-S). #### TBR (11:20) Field used by the *Move From Time Base* instruction (see Section 4.2.1 of Book II). #### TE (11:15) Immediate field that specifies a DFP exponent. #### TH (6:10 Field used by the data stream variant of the *dcbt* and *dcbtst* instructions (see Section 3.3.2 of Book II). ## TO (6:10) Field used to specify the conditions on which to trap. The encoding is described in Section 3.3.10, "Fixed-Point Trap Instructions" on page 73. #### U (16:19) Immediate field used as the data to be placed into a field in the FPSCR. #### UI (11:15, 16:20, or 16:31) Immediate field used to specify an unsigned integer. #### UIM (11:15, 12:15, 13:15, 14:15) Immediate field used to specify an unsigned integer. #### VRA (11:15) Field used to specify a VR to be used as a source. ## VRB (16:20) Field used to specify a VR to be used as a source. #### VRC (21:25) Field used to specify a VR to be used as a source. #### VRS (6:10) Field used to specify a VR to be used as a source. #### VRT (6:10) Field used to specify a VR to be used as a target. #### W (15) Field used by the *mtfsfi* and *mtfsf* instructions to specify the target word in the FPSCR. XO (21:28, 21:29, 21:30, 21:31, 22:30, 22:31, 23:30, 26:30, 26:31, 27:29, 27:30, or 30:31) Extended opcode field. ## 1.7 Classes of Instructions An instruction falls into exactly one of the following three classes: Defined Illegal Reserved The class is determined by examining the opcode, and the extended opcode if any. If the opcode, or combination of opcode and extended opcode, is not that of a defined instruction or a reserved instruction, the instruction is illegal. ## 1.7.1 Defined Instruction Class This class of instructions contains all the instructions defined in this document. A defined instruction can have preferred and/or invalid forms, as described in Section 1.8.1, "Preferred Instruction Forms" and Section 1.8.2, "Invalid Instruction Forms". Instructions that are part of a category that is not supported are treated as illegal instructions. ## 1.7.2 Illegal Instruction Class This class of instructions contains the set of instructions described in Appendix D of Book Appendices. Illegal instructions are available for future extensions of the Power ISA; that is, some future version of the Power ISA may define any of these instructions to perform new functions. Any attempt to execute an illegal instruction will cause the system illegal instruction error handler to be invoked and will have no other effect. An instruction consisting entirely of binary 0s is guaranteed always to be an illegal instruction. This increases the probability that an attempt to execute data or uninitialized storage will result in the invocation of the system illegal instruction error handler. ## 1.7.3 Reserved Instruction Class This class of instructions contains the set of instructions described in Appendix E of Book Appendices. Reserved instructions are allocated to specific purposes that are outside the scope of the Power ISA. Any attempt to execute a reserved instruction will: - perform the actions described by the implementation if the instruction is implemented; or - cause the system illegal instruction error handler to be invoked if the instruction is not implemented. # 1.8 Forms of Defined Instructions ## 1.8.1 Preferred Instruction Forms Some of the defined instructions have preferred forms. For such an instruction, the preferred form will execute in an efficient manner, but any other form may take significantly longer to execute than the preferred form. Instructions having preferred forms are: - the Condition Register Logical instructions - the Load/Store Multiple instructions - the Load/Store String instructions - the Or Immediate instruction (preferred form of no-op) - the Move To Condition Register Fields instruction ## 1.8.2 Invalid Instruction Forms Some of the defined instructions can be coded in a form that is invalid. An instruction form is invalid if one or more fields of the instruction, excluding the opcode field(s), are coded incorrectly in a manner that can be deduced by examining only the instruction encoding. In general, any attempt to execute an invalid form of an instruction will either cause the system illegal instruction error handler to be invoked or yield boundedly undefined results. Exceptions to this rule are stated in the instruction descriptions. Some instruction forms are invalid because the instruction contains a reserved value in a defined field (see Section 1.3.3 on page 5); these invalid forms are not discussed further. All other invalid forms are identified in the instruction descriptions. References to instructions elsewhere in this document assume the instruction form is not invalid, unless otherwise stated or obvious from context. #### **Assembler Note** Assemblers should report uses of invalid instruction forms as errors. # 1.8.3 Reserved-no-op Instructions [Category: Phased-In (sV2.07)] Reserved-no-op instructions include the following extended opcodes under primary opcode 31: 530, 562, 594, 626, 658, 690, 722, and 754. Reserved-no-op instructions are provided in the architecture to anticipate the eventual adoption of performance hint instructions to the architecture. For these instructions, which cause no visible change to architected state, employing a reserved-no-op opcode will allow software to use this new capability on new implementations that support it while remaining compatible with existing implementations that may not support the new function. When a reserved-no-op instruction is executed, no operation is performed. Reserved-no-op instructions are not assigned instruction names or mnemonics. There are no individual descriptions of reserved-no-op instructions in this document ## 1.9 Exceptions There are two kinds of exception, those caused directly by the execution of an instruction and those caused by an asynchronous event. In either case, the exception may cause one of several components of the system software to be invoked. The exceptions that can be caused directly by the execution of an instruction include the following: - an attempt to execute an illegal instruction, or an attempt by an application program to execute a "privileged" instruction (see Book III) (system illegal instruction error handler or system privileged instruction error handler) - the execution of a defined instruction using an invalid form (system illegal instruction error handler or system privileged instruction error handler) - an attempt to execute an instruction that is not provided by the implementation (system illegal instruction error handler) - an attempt to access a storage location that is unavailable (system instruction storage error handler or system data storage error handler) - an attempt to access storage with an effective address alignment that is invalid for the instruction (system alignment error handler) - the execution of a System Call instruction (system service program) - the execution of a Trap instruction that traps (system trap handler) - the execution of a floating-point instruction that causes a floating-point enabled exception to exist (system floating-point enabled exception error handler) - the execution of an auxiliary processor instruction that causes an auxiliary processor enabled exception to exist (system auxiliary processor enabled exception error handler) The exceptions that can be caused by an asynchronous event are described in Book III. The invocation of the system error handler is precise, except that the invocation of the auxiliary processor enabled exception error handler may be imprecise, and if one of the imprecise modes for invoking the system floating-point enabled exception error handler is in effect (see page 109), then the invocation of the system floating-point enabled exception error handler may also be imprecise. When the system error handler is invoked imprecisely, the excepting instruction does not appear to complete before the next instruction starts (because one of the effects of the excepting instruction, namely the invocation of the system error handler, has not yet occurred). Additional information about exception handling can be found in Book III. ## 1.10 Storage Addressing A program references storage using the effective address computed by the processor when it executes a *Storage Access* or *Branch* instruction (or certain other instructions described in Book II and Book III), or when it fetches the next sequential instruction. Bytes in storage are numbered consecutively starting with 0. Each number is the address of the corresponding byte. The byte ordering (Big-Endian or Little-Endian) for a storage access is specified by the operating system. In the Embedded environment this ordering is a page attribute (see Book II) and is specified independently for each virtual page, while in the Server environment it is a mode (see Book III-S) and applies to all storage. ## 1.10.1 Storage Operands Storage operands may be bytes, halfwords, words, doublewords, or quadwords (see book III), or, for the *Load/Store Multiple* and *Move Assist* instructions, a sequence of bytes or words. The address of a storage operand is the address of its first byte (i.e., of its lowest-numbered byte). Operand length is implicit for each instruction. The operand of a single-register *Storage Access* instruction or quadword *Load* or *Store* instruction, has a "natural" alignment boundary equal to the operand length. In other words, the "natural" address of an operand is an integral multiple of the operand length. A storage operand is said to be *aligned* if it is aligned at its natural boundary; otherwise it is said to be *unaligned*. See the following table. | Operand | Length | Addr <sub>60:63</sub> if aligned | |------------|----------|----------------------------------| | Byte | 8 bits | xxxx | | Halfword | 2 bytes | xxx0 | | Word | 4 bytes | xx00 | | Doubleword | 8 bytes | x000 | | Quadword | 16 bytes | 0000 | **Note:** An "x" in an address bit position indicates that the bit can be 0 or 1 independent of the contents of other bits in the address. The concept of alignment is also applied more generally, to any datum in storage. For example, a 12-byte datum in storage is said to be word-aligned if its address is an integral multiple of 4. Some instructions require their storage operands to have certain alignments. In addition, alignment may affect performance. For single-register *Storage Access* instructions and quadword *Load* and *Store* instructions, the best performance is obtained when storage operands are aligned. Additional effects of data placement on performance are described in Chapter 2 of Book II. When a storage operand of length N bytes starting at effective address EA is copied between storage and a register that is R bytes long (i.e., the register contains bytes numbered from 0, most significant, through R-1, least significant), the bytes of the operand are placed into the register or into storage in a manner that depends on the byte ordering for the storage access as shown in Figure 27, unless otherwise specified in the instruction description. | Big-Endian Byte Ordering | | | | | | | |---------------------------------------|-----------------------------------------|--|--|--|--|--| | Big-Liidiaii | Big-Elidiali Byte Ordering | | | | | | | Load | Store | | | | | | | for i=0 to N-1: | for i=0 to N-1: | | | | | | | $RT_{(R-N)+i} \leftarrow MEM(EA+i,1)$ | $MEM(EA+i,1) \leftarrow (RS)_{(R-N)+i}$ | | | | | | | Little-Endian | Byte Ordering | | | | | | | Load | Store | | | | | | | for i=0 to N-1: | for i=0 to N-1: | | | | | | | $RT_{(R-1)-i} \leftarrow MEM(EA+i,1)$ | $MEM(EA+i,1) \leftarrow (RS)_{(R-1)-i}$ | | | | | | | Notes: | | | | | | | - 1. In this table, subscripts refer to bytes in a register rather than to bits as defined in Section 1.3.2. - This table does not apply to the *Ivebx*, *Ivehx*, *Ivewx*, *stvebx*, *stvehx*, and *stvewx* instructions. Figure 27. Storage operands and byte ordering Figure 28 shows an example of a C language structure **s** containing an assortment of scalars and one character string. The value assumed to be in each structure element is shown in hex in the C comments; these values are used below to show how the bytes making up each structure element are mapped into storage. It is assumed that structure s is compiled for 32-bit mode or for a 32-bit implementation. (This affects the length of the pointer to c.) C structure mapping rules permit the use of padding (skipped bytes) in order to align the scalars on desirable boundaries. Figures 29 and 30 show each scalar aligned at its natural boundary. This alignment introduces padding of four bytes between **a** and **b**, one byte between **d** and **e**, and two bytes between **e** and **f**. The same amount of padding is present for both Big-Endian and Little-Endian mappings. The Big-Endian mapping of structure **s** is shown in Figure 29. Addresses are shown in hex at the left of each doubleword, and in small figures below each byte. The contents of each byte, as indicated in the C example in Figure 28, are shown in hex (as characters for the elements of the string). The Little-Endian mapping of structure s is shown in Figure 30. Doublewords are shown laid out from right to left, which is the common way of showing storage maps for processors that implement only Little-Endian byte ordering. ``` struct { int /* 0x1112_1314 word 0x2122_2324_2526_2728 doubleword double b; char * /* 0x3132_3334 word C; /* 'A', 'B', 'C', 'D', 'E', 'F', 'G' char d[7]; array of bytes /* 0x5152 short e; halfword /* 0x6162_6364 word int f; } s; ``` Figure 28. C structure 's', showing values of elements | 00 | 11 | 12 | 13 | 14 | | | | | |----|-----|-----|------|----|-----|-----|-----|-----| | | 00 | 01 | 02 | 03 | 04 | 05 | 06 | 07 | | 08 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | | | 08 | 09 | 0A | 0B | 0C | 0D | 0E | 0F | | 10 | 31 | 32 | 33 | 34 | `A' | 'B' | `C' | 'D' | | | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | | 18 | `E' | `F′ | , G, | | 51 | 52 | | | | | 18 | 19 | 1A | 1B | 1C | 1D | 1E | 1F | | 20 | 61 | 62 | 63 | 64 | | | | | | | 20 | 21 | 22 | 23 | | | | | Figure 29. Big-Endian mapping of structure 's' | | | | | 11 | 12 | 13 | 14 | 00 | |-----|-----|-----|-----|----|-----|-----|-----|----| | 07 | 06 | 05 | 04 | 03 | 02 | 01 | 00 | | | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 08 | | 0F | 0E | 0D | 0C | 0B | 0A | 09 | 08 | | | 'D' | `C' | `B' | `A' | 31 | 32 | 33 | 34 | 10 | | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | | | | | 51 | 52 | | `G' | `F' | `E' | 18 | | 1F | 1E | 1D | 1C | 1B | 1A | 19 | 18 | | | | | | | 61 | 62 | 63 | 64 | 20 | | | | | | 23 | 22 | 21 | 20 | | Figure 30. Little-Endian mapping of structure 's' ## 1.10.2 Instruction Fetches Instructions are always four bytes long and word-aligned (except for VLE instructions; see Book VLE). When an instruction starting at effective address EA is fetched from storage, the relative order of the bytes within the instruction depend on the byte ordering for the storage access as shown in Figure 31. | Big-Endian Byte Ordering | |------------------------------------------| | for i=0 to 3: | | inst <sub>i</sub> ← MEM(EA+i,1) | | Little-Endian Byte Ordering | | for i=0 to 3: | | inst <sub>3-i</sub> ← MEM(EA+i,1) | | Note: In this table, subscripts refer to | | bytes of the instruction rather than | | to bits as defined in Section 1.3.2. | Figure 31. Instructions and byte ordering Figure 32 shows an example of a small assembly lanquage program $\mathbf{p}$ . | loop: | | | |-------|--------|----------| | | cmplwi | r5,0 | | | beq | done | | | lwzux | r4,r5,r6 | | | add | r7,r7,r4 | | | subi | r5,r5,4 | | | b | loop | | | | | | done: | | | | | stw | r7,total | Figure 32. Assembly language program 'p' The Big-Endian mapping of program **p** is shown in Figure 33 (assuming the program starts at address 0). | 00 | loop: cmplwi r5,0 | | | | | beq | done | | |----|-------------------|-------|------|------|----|-------|-------|----| | | 00 | 01 | 02 | 03 | 04 | 05 | 06 | 07 | | 08 | lwzux r4,r5,r6 | | | | a | dd r7 | ,r7,r | 4 | | | 08 | 09 | 0A | 0B | 0C | 0D | 0E | 0F | | 10 | subi r5,r5,4 | | | | | b 1 | оор | | | | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | | 18 | done | : stw | r7,t | otal | | | | | | | 18 | 19 | 1A | 1B | 1C | 1D | 1E | 1F | Figure 33. Big-Endian mapping of program 'p' The Little-Endian mapping of program ${\bf p}$ is shown in Figure 34. | oop: cmplwi r5,0 | loop | | done | beq | | |-------------------|-------|----|-------|-------|----| | 03 02 01 00 | 03 | 04 | 05 | 06 | 07 | | lwzux r4,r5,r6 08 | lwz | 4 | ,r7,r | dd r7 | a | | OB OA 09 08 | 0B | 0C | 0D | 0E | 0F | | subi r5,r5,4 | su | | oop | b 1 | | | 3 12 11 10 | 13 | 14 | 15 | 16 | 17 | | one: stw r7,total | done: | | | | | | IB 1A 19 18 | 1B | 1C | 1D | 1E | 1F | Figure 34. Little-Endian mapping of program 'p' ## **Programming Note** The terms *Big-Endian* and *Little-Endian* come from Part I, Chapter 4, of Jonathan Swift's *Gulliver's Travels*. Here is the complete passage, from the edition printed in 1734 by George Faulkner in Dublin. ... our Histories of six Thousand Moons make no Mention of any other Regions, than the two great Empires of *Lilliput* and *Blefuscu*. Which two mighty Powers have, as I was going to tell you, been engaged in a most obstinate War for six and thirty Moons past. It began upon the following Occasion. It is allowed on all Hands, that the primitive Way of breaking Eggs before we eat them, was upon the larger End: But his present Majesty's Grand-father, while he was a Boy, going to eat an Egg, and breaking it according to the ancient Practice, happened to cut one of his Fingers. Whereupon the Emperor his Father, published an Edict, commanding all his Subjects, upon great Penalties, to break the smaller End of their Eggs. The People so highly resented this Law, that our Histories tell us, there have been six Rebellions raised on that Account; wherein one Emperor lost his Life, and another his Crown. These civil Commotions were constantly fomented by the Monarchs of Blefuscu; and when they were quelled, the Exiles always fled for Refuge to that Empire. It is computed that eleven Thousand Persons have, at several Times, suffered Death, rather than submit to break their Eggs at the smaller End. Many hundred large Volumes have been published upon this Controversy: But the Books of the Big-Endians have been long forbidden, and the whole Party rendered incapable by Law of holding Employments. During the Course of these Troubles, the Emperors of Blefuscu did frequently expostulate by their Ambassadors, accusing us of making a Schism in Religion, by offending against a fundamental Doctrine of our great Prophet Lustrog, in the fifty-fourth Chapter of the Brundrecal, (which is their Alcoran.) This, however, is thought to be a mere Strain upon the text: For the Words are these; That all true Believers shall break their Eggs at the convenient End: and which is the convenient End, seems, in my humble Opinion, to be left to every Man's Conscience, or at least in the Power of the chief Magistrate to determine. Now the Big-Endian Exiles have found so much Credit in the Emperor of Blefuscu's Court; and so much private Assistance and Encouragement from their Party here at home, that a bloody War has been carried on between the two Empires for six and thirty Moons with various Success; during which Time we have lost Forty Capital Ships, and a much greater Number of smaller Vessels, together with thirty thousand of our best Seamen and Soldiers; and the Damage received by the Enemy is reckoned to be somewhat greater than ours. However, they have now equipped a numerous Fleet, and are just preparing to make a Descent upon us: and his Imperial Majesty, placing great Confidence in your Valour and Strength, hath commanded me to lay this Account of his Affairs before you. # 1.10.3 Effective Address Calculation An effective address is computed by the processor when executing a *Storage Access* or *Branch* instruction (or certain other instructions described in Book II, Book III, and Book VLE) when fetching the next sequential instruction, or when invoking a system error handler. The following provides an overview of this process. More detail is provided in the individual instruction descriptions. Effective address calculations, for both data and instruction accesses, use 64-bit two's complement addition. All 64 bits of each address component participate in the calculation regardless of mode (32-bit or 64-bit). In this computation one operand is an address (which is by definition an unsigned number) and the second is a signed offset. Carries out of the most significant bit are ignored. In 64-bit mode, the entire 64-bit result comprises the 64-bit effective address. The effective address arithmetic wraps around from the maximum address, $2^{64}$ – 1, to address 0, except that if the current instruction is at effective address $2^{64}$ – 4 the effective address of the next sequential instruction is undefined. In 32-bit mode, the low-order 32 bits of the 64-bit result, preceded by 32 0 bits, comprise the 64-bit effective address for the purpose of addressing storage. When an effective address is placed into a register by an instruction or event, the value placed into the high-order 32 bits of the register differs between the Server environment and the Embedded environment. - Server environment: - Load with Update and Store with Update instructions set the high-order 32 bits of register RA to the high-order 32 bits of the 64-bit result. - In all other cases (e.g., the Link Register when set by Branch instructions having LK=1, Special Purpose Registers when set to an effec- tive address by invocation of a system error handler) the high-order 32 bits of the register are set to 0s except as described in the last sentence of this paragraph. Embedded environment: The high-order 32 bits of the register are set to an undefined value. As used to address storage, the effective address arithmetic appears to wrap around from the maximum address, $2^{32}$ - 1, to address 0, except that if the current instruction is at effective address $2^{32}$ - 4 the effective address of the next sequential instruction is undefined. The 64-bit current instruction address is not affected by a change from 32-bit mode to 64-bit mode, but is affected by a change from 64-bit mode to 32-bit mode. In the latter case, the high-order 32 bits are set to 0. The same is true for the 64-bit next instruction address, except as described in the last item of the list below. RA is a field in the instruction which specifies an address component in the computation of an effective address. A zero in the RA field indicates the absence of the corresponding address component. A value of zero is substituted for the absent component of the effective address computation. This substitution is shown in the instruction descriptions as (RAIO). Effective addresses are computed as follows. In the descriptions below, it should be understood that "the contents of a GPR" refers to the entire 64-bit contents, independent of mode, but that in 32-bit mode only bits 32:63 of the 64-bit result of the computation are used to address storage. - With X-form instructions, in computing the effective address of a data element, the contents of the GPR designated by RB (or the value zero for *Iswi* and *stswi*) are added to the contents of the GPR designated by RA or to zero if RA=0. - With D-form instructions, the 16-bit D field is sign-extended to form a 64-bit address component. In computing the effective address of a data element, this address component is added to the contents of the GPR designated by RA or to zero if RA=0. - With DS-form instructions, the 14-bit DS field is concatenated on the right with 0b00 and sign-extended to form a 64-bit address component. In computing the effective address of a data element, this address component is added to the contents of the GPR designated by RA or to zero if RA=0. - With I-form Branch instructions, the 24-bit LI field is concatenated on the right with 0b00 and sign-extended to form a 64-bit address component. If AA=0, this address component is added to the address of the Branch instruction to form the effective address of the next instruction. If AA=1, - this address component is the effective address of the next instruction. - With B-form Branch instructions, the 14-bit BD field is concatenated on the right with 0b00 and sign-extended to form a 64-bit address component. If AA=0, this address component is added to the address of the Branch instruction to form the effective address of the next instruction. If AA=1, this address component is the effective address of the next instruction. - With XL-form Branch instructions, bits 0:61 of the Link Register or the Count Register are concatenated on the right with 0b00 to form the effective address of the next instruction. - With sequential instruction fetching, the value 4 is added to the address of the current instruction to form the effective address of the next instruction, except that if the current instruction is at the maximum instruction effective address for the mode (2<sup>64</sup> 4 in 64-bit mode, 2<sup>32</sup> 4 in 32-bit mode) the effective address of the next sequential instruction is undefined. (There is one other exception to this rule; this exception involves changing between 32-bit mode and 64-bit mode and is described in Section 5.3.2 of Book III-S and Section 4.3.2 of Book III-E.) If the size of the operand of a storage access instruction is more than one byte, the effective address for each byte after the first is computed by adding 1 to the effective address of the preceding byte. ## **Chapter 2. Branch Processor** | 0.4 December December 20 and 12 | O.S. O. allifa a Deviate allege of a second | |------------------------------------|----------------------------------------------| | 2.1 Branch Processor Overview 29 | 2.5 Condition Register Instructions 37 | | 2.2 Instruction Execution Order 29 | 2.5.1 Condition Register Logical Instruc- | | 2.3 Branch Processor Registers 30 | tions | | 2.3.1 Condition Register 30 | 2.5.2 Condition Register Field Instruction . | | 2.3.2 Link Register 31 | 38 | | 2.3.3 Count Register | 2.6 System Call Instruction 39 | | 2.4 Branch Instructions | • | ## 2.1 Branch Processor Overview This chapter describes the registers and instructions that make up the Branch Processor facility. ## 2.2 Instruction Execution Order In general, instructions appear to execute sequentially, in the order in which they appear in storage. The exceptions to this rule are listed below. - Branch instructions for which the branch is taken cause execution to continue at the target address specified by the Branch instruction. - Trap instructions for which the trap conditions are satisfied, and System Call instructions, cause the appropriate system handler to be invoked. - Exceptions can cause the system error handler to be invoked, as described in Section 1.9, "Exceptions" on page 22. - Returning from a system service program, system trap handler, or system error handler causes execution to continue at a specified address. The model of program execution in which the processor appears to execute one instruction at a time, completing each instruction before beginning to execute the next instruction is called the "sequential execution model". In general, the processor obeys the sequential execution model. For the instructions and facilities defined in this Book, the only exceptions to this rule are the following. A floating-point exception occurs when the processor is running in one of the Imprecise floating-point exception modes (see Section 4.4). The instruction that causes the exception need not complete before the next instruction begins execution, with respect to setting exception bits and (if the exception is enabled) invoking the system error handler. A Store instruction modifies one or more bytes in an area of storage that contains instructions that will subsequently be executed. Before an instruction in that area of storage is executed, software synchronization is required to ensure that the instructions executed are consistent with the results produced by the Store instruction. #### **Programming Note** This software synchronization will generally be provided by system library programs (see Section 1.8 of Book II). Application programs should call the appropriate system library program before attempting to execute modified instructions. ## 2.3 Branch Processor Registers ## 2.3.1 Condition Register The Condition Register (CR) is a 32-bit register which reflects the result of certain operations, and provides a mechanism for testing (and branching). Figure 35. Condition Register The bits in the Condition Register are grouped into eight 4-bit fields, named CR Field 0 (CR0), ..., CR Field 7 (CR7), which are set in one of the following ways. - Specified fields of the CR can be set by a move to the CR from a GPR (mtcrf, mtocrf). - A specified field of the CR can be set by a move to the CR from another CR field (*mcrf*), from XER<sub>32:35</sub> (*mcrxr*), or from the FPSCR (*mcrfs*). - CR Field 0 can be set as the implicit result of a fixed-point instruction. - CR Field 1 can be set as the implicit result of a floating-point instruction. - CR Field 6 can be set as the implicit result of a vector instruction. - A specified CR field can be set as the result of a Compare instruction. Instructions are provided to perform logical operations on individual CR bits and to test individual CR bits. For all fixed-point instructions in which Rc=1, and for *addic.*, *andi.*, and *andis.*, the first three bits of CR Field 0 (bits 32:34 of the Condition Register) are set by signed comparison of the result to zero, and the fourth bit of CR Field 0 (bit 35 of the Condition Register) is copied from the SO field of the XER. "Result" here refers to the entire 64-bit value placed into the target register in 64-bit mode, and to bits 32:63 of the 64-bit value placed into the target register in 32-bit mode. ``` if (64-bit mode) then M \leftarrow 0 else M \leftarrow 32 if (target_register)<sub>M:63</sub> < 0 then c \leftarrow 0b100 else if (target_register)<sub>M:63</sub> > 0 then c \leftarrow 0b010 else c \leftarrow 0b000 CR0 \leftarrow c \mid | XER<sub>SO</sub> ``` If any portion of the result is undefined, then the value placed into the first three bits of CR Field 0 is undefined. The bits of CR Field 0 are interpreted as follows. ## Bit Description 0 **Negative** (LT) The result is negative. 1 **Positive** (GT) The result is positive. 2 **Zero** (EQ) The result is zero. 3 **Summary Overflow** (SO) This is a copy of the contents of XER<sub>SO</sub> at the completion of the instruction. The **stwcx.** and **stdcx.** instructions (see Section 3.4.2, "Load and Reserve and Store Conditional Instructions", in Book II) also set CR Field 0. For all floating-point instructions in which Rc=1, CR Field 1 (bits 36:39 of the Condition Register) is set to the Floating-Point exception status, copied from bits 0:3 of the Floating-Point Status and Control Register. This occurs regardless of whether any exceptions are enabled, and regardless of whether the writing of the result is suppressed (see Section 4.4, "Floating-Point Exceptions" on page 108). These bits are interpreted as follows. #### Bit Description - 0 Floating-Point Exception Summary (FX) This is a copy of the contents of FPSCR<sub>FX</sub> at the completion of the instruction. - 1 Floating-Point Enabled Exception Summary (FEX) This is a copy of the contents of FPSCR<sub>FEX</sub> at the completion of the instruction. Floating-Point Invalid Operation Exception Summary (VX) This is a copy of the contents of $\mathsf{FPSCR}_{\mathsf{VX}}$ at the completion of the instruction. 3 Floating-Point Overflow Exception (OX) This is a copy of the contents of FPSCR<sub>OX</sub> at the completion of the instruction. For *Compare* instructions, a specified CR field is set to reflect the result of the comparison. The bits of the specified CR field are interpreted as follows. A complete description of how the bits are set is given in the instruction descriptions in Section 3.3.9, "Fixed-Point Compare Instructions" on page 71, Section 4.6.8, "Floating-Point Compare Instructions" on page 138, and Section 7.3.9, "SPE Instruction Set" on page 268. #### Bit Description 0 Less Than, Floating-Point Less Than (LT, For fixed-point Compare instructions, (RA) < SI or (RB) (signed comparison) or (RA) < UI or (RB) (unsigned comparison). For floating-point Compare instructions, (FRA) < (FRB). - 1 **Greater Than, Floating-Point Greater Than** (GT, FG) - For fixed-point Compare instructions, (RA) > SI or (RB) (signed comparison) or (RA) > UI or (RB) (unsigned comparison). For floating-point Compare instructions, (FRA) > (FRB). - 2 Equal, Floating-Point Equal (EQ, FE) For fixed-point Compare instructions, (RA) = SI, UI, or (RB). For floating-point Compare instructions, (FRA) = (FRB). - 3 Summary Overflow, Floating-Point Unordered (SO,FU) For fixed-point *Compare* instructions, this is a copy of the contents of XER<sub>SO</sub> at the completion of the instruction. For floating-point *Compare* instructions, one or both of (FRA) and (FRB) is a NaN. ## 2.3.2 Link Register The Link Register (LR) is a 64-bit register. It can be used to provide the branch target address for the *Branch Conditional to Link Register* instruction, and it holds the return address after Branch instructions for which LK=1. Figure 36. Link Register ## 2.3.3 Count Register The Count Register (CTR) is a 64-bit register. It can be used to hold a loop count that can be decremented during execution of Branch instructions that contain an appropriately coded BO field. If the value in the Count Register is 0 before being decremented, it is -1 afterward. The Count Register can also be used to provide the branch target address for the *Branch Conditional to Count Register* instruction. Figure 37. Count Register ## 2.4 Branch Instructions The sequence of instruction execution can be changed by the *Branch* instructions. Because all instructions are on word boundaries, bits 62 and 63 of the generated branch target address are ignored by the processor in performing the branch. The *Branch* instructions compute the effective address (EA) of the target in one of the following four ways, as described in Section 1.10.3, "Effective Address Calculation" on page 26. - Adding a displacement to the address of the Branch instruction (Branch or Branch Conditional with AA=0). - Specifying an absolute address (Branch or Branch Conditional with AA=1). - Using the address contained in the Link Register (Branch Conditional to Link Register). - Using the address contained in the Count Register (Branch Conditional to Count Register). In all four cases, in 32-bit mode the final step in the address computation is setting the high-order 32 bits of the target address to 0. For the first two methods, the target addresses can be computed sufficiently ahead of the *Branch* instruction that instructions can be prefetched along the target path. For the third and fourth methods, prefetching instructions along the target path is also possible provided the Link Register or the Count Register is loaded sufficiently ahead of the *Branch* instruction. Branching can be conditional or unconditional, and the return address can optionally be provided. If the return address is to be provided (LK=1), the effective address of the instruction following the *Branch* instruction is placed into the Link Register after the branch target address has been computed; this is done regardless of whether the branch is taken. For *Branch Conditional* instructions, the BO field specifies the conditions under which the branch is taken, as shown in Figure 38. In the figure, M=0 in 64-bit mode and M=32 in 32-bit mode. | во | Description | |------------------|--------------------------------------------------------------------------------------------------| | 0000z | Decrement the CTR, then branch if the decremented CTR <sub>M:63</sub> ≠0 and CR <sub>BI</sub> =0 | | 0001z | Decrement the CTR, then branch if the decremented CTR <sub>M:63</sub> =0 and CR <sub>BI</sub> =0 | | 001at | Branch if CR <sub>BI</sub> =0 | | 0100z | Decrement the CTR, then branch if the decremented CTR <sub>M:63</sub> ≠0 and CR <sub>BI</sub> =1 | | 0101z | Decrement the CTR, then branch if the decremented CTR <sub>M:63</sub> =0 and CR <sub>BI</sub> =1 | | 011at | Branch if CR <sub>BI</sub> =1 | | 1a00t | Decrement the CTR, then branch if the decremented CTR <sub>M:63</sub> ≠0 | | 1a01t | Decrement the CTR, then branch if the decremented CTR <sub>M:63</sub> =0 | | 1z1zz | Branch always | | Notes:<br>1. "z" | denotes a bit that is ignored. | #### Figure 38. BO field encodings The "a" and "t" bits of the BO field can be used by software to provide a hint about whether the branch is likely to be taken or is likely not to be taken, as shown in Figure 39. 2. The "a" and "t" bits are used as described below. | at | Hint | |----|-------------------------------------------| | 00 | No hint is given | | 01 | Reserved | | 10 | The branch is very likely not to be taken | | 11 | The branch is very likely to be taken | Figure 39. "at" bit encodings ### - Programming Note - Many implementations have dynamic mechanisms for predicting whether a branch will be taken. Because the dynamic prediction is likely to be very accurate, and is likely to be overridden by any hint provided by the "at" bits, the "at" bits should be set to 0b00 unless the static prediction implied by at=0b10 or at=0b11 is highly likely to be correct. For Branch Conditional to Link Register and Branch Conditional to Count Register instructions, the BH field provides a hint about the use of the instruction, as shown in Figure 40. | вн | Hint | | | | |----|----------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--| | 00 | bclr[l]: | The instruction is a subroutine return | | | | | bcctr[I]: | The instruction is not a subroutine return; the target address is likely to be the same as the target address used the preceding time the branch was taken | | | | 01 | bclr[l]: | The instruction is not a subroutine return; the target address is likely to be the same as the target address used the preceding time the branch was taken | | | | | bcctr[I]: | Reserved | | | | 10 | Reserve | d | | | | 11 | <pre>bclr[I] and bcctr[I]: The target address is not<br/>predictable</pre> | | | | Figure 40. BH field encodings ## Programming Note - The hint provided by the BH field is independent of the hint provided by the "at" bits (e.g., the BH field provides no indication of whether the branch is likely to be taken). ## **Extended mnemonics for branches** Many extended mnemonics are provided so that *Branch Conditional* instructions can be coded with portions of the BO and BI fields as part of the mnemonic rather than as part of a numeric operand. Some of these are shown as examples with the Branch instructions. See Appendix D for additional extended mnemonics. ## Programming Note — The hints provided by the "at" bits and by the BH field do not affect the results of executing the instruction. The "z" bits should be set to 0, because they may be assigned a meaning in some future version of the architecture. ## **Programming Note** Many implementations have dynamic mechanisms for predicting the target addresses of <code>bclr[I]</code> and <code>bcctr[I]</code> instructions. These mechanisms may cache return addresses (i.e., Link Register values set by <code>Branch</code> instructions for which LK=1 and for which the branch was taken) and recently used branch target addresses. To obtain the best performance across the widest range of implementations, the programmer should obey the following rules. - Use Branch instructions for which LK=1 only as subroutine calls (including function calls, etc.). - Pair each subroutine call (i.e., each Branch instruction for which LK=1 and the branch is taken) with a bclr instruction that returns from the subroutine and has BH=0b00. - Do not use *bclrl* as a subroutine call. (Some implementations access the return address cache at most once per instruction; such implementations are likely to treat *bclrl* as a subroutine return, and not as a subroutine call.) - For bclr[I] and bcctr[I], use the appropriate value in the BH field. The following are examples of programming conventions that obey these rules. In the examples, BH is assumed to contain 0b00 unless otherwise stated. In addition, the "at" bits are assumed to be coded appropriately. Let A, B, and Glue be specific programs. #### ■ Loop counts: Keep them in the Count Register, and use a **bc** instruction (LK=0) to decrement the count and to branch back to the beginning of the loop if the decremented count is nonzero. Computed goto's, case statements, etc.: Use the Count Register to hold the address to branch to, and use a *bcctr* instruction (LK=0, and BH=0b11 if appropriate) to branch to the selected address. ■ Direct subroutine linkage: Here A calls B and B returns to A. The two branches should be as follows. - A calls B: use a bl or bcl instruction (LK=1). - B returns to A: use a *bclr* instruction (LK=0) (the return address is in, or can be restored to, the Link Register). - Indirect subroutine linkage: Here A calls Glue, Glue calls B, and B returns to A rather than to Glue. (Such a calling sequence is common in linkage code used when the subroutine that the programmer wants to call, here B, is in a different module from the caller; the Binder inserts "glue" code to mediate the branch.) The three branches should be as follows. - A calls Glue: use a bl or bcl instruction (LK=1). - Glue calls B: place the address of B into the Count Register, and use a *bcctr* instruction (LK=0). - B returns to A: use a *bcIr* instruction (LK=0) (the return address is in, or can be restored to, the Link Register). #### ■ Function call: Here A calls a function, the identity of which may vary from one instance of the call to another, instead of calling a specific program B. This case should be handled using the conventions of the preceding two bullets, depending on whether the call is direct or indirect, with the following differences. - If the call is direct, place the address of the function into the Count Register, and use a bcctrl instruction (LK=1) instead of a bl or bcl instruction. - For the bcctr[I] instruction that branches to the function, use BH=0b11 if appropriate. #### Compatibility Note - The bits corresponding to the current "a" and "t" bits, and to the current "z" bits except in the "branch always" BO encoding, had different meanings in versions of the architecture that precede Version 2.00. - The bit corresponding to the "t" bit was called the "y" bit. The "y" bit indicated whether to use the architected default prediction (y=0) or to use the complement of the default prediction (y=1). The default prediction was defined as follows. - If the instruction is **bc[/][a]** with a negative value in the displacement field, the branch is taken. (This is the only case in which the prediction corresponding to the "y" bit differs from the prediction corresponding to the "t" bit.) - In all other cases (bc[I][a] with a nonnegative value in the displacement field, bc[r[I], or bcctr[I]), the branch is not taken. - The BO encodings that test both the Count Register and the Condition Register had a "y" bit in place of the current "z" bit. The meaning of the "y" bit was as described in the preceding item. - The "a" bit was a "z" bit. Because these bits have always been defined either to be ignored or to be treated as hints, a given program will produce the same result on any implementation regardless of the values of the bits. Also, because even the "y" bit is ignored, in practice, by most processors that comply with versions of the architecture that precede Version 2.00, the performance of a given program on those processors will not be affected by the values of the bits. | Branch | | | I-form | |--------|-------------|----|-------------| | b | target_addr | | (AA=0 LK=0) | | ba | target_addr | | (AA=1 LK=0) | | bl | target_addr | | (AA=0 LK=1) | | bla | target_addr | | (AA=1 LK=1) | | | | | | | 18 | | LI | AA LK | 30 31 | if AA then | $NIA \leftarrow_{iea}$ | EXTS(LI 0b00) | | |------------|------------------------|-----------------|-------| | else | NIA ←iea | CIA + EXTS(LI | 0b00) | | if LK then | | | | target\_addr specifies the branch target address. If AA=0 then the branch target address is the sum of LI || 0b00 sign-extended and the address of this instruction, with the high-order 32 bits of the branch target address set to 0 in 32-bit mode. If AA=1 then the branch target address is the value LI || 0b00 sign-extended, with the high-order 32 bits of the branch target address set to 0 in 32-bit mode. If LK=1 then the effective address of the instruction following the Branch instruction is placed into the Link Register. #### **Special Registers Altered:** ## Branch Conditional B-form | bc | BO,BI,target_addr | (AA=0 LK=0) | |------|-------------------|-------------| | bca | BO,BI,target_addr | (AA=1 LK=0) | | bcl | BO,BI,target_addr | (AA=0 LK=1) | | bcla | BO,BI,target_addr | (AA=1 LK=1) | | | | | | 16 | ВО | BI | BD | AΑ | LK | |----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 30 | 31 | ``` if (64-bit mode) then M \leftarrow 0 else M \leftarrow 32 if \neg BO_2 then CTR \leftarrow CTR - 1 ctr_ok \leftarrow BO_2 | ((CTR<sub>M:63</sub> \neq 0) \oplus BO_3) cond_ok \leftarrow BO_0 | (CR<sub>BI+32</sub> \equiv BO_1) if ctr_ok & cond_ok then if AA then NIA \leftarrow iea EXTS(BD || 0b00) else NIA \leftarrow iea CIA + EXTS(BD || 0b00) if LK then LR \leftarrow iea CIA + 4 ``` BI+32 specifies the Condition Register bit to be tested. The BO field is used to resolve the branch as described in Figure 38. *target\_addr* specifies the branch target address. If AA=0 then the branch target address is the sum of BD || 0b00 sign-extended and the address of this instruction, with the high-order 32 bits of the branch target address set to 0 in 32-bit mode. If AA=1 then the branch target address is the value BD || 0b00 sign-extended, with the high-order 32 bits of the branch target address set to 0 in 32-bit mode. If LK=1 then the effective address of the instruction following the *Branch* instruction is placed into the Link Register. ## **Special Registers Altered:** $$\begin{array}{ccc} \mathsf{CTR} & & \mathsf{(if} \ \mathsf{BO}_2 \!\!=\!\! \mathsf{0}) \\ \mathsf{LR} & & \mathsf{(if} \ \mathsf{LK} \!\!=\!\! \mathsf{1}) \\ \end{array}$$ #### **Extended Mnemonics:** Examples of extended mnemonics for *Branch Conditional*: | Extended: | | Equiv | alent to: | |-----------|------------|-------|-------------| | blt | target | bc | 12,0,target | | bne | cr2,target | bc | 4,10,target | | bdnz | target | bc | 16,0,target | ## Branch Conditional to Link Register XL-form | bclr | BO,BI,BH | (LK=0) | |-------|----------|--------| | bclrl | BO,BI,BH | (LK=1) | | 19 | ВО | BI | /// | ВН | 16 | LK | |----|----|----|-----|----|----|----| | 0 | 6 | 11 | 16 | 19 | 21 | 31 | ``` if (64-bit mode) then M \leftarrow 0 else M \leftarrow 32 if \neg BO_2 then CTR \leftarrow CTR - 1 ctr_ok \leftarrow BO_2 | ((CTR_M:63 \neq 0) \oplus BO_3 cond_ok \leftarrow BO_0 | (CR_BI+32 \equiv BO_1) if ctr_ok & cond_ok then NIA \leftarrowiea LR_0:61 || 0b00 if LK then LR \leftarrowiea CIA + 4 ``` BI+32 specifies the Condition Register bit to be tested. The BO field is used to resolve the branch as described in Figure 38. The BH field is used as described in Figure 40. The branch target address is $LR_{0:61} \parallel 0b00$ , with the high-order 32 bits of the branch target address set to 0 in 32-bit mode. If LK=1 then the effective address of the instruction following the *Branch* instruction is placed into the Link Register. ## **Special Registers Altered:** $$\begin{array}{ll} \text{CTR} & \text{ (if BO}_2\text{=}0) \\ \text{LR} & \text{ (if LK=1)} \end{array}$$ #### **Extended Mnemonics:** Examples of extended mnemonics for *Branch Conditional to Link Register*. | Extended: | | Equiva | Equivalent to: | | | |-----------|-----|--------|----------------|--|--| | bclr | 4,6 | bclr | 4,6,0 | | | | bltlr | | bclr | 12,0,0 | | | | bnelr | cr2 | bclr | 4,10,0 | | | | bdnzlr | | bclr | 16,0,0 | | | ## - Programming Note bcIr, bcIrI, bcctr, and bcctrI each serve as both a basic and an extended mnemonic. The Assembler will recognize a bcIr, bcIrI, bcctr, or bcctrI mnemonic with three operands as the basic form, and a bcIr, bcIrI, bcctr, or bcctrI mnemonic with two operands as the extended form. In the extended form the BH operand is omitted and assumed to be 0b00. ## Branch Conditional to Count Register XL-form | bcctr | BO,BI,BH | (LK=0) | |-----------------|----------|--------| | bcctrl BO,BI,BH | | (LK=1) | | | | | | 19 | ВО | BI | /// | ВН | 528 | LK | |----|----|----|-----|------|-----|----| | 0 | 6 | 11 | 16 | 19 2 | 21 | 31 | $$\begin{array}{lll} {\rm cond\_ok} \; \leftarrow \; {\rm BO_0} \; | \; ({\rm CR_{BI+32}} \equiv {\rm BO_1}) \\ {\rm if} \; {\rm cond\_ok} \; {\rm then} \; {\rm NIA} \; \leftarrow_{\rm iea} \; {\rm CTR_{0:61}} \; | \; | \; {\rm 0b00} \\ {\rm if} \; {\rm LK} \; {\rm then} \; {\rm LR} \; \leftarrow_{\rm iea} \; {\rm CIA} \; + \; 4 \end{array}$$ BI+32 specifies the Condition Register bit to be tested. The BO field is used to resolve the branch as described in Figure 38. The BH field is used as described in Figure 40. The branch target address is $CTR_{0:61} \parallel 0b00$ , with the high-order 32 bits of the branch target address set to 0 in 32-bit mode. If LK=1 then the effective address of the instruction following the *Branch* instruction is placed into the Link Register. If the "decrement and test CTR" option is specified (BO<sub>2</sub>=0), the instruction form is invalid. ## **Special Registers Altered:** #### **Extended Mnemonics:** Examples of extended mnemonics for *Branch Conditional to Count Register*. | Extended: | | Equiva | lent to: | |-----------|-----|--------|----------| | bcctr | 4,6 | bcctr | 4,6,0 | | bltctr | | bcctr | 12,0,0 | | bnectr | cr2 | bcctr | 4,10,0 | ## 2.5 Condition Register Instructions ## 2.5.1 Condition Register Logical Instructions The Condition Register Logical instructions have preferred forms; see Section 1.8.1. In the preferred forms, the BT and BB fields satisfy the following rule. The bit specified by BT is in the same Condition Register field as the bit specified by BB. ## **Extended mnemonics for Condition** Register logical operations A set of extended mnemonics is provided that allow additional Condition Register logical operations, beyond those provided by the basic Condition Register Logical instructions, to be coded easily. Some of these are shown as examples with the Condition Register Logical instructions. See Appendix D for additional extended mnemonics. ## **Condition Register AND** XL-form **Condition Register NAND** XL-form crand BT,BA,BB | 19 | BT | BA | BB | 257 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$CR_{BT+32} \leftarrow \neg (CR_{BA+32} \& CR_{BB+32})$$ BT,BA,BB $CR_{BT+32} \leftarrow CR_{BA+32} \& CR_{BB+32}$ The bit in the Condition Register specified by BA+32 is ANDed with the bit in the Condition Register specified by BB+32, and the result is placed into the bit in the Condition Register specified by BT+32. ## **Special Registers Altered:** $CR_{BT+32}$ The bit in the Condition Register specified by BA+32 is ANDed with the bit in the Condition Register specified by BB+32, and the complemented result is placed into the bit in the Condition Register specified by BT+32. BB 225 31 ## Special Registers Altered: CR<sub>BT+32</sub> crnand ## Condition Register OR XL-form #### Condition Register XOR XL-form cror BT,BA,BB | 19 | RT | RΑ | RR | 449 | / | |----|----|------|----|--------------|----| | 13 | ٠, | D, ( | | ''' | ′ | | 0 | 6 | 11 | 16 | 21 | 31 | | O | U | 1 | 10 | <del>-</del> | 01 | $$CR_{BT+32} \leftarrow CR_{BA+32} \mid CR_{BB+32}$$ The bit in the Condition Register specified by BA+32 is ORed with the bit in the Condition Register specified by BB+32, and the result is placed into the bit in the Condition Register specified by BT+32. Special Registers Altered: $CR_{BT+32} \leftarrow CR_{BA+32} \oplus CR_{BB+32}$ XORed with the bit in the Condition Register specified by BB+32, and the result is placed into the bit in the Condition Register specified by BT+32. The bit in the Condition Register specified by BA+32 is ## **Special Registers Altered:** CR<sub>BT+32</sub> #### **Extended Mnemonics:** Example of extended mnemonics for Condition Register OR: **Extended Mnemonics:** CR<sub>BT+32</sub> Extended: Equivalent to: crmove Bx,By cror Bx,By,By Example of extended mnemonics for Condition Register XOR: > Extended: Equivalent to: crclr Bx crxor Bx,Bx,Bx ## **Condition Register NOR** XL-form ## Condition Register Equivalent XL-form crnor BT,BA,BB | 19 | BT | BA | BB | 33 | / | |----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | BT,BA,BB $$CR_{BT+32} \leftarrow \neg (CR_{BA+32} \mid CR_{BB+32})$$ 0 6 11 16 21 31 The bit in the Condition Register specified by BA+32 is ORed with the bit in the Condition Register specified by BB+32, and the complemented result is placed into the bit in the Condition Register specified by BT+32. The bit in the Condition Register specified by BA+32 is XORed with the bit in the Condition Register specified by BB+32, and the complemented result is placed into the bit in the Condition Register specified by BT+32. ## **Special Registers Altered:** ## **Special Registers Altered:** $CR_{BT+32} \leftarrow CR_{BA+32} \equiv CR_{BB+32}$ CR<sub>BT+32</sub> CR<sub>BT+32</sub> creqv #### **Extended Mnemonics:** #### **Extended Mnemonics:** Example of extended mnemonics for *Condition Register NOR*: Example of extended mnemonics for *Condition Register Equivalent*: **Extended:** Equivalent to: crnot Bx,By crnor Bx,By,By Extended: Equivalent to: crset Bx creqv Bx,Bx,Bx ## Condition Register AND with Complement XL-form ## Condition Register OR with Complement XL-form crandc BT,BA,BB | BA,BB | |-------| | | $$CR_{BT+32} \leftarrow CR_{BA+32} \& \neg CR_{BB+32}$$ $$CR_{BT+32} \leftarrow CR_{BA+32} \mid \neg CR_{BB+32}$$ The bit in the Condition Register specified by BA+32 is ANDed with the complement of the bit in the Condition Register specified by BB+32, and the result is placed into the bit in the Condition Register specified by BT+32. The bit in the Condition Register specified by BA+32 is ORed with the complement of the bit in the Condition Register specified by BB+32, and the result is placed into the bit in the Condition Register specified by BT+32. ## **Special Registers Altered:** ## **Special Registers Altered:** CR<sub>BT+32</sub> CR<sub>BT+32</sub> ## 2.5.2 Condition Register Field Instruction ## Move Condition Register Field XL-form mcrf BF,BFA $\mathtt{CR}_{4 \times \mathtt{BF} + 32:4 \times \mathtt{BF} + 35} \; \leftarrow \; \mathtt{CR}_{4 \times \mathtt{BFA} + 32:4 \times \mathtt{BFA} + 35}$ The contents of Condition Register field BFA are copied to Condition Register field BF. ## **Special Registers Altered:** CR field BF ## 2.6 System Call Instruction This instruction provides the means by which a program can call upon the system to perform a service. ## System Call SC-form sc LEV | 17 | /// | /// | // | LEV | // | 1 | / | |----|-----|-----|----|-----|----|----|----| | 0 | 6 | 11 | 16 | 20 | 27 | 30 | 31 | This instruction calls the system to perform a service. A complete description of this instruction can be found in Book III. The use of the LEV field is described in Book III. The LEV values greater than 1 are reserved, and bits 0:5 of the LEV field (instruction bits 20:25) are treated as a reserved field. When control is returned to the program that executed the *System Call* instruction, the contents of the registers will depend on the register conventions used by the program providing the system service. This instruction is context synchronizing (see Book III). #### **Special Registers Altered:** Dependent on the system service ## Programming Note sc serves as both a basic and an extended mnemonic. The Assembler will recognize an sc mnemonic with one operand as the basic form, and an sc mnemonic with no operand as the extended form. In the extended form the LEV operand is omitted and assumed to be 0. In application programs the value of the LEV operand for $\boldsymbol{sc}$ should be 0. ## **Chapter 3. Fixed-Point Processor** | 3.1 Fixed-Point Processor Overview . 41 3.2 Fixed-Point Processor Registers . 42 3.2.1 General Purpose Registers . 42 3.2.2 Fixed-Point Exception Register . 42 3.2.3 Program Priority Register [Category: Server] | | | | | | |--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------|-------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------|----------------------------------------| | 3.3.8.1 64-bit Fixed-Point Arithmetic | 3.2 Fixed-Point 3.2.1 General I 3.2.2 Fixed-Po 3.2.3 Program Server] | Purpose Registers | 3.3.10 Fixed 3.3.10.1 64- tions [Categ 3.3.11 Fixed Phased-In (: 3.3.12 Fixed 3.3.12.1 64- tions [Categ 3.3.12.2 Ph Instructions (sV2.05)] 3.3.13 Fixed Instructions 3.3.13.1 Fix 3.3.13.1.1 6 Instructions 3.3.13.2 Fix 3.3.13.2.1 6 tions [Categ 3.3.14 Move Instructions 3.3.14.1 Move Register Fie Phased-In (: 3.3.14.2 Move | d-Point Trap Instructions | 73777777777777777777777777777777777777 | ## 3.1 Fixed-Point Processor Overview This chapter describes the registers and instructions that make up the Fixed-Point Processor facility. ## 3.2 Fixed-Point Processor Registers ## 3.2.1 General Purpose Registers All manipulation of information is done in registers internal to the Fixed-Point Processor. The principal storage internal to the Fixed-Point Processor is a set of 32 General Purpose Registers (GPRs). See Figure 41. | | GPR 0 | |---|--------| | | GPR 1 | | | | | | | | | GPR 30 | | | GPR 31 | | 0 | 63 | Figure 41. General Purpose Registers Each GPR is a 64-bit register. # 3.2.2 Fixed-Point Exception Register The Fixed-Point Exception Register (XER) is a 64-bit register. #### Figure 42. Fixed-Point Exception Register The bit definitions for the Fixed-Point Exception Register are shown below. Here M=0 in 64-bit mode and M=32 in 32-bit mode. The bits are set based on the operation of an instruction considered as a whole, not on intermediate results (e.g., the *Subtract From Carrying* instruction, the result of which is specified as the sum of three values, sets bits in the Fixed-Point Exception Register based on the entire operation, not on an intermediate sum). ## Bit(s Description #### 0:31 Reserved ## 32 **Summary Overflow** (SO) The Summary Overflow bit is set to 1 whenever an instruction (except *mtspr*) sets the Overflow bit. Once set, the SO bit remains set until it is cleared by an *mtspr* instruction (specifying the XER) or an *mcrxr* instruction. It is not altered by *Compare* instructions, nor by other instructions (except *mtspr* to the XER, and *mcrxr*) that cannot overflow. Executing an *mtspr* instruction to the XER, supplying the values 0 for SO and 1 for OV, causes SO to be set to 0 and OV to be set to 1 ## 33 **Overflow** (OV) The Overflow bit is set to indicate that an overflow has occurred during execution of an instruction. XO-form *Add*, *Subtract From*, and *Negate* instructions having OE=1 set it to 1 if the carry out of bit M is not equal to the carry out of bit M+1, and set it to 0 otherwise. XO-form *Multiply Low* and *Divide* instructions having OE=1 set it to 1 if the result cannot be represented in 64 bits (*mulld*, *divd*, *divdu*) or in 32 bits (*mullw*, *divw*, *divwu*), and set it to 0 otherwise. The OV bit is not altered by *Compare* instructions, nor by other instructions (except *mtspr* to the XER, and *mcrxr*) that cannot overflow. #### [Category: Legacy Integer Multiply-Accumulate] XO-form Legacy Integer Multiply-Accumulate instructions set OV when OE=1 to reflect overflow of the 32-bit result. For signed-integer accumulation, overflow occurs when the add produces a carry out of bit 32 that is not equal to the carry out of bit 33. For unsigned-integer accumulation, overflow occurs when the add produces a carry out of bit 32. #### 34 **Carry** (CA) The Carry bit is set as follows, during execution of certain instructions. Add Carrying, Subtract From Carrying, Add Extended, and Subtract From Extended types of instructions set it to 1 if there is a carry out of bit M, and set it to 0 otherwise. Shift Right Algebraic instructions set it to 1 if any 1-bits have been shifted out of a negative operand, and set it to 0 otherwise. The CA bit is not altered by Compare instructions, nor by other instructions (except Shift Right Algebraic, mtspr to the XER, and mcrxr) that cannot carry. #### 35:56 Reserved 57:63 This field specifies the number of bytes to be transferred by a *Load String Indexed* or *Store String Indexed* instruction. [Category: Legacy Move Assist] This field is used as a target by **dmlzb** to indicate the byte location of the leftmost zero byte found. # 3.2.3 Program Priority Register [Category: Server] The Program Priority Register (PPR) is a 64-bit register that controls the program's priority. The layout of the PPR is shown in Figure 43. ## Bit(s) Description ## 11:13 Program Priority (PRI) 010 low 011 medium low 100 medium (normal) 44:63 implementation-specific (read-only; values written to this field by software are ignored) All other fields are reserved. Figure 43. Program Priority Register ## Programming Note By setting the PRI field, a programmer may be able to improve system throughput by causing system resources to be used more efficiently. E.g., if a program is waiting on a lock (see Section B.2 of Book II), it could set low priority, with the result that more processor resources would be diverted to the program that holds the lock. This diversion of resources may enable the lock-holding program to complete the operation under the lock more quickly, and then relinquish the lock to the waiting program. ## Programming Note or Rx,Rx,Rx can be used to modify the PRI field; see Section 3.3.14. ## Programming Note When the system error handler is invoked, the PRI field may be set to an undefined value. # 3.2.4 Software Use SPRs [Category: Embedded] Software Use SPRs are 64-bit registers that have no defined functionality. SPRG4-7 can be read by applica- tion programs. Additional Software Use SPRs are defined in Book III. | | SPRG4 | | |---|-------|----| | | SPRG5 | | | | SPRG6 | | | | SPRG7 | | | 0 | | 63 | Figure 44. Software-use SPRs The VRSAVE is a 32-bit register that also can be used as a software use SPR. VRSAVE is also defined as part of Category: Embedded and Vector (see Section 6.3.3) #### Programming Note - USPRG0 was made a 32-bit register and renamed to VRSAVE; see Section 6.3.3 # 3.2.5 Device Control Registers [Category: Embedded] Device Control Registers (DCRs) are on-chip registers that exist architecturally outside the processor and thus are not actually part of the processor architecture. This specification simply defines the existence of a Device Control Register 'address space' and the instructions to access them and does not define the Device Control Registers themselves. Device Control Registers may control the use of on-chip peripherals, such as memory controllers (the definition of specific Device Control Registers is implementation-dependent). The contents of user-mode-accessible Device Control Registers can be read using *mfdcrux* and written using *mtdcrux*. ## 3.3 Fixed-Point Processor Instructions ## 3.3.1 Fixed-Point Storage Access Instructions The Storage Access instructions compute the effective address (EA) of the storage to be accessed as described in Section 1.10.3 on page 26. #### Programming Note - The *la* extended mnemonic permits computing an effective address as a *Load* or *Store* instruction would, but loads the address itself into a GPR rather than loading the value that is in storage at that address. ### **Programming Note** The DS field in DS-form Storage Access instructions is a word offset, not a byte offset like the D field in D-form *Storage Access* instructions. However, for programming convenience, Assemblers should support the specification of byte offsets for both forms of instruction. ## 3.3.1.1 Storage Access Exceptions Storage accesses will cause the system data storage error handler to be invoked if the program is not allowed to modify the target storage (*Store* only), or if the program attempts to access storage that is unavailable. ## 3.3.2 Fixed-Point Load Instructions The byte, halfword, word, or doubleword in storage addressed by EA is loaded into register RT. Many of the *Load* instructions have an "update" form, in which register RA is updated with the effective address. For these forms, if RA≠0 and RA≠RT, the effective address is placed into register RA and the storage element (byte, halfword, word, or doubleword) addressed by EA is loaded into RT. ## - Programming Note - In some implementations, the *Load Algebraic* and *Load with Update* instructions may have greater latency than other types of *Load* instructions. Moreover, *Load with Update* instructions may take longer to execute in some implementations than the corresponding pair of a non-update *Load* instruction and an *Add* instruction. ## Load Byte and Zero #### D-form ## Load Byte and Zero Indexed X-form RT,RA,RB lbz RT,D(RA) | | 34 | RT | RA | D | | |---|----|----|----|----|----| | 0 | | 6 | 11 | 16 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTS(D) RT \leftarrow 560 | MEM(EA, 1) ``` Let the effective address (EA) be the sum (RA|0)+ D. The byte in storage addressed by EA is loaded into RT $_{56:63}$ . RT $_{0:55}$ are set to 0. ## **Special Registers Altered:** None ## None ## Load Byte and Zero with Update D-form lbzu RT,D(RA) $$\begin{array}{lll} \texttt{EA} & \leftarrow & (\texttt{RA}) & + & \texttt{EXTS} \, (\texttt{D}) \\ \texttt{RT} & \leftarrow & ^{56}\texttt{0} & \big| \, \big| & \texttt{MEM} \, (\texttt{EA}, \ 1) \\ \texttt{RA} & \leftarrow & \texttt{EA} \end{array}$$ Let the effective address (EA) be the sum (RA)+ D. The byte in storage addressed by EA is loaded into RT $_{56:63}$ . RT $_{0:55}$ are set to 0. EA is placed into register RA. If RA=0 or RA=RT, the instruction form is invalid. #### **Special Registers Altered:** None 31 RT RA RB 87 / 31 ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) RT \leftarrow 560 | MEM(EA, 1) ``` lbzx Let the effective address (EA) be the sum (RA|0)+ (RB). The byte in storage addressed by EA is loaded into $RT_{56:63}$ . $RT_{0:55}$ are set to 0. ## **Special Registers Altered:** ## Load Byte and Zero with Update Indexed X-form lbzux RT,RA,RB | 31 | RT | RA | RB | 119 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` EA \leftarrow (RA) + (RB) RT \leftarrow <sup>56</sup>0 || MEM(EA, 1) RA \leftarrow EA ``` Let the effective address (EA) be the sum (RA)+ (RB). The byte in storage addressed by EA is loaded into $RT_{56:63}$ . $RT_{0:55}$ are set to 0. EA is placed into register RA. If RA=0 or RA=RT, the instruction form is invalid. ## **Special Registers Altered:** None ## Load Halfword and Zero #### D-form ## Load Halfword and Zero Indexed X-form lhz RT,D(RA) | | 40 | RT | RA | D | | |---|----|----|----|------|---| | 0 | | 6 | 11 | 16 3 | 1 | if RA = 0 then b $$\leftarrow$$ 0 else b $\leftarrow$ (RA) EA $\leftarrow$ b + EXTS(D) RT $\leftarrow$ 480 | MEM(EA, 2) Let the effective address (EA) be the sum (RA|0)+ D. The halfword in storage addressed by EA is loaded into RT<sub>48:63</sub>. RT<sub>0:47</sub> are set to 0. ## **Special Registers Altered:** None ## lhzx RT,RA,RB | 31 | RT | RA | RB | 279 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | if RA = 0 then b $$\leftarrow$$ 0 else b $\leftarrow$ (RA) EA $\leftarrow$ b + (RB) RT $\leftarrow$ $^{48}$ 0 | MEM(EA, 2) Let the effective address (EA) be the sum (RA|0)+ (RB). The halfword in storage addressed by EA is loaded into RT $_{48:63}$ . RT $_{0:47}$ are set to 0. ## **Special Registers Altered:** None ## Load Halfword and Zero with Update D-form ## Ihzu RT,D(RA) | 41 | RT | RA | D | | |----|----|----|----|----| | 0 | 6 | 11 | 16 | 31 | $$EA \leftarrow (RA) + EXTS(D)$$ $RT \leftarrow {}^{48}O \mid \mid MEM(EA, 2)$ Let the effective address (EA) be the sum (RA)+ D. The halfword in storage addressed by EA is loaded into $RT_{48:63}$ . $RT_{0:47}$ are set to 0. EA is placed into register RA. If RA=0 or RA=RT, the instruction form is invalid. ## **Special Registers Altered:** None ## Load Halfword and Zero with Update Indexed X-form Ihzux RT,RA,RB | 31 | RT | RA | RB | 311 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$EA \leftarrow (RA) + (RB)$$ $RT \leftarrow {}^{48}0 \mid \mid MEM(EA, 2)$ $RA \leftarrow EA$ Let the effective address (EA) be the sum (RA)+ (RB). The halfword in storage addressed by EA is loaded into $RT_{48:63}$ . $RT_{0:47}$ are set to 0. EA is placed into register RA. If RA=0 or RA=RT, the instruction form is invalid. ## **Special Registers Altered:** None # Load Halfword Algebraic #### D-form # Load Halfword Algebraic Indexed X-form RT,RA,RB lha RT,D(RA) | | 42 | RT | RA | D | | |---|----|----|----|----|----| | 0 | | 6 | 11 | 16 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTS(D) RT \leftarrow EXTS(MEM(EA, 2)) ``` Let the effective address (EA) be the sum (RA|0)+ D. The halfword in storage addressed by EA is loaded into RT $_{48:63}$ . RT $_{0:47}$ are filled with a copy of bit 0 of the loaded halfword. #### **Special Registers Altered:** None # Load Halfword Algebraic with Update Load Halfword Algebraic w D-form Ihau RT,D(RA) | 13 | RT | RΔ | П | | |----|----|----|----|----| | 0 | 6 | 11 | 16 | 31 | ``` EA \leftarrow (RA) + EXTS(D) RT \leftarrow EXTS(MEM(EA, 2)) RA \leftarrow EA ``` Let the effective address (EA) be the sum (RA)+ D. The halfword in storage addressed by EA is loaded into RT $_{48:63}$ . RT $_{0:47}$ are filled with a copy of bit 0 of the loaded halfword. EA is placed into register RA. If RA=0 or RA=RT, the instruction form is invalid. #### **Special Registers Altered:** None 31 | RT | RA | RB | 343 | / 0 | 6 | 11 | 16 | 21 | 31 ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) RT \leftarrow EXTS (MEM(EA, 2)) ``` Let the effective address (EA) be the sum (RA|0)+ (RB). The halfword in storage addressed by EA is loaded into RT $_{48:63}$ . RT $_{0:47}$ are filled with a copy of bit 0 of the loaded halfword. #### **Special Registers Altered:** None lhax # Load Halfword Algebraic with Update Indexed X-form Ihaux RT,RA,RB | 31 | RT | RA | RB | 375 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` EA \leftarrow (RA) + (RB) RT \leftarrow EXTS(MEM(EA, 2)) RA \leftarrow EA ``` Let the effective address (EA) be the sum (RA)+ (RB). The halfword in storage addressed by EA is loaded into RT $_{48:63}$ . RT $_{0:47}$ are filled with a copy of bit 0 of the loaded halfword. EA is placed into register RA. If RA=0 or RA=RT, the instruction form is invalid. ## **Special Registers Altered:** #### Load Word and Zero #### D-form # Load Word and Zero Indexed X-form RB 16 23 31 RA 11 lwz RT,D(RA) | ſ | 32 | RT | RA | D | |---|----|----|----|-------| | ( | ) | 6 | 11 | 16 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTS(D) RT \leftarrow 320 | MEM(EA, 4) ``` Let the effective address (EA) be the sum (RA|0)+ D. The word in storage addressed by EA is loaded into $RT_{32:63}$ . $RT_{0:31}$ are set to 0. #### **Special Registers Altered:** None # if RA = 0 then b $\leftarrow$ 0 else b $\leftarrow$ (RA) EA $\leftarrow$ b + (RB) RT $\leftarrow$ 320 || MEM(EA, 4) RT,RA,RB RT Let the effective address (EA) be the sum (RA|0)+ (RB). The word in storage addressed by EA is loaded into $RT_{32:63}$ . $RT_{0:31}$ are set to 0. # **Special Registers Altered:** None lwzx 31 # Load Word and Zero with Update D-form #### lwzu RT,D(RA) $$EA \leftarrow (RA) + EXTS(D)$$ $RT \leftarrow ^{32}O \mid \mid MEM(EA, 4)$ $RA \leftarrow EA$ Let the effective address (EA) be the sum (RA)+ D. The word in storage addressed by EA is loaded into $RT_{32:63}$ . $RT_{0:31}$ are set to 0. EA is placed into register RA. If RA=0 or RA=RT, the instruction form is invalid. #### Special Registers Altered: None # Load Word and Zero with Update Indexed X-form lwzux RT,RA,RB | 31 | RT | RA | RB | 55 | / | |----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | EA $$\leftarrow$$ (RA) + (RB) RT $\leftarrow$ <sup>32</sup>0 | | MEM(EA, 4) RA $\leftarrow$ EA Let the effective address (EA) be the sum (RA)+ (RB). The word in storage addressed by EA is loaded into $RT_{32:63}$ . $RT_{0:31}$ are set to 0. EA is placed into register RA. If RA=0 or RA=RT, the instruction form is invalid. # **Special Registers Altered:** # 3.3.2.1 64-bit Fixed-Point Load Instructions [Category: 64-Bit] # Load Word Algebraic # DS-form # Load Word Algebraic Indexed RT,RA,RB X-form lwa RT,DS(RA) | 58 | RT | RA | DS | 2 | |----|----|----|----|-------| | 0 | 6 | 11 | 16 | 30 31 | | if | RA | = | 0 | then | b | $\leftarrow$ | 0 | |-----|--------------|---|---|--------|----|--------------|-------| | els | se | | | | b | $\leftarrow$ | (RA) | | EΑ | $\leftarrow$ | b | + | EXTS ( | DS | | 0b00) | | | | | | (MFM ( | | | | Let the effective address (EA) be the sum (RA|0)+ (DS||0b00). The word in storage addressed by EA is loaded into RT $_{32:63}$ . RT $_{0:31}$ are filled with a copy of bit 0 of the loaded word. #### **Special Registers Altered:** None ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) RT \leftarrow EXTS (MEM(EA, 4)) ``` Let the effective address (EA) be the sum (RA|0)+ (RB). The word in storage addressed by EA is loaded into $RT_{32:63}$ . $RT_{0:31}$ are filled with a copy of bit 0 of the loaded word. # **Special Registers Altered:** None lwax ## Load Word Algebraic with Update Indexed X-form lwaux RT,RA,RB | 31 | RT | RA | RB | 373 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` EA \leftarrow (RA) + (RB) RT \leftarrow EXTS(MEM(EA, 4)) RA \leftarrow EA ``` Let the effective address (EA) be the sum (RA)+ (RB). The word in storage addressed by EA is loaded into $RT_{32:63}$ . $RT_{0:31}$ are filled with a copy of bit 0 of the loaded word. EA is placed into register RA. If RA=0 or RA=RT, the instruction form is invalid. # **Special Registers Altered:** #### Load Doubleword #### DS-form # Load Doubleword Indexed RT,RA,RB X-form 31 21 ld RT,DS(RA) | | 58 | RT | RA | DS | 0 | |---|----|----|----|----|-------| | 0 | | 6 | 11 | 16 | 30 31 | ``` if RA = 0 then b \leftarrow 0 b \leftarrow (RA) EA \leftarrow b + EXTS(DS | | 0b00) RT \leftarrow MEM(EA, 8) ``` Let the effective address (EA) be the sum (RA|0)+ (DS||0b00). The doubleword in storage addressed by EA is loaded into RT. #### **Special Registers Altered:** None # 11 if RA = 0 then b $\leftarrow$ 0 RA ``` else b \leftarrow (RA) EA \leftarrow b + (RB) RT \leftarrow MEM(EA, 8) ``` RT Let the effective address (EA) be the sum (RA|0)+ (RB). The doubleword in storage addressed by EA is loaded into RT. RB 21 16 #### **Special Registers Altered:** None ldx 31 #### Load Doubleword with Update DS-form #### ldu RT,DS(RA) | 58 | RT | RA | DS | 1 | |----|----|----|----|-------| | 0 | 6 | 11 | 16 | 30 31 | ``` EA \leftarrow (RA) + EXTS(DS | | 0b00) RT \leftarrow MEM(EA, 8) RA ← EA ``` Let the effective address (EA) be the sum (RA)+ (DS||0b00). The doubleword in storage addressed by EA is loaded into RT. EA is placed into register RA. If RA=0 or RA=RT, the instruction form is invalid. #### Special Registers Altered: None # Load Doubleword with Update Indexed X-form ldux RT,RA,RB | 31 | RT | RA | RB | 53 | / | |----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` EA \leftarrow (RA) + (RB) RT \leftarrow MEM(EA, 8) RA ← EA ``` Let the effective address (EA) be the sum (RA)+ (RB). The doubleword in storage addressed by EA is loaded into RT. EA is placed into register RA. If RA=0 or RA=RT, the instruction form is invalid. ## **Special Registers Altered:** # 3.3.3 Fixed-Point Store Instructions The contents of register RS are stored into the byte, halfword, word, or doubleword in storage addressed by EA. Many of the Store instructions have an "update" form, in which register RA is updated with the effective address. For these forms, the following rules apply. - If RA≠0, the effective address is placed into regis- - If RS=RA, the contents of register RS are copied to the target storage element and then EA is placed into RA (RS). # Store Byte #### D-form # Store Byte Indexed RS,RA,RB # X-form stb RS,D(RA) | 38 | RS | RA | D | | |----|----|----|----|----| | 0 | 6 | 11 | 16 | 31 | if RA = 0 then b $$\leftarrow$$ 0 else b $\leftarrow$ (RA) EA $\leftarrow$ b + EXTS(D) MEM(EA, 1) $\leftarrow$ (RS)<sub>56:63</sub> Let the effective address (EA) be the sum (RA|0)+ D. (RS)<sub>56:63</sub> are stored into the byte in storage addressed by EA. # **Special Registers Altered:** None stbu RA ← EA stbx | 31 | RS | RA | RB | 215 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | if RA = 0 then b $$\leftarrow$$ 0 else b $\leftarrow$ (RA) EA $\leftarrow$ b + (RB) MEM(EA, 1) $\leftarrow$ (RS)<sub>56:63</sub> Let the effective address (EA) be the sum (RA|0)+ (RB). (RS)<sub>56:63</sub> are stored into the byte in storage addressed by EA. #### **Special Registers Altered:** RS,RA,RB None stbux # Store Byte with Update RS,D(RA) #### D-form #### Store Byte with Update Indexed X-form | ĺ | 39 | RS | RA | D | | |---|----|----|----|----|----| | ١ | 0 | 6 | 11 | 16 | 31 | | EA ← (RA) | + EXTS(D) | |------------|-------------------------| | MEM(EA, 1) | ← (RS) <sub>56:63</sub> | Let the effective address (EA) be the sum (RA)+ D. (RS)<sub>56:63</sub> are stored into the byte in storage addressed by EA. EA is placed into register RA. If RA=0, the instruction form is invalid. #### **Special Registers Altered:** None | 31 | RS | RA | RB | 247 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{lll} \text{EA} \leftarrow & (\text{RA}) + (\text{RB}) \\ \text{MEM} (\text{EA}, 1) \leftarrow & (\text{RS})_{56:63} \\ \text{RA} \leftarrow & \text{EA} \end{array}$$ Let the effective address (EA) be the sum (RA)+ (RB). (RS)<sub>56:63</sub> are stored into the byte in storage addressed by EA. EA is placed into register RA. If RA=0, the instruction form is invalid. #### **Special Registers Altered:** #### Store Halfword #### D-form #### Store Halfword Indexed RS,RA,RB 11 RA X-form 31 407 RS,D(RA) sth | ſ | 44 | RS | RA | D | | |---|----|----|----|-------|---| | | 0 | 6 | 11 | 16 31 | ı | if RA = 0 then b $$\leftarrow$$ 0 else b $\leftarrow$ (RA) EA $\leftarrow$ b + EXTS(D) MEM(EA, 2) $\leftarrow$ (RS)<sub>48:63</sub> Let the effective address (EA) be the sum (RA|0)+ D. (RS)<sub>48:63</sub> are stored into the halfword in storage addressed by EA. #### **Special Registers Altered:** None # if RA = 0 then b $\leftarrow$ 0 b ← (RA) RS $EA \leftarrow b + (RB)$ sthx else 31 $\texttt{MEM}(\texttt{EA, 2}) \leftarrow (\texttt{RS})_{48:63}$ Let the effective address (EA) be the sum (RA|0)+(RB). $(RS)_{48:63}$ are stored into the halfword in storage addressed by EA. RB 21 16 #### **Special Registers Altered:** None # Store Halfword with Update # **D-form** sthu RS,D(RA) | 45 | RS | RA | D | | |----|----|----|----|----| | 0 | 6 | 11 | 16 | 31 | $$EA \leftarrow (RA) + EXTS(D)$$ $MEM(EA, 2) \leftarrow (RS)_{48:63}$ $RA \leftarrow EA$ Let the effective address (EA) be the sum (RA)+ D. $(RS)_{48:63}$ are stored into the halfword in storage addressed by EA. EA is placed into register RA. If RA=0, the instruction form is invalid. #### Special Registers Altered: None # Store Halfword with Update Indexed X-form RS,RA,RB sthux | 3′ | 1 R | S RA | RB | 439 | / | |----|-----|------|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$EA \leftarrow (RA) + (RB)$$ $MEM(EA, 2) \leftarrow (RS)_{48:63}$ $RA \leftarrow EA$ Let the effective address (EA) be the sum (RA)+ (RB). (RS)<sub>48:63</sub> are stored into the halfword in storage addressed by EA. EA is placed into register RA. If RA=0, the instruction form is invalid. # **Special Registers Altered:** #### Store Word #### D-form # Store Word Indexed RS,RA,RB stwx # X-form stw RS,D(RA) | 36 | RS | RA | D | | |----|----|----|------|----| | 0 | 6 | 11 | 16 3 | 31 | if RA = 0 then b $$\leftarrow$$ 0 else b $\leftarrow$ (RA) EA $\leftarrow$ b + EXTS(D) MEM(EA, 4) $\leftarrow$ (RS)<sub>32:63</sub> Let the effective address (EA) be the sum (RA|0)+ D. (RS) $_{32:63}$ are stored into the word in storage addressed by EA. #### **Special Registers Altered:** None # 31 RS RA RB 151 / if RA = 0 then b $$\leftarrow$$ 0 else b $\leftarrow$ (RA) EA $\leftarrow$ b + (RB) MEM(EA, 4) $\leftarrow$ (RS)<sub>32:63</sub> Let the effective address (EA) be the sum (RA|0)+ (RB). (RS) $_{32:63}$ are stored into the word in storage addressed by EA. #### **Special Registers Altered:** None # Store Word with Update # D-form # Store Word with Update Indexed X-form stwu RS,D(RA) | 37 | RS | RA | D | ٦ | |----|----|----|------|---| | 0 | 6 | 11 | 16 3 | 1 | $$EA \leftarrow (RA) + EXTS(D)$$ $MEM(EA, 4) \leftarrow (RS)_{32:63}$ $RA \leftarrow EA$ Let the effective address (EA) be the sum (RA)+ D. $(RS)_{32:63}$ are stored into the word in storage addressed by EA. EA is placed into register RA. If RA=0, the instruction form is invalid. #### **Special Registers Altered:** None | 0.0.0. | ora mar opeans masses | | |--------|-----------------------|--| | stwux | RS,RA,RB | | | 31 | RS | RA | RB | 183 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$EA \leftarrow (RA) + (RB)$$ $MEM(EA, 4) \leftarrow (RS)_{32:63}$ $RA \leftarrow EA$ Let the effective address (EA) be the sum (RA)+ (RB). (RS)<sub>32:63</sub> are stored into the word in storage addressed by EA. EA is placed into register RA. If RA=0, the instruction form is invalid. #### **Special Registers Altered:** # 3.3.3.1 64-bit Fixed-Point Store Instructions [Category: 64-Bit] #### Store Doubleword #### DS-form #### Store Doubleword Indexed X-form std RS,DS(RA) | 62 | RS | RA | DS | 0 | |----|----|----|----|-------| | 0 | 6 | 11 | 16 | 30 31 | if RA = 0 then b $$\leftarrow$$ 0 else b $\leftarrow$ (RA) EA $\leftarrow$ b + EXTS(DS | | 0b00) MEM(EA, 8) $\leftarrow$ (RS) Let the effective address (EA) be the sum (RA|0)+ (DS||0b00). (RS) is stored into the doubleword in storage addressed by EA. # **Special Registers Altered:** None # stdx RS,RA,RB | 31 | RS | RA | RB | 149 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | if RA = 0 then b $$\leftarrow$$ 0 else b $\leftarrow$ (RA) EA $\leftarrow$ b + (RB) MEM(EA, 8) $\leftarrow$ (RS) Let the effective address (EA) be the sum (RA $|0\rangle$ + (RB). (RS) is stored into the doubleword in storage addressed by EA. # **Special Registers Altered:** None # Store Doubleword with Update DS-form ## stdu RS,DS(RA) | 62 | RS | RA | DS | 1 | |----|----|----|----|-------| | 02 | | | 50 | • | | 0 | 6 | 11 | 16 | 30 31 | | | | | | | $$EA \leftarrow (RA) + EXTS(DS \mid \mid 0b00)$$ $MEM(EA, 8) \leftarrow (RS)$ $RA \leftarrow EA$ Let the effective address (EA) be the sum (RA)+ (DS||0b00). (RS) is stored into the doubleword in storage addressed by EA. EA is placed into register RA. If RA=0, the instruction form is invalid. #### Special Registers Altered: None # Store Doubleword with Update Indexed X-form stdux RS,RA,RB | | 31 | RS | RA | RB | 181 | / | |---|----|----|----|----|-----|----| | ( | ) | 6 | 11 | 16 | 21 | 31 | Let the effective address (EA) be the sum (RA)+ (RB). (RS) is stored into the doubleword in storage addressed by EA. EA is placed into register RA. If RA=0, the instruction form is invalid. # **Special Registers Altered:** # 3.3.4 Fixed-Point Load and Store with Byte Reversal Instructions ## - Programming Note These instructions have the effect of loading and storing data in the opposite byte ordering from that which would be used by other *Load* and *Store* instructions. #### Programming Note In some implementations, the Load Byte-Reverse instructions may have greater latency than other Load instructions. # Load Halfword Byte-Reverse Indexed X-form Ihbrx RT,RA,RB | 31 | RT | RA | RB | 790 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) load_data \leftarrow MEM(EA, 2) RT \leftarrow 480 || load_data<sub>8:15</sub> || load_data<sub>0:7</sub> ``` Let the effective address (EA) be the sum (RA|0)+(RB). Bits 0:7 of the halfword in storage addressed by EA are loaded into RT $_{56:63}$ . Bits 8:15 of the halfword in storage addressed by EA are loaded into RT $_{48:55}$ . RT $_{0:47}$ are set to 0. #### **Special Registers Altered:** None # Store Halfword Byte-Reverse Indexed X-form sthbrx RS,RA,RB | 31 | RS | RA | RB | 918 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) MEM(EA, 2) \leftarrow (RS)<sub>56:63</sub> | (RS)<sub>48:55</sub> ``` Let the effective address (EA) be the sum (RA|0)+(RB). $(RS)_{56:63}$ are stored into bits 0:7 of the halfword in storage addressed by EA. $(RS)_{48:55}$ are stored into bits 8:15 of the halfword in storage addressed by EA. ## **Special Registers Altered:** None #### Load Word Byte-Reverse Indexed X-form lwbrx RT,RA,RB | 31 | RT | RA | RB | 534 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) load_data \leftarrow MEM(EA, 4) RT \leftarrow 320 || load_data<sub>24:31</sub> || load_data<sub>16:23</sub> || load_data<sub>0:7</sub> ``` Let the effective address (EA) be the sum (RA|0)+ (RB). Bits 0:7 of the word in storage addressed by EA are loaded into RT $_{56:63}$ . Bits 8:15 of the word in storage addressed by EA are loaded into RT $_{48:55}$ . Bits 16:23 of the word in storage addressed by EA are loaded into RT $_{40:47}$ . Bits 24:31 of the word in storage addressed by EA are loaded into RT $_{32:39}$ . RT $_{0:31}$ are set to 0. #### **Special Registers Altered:** None #### Store Word Byte-Reverse Indexed X-form stwbrx RS,RA,RB | 31 | RS | RA | RB | 662 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) MEM(EA, 4) \leftarrow (RS)<sub>56:63</sub> || (RS)<sub>48:55</sub> || (RS)<sub>40:47</sub> || (RS)<sub>32:39</sub> ``` Let the effective address (EA) be the sum (RA|0)+(RB). $(RS)_{56:63}$ are stored into bits 0:7 of the word in storage addressed by EA. $(RS)_{48:55}$ are stored into bits 8:15 of the word in storage addressed by EA. $(RS)_{40:47}$ are stored into bits 16:23 of the word in storage addressed by EA. $(RS)_{32:39}$ are stored into bits 24:31 of the word in storage addressed by EA. # **Special Registers Altered:** # 3.3.5 Fixed-Point Load and Store Multiple Instructions The Load/Store Multiple instructions have preferred forms; see Section 1.8.1, "Preferred Instruction Forms" on page 21. In the preferred forms, storage alignment satisfies the following rule. ■ The combination of the EA and RT (RS) is such that the low-order byte of GPR 31 is loaded (stored) from (into) the last byte of an aligned quadword in storage. For the Server environment, the *Load/Store Multiple* instructions are not supported in Little-Endian mode. If they are executed in Little-Endian mode, the system alignment error handler is invoked. # Load Multiple Word # **D-form** lmw RT,D(RA) ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTS(D) r \leftarrow RT do while r \leq 31 GPR(r) \leftarrow 320 || MEM(EA, 4) r \leftarrow r + 1 EA \leftarrow EA + 4 ``` Let n = (32-RT). Let the effective address (EA) be the sum (RA|0)+ D. n consecutive words starting at EA are loaded into the low-order 32 bits of GPRs RT through 31. The high-order 32 bits of these GPRs are set to zero. If RA is in the range of registers to be loaded, including the case in which RA=0, the instruction form is invalid. ## **Special Registers Altered:** # Store Multiple Word # **D-form** stmw RS,D(RA) | 47 | RS | RA | D | |----|----|----|-------------------| | 0 | 6 | 11 | 16 3 <sup>-</sup> | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTS(D) r \leftarrow RS do while r \leq 31 MEM(EA, 4) \leftarrow GPR(r)<sub>32:63</sub> r \leftarrow r + 1 EA \leftarrow EA + 4 ``` Let n = (32-RS). Let the effective address (EA) be the sum (RA|0)+ D. n consecutive words starting at EA are stored from the low-order 32 bits of GPRs RS through 31. # **Special Registers Altered:** # 3.3.6 Fixed-Point Move Assist Instructions [Category: Move Assist] The Move Assist instructions allow movement of data from storage to registers or from registers to storage without concern for alignment. These instructions can be used for a short move between arbitrary storage locations or to initiate a long move between unaligned storage fields. The Load/Store String instructions have preferred forms; see Section 1.8.1, "Preferred Instruction Forms" on page 21. In the preferred forms, register usage satisfies the following rules. ■ RS = 4 or 5 - RT = 4 or 5 - last register loaded/stored ≤ 12 For some implementations, using GPR 4 for RS and RT may result in slightly faster execution than using GPR 5. For the Server environment, the *Move Assist* instructions are not supported in Little-Endian mode. If they are executed in Little-Endian mode, the system alignment error handler may be invoked or the instructions may be treated as no-ops if the number of bytes specified by the instruction is 0. # **Load String Word Immediate** # X-form # Load String Word Indexed X-form Iswi RT,RA,NB | ſ | 31 | RT | RA | NB | 597 | / | |---|----|----|----|----|-----|----| | ŀ | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then EA \leftarrow 0 else EA \leftarrow (RA) if NB = 0 then n \leftarrow 32 n \leftarrow NB else r ← RT - 1 i ← 32 do while n > 0 if i = 32 then r \leftarrow r + 1 \pmod{32} GPR(r) \leftarrow 0 GPR(r)_{i:i+7} \leftarrow MEM(EA, 1) i \leftarrow i + 8 if i = 64 then i \leftarrow 32 EA \leftarrow EA + 1 n \leftarrow n - 1 ``` Let the effective address (EA) be (RA|0). Let n = NB if $NB \neq 0$ , n = 32 if NB = 0; n is the number of bytes to load. Let nr = CEIL(n/4); nr is the number of registers to receive data. n consecutive bytes starting at EA are loaded into GPRs RT through RT+nr-1. Data are loaded into the low-order four bytes of each GPR; the high-order four bytes are set to 0. Bytes are loaded left to right in each register. The sequence of registers wraps around to GPR 0 if required. If the low-order four bytes of register RT+nr-1 are only partially filled, the unfilled low-order byte(s) of that register are set to 0. If RA is in the range of registers to be loaded, including the case in which RA=0, the instruction form is invalid. #### **Special Registers Altered:** None #### Iswx RT,RA,RB | | 31 | RT | RA | RB | 533 | / | |---|----|----|------|-----|-----|-----| | | ٠. | | 10/1 | 110 | 000 | l ′ | | 0 | | 6 | l11 | 16 | 21 | 31 | | | | - | | - | | - | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) n \leftarrow XER_{57:63} r ← RT - 1 i ← 32 RT ← undefined do while n > 0 if i = 32 then r \leftarrow r + 1 \pmod{32} GPR(r) \leftarrow 0 GPR(r)_{i:i+7} \leftarrow MEM(EA, 1) i \leftarrow i + 8 if i = 64 then i \leftarrow 32 EA \leftarrow EA + 1 n \leftarrow n - 1 ``` Let the effective address (EA) be the sum (RA|0)+ (RB). Let n=XER<sub>57:63</sub>; n is the number of bytes to load. Let nr=CEIL(n/4); nr is the number of registers to receive data. If n>0, n consecutive bytes starting at EA are loaded into GPRs RT through RT+nr-1. Data are loaded into the low-order four bytes of each GPR; the high-order four bytes are set to 0. Bytes are loaded left to right in each register. The sequence of registers wraps around to GPR 0 if required. If the low-order four bytes of register RT+nr-1 are only partially filled, the unfilled low-order byte(s) of that register are set to 0. If n=0, the contents of register RT are undefined. If RA or RB is in the range of registers to be loaded, including the case in which RA=0, the instruction is treated as if the instruction form were invalid. If RT=RA or RT=RB, the instruction form is invalid. #### **Special Registers Altered:** # Store String Word Immediate X-form # Store String Word Indexed X-form stswi RS,RA,NB | | 31 | RS | RA | NB | 725 | / | |---|----|----|----|----|-----|----| | 0 | | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then EA \leftarrow 0 else EA \leftarrow (RA) if NB = 0 then n \leftarrow 32 else n \leftarrow NB r \leftarrow RS - 1 i \leftarrow 32 do while n > 0 if i = 32 then r \leftarrow r + 1 (mod 32) MEM(EA, 1) \leftarrow GPR(r)<sub>i:i+7</sub> i \leftarrow i + 8 if i = 64 then i \leftarrow 32 EA \leftarrow EA + 1 n \leftarrow n - 1 ``` Let the effective address (EA) be (RA|0). Let n = NB if $NB \neq 0$ , n = 32 if NB = 0; n = 12 is the number of bytes to store. Let n = CEIL(n/4); n = 12 is the number of registers to supply data. n consecutive bytes starting at EA are stored from GPRs RS through RS+nr-1. Data are stored from the low-order four bytes of each GPR. Bytes are stored left to right from each register. The sequence of registers wraps around to GPR 0 if required. # **Special Registers Altered:** None ### stswx RS,RA,RB | 31 | RS | RA | RB | 661 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) n \leftarrow XER<sub>57:63</sub> r \leftarrow RS - 1 i \leftarrow 32 do while n > 0 if i = 32 then r \leftarrow r + 1 (mod 32) MEM(EA, 1) \leftarrow GPR(r)<sub>i:i+7</sub> i \leftarrow i + 8 if i = 64 then i \leftarrow 32 EA \leftarrow EA + 1 n \leftarrow n - 1 ``` Let the effective address (EA) be the sum (RA|0)+ (RB). Let $n = XER_{57:63}$ ; n is the number of bytes to store. Let n = CEIL(n/4); nr is the number of registers to supply data. If n>0, n consecutive bytes starting at EA are stored from GPRs RS through RS+nr-1. Data are stored from the low-order four bytes of each GPR. Bytes are stored left to right from each register. The sequence of registers wraps around to GPR 0 if required. If n=0, no bytes are stored. #### **Special Registers Altered:** # 3.3.7 Other Fixed-Point Instructions The remainder of the fixed-point instructions use the contents of the General Purpose Registers (GPRs) as source operands, and place results into GPRs, into the Fixed-Point Exception Register (XER), and into Condition Register fields. In addition, the *Trap* instructions test the contents of a GPR or XER bit, invoking the system trap handler if the result of the specified test is true. These instructions treat the source operands as signed integers unless the instruction is explicitly identified as performing an unsigned operation. The X-form and XO-form instructions with Rc=1, and the D-form instructions *addic.*, *andi.*, and *andis.*, set the first three bits of CR Field 0 to characterize the result placed into the target register. In 64-bit mode, these bits are set by signed comparison of the result to zero. In 32-bit mode, these bits are set by signed comparison of the low-order 32 bits of the result to zero. Unless otherwise noted and when appropriate, when CR Field 0 and the XER are set they reflect the value placed into the target register. # **Programming Note** Instructions with the OE bit set or that set CA may execute slowly or may prevent the execution of subsequent instructions until the instruction has completed. # 3.3.8 Fixed-Point Arithmetic Instructions The XO-form Arithmetic instructions with Rc=1, and the D-form Arithmetic instruction addic., set the first three bits of CR Field 0 as described in Section 3.3.7, "Other Fixed-Point Instructions". addic, addic., subfic, addc, subfc, adde, subfe, addme, subfme, addze, and subfze always set CA, to reflect the carry out of bit 0 in 64-bit mode and out of bit 32 in 32-bit mode. The XO-form Arithmetic instructions set SO and OV when OE=1 to reflect overflow of the result. Except for the Multiply Low and Divide instructions, the setting of these bits is mode-dependent, and reflects overflow of the 64-bit result in 64-bit mode and overflow of the low-order 32-bit result in 32-bit mode. For XO-form Multiply Low and Divide instructions, the setting of these bits is mode-independent, and reflects overflow of the 64-bit result for mulld, divd, and divdu, and overflow of the low-order 32-bit result for mullw, divw, and divwu. #### Programming Note - Notice that CR Field 0 may not reflect the "true" (infinitely precise) result if overflow occurs. # Extended mnemonics for addition and subtraction Several extended mnemonics are provided that use the Add Immediate and Add Immediate Shifted instructions to load an immediate value or an address into a target register. Some of these are shown as examples with the two instructions. The Power ISA supplies Subtract From instructions, which subtract the second operand from the third. A set of extended mnemonics is provided that use the more "normal" order, in which the third operand is subtracted from the second, with the third operand being either an immediate field or a register. Some of these are shown as examples with the appropriate Add and Subtract From instructions. See Appendix D for additional extended mnemonics. #### Add Immediate #### **D-form** #### Add Immediate Shifted addi RT,RA,SI | | 14 | RT | RA | SI | | |---|----|----|----|----|----| | 0 | | 6 | 11 | 16 | 31 | if RA = 0 then $RT \leftarrow EXTS(SI)$ $RT \leftarrow (RA) + EXTS(SI)$ The sum (RA|0) + SI is placed into register RT. ## **Special Registers Altered:** None #### **Extended Mnemonics:** Examples of extended mnemonics for Add Immediate: | Extended: | | Equiv | Equivalent to: | | | |-----------|-------------|-------|----------------|--|--| | li | Rx,value | addi | Rx,0,value | | | | la | Rx,disp(Ry) | addi | Rx,Ry,disp | | | | subi | Rx,Ry,value | addi | Rx,Ry,-value | | | ## Programming Note - addi, addis, add, and subf are the preferred instructions for addition and subtraction, because they set few status bits. Notice that addi and addis use the value 0, not the contents of GPR 0, if RA=0. D-form addis RT,RA,SI | 15 | RT | RA | SI | | |----|----|----|------|---| | 0 | 6 | 11 | 16 3 | 1 | if RA = 0 then RT $\leftarrow$ EXTS(SI || $^{16}$ 0) $RT \leftarrow (RA) + EXTS(SI || ^{16}0)$ The sum (RA|0) + (SI || 0x0000) is placed into register # **Special Registers Altered:** None #### **Extended Mnemonics:** Examples of extended mnemonics for Add Immediate Shifted: | Extended: | | Equivalent to: | | | |-----------|-------------|----------------|--------------|--| | lis | Rx,value | addis | Rx,0,value | | | suhis | Rx Ry value | addis | Rx Rv -value | | | Add | | XO-form | Subtract From | XO-form | |------------------------------|----------------------------------------------|----------------------------------------------------------|----------------------------------------------|------------------------------------| | add<br>add.<br>addo<br>addo. | RT,RA,RB<br>RT,RA,RB<br>RT,RA,RB<br>RT,RA,RB | (OE=0 Rc=0)<br>(OE=0 Rc=1)<br>(OE=1 Rc=0)<br>(OE=1 Rc=1) | subf RT,RA, subfo RT,RA, subfo RT,RA, subfo. | ,RB (OE=0 Rc=1)<br>,RB (OE=1 Rc=0) | | 31<br>0 6 | RT RA RB | OE 266 Rc 31 | 31 RT 0 6 1 | RA RB OE 40 Rc 16 21 22 31 | $RT \leftarrow (RA) + (RB)$ The sum (RA) + (RB) is placed into register RT. # **Special Registers Altered:** $RT \leftarrow \neg (RA) + (RB) + 1$ The sum $\neg$ (RA) + (RB) +1 is placed into register RT. #### **Special Registers Altered:** | CR0 | (if Rc=1) | |-------|-----------| | SO OV | (if OE=1) | #### **Extended Mnemonics:** Example of extended mnemonics for Subtract From: | Extended: | | Equivalent to: | | | |-----------|----------|----------------|----------|--| | sub | Rx,Ry,Rz | subf | Rx,Rz,Ry | | # Add Immediate Carrying D-form addic RT,RA,SI $RT \leftarrow (RA) + EXTS(SI)$ The sum (RA) + SI is placed into register RT. #### **Special Registers Altered:** CA # **Extended Mnemonics:** Example of extended mnemonics for *Add Immediate Carrying*: Extended: Equivalent to: subic Rx,Ry,value addic Rx,Ry,-value # Add Immediate Carrying and Record D-form addic. RT,RA,SI RT ← (RA) + EXTS(SI) The sum (RA) + SI is placed into register RT. # **Special Registers Altered:** CR0 CA #### **Extended Mnemonics:** Example of extended mnemonics for *Add Immediate* Carrying and Record: **Extended:** Equivalent to: subic. Rx,Ry,value addic. Rx,Ry,-value # Subtract From Immediate Carrying D-form subfic RT,RA,SI $RT \leftarrow \neg(RA) + EXTS(SI) + 1$ The sum $\neg$ (RA) + SI + 1 is placed into register RT. # **Special Registers Altered:** CA | Add Ca | rrying | XO-form | |----------------------------------|----------------------------------------------|----------------------------------------------------------| | addc<br>addc.<br>addco<br>addco. | RT,RA,RB<br>RT,RA,RB<br>RT,RA,RB<br>RT,RA,RB | (OE=0 Rc=0)<br>(OE=0 Rc=1)<br>(OE=1 Rc=0)<br>(OE=1 Rc=1) | | 31 | RT RA | RB OF 10 Rc | | 31 | RT | RA | RB | OE | 10 | Rc | |----|----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 22 | 31 | $$RT \leftarrow (RA) + (RB)$$ The sum (RA) + (RB) is placed into register RT. # **Special Registers Altered:** CA CR0 (if Rc=1) SO OV (if OE=1) | Subtract From Carrying | XO-form | |------------------------|---------| | | | | subfc | RT,RA,RB | (OE=0 Rc=0) | |---------|----------|-------------| | subfc. | RT,RA,RB | (OE=0 Rc=1) | | subfco | RT,RA,RB | (OE=1 Rc=0) | | subfco. | RT,RA,RB | (OE=1 Rc=1) | | | | | | 31 | RT | RA | RB | OE | 8 | Rc | |----|----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 22 | 31 | $$RT \leftarrow \neg (RA) + (RB) + 1$$ The sum $\neg$ (RA) + (RB) + 1 is placed into register RT. # **Special Registers Altered:** # **Extended Mnemonics:** Example of extended mnemonics for *Subtract From Carrying*: **Extended:** Equivalent to: subc Rx,Ry,Rz subfc Rx,Rz,Ry | Add Extended | | | | XO-f | orm | | | |----------------------------------|------------------------------|--------------|----|-------|--------------------------------------|----------------|--| | adde<br>adde.<br>addeo<br>addeo. | RT,R<br>RT,R<br>RT,R<br>RT,R | A,RB<br>A,RB | | ( | OE=0 F<br>OE=0 F<br>OE=1 F<br>OE=1 F | Rc=1)<br>Rc=0) | | | 31 | RT | RA | RB | OE | 138 | Rc | | | 0 | 6 | 11 | 16 | 21 22 | | 31 | | The sum (RA) + (RB) + CA is placed into register RT. # **Special Registers Altered:** # Add to Minus One Extended XO-form | addme addme. | RT,RA<br>RT.RA | (OE=0 Rc=0)<br>(OE=0 Rc=1) | |--------------|----------------|----------------------------| | addmeo | RT,RA | (OE=1 Rc=0) | | addmeo. | RT,RA | (OE=1 Rc=1) | | 31 | RT | RA | /// | OE | 234 | Rc | |----|----|----|-----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 22 | 31 | $$RT \leftarrow (RA) + CA - 1$$ The sum (RA) + CA + $^{64}$ 1 is placed into register RT. #### **Special Registers Altered:** # Subtract From Extended XO-form | subfe | RT,RA,RB | (OE=0 Rc=0) | |---------|----------|-------------| | subfe. | RT,RA,RB | (OE=0 Rc=1) | | subfeo | RT,RA,RB | (OE=1 Rc=0) | | subfeo. | RT,RA,RB | (OE=1 Rc=1) | | | | | | 31 | RT | RA | RB | OE | 136 | Rc | |----|----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 22 | 31 | $$RT \leftarrow \neg (RA) + (RB) + CA$$ The sum $\neg$ (RA) + (RB) + CA is placed into register RT. # **Special Registers Altered:** # Subtract From Minus One Extended XO-form | subfme | RT,RA | (OE=0 Rc=0) | |----------|-------|-------------| | subfme. | RT,RA | (OE=0 Rc=1) | | subfmeo | RT,RA | (OE=1 Rc=0) | | subfmeo. | RT,RA | (OE=1 Rc=1) | | 31 | RT | RA | /// | OE | 232 | Rc | |----|----|----|-----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 22 | 31 | The sum $\neg$ (RA) + CA + $^{64}$ 1 is placed into register RT. ## **Special Registers Altered:** #### Add to Zero Extended XO-form Subtract From Zero Extended XO-form addze RT,RA (OE=0 Rc=0) subfze RT,RA (OE=0 Rc=0) subfze. addze. RT,RA (OE=0 Rc=1) RT,RA (OE=0 Rc=1) addzeo RT,RA (OE=1 Rc=0) subfzeo RT,RA (OE=1 Rc=0) addzeo. RT,RA (OE=1 Rc=1) subfzeo. RT,RA (OE=1 Rc=1) 202 Rc RT RA /// OE Rc 31 RT RA /// OE 200 21 16 21 22 31 11 16 22 31 The sum (RA) + CA is placed into register RT. ### Special Registers Altered: #### RT ← ¬(RA) + CA The sum $\neg(RA)$ + CA is placed into register RT. ## **Special Registers Altered:** #### Programming Note - The setting of CA by the *Add* and *Subtract From* instructions, including the Extended versions thereof, is mode-dependent. If a sequence of these instructions is used to perform extended-precision addition or subtraction, the same mode should be used throughout the sequence. #### XO-form Negate RT,RA (OE=0 Rc=0) neg (OE=0 Rc=1) neg. RT,RA (OE=1 Rc=0) nego RT,RA nego. RT,RA (OE=1 Rc=1) 31 RT RA /// OE 104 Rc 11 16 21 22 31 $$RT \leftarrow \neg (RA) + 1$$ The sum $\neg(RA) + 1$ is placed into register RT. If the processor is in 64-bit mode and register RA contains the most negative 64-bit number $(0x8000\_0000\_0000\_0000)$ , the result is the most negative number and, if OE=1, OV is set to 1. Similarly, if the processor is in 32-bit mode and $(RA)_{32:63}$ contain the most negative 32-bit number $(0x8000\_0000)$ , the low-order 32 bits of the result contain the most negative 32-bit number and, if OE=1, OV is set to 1. ## **Special Registers Altered:** # Multiply Low Immediate #### **D-form** mulli RT,RA,SI | 7 | RT | RA | SI | |---|----|----|------| | 0 | 6 | 11 | 16 3 | $$\begin{array}{l} \operatorname{prod}_{0:127} \leftarrow (\operatorname{RA}) \times \operatorname{EXTS}(\operatorname{SI}) \\ \operatorname{RT} \leftarrow \operatorname{prod}_{64:127} \end{array}$$ The 64-bit first operand is (RA). The 64-bit second operand is the sign-extended value of the SI field. The low-order 64 bits of the 128-bit product of the operands are placed into register RT. Both operands and the product are interpreted as signed integers. ## **Special Registers Altered:** None ## Multiply Low Word | XO | -fo | rm | |----|-----|----| | | | | | mullw<br>mullw. | RT,RA,RB<br>RT.RA.RB | (OE=0 Rc=0)<br>(OE=0 Rc=1) | |-----------------|----------------------|----------------------------| | mullwo | RT,RA,RB | (OE=0 Rc=1) | | mullwo. | RT,RA,RB | (OE=1 Rc=1) | | 31 | RT | RA | RB | OE | 235 | Rc | |----|----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 22 | 31 | $$\mathtt{RT} \leftarrow \mathtt{(RA)_{32:63}} \times \mathtt{(RB)_{32:63}}$$ The 32-bit operands are the low-order 32 bits of RA and of RB. The 64-bit product of the operands is placed into register RT. If OE=1 then OV is set to 1 if the product cannot be represented in 32 bits. Both operands and the product are interpreted as signed integers. #### **Special Registers Altered:** | CR0 | (if Rc=1) | |-------|-----------| | SO OV | (if OE=1) | #### Programming Note For *mulli* and *mullw*, the low-order 32 bits of the product are the correct 32-bit product for 32-bit mode. For *mulli* and *mulld*, the low-order 64 bits of the product are independent of whether the operands are regarded as signed or unsigned 64-bit integers. For *mulli* and *mullw*, the low-order 32 bits of the product are independent of whether the operands are regarded as signed or unsigned 32-bit integers. # Multiply High Word # XO-form $\begin{array}{lll} \text{mulhw} & \text{RT,RA,RB} & \text{(Rc=0)} \\ \text{mulhw.} & \text{RT,RA,RB} & \text{(Rc=1)} \end{array}$ | 31 | RT | RA | RB | / | 75 | Rc | |----|----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 22 | 31 | $$\begin{array}{l} \operatorname{prod}_{0:63} \leftarrow (\operatorname{RA})_{32:63} \times (\operatorname{RB})_{32:63} \\ \operatorname{RT}_{32:63} \leftarrow \operatorname{prod}_{0:31} \\ \operatorname{RT}_{0:31} \leftarrow \operatorname{undefined} \end{array}$$ The 32-bit operands are the low-order 32 bits of RA and of RB. The high-order 32 bits of the 64-bit product of the operands are placed into $RT_{32:63}$ . The contents of $RT_{0:31}$ are undefined. Both operands and the product are interpreted as signed integers. #### **Special Registers Altered:** CR0 (bits 0:2 undefined in 64-bit mode) (if Rc=1) # Multiply High Word Unsigned XO-form | mulhwu | RT,RA,RB | (Rc=0) | |---------|----------|--------| | mulhwu. | RT,RA,RB | (Rc=1) | | 31 | RT | RA | RB | / | 11 | Rc | |----|----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 22 | 31 | $$\begin{array}{l} \operatorname{prod}_{0:63} \leftarrow \left( \operatorname{RA} \right)_{32:63} \times \left( \operatorname{RB} \right)_{32:63} \\ \operatorname{RT}_{32:63} \leftarrow \operatorname{prod}_{0:31} \\ \operatorname{RT}_{0:31} \leftarrow \operatorname{undefined} \end{array}$$ The 32-bit operands are the low-order 32 bits of RA and of RB. The high-order 32 bits of the 64-bit product of the operands are placed into $RT_{32:63}$ . The contents of $RT_{0:31}$ are undefined. Both operands and the product are interpreted as unsigned integers, except that if Rc=1 the first three bits of CR Field 0 are set by signed comparison of the result to zero. # **Special Registers Altered:** CR0 (bits 0:2undefined in 64-bit mode) (if Rc=1) # Divide Word XO-form divw RT,RA,RB (OE=0 Rc=0) divw RT,RA,RB (OE=0 Rc=1) divwo RT,RA,RB (OE=1 Rc=0) divwo RT,RA,RB (OE=1 Rc=1) | | 31 | RT | RA | RB | OE | 491 | Rc | |---|----|----|----|----|----|-----|----| | 0 | | 6 | 11 | 16 | 21 | 22 | 31 | ``` \begin{array}{l} \operatorname{dividend_{0:63}} \leftarrow \operatorname{EXTS}\left(\left(\operatorname{RA}\right)_{32:63}\right) \\ \operatorname{divisor_{0:63}} \leftarrow \operatorname{EXTS}\left(\left(\operatorname{RB}\right)_{32:63}\right) \\ \operatorname{RT}_{32:63} \leftarrow \operatorname{dividend} \div \operatorname{divisor} \\ \operatorname{RT}_{0:31} \leftarrow \operatorname{undefined} \end{array} ``` The 64-bit dividend is the sign-extended value of $(RA)_{32:63}$ . The 64-bit divisor is the sign-extended value of $(RB)_{32:63}$ . The 64-bit quotient is formed. The low-order 32 bits of the 64-bit quotient are placed into $RT_{32:63}$ . The contents of $RT_{0:31}$ are undefined. The remainder is not supplied as a result. Both operands and the quotient are interpreted as signed integers. The quotient is the unique signed integer that satisfies ``` dividend = (quotient \times divisor) + r ``` where $0 \le r < |divisor|$ if the dividend is nonnegative, and $-|divisor| < r \le 0$ if the dividend is negative. If an attempt is made to perform any of the divisions ``` 0x8000\_0000 \div -1 <anything> ÷ 0 ``` then the contents of register RT are undefined as are (if Rc=1) the contents of the LT, GT, and EQ bits of CR Field 0. In these cases, if OE=1 then OV is set to 1. #### **Special Registers Altered:** ``` CR0 (bits 0:2 undefined in 64-bit mode) (if Rc=1) SO OV (if OE=1) ``` #### Programming Note The 32-bit signed remainder of dividing (RA) $_{32:63}$ by (RB) $_{32:63}$ can be computed as follows, except in the case that (RA) $_{32:63}$ = -2<sup>31</sup> and (RB) $_{32:63}$ = -1. ``` divw RT,RA,RB # RT = quotient mullw RT,RT,RB # RT = quotient×divisor subf RT,RT,RA # RT = remainder ``` # Divide Word Unsigned XO-form | divwu | RT,RA,RB | (OE=0 Rc=0) | |---------|----------|-------------| | divwu. | RT,RA,RB | (OE=0 Rc=1) | | divwuo | RT,RA,RB | (OE=1 Rc=0) | | divwuo. | RT,RA,RB | (OE=1 Rc=1) | | 31 | RT | RA | RB | OE | 459 | Rc | |----|----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 22 | 31 | ``` \begin{array}{l} \mbox{dividend}_{0:63} \leftarrow \mbox{$^{32}$0} \mid \mid \mbox{(RA)}_{32:63} \\ \mbox{divisor}_{0:63} \leftarrow \mbox{$^{32}$0} \mid \mid \mbox{(RB)}_{32:63} \\ \mbox{RT}_{32:63} \leftarrow \mbox{dividend} \div \mbox{divisor} \\ \mbox{RT}_{0:31} \leftarrow \mbox{undefined} \end{array} ``` The 64-bit dividend is the zero-extended value of (RA) $_{32:63}$ . The 64-bit divisor is the zero-extended value of (RB) $_{32:63}$ . The 64-bit quotient is formed. The low-order 32 bits of the 64-bit quotient are placed into RT $_{32:63}$ . The contents of RT $_{0:31}$ are undefined. The remainder is not supplied as a result. Both operands and the quotient are interpreted as unsigned integers, except that if Rc=1 the first three bits of CR Field 0 are set by signed comparison of the result to zero. The quotient is the unique unsigned integer that satisfies ``` dividend = (quotient \times divisor) + r ``` where $0 \le r < \text{divisor}$ . If an attempt is made to perform the division ``` <anything> ÷ 0 ``` then the contents of register RT are undefined as are (if Rc=1) the contents of the LT, GT, and EQ bits of CR Field 0. In this case, if OE=1 then OV is set to 1. #### Special Registers Altered: ``` CR0 (bits 0:2 undefined in 64-bit mode) (if Rc=1) SO OV (if OE=1) ``` #### Programming Note - The 32-bit unsigned remainder of dividing (RA) $_{32:63}$ by (RB) $_{32:63}$ can be computed as follows. ``` divwu RT,RA,RB # RT = quotient mullw RT,RT,RB # RT = quotient×divisor subf RT,RT,RA # RT = remainder ``` # 3.3.8.1 64-bit Fixed-Point Arithmetic Instructions [Category: 64-Bit] # Multiply Low Doubleword #### XO-form # mulld RT,RA,RB (OE=0 Rc=0) mulld. RT,RA,RB (OE=0 Rc=1) mulldo RT,RA,RB (OE=1 Rc=0) mulldo. RT,RA,RB (OE=1 Rc=1) | 31 | RT | RA | RB | OE | 233 | Rc | |----|----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 22 | 31 | $$\begin{array}{l} \operatorname{prod}_{0:127} \leftarrow (\operatorname{RA}) \ \times \ (\operatorname{RB}) \\ \operatorname{RT} \ \leftarrow \ \operatorname{prod}_{64:127} \end{array}$$ The 64-bit operands are (RA) and (RB). The low-order 64 bits of the 128-bit product of the operands are placed into register RT. If OE=1 then OV is set to 1 if the product cannot be represented in 64 bits. Both operands and the product are interpreted as signed integers. # **Special Registers Altered:** ### Programming Note The XO-form *Multiply* instructions may execute faster on some implementations if RB contains the operand having the smaller absolute value. # Multiply High Doubleword Unsigned XO-form | mulhdu | RT,RA,RB | (Rc=0) | |---------|----------|--------| | mulhdu. | RT,RA,RB | (Rc=1) | | 31 | RT | RA | RB | / | 9 | Rc | |----|----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 22 | 31 | $$\begin{array}{l} \operatorname{prod}_{0:127} \leftarrow (\operatorname{RA}) \ \times \ (\operatorname{RB}) \\ \operatorname{RT} \ \leftarrow \ \operatorname{prod}_{0:63} \end{array}$$ The 64-bit operands are (RA) and (RB). The high-order 64 bits of the 128-bit product of the operands are placed into register RT. Both operands and the product are interpreted as unsigned integers, except that if Rc=1 the first three bits of CR Field 0 are set by signed comparison of the result to zero. #### **Special Registers Altered:** # Multiply High Doubleword XO-form | mulhd | RT,RA,RB | (Rc=0) | |--------|----------|--------| | mulhd. | RT,RA,RB | (Rc=1) | | 31 | RT | RA | RB | / | 73 | Rc | |----|----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 22 | 31 | $$prod_{0:127} \leftarrow (RA) \times (RB)$$ $RT \leftarrow prod_{0:63}$ The 64-bit operands are (RA) and (RB). The high-order 64 bits of the 128-bit product of the operands are placed into register RT. Both operands and the product are interpreted as signed integers. # **Special Registers Altered:** # **Divide Doubleword** #### XO-form # Divide Doubleword Unsigned XO-form | divd<br>divd.<br>divdo<br>divdo. | RT,R.<br>RT,R.<br>RT,R.<br>RT,R. | A,RB<br>A,RB | | | (OE=0 F<br>(OE=0 F<br>(OE=1 F<br>(OE=1 F | Rc=1)<br>Rc=0) | |----------------------------------|----------------------------------|--------------|----|----|------------------------------------------|----------------| | 31 | RT | RA | RR | OF | 489 | Rc | | 31 | RT | RA | RB | OE | 489 | Rc | |----|----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 22 | 31 | ``` \begin{array}{l} \operatorname{dividend}_{0:63} \leftarrow (\operatorname{RA}) \\ \operatorname{divisor}_{0:63} \leftarrow (\operatorname{RB}) \\ \operatorname{RT} \leftarrow \operatorname{dividend} \div \operatorname{divisor} \end{array} ``` The 64-bit dividend is (RA). The 64-bit divisor is (RB). The 64-bit quotient of the dividend and divisor is placed into register RT. The remainder is not supplied as a result. Both operands and the quotient are interpreted as signed integers. The quotient is the unique signed integer that satisfies ``` dividend = (quotient \times divisor) + r ``` where $0 \le r < |divisor|$ if the dividend is nonnegative, and $-|divisor| < r \le 0$ if the dividend is negative. If an attempt is made to perform any of the divisions ``` 0x8000\_0000\_0000\_0000 \div -1 <anything> \div 0 ``` then the contents of register RT are undefined as are (if Rc=1) the contents of the LT, GT, and EQ bits of CR Field 0. In these cases, if OE=1 then OV is set to 1. #### **Special Registers Altered:** | CR0 | (if Rc=1) | |-------|-----------| | SO OV | (if OE=1) | #### **Programming Note** The 64-bit signed remainder of dividing (RA) by (RB) can be computed as follows, except in the case that (RA) = $-2^{63}$ and (RB) = -1. ``` divd RT,RA,RB # RT = quotient mulld RT,RT,RB # RT = quotient×divisor subf RT,RT,RA # RT = remainder ``` | divdu | RT,RA,RB | (OE=0 Rc=0) | |---------|----------|-------------| | divdu. | RT.RA.RB | (OE=0 Rc=1) | | divduo | RT,RA,RB | (OE=1 Rc=0) | | divduo. | RT,RA,RB | (OE=1 Rc=1) | | 31 | RT | RA | RB | OE | 457 | Rc | |----|----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 22 | 31 | ``` \begin{array}{l} \operatorname{dividend}_{0:63} \leftarrow (\operatorname{RA}) \\ \operatorname{divisor}_{0:63} \leftarrow (\operatorname{RB}) \\ \operatorname{RT} \leftarrow \operatorname{dividend} \div \operatorname{divisor} \end{array} ``` The 64-bit dividend is (RA). The 64-bit divisor is (RB). The 64-bit quotient of the dividend and divisor is placed into register RT. The remainder is not supplied as a result. Both operands and the quotient are interpreted as unsigned integers, except that if Rc=1 the first three bits of CR Field 0 are set by signed comparison of the result to zero. The quotient is the unique unsigned integer that satisfies $$dividend = (quotient \times divisor) + r$$ where $0 \le r < divisor$ . If an attempt is made to perform the division ``` <anything> ÷ 0 ``` then the contents of register RT are undefined as are (if Rc=1) the contents of the LT, GT, and EQ bits of CR Field 0. In this case, if OE=1 then OV is set to 1. # **Special Registers Altered:** ``` CR0 (if Rc=1) SO OV (if OE=1) ``` #### Programming Note The 64-bit unsigned remainder of dividing (RA) by (RB) can be computed as follows. ``` divdu RT,RA,RB # RT = quotient mulld RT,RT,RB # RT = quotient×divisor subf RT,RT,RA # RT = remainder ``` # 3.3.9 Fixed-Point Compare Instructions The fixed-point Compare instructions compare the contents of register RA with (1) the sign-extended value of the SI field, (2) the zero-extended value of the UI field, or (3) the contents of register RB. The comparison is signed for *cmpi* and *cmp*, and unsigned for *cmpli* and cmpl. The L field controls whether the operands are treated as 64-bit or 32-bit quantities, as follows: L **Operand length** 0 32-bit operands 64-bit operands 1 L=1 is part of Category: 64-Bit. When the operands are treated as 32-bit signed quantities, bit 32 of the register (RA or RB) is the sign bit. The Compare instructions set one bit in the leftmost three bits of the designated CR field to 1, and the other two to 0. XER<sub>SO</sub> is copied to bit 3 of the designated CR field. The CR field is set as follows ### **Bit Name Description** (RA) < SI or (RB) (signed comparison) LT (RA) < UI or (RB) (unsigned comparison) (RA) > SI or (RB) (signed comparison) GT (RA) > UI or (RB) (unsigned comparison) FΩ (RA) = SI. UI. or (RB)2 SO Summary Overflow from the XER # **Extended mnemonics for compares** A set of extended mnemonics is provided so that compares can be coded with the operand length as part of the mnemonic rather than as a numeric operand. Some of these are shown as examples with the Compare instructions. See Appendix D for additional extended mnemonics. # Compare Immediate #### D-form cmp BF,L,RA,SI cmpi | 11 | BF | / | L | RA | SI | | |----|----|---|----|----|----|----| | 0 | 6 | 9 | 10 | 11 | 16 | 31 | ``` if L = 0 then a \leftarrow EXTS((RA)<sub>32:63</sub>) else a \leftarrow (RA) a < EXTS(SI) then c \leftarrow 0b100 else if a > EXTS(SI) then c \leftarrow 0b010 c ← 0b001 CR_{4\times BF+32:4\times BF+35} \leftarrow c \mid\mid XER_{SO} ``` The contents of register RA ((RA)<sub>32:63</sub> sign-extended to 64 bits if L=0) are compared with the sign-extended value of the SI field, treating the operands as signed integers. The result of the comparison is placed into CR field BF. #### **Special Registers Altered:** CR field BF #### **Extended Mnemonics:** Examples of extended mnemonics for Compare Immediate: | Extend | ed: | Equivalent to: | | | | |--------|--------------|----------------|--------------|--|--| | cmpdi | Rx,value | cmpi | 0,1,Rx,value | | | | cmpwi | cr3,Rx,value | cmpi | 3,0,Rx,value | | | # Compare X-form | 31 | BF | / | L | RA | RB | 0 | / | |----|----|---|----|----|----|----|----| | 0 | 6 | ۵ | 10 | 11 | 16 | 21 | 21 | ``` if L = 0 then a \leftarrow EXTS((RA)<sub>32:63</sub>) b \leftarrow EXTS((RB)_{32:63}) else a \leftarrow (RA) b \leftarrow (RB) if a < b \text{ then } c \leftarrow 0b100 else if a > b then c \leftarrow 0b010 c ← 0b001 CR_{4\times BF+32:4\times BF+35} \leftarrow c \mid\mid XER_{SO} ``` BF,L,RA,RB The contents of register RA ((RA)<sub>32:63</sub> if L=0) are compared with the contents of register RB ((RB)<sub>32:63</sub> if L=0), treating the operands as signed integers. The result of the comparison is placed into CR field BF. #### **Special Registers Altered:** CR field BF #### **Extended Mnemonics:** Examples of extended mnemonics for Compare: | Extend | ded: | Equivalent to: | | | | | |--------|-----------|----------------|-----------|--|--|--| | cmpd | Rx,Ry | cmp | 0,1,Rx,Ry | | | | | cmpw | cr3,Rx,Ry | cmp | 3,0,Rx,Ry | | | | # Compare Logical Immediate D-form # Compare Logical X-form cmpli BF,L,RA,UI | 10 | BF | / | L | RA | | UI | | |----|----|---|----|----|----|----|----| | 0 | 6 | 9 | 10 | 11 | 16 | | 31 | The contents of register RA ((RA) $_{32:63}$ zero-extended to 64 bits if L=0) are compared with $^{48}0 \parallel$ UI, treating the operands as unsigned integers. The result of the comparison is placed into CR field BF. # **Special Registers Altered:** CR field BF #### **Extended Mnemonics:** Examples of extended mnemonics for Compare Logical Immediate: | Extend | ed: | Equiva | Equivalent to: | | | | |--------|--------------|--------|----------------|--|--|--| | cmpldi | Rx,value | cmpli | 0,1,Rx,value | | | | | cmplwi | cr3,Rx,value | cmpli | 3,0,Rx,value | | | | # cmpl BF,L,RA,RB ``` if L = 0 then a \leftarrow 320 | (RA) 32:63 b \leftarrow 320 | (RB) 32:63 else a \leftarrow (RA) b \leftarrow (RB) if a < b then c \leftarrow 0b100 else if a > b then c \leftarrow 0b010 else c \leftarrow 0b001 \cot CR_{4\times BF+32:4\times BF+35} \leftarrow c | XER<sub>SO</sub> ``` The contents of register RA ((RA) $_{32:63}$ if L=0) are compared with the contents of register RB ((RB) $_{32:63}$ if L=0), treating the operands as unsigned integers. The result of the comparison is placed into CR field BF. # **Special Registers Altered:** CR field BF #### **Extended Mnemonics:** Examples of extended mnemonics for Compare Logical: | Extend | ed: | Equivalent to: | | | | |--------|-----------|----------------|-----------|--|--| | cmpld | Rx,Ry | cmpl | 0,1,Rx,Ry | | | | cmplw | cr3,Rx,Ry | cmpl | 3,0,Rx,Ry | | | # 3.3.10 Fixed-Point Trap Instructions The Trap instructions are provided to test for a specified set of conditions. If any of the conditions tested by a Trap instruction are met, the system trap handler is invoked. If none of the tested conditions are met, instruction execution continues normally. The contents of register RA are compared with either the sign-extended value of the SI field or the contents of register RB, depending on the Trap instruction. For tdi and td, the entire contents of RA (and RB) participate in the comparison; for twi and tw, only the contents of the low-order 32 bits of RA (and RB) participate in the comparison. This comparison results in five conditions which are ANDed with TO. If the result is not 0 the system trap handler is invoked. These conditions are as follows. #### TO Bit ANDed with Condition - 0 Less Than, using signed comparison - 1 Greater Than, using signed comparison - 2 - 3 Less Than, using unsigned comparison - Greater Than, using unsigned comparison # **Extended mnemonics for traps** A set of extended mnemonics is provided so that traps can be coded with the condition as part of the mnemonic rather than as a numeric operand. Some of these are shown as examples with the Trap instructions. See Appendix D for additional extended mnemonics. # Trap Word Immediate #### D-form tw ### X-form TO, RA, SI twi | 3 | TO | RA | SI | | |---|----|----|----|----| | 0 | 6 | 11 | 16 | 31 | ``` a \leftarrow EXTS((RA)_{32:63}) if (a < EXTS(SI)) & TO_0 then TRAP if (a > EXTS(SI)) & TO_1 then TRAP if (a = EXTS(SI)) & TO<sub>2</sub> then TRAP if (a <^u EXTS(SI)) & TO_3 then TRAP if (a > u EXTS(SI)) & TO_4 then TRAP ``` The contents of RA<sub>32:63</sub> are compared with the sign-extended value of the SI field. If any bit in the TO field is set to 1 and its corresponding condition is met by the result of the comparison, the system trap handler is invoked. If the trap conditions are met, this instruction is context synchronizing (see Book III). #### **Special Registers Altered:** None #### **Extended Mnemonics:** Examples of extended mnemonics for Trap Word Immediate: | Extended: | | Equiv | /alent to: | |-----------|----------|-------|------------| | twgti | Rx,value | twi | 8,Rx,value | | twllei | Rx,value | twi | 6,Rx,value | # Trap Word | | 31 | TO | RA | RB | 4 | / | |---|----|----|----|----|----|----| | _ | | 6 | 11 | 16 | 21 | 21 | ``` a \leftarrow EXTS((RA)_{32:63}) b \leftarrow EXTS((RB)_{32:63}) if (a < b) \& TO_0 then TRAP if (a > b) \& TO_1 then TRAP if (a = b) \& TO_2 then TRAP if (a <^{\rm u} b) & {\rm TO_3} then TRAP if (a >^{\rm u} b) & {\rm TO_4} then TRAP ``` TO,RA,RB The contents of RA<sub>32:63</sub> are compared with the contents of RB<sub>32:63</sub>. If any bit in the TO field is set to 1 and its corresponding condition is met by the result of the comparison, the system trap handler is invoked. If the trap conditions are met, this instruction is context synchronizing (see Book III). #### **Special Registers Altered:** None #### **Extended Mnemonics:** Examples of extended mnemonics for *Trap Word*: | Extended: | | Equivalent to: | | | |-----------|-------|----------------|---------|--| | tweq | Rx,Ry | tw | 4,Rx,Ry | | | twlge | Rx,Ry | tw | 5,Rx,Ry | | | trap | | tw | 31,0,0 | | # 3.3.10.1 64-bit Fixed-Point Trap Instructions [Category: 64-Bit] ## Trap Doubleword Immediate D-form tdi TO,RA,SI | 2 | TO | RA | SI | |---|----|----|-------| | 0 | 6 | 11 | 16 31 | ``` \begin{array}{l} \mathbf{a} \leftarrow (\mathtt{RA}) \\ \mathbf{b} \leftarrow \mathtt{EXTS}(\mathtt{SI}) \\ \text{if } (\mathbf{a} < \mathbf{b}) & \mathtt{TO_0} \text{ then TRAP} \\ \text{if } (\mathbf{a} > \mathbf{b}) & \mathtt{TO_1} \text{ then TRAP} \\ \text{if } (\mathbf{a} = \mathbf{b}) & \mathtt{TO_2} \text{ then TRAP} \\ \text{if } (\mathbf{a} <^{\mathtt{u}} \ \mathbf{b}) & \mathtt{TO_3} \text{ then TRAP} \\ \text{if } (\mathbf{a} >^{\mathtt{u}} \ \mathbf{b}) & \mathtt{TO_4} & \text{then TRAP} \\ \end{array} ``` The contents of register RA are compared with the sign-extended value of the SI field. If any bit in the TO field is set to 1 and its corresponding condition is met by the result of the comparison, the system trap handler is invoked. If the trap conditions are met, this instruction is context synchronizing (see Book III). #### **Special Registers Altered:** None #### **Extended Mnemonics:** Examples of extended mnemonics for *Trap Double-word Immediate*: | Extended: | | Equi | Equivalent to: | | |-----------|----------|------|----------------|--| | tdlti | Rx,value | tdi | 16,Rx,value | | | tdnei | Rx,value | tdi | 24,Rx,value | | # Trap Doubleword X-form td TO,RA,RB | 31 | TO | RA | RB | 68 | / | |----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` a \leftarrow (RA) b \leftarrow (RB) if (a < b) \& TO_0 then TRAP if (a > b) \& TO_1 then TRAP if (a = b) \& TO_2 then TRAP if (a <^u b) \& TO_3 then TRAP if (a >^u b) \& TO_4 then TRAP ``` The contents of register RA are compared with the contents of register RB. If any bit in the TO field is set to 1 and its corresponding condition is met by the result of the comparison, the system trap handler is invoked. If the trap conditions are met, this instruction is context synchronizing (see Book III). # **Special Registers Altered:** None #### **Extended Mnemonics:** Examples of extended mnemonics for *Trap Double-word*: | Extended: | | Equi | valent to: | |-----------|-------|------|------------| | tdge | Rx,Ry | td | 12,Rx,Ry | | tdlnl | Rx,Ry | td | 5,Rx,Rv | # 3.3.11 Fixed-Point Select [Category: Phased-In (sV2.06)] # Integer Select A-form isel RT,RA,RB,BC | 31 | RT | RA | RB | BC | 15 | / | |----|----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 26 | 31 | ``` if RA=0 then a \leftarrow0 else a \leftarrow (RA) if CR_{BC+32}=1 then RT \leftarrow a else RT \leftarrow (RB) ``` If the contents of bit BC+32 of the Condition Register are equal to 1, then the contents of register RA (or 0) are placed into register RT. Otherwise, the contents of register RB are placed into register RT. #### **Special Registers Altered:** # 3.3.12 Fixed-Point Logical Instructions The *Logical* instructions perform bit-parallel operations on 64-bit operands. The X-form Logical instructions with Rc=1, and the D-form *Logical* instructions *andi.* and *andis.*, set the first three bits of CR Field 0 as described in Section 3.3.7, "Other Fixed-Point Instructions" on page 61. The Logical instructions do not change the SO, OV, and CA bits in the XER. # Extended mnemonics for logical operations Extended mnemonics are provided that generate two different types of "no-ops" (instructions that do nothing). The first type is the preferred form, which is optimized to minimize its use of the processor's execution resources. This form is based on the *OR Immediate* instruction. The second type is the executed form, which is intended to consume the same amount of the processor's execution resources as if it were not a no-op. This form is based on the *XOR Immediate* instruction. (There are also no-ops which affect program priority, for which extended mnemonics have not been assigned.) Extended mnemonics are provided that use the *OR* and *NOR* instructions to copy the contents of one register to another, with and without complementing. These are shown as examples with the two instructions. See Appendix D, "Assembler Extended Mnemonics" on page 383 for additional extended mnemonics. #### AND Immediate D-form andi. RA,RS,UI $$RA \leftarrow (RS) \& (^{48}O \mid \mid UI)$$ The contents of register RS are ANDed with <sup>48</sup>0 || UI and the result is placed into register RA. #### **Special Registers Altered:** CR0 #### AND Immediate Shifted D-form andis. RA,RS,UI | 29 | RS | RA | UI | | |----|----|----|----|----| | _ | _ | | | | | 0 | 6 | 11 | 16 | 31 | $$RA \leftarrow (RS) \& (^{32}O \mid \mid UI \mid \mid ^{16}O)$$ The contents of register RS are ANDed with $^{32}0 \parallel$ UI $\parallel$ $^{16}0$ and the result is placed into register RA. # **Special Registers Altered:** CR0 #### OR Immediate D-form ori RA,RS,UI | Ī | 24 | RS | RA | UI | | |---|----|----|----|----|----| | ( | ) | 6 | 11 | 16 | 31 | $$RA \leftarrow (RS) \mid (^{48}0 \mid \mid UI)$$ The contents of register RS are ORed with <sup>48</sup>0 || UI and the result is placed into register RA. The preferred "no-op" (an instruction that does nothing) is: ## **Special Registers Altered:** None #### **Extended Mnemonics:** Example of extended mnemonics for OR Immediate: | Extended: | Equiv | alent to: | |-----------|-------|-----------| | no-op | ori | 0,0,0 | # **OR Immediate Shifted** D-form oris RA,RS,UI $$RA \leftarrow (RS) \mid (^{32}0 \mid \mid UI \mid \mid ^{16}0)$$ The contents of register RS are ORed with <sup>32</sup>0 || UI || <sup>16</sup>0 and the result is placed into register RA. # **Special Registers Altered:** None # XOR Immediate **D-form** **D-form** RA,RS,UI xori | 26 | RS | RA | UI | | |----|----|----|----|----| | 0 | 6 | 11 | 16 | 31 | $$RA \leftarrow (RS) XOR (^{48}O \mid \mid UI)$$ The contents of register RS are XORed with 480 || UI and the result is placed into register RA. The executed form of a "no-op" (an instruction that does nothing, but consumes execution resources nevertheless) is: ## **Special Registers Altered:** #### **Extended Mnemonics:** Example of extended mnemonics for XOR Immediate: | Extended: | Equivalent to: | | | |-----------|----------------|-------|--| | xnop | xori | 0,0,0 | | # XOR Immediate Shifted RA,RS,UI xoris | 27 | RS | RA | UI | |----|----|----|-------| | 0 | 6 | 11 | 16 31 | $$RA \leftarrow (RS) XOR (^{32}O || UI || ^{16}O)$$ The contents of register RS are XORed with <sup>32</sup>0 || UI || <sup>16</sup>0 and the result is placed into register RA. # **Special Registers Altered:** None # Programming Note — The executed form of no-op should be used only when the intent is to alter the timing of a program. | AND | | | | Х- | form | |-------------|--------------|----|----|----|------------------| | and<br>and. | RA,R<br>RA,R | , | | | (Rc=0)<br>(Rc=1) | | 31 | RS | RA | RB | 28 | Rc | 16 RA ← (RS) & (RB) The contents of register RS are ANDed with the contents of register RB and the result is placed into register RA ### **Special Registers Altered:** CR0 (if Rc=1) | xor | RA,RS,RB | (Rc=0) | |------|----------|--------| | xor. | RA,RS,RB | (Rc=1) | | 31 | RS | RA | RB | 316 | Rc | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $RA \leftarrow (RS) \oplus (RB)$ The contents of register RS are XORed with the contents of register RB and the result is placed into register RA #### **Special Registers Altered:** CR0 (if Rc=1) #### NAND X-form | nand | RA,RS,RB | (Rc=0) | |-------|----------|--------| | nand. | RA,RS,RB | (Rc=1) | | 31 | RS | RA | RB | 476 | Rc | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $RA \leftarrow \neg((RS) \& (RB))$ The contents of register RS are ANDed with the contents of register RB and the complemented result is placed into register RA. # **Special Registers Altered:** CR0 (if Rc=1) # **Programming Note** **nand** or **nor** with RS=RB can be used to obtain the one's complement. | OR | X-form | |----|----------| | UK | X-10/111 | | or | RA,RS,RB | (Rc=0) | |-----|----------|--------| | or. | RA,RS,RB | (Rc=1) | | 31 | RS | RA | RB | 444 | Rc | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | RA ← (RS) | (RB) 31 The contents of register RS are ORed with the contents of register RB and the result is placed into register RA. For implementations that support the PPR (see Section 3.2.3), *or Rx*,*Rx*, *Rx* can be used to set PPR<sub>PRI</sub> as shown in Figure 45. *or*. *Rx*,*Rx*, *Rx* does not set PPR<sub>PRI</sub>. | Rx | PPR <sub>PRI</sub> | Priority | |----|--------------------|-----------------| | 1 | 010 | low | | 6 | 011 | medium low | | 2 | 100 | medium (normal) | Figure 45. Priority levels for or Rx,Rx,Rx # **Special Registers Altered:** CR0 (if Rc=1) #### **Extended Mnemonics:** Example of extended mnemonics for OR: Extended: Equivalent to: mr Rx,Ry or Rx,Ry,Ry #### Programming Note **Warning:** Other forms of *or Rx,Rx,Rx* that are not described in Figure 45 may also cause program priority to change. Use of these forms should be avoided except when software explicitly intends to alter program priority. If a no-op is needed, the preferred no-op (*ori* 0,0,0) should be used. | NOR | | | | | X-fo | rm | E | quiva | lent | | | | X | -form | |-------------|---------|----------------|----------|----|------|--------------|----------|-------|---------|----------------|----------|----|-----|------------------| | nor<br>nor. | | RS,RB<br>RS,RB | | | ` | c=0)<br>c=1) | eq<br>eq | | | RS,RB<br>RS,RB | | | | (Rc=0)<br>(Rc=1) | | 31<br>0 | RS<br>6 | RA<br>11 | RB<br>16 | 21 | 124 | Rc<br>31 | 0 | 31 | RS<br>6 | RA<br>11 | RB<br>16 | 21 | 284 | Rc<br>31 | X-form $$RA \leftarrow \neg((RS) \mid (RB))$$ The contents of register RS are ORed with the contents of register RB and the complemented result is placed into register RA. # **Special Registers Altered:** CR0 (if Rc=1) #### **Extended Mnemonics:** Example of extended mnemonics for NOR: | Extended: | | Equivalent to: | | | |-----------|-------|----------------|----------|--| | not | Rx,Ry | nor | Rx,Ry,Ry | | # $RA \leftarrow (RS) \equiv (RB)$ The contents of register RS are XORed with the contents of register RB and the complemented result is placed into register RA. #### **Special Registers Altered:** CR0 (if Rc=1) # AND with Complement RA,RS,RB (Rc=0)andc andc. RA,RS,RB (Rc=1) | 31 | RS | RA | RB | 60 | Rc | |----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$RA \leftarrow (RS) \& \neg (RB)$$ The contents of register RS are ANDed with the complement of the contents of register RB and the result is placed into register RA. #### **Special Registers Altered:** (if Rc=1) CR0 #### OR with Complement X-form | orc | RA,RS,RB | (Rc=0) | |------|----------|--------| | orc. | RA,RS,RB | (Rc=1) | | 31 | RS | RA | RB | 412 | Rc | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$RA \leftarrow (RS) \mid \neg (RB)$$ The contents of register RS are ORed with the complement of the contents of register RB and the result is placed into register RA. # **Special Registers Altered:** (if Rc=1) CR0 (Rc=0) # Extend Sign Byte # X-form # Extend Sign Halfword X-form extsb RA,RS (Rc=0) extsb. RA,RS (Rc=1) | extsh. | RA,RS | S | | | (Rc=1) | |--------|-------|----|-----|-----|--------| | 24 | DC | DΛ | 111 | 000 | D- | | 31 | RS | RA | /// | 954 | Rc | |----|----|----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{l} \text{s} \leftarrow (\text{RS})_{56} \\ \text{RA}_{56:63} \leftarrow (\text{RS})_{56:63} \\ \text{RA}_{0:55} \leftarrow ^{56} \text{s} \end{array}$$ $$\begin{array}{l} \mathtt{s} \leftarrow \mathtt{(RS)}_{48} \\ \mathtt{RA}_{48:63} \leftarrow \mathtt{(RS)}_{48:63} \\ \mathtt{RA}_{0:47} \leftarrow \phantom{}^{48}\mathtt{s} \end{array}$$ extsh (RS) $_{56:63}$ are placed into RA $_{56:63}$ . RA $_{0:55}$ are filled with a copy of (RS) $_{56}$ . (RS) $_{48:63}$ are placed into RA $_{48:63}$ . RA $_{0:47}$ are filled with a copy of (RS) $_{48}$ . # **Special Registers Altered:** # **Special Registers Altered:** RA,RS # Count Leading Zeros Word #### X-form | cntlzw | RA,RS | (Rc=0) | |---------|-------|--------| | cntlzw. | RA,RS | (Rc=1) | | 31 | RS | RA | /// | 26 | Rc | |----|----|----|-----|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$n \leftarrow 32$$ do while $n < 64$ if $(RS)_n = 1$ then leave $n \leftarrow n + 1$ $RA \leftarrow n - 32$ A count of the number of consecutive zero bits starting at bit 32 of register RS is placed into register RA. This number ranges from 0 to 32, inclusive. If Rc=1, CR Field 0 is set to reflect the result. #### **Special Registers Altered:** Programming Note CR0 (if Rc=1) # · # For both *Count Leading Zeros* instructions, if Rc=1 then LT is set to 0 in CR Field 0. # Compare Bytes ### X-form cmpb RA,RS,RB do n = 0 to 7 $$\begin{array}{ll} \text{if } \mathrm{RS}_{8\times n:\, 8\times n+7} &=& (\mathrm{RB})_{\, 8\times n:\, 8\times n+7} \text{ then} \\ \mathrm{RA}_{8\times n:\, 8\times n+7} &\leftarrow& ^{8}1 \\ \text{else} & \mathrm{RA}_{8\times n:\, 8\times n+7} &\leftarrow& ^{8}0 \end{array}$$ Each byte of the contents of register RS is compared to each corresponding byte of the contents in register RB. If they are equal, the corresponding byte in RA is set to 0xFF. Otherwise the corresponding byte in RA is set to 0x00. # **Special Registers Altered:** # Parity Doubleword X-form prtyd RA,RS [Category: 64-bit] | 31 | RS | RA | /// | 186 | / | |----|----|----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$s \leftarrow 0$$ do i = 0 to 7 $s \leftarrow s \oplus (RS)_{i \times 8+7}$ RA $\leftarrow ^{63}0 \mid \mid s$ The least significant bit in each byte of the contents of register RS is examined. If there is an odd number of one bits the value 1 is placed into register RA; otherwise the value 0 is placed into register RA. ### **Special Registers Altered:** None # **Parity Word** X-form prtyw RA,RS | 31 | RS | RA | /// | 154 | / | |----|----|----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{l} s \leftarrow 0 \\ t \leftarrow 0 \\ \text{do i = 0 to 3} \\ s \leftarrow s \oplus (RS)_{i \times 8+7} \\ \text{do i = 4 to 7} \\ t \leftarrow t \oplus (RS)_{i \times 8+7} \\ \text{RA}_{0:31} \leftarrow {}^{31}0 \mid \mid s \\ \text{RA}_{32:63} \leftarrow {}^{31}0 \mid \mid t \end{array} ``` The least significant bit in each byte of $(RS)_{0:31}$ is examined. If there is an odd number of one bits the value 1 is placed into $RA_{0:31}$ ; otherwise the value 0 is placed into $RA_{0:31}$ . The least significant bit in each byte of $(RS)_{32:63}$ is examined. If there is an odd number of one bits the value 1 is placed into $RA_{32:63}$ ; otherwise the value 0 is placed into $RA_{32:63}$ . # **Special Registers Altered:** None #### Programming Note - The *Parity* instructions are designed to be used in conjunction with the *Population Count* instruction to compute the parity of words or a doubleword. The parity of the upper and lower words in (RS) can be computed as follows. ``` popentb RA, RS prtyw RA, RA ``` The parity of (RS) can be computed as follows. ``` popentb RA, RS prtyd RA, RA ``` # 3.3.12.1 64-bit Fixed-Point Logical Instructions [Category: 64-Bit] # **Extend Sign Word** # X-form | extsw | RA,RS | (Rc=0) | |--------|-------|--------| | extsw. | RA,RS | (Rc=1) | | 31 | RS | RA | /// | 986 | Rc | |----|----|----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$s \leftarrow (RS)_{32}$$ $RA_{32:63} \leftarrow (RS)_{32:63}$ $RA_{0:31} \leftarrow ^{32}s$ (RS) $_{32:63}$ are placed into RA $_{32:63}$ . RA $_{0:31}$ are filled with a copy of (RS) $_{32}$ . # Special Registers Altered: CR0 (if Rc=1) # Count Leading Zeros Doubleword X-form | cntlzd | RA,RS | (Rc=0) | |---------|-------|--------| | cntlzd. | RA,RS | (Rc=1) | | 31 | RS | RA | /// | 58 | Rc | |----|----|----|-----|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` n \leftarrow 0 do while n < 64 if (RS)_n = 1 then leave n \leftarrow n + 1 RA \leftarrow n ``` A count of the number of consecutive zero bits starting at bit 0 of register RS is placed into register RA. This number ranges from 0 to 64, inclusive. If Rc=1, CR Field 0 is set to reflect the result. #### **Special Registers Altered:** CR0 (if Rc=1) # 3.3.12.2 Phased-In Fixed-Point Logical Instructions [Category: Phased-In (sV2.05)] # **Population Count Bytes** X-form popcntb RA, RS | 31 | RS | RA | /// | 122 | / | |----|----|----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{lll} \text{do i} &=& 0 \text{ to } 7 \\ & n \leftarrow & 0 \\ & \text{do j} &=& 0 \text{ to } 7 \\ & \text{ if } (\text{RS})_{(i\times8)+j} &=& 1 \text{ then } \\ & n \leftarrow & n+1 \\ & \text{RA}_{(i\times8):(i\times8)+7} \leftarrow & n \end{array} ``` A count of the number of one bits in each byte of register RS is placed into the corresponding byte of register RA. This number ranges from 0 to 8, inclusive. # **Special Registers Altered:** None #### Programming Note - The total number of one bits in register RS can be computed as follows. In this example it is assumed that register RB contains the value 0x0101\_0101\_0101\_0101 ``` popentb RA,RS mulld RT,RA,RB srdi RT,RT,56 # RT = population count ``` # 3.3.13 Fixed-Point Rotate and Shift Instructions The Fixed-Point Processor performs rotation operations on data from a GPR and returns the result, or a portion of the result, to a GPR. The rotation operations rotate a 64-bit quantity left by a specified number of bit positions. Bits that exit from position 0 enter at position 63. Two types of rotation operation are supported. For the first type, denoted rotate<sub>64</sub> or ROTL<sub>64</sub>, the value rotated is the given 64-bit value. The rotate<sub>64</sub> operation is used to rotate a given 64-bit quantity. For the second type, denoted rotate<sub>32</sub> or ROTL<sub>32</sub>, the value rotated consists of two copies of bits 32:63 of the given 64-bit value, one copy in bits 0:31 and the other in bits 32:63. The rotate<sub>32</sub> operation is used to rotate a given 32-bit quantity. The *Rotate* and *Shift* instructions employ a mask generator. The mask is 64 bits long, and consists of 1-bits from a start bit, *mstart*, through and including a stop bit, *mstop*, and 0-bits elsewhere. The values of *mstart* and *mstop* range from 0 to 63. If *mstart* > *mstop*, the 1-bits wrap around from position 63 to position 0. Thus the mask is formed as follows: ``` if mstart ≤ mstop then mask<sub>mstart:mstop</sub> = ones mask<sub>all</sub> other bits = zeros else mask<sub>mstart:63</sub> = ones mask<sub>0:mstop</sub> = ones mask<sub>all</sub> other bits = zeros ``` There is no way to specify an all-zero mask. For instructions that use the rotate $_{32}$ operation, the mask start and stop positions are always in the low-order 32 bits of the mask. The use of the mask is described in following sections. The Rotate and Shift instructions with Rc=1 set the first three bits of CR field 0 as described in Section 3.3.7, "Other Fixed-Point Instructions" on page 61. Rotate and Shift instructions do not change the OV and SO bits. Rotate and Shift instructions, except algebraic right shifts, do not change the CA bit. # Extended mnemonics for rotates and shifts The *Rotate* and *Shift* instructions, while powerful, can be complicated to code (they have up to five operands). A set of extended mnemonics is provided that allow simpler coding of often-used functions such as clearing the leftmost or rightmost bits of a register, left justifying or right justifying an arbitrary field, and performing simple rotates and shifts. Some of these are shown as examples with the *Rotate* instructions. See Appendix D, "Assembler Extended Mnemonics" on page 383 for additional extended mnemonics. # 3.3.13.1 Fixed-Point Rotate Instructions These instructions rotate the contents of a register. The result of the rotation is - inserted into the target register under control of a mask (if a mask bit is 1 the associated bit of the rotated data is placed into the target register, and if the mask bit is 0 the associated bit in the target register remains unchanged); or - ANDed with a mask before being placed into the target register. The Rotate Left instructions allow right-rotation of the contents of a register to be performed (in concept) by a left-rotation of 64-n, where n is the number of bits by which to rotate right. They allow right-rotation of the contents of the low-order 32 bits of a register to be performed (in concept) by a left-rotation of 32-n, where n is the number of bits by which to rotate right. # Rotate Left Word Immediate then AND with Mask M-form | 21 | RS | RA | SH | MB | ME | Rc | |----|----|----|----|----|----|----| | 0 | 6 | 11 | | 21 | 26 | 31 | ``` n \leftarrow SH r \leftarrow ROTL_{32}((RS)_{32:63}, n) m \leftarrow MASK(MB+32, ME+32) RA \leftarrow r \& m ``` The contents of register RS are rotated<sub>32</sub> left SH bits. A mask is generated having 1-bits from bit MB+32 through bit ME+32 and 0-bits elsewhere. The rotated data are ANDed with the generated mask and the result is placed into register RA. #### **Special Registers Altered:** CR0 (if Rc=1) #### **Extended Mnemonics:** Examples of extended mnemonics for Rotate Left Word Immediate then AND with Mask: | Extended: | | Equiva | Equivalent to: | | | |-----------|-----------|--------|-----------------|--|--| | extlwi | Rx,Ry,n,b | rlwinm | Rx,Ry,b,0,n-1 | | | | srwi | Rx,Ry,n | rlwinm | Rx,Ry,32-n,n,31 | | | | clrrwi | Rx,Ry,n | rlwinm | Rx,Ry,0,0,31-n | | | #### **Programming Note** Let RSL represent the low-order 32 bits of register RS, with the bits numbered from 0 through 31. rlwinm can be used to extract an n-bit field that starts at bit position b in RSL, right-justified into the low-order 32 bits of register RA (clearing the remaining 32-n bits of the low-order 32 bits of RA), by setting SH=b+n, MB=32-n, and ME=31. It can be used to extract an n-bit field that starts at bit position b in RSL, left-justified into the low-order 32 bits of register RA (clearing the remaining 32-n bits of the low-order 32 bits of RA), by setting SH=b, MB = 0, and ME=n-1. It can be used to rotate the contents of the low-order 32 bits of a register left (right) by n bits, by setting SH=n (32-n), MB=0, and ME=31. It can be used to shift the contents of the low-order 32 bits of a register right by n bits, by setting SH=32-n, MB=n, and ME=31. It can be used to clear the high-order b bits of the low-order 32 bits of the contents of a register and then shift the result left by n bits, by setting SH=n, MB=b-n, and ME=31-n. It can be used to clear the low-order n bits of the low-order 32 bits of a register, by setting SH=0, MB=0, and ME=31-n. For all the uses given above, the high-order 32 bits of register RA are cleared. Extended mnemonics are provided for all of these uses; see Appendix D, "Assembler Extended Mnemonics" on page 383. ### Rotate Left Word then AND with Mask M-form | rlwnm | RA,RS,RB,MB,ME | (Rc=0) | |--------|----------------|--------| | rlwnm. | RA,RS,RB,MB,ME | (Rc=1) | | 23 | RS | RA | RB | MB | ME | Rc | |----|----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 26 | 31 | ``` n \leftarrow (RB)_{59:63} r \leftarrow ROTL_{32}((RS)_{32:63}, n) m \leftarrow MASK(MB+32, ME+32) PA \leftarrow r \cdot R \cdot m ``` The contents of register RS are rotated<sub>32</sub> left the number of bits specified by (RB)<sub>59:63</sub>. A mask is generated having 1-bits from bit MB+32 through bit ME+32 and 0-bits elsewhere. The rotated data are ANDed with the generated mask and the result is placed into register RA. #### Special Registers Altered: CR0 (if Rc=1) #### **Extended Mnemonics:** Example of extended mnemonics for Rotate Left Word then AND with Mask: Extended: Equivalent to: rotlw Rx,Ry,Rz rlwnm Rx,Ry,Rz,0,31 #### Programming Note Let RSL represent the low-order 32 bits of register RS, with the bits numbered from 0 through 31. **rlwnm** can be used to extract an n-bit field that starts at variable bit position b in RSL, right-justified into the low-order 32 bits of register RA (clearing the remaining 32-n bits of the low-order 32 bits of RA), by setting RB<sub>59:63</sub>=b+n, MB=32-n, and ME=31. It can be used to extract an n-bit field that starts at variable bit position b in RSL, left-justified into the low-order 32 bits of register RA (clearing the remaining 32-n bits of the low-order 32 bits of RA), by setting RB<sub>59:63</sub>=b, MB = 0, and ME=n-1. It can be used to rotate the contents of the low-order 32 bits of a register left (right) by variable n bits, by setting RB<sub>59:63</sub>=n (32-n), MB=0, and ME=31. For all the uses given above, the high-order 32 bits of register RA are cleared. Extended mnemonics are provided for some of these uses; see Appendix D, "Assembler Extended Mnemonics" on page 383. #### Rotate Left Word Immediate then Mask Insert M-form rlwimi RA,RS,SH,MB,ME (Rc=0) rlwimi. RA,RS,SH,MB,ME (Rc=1) | 20 | RS | RA | SH | MB | ME | Rc | |----|----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 26 | 31 | ``` n \leftarrow SH r \leftarrow ROTL_{32}((RS)_{32:63}, n) m \leftarrow MASK(MB+32, ME+32) RA \leftarrow r&m \mid (RA) \& \neg m ``` The contents of register RS are rotated<sub>32</sub> left SH bits. A mask is generated having 1-bits from bit MB+32 through bit ME+32 and 0-bits elsewhere. The rotated data are inserted into register RA under control of the generated mask. #### **Special Registers Altered:** CR0 (if Rc=1) #### **Extended Mnemonics:** Example of extended mnemonics for Rotate Left Word Immediate then Mask Insert: Extended: Equivalent to: inslwi Rx,Ry,n,b rlwimi Rx,Ry,32-b,b,b+n-1 #### **Programming Note** Let RAL represent the low-order 32 bits of register RA, with the bits numbered from 0 through 31. **rIwimi** can be used to insert an n-bit field that is left-justified in the low-order 32 bits of register RS, into RAL starting at bit position b, by setting SH=32-b, MB=b, and ME=(b+n)-1. It can be used to insert an n-bit field that is right-justified in the low-order 32 bits of register RS, into RAL starting at bit position b, by setting SH=32-(b+n), MB=b, and ME=(b+n)-1. Extended mnemonics are provided for both of these uses; see Appendix D, "Assembler Extended Mnemonics" on page 383. #### 3.3.13.1.1 64-bit Fixed-Point Rotate Instructions [Category: 64-Bit] ### Rotate Left Doubleword Immediate then Clear Left MD-form | rldicl | RA,RS,SH,MB | (Rc=0) | |---------|-------------|--------| | rldicl. | RA,RS,SH,MB | (Rc=1) | | | 30 | RS | RA | sh | mb | 0 | sh | Rc | |---|----|----|----|----|----|----|----|----| | ١ | 0 | 6 | 11 | 16 | 21 | 27 | 30 | 31 | $$\begin{array}{l} \text{n} \leftarrow \text{sh}_5 \mid\mid \text{sh}_{0:4} \\ \text{r} \leftarrow \text{ROTL}_{64}\left(\left(\text{RS}\right), \text{ n}\right) \\ \text{b} \leftarrow \text{mb}_5 \mid\mid \text{mb}_{0:4} \\ \text{m} \leftarrow \text{MASK}\left(\text{b}, \text{ 63}\right) \\ \text{RA} \leftarrow \text{r} \& \text{m} \end{array}$$ The contents of register RS are rotated<sub>64</sub> left SH bits. A mask is generated having 1-bits from bit MB through bit 63 and 0-bits elsewhere. The rotated data are ANDed with the generated mask and the result is placed into register RA. #### **Special Registers Altered:** CR0 (if Rc=1) #### **Extended Mnemonics:** Examples of extended mnemonics for Rotate Left Doubleword Immediate then Clear Left. | Extended: | | Equiva | Equivalent to: | | | |-----------|-----------|--------|----------------|--|--| | extrdi | Rx,Ry,n,b | rldicl | Rx,Ry,b+n,64-n | | | | srdi | Rx,Ry,n | rldicl | Rx,Ry,64-n,n | | | | clrldi | Rx.Rv.n | rldicl | Rx.Rv.0.n | | | #### Programming Note – **rIdicI** can be used to extract an n-bit field that starts at bit position b in register RS, right-justified into register RA (clearing the remaining 64-n bits of RA), by setting SH=b+n and MB=64-n. It can be used to rotate the contents of a register left (right) by n bits, by setting SH=n (64-n) and MB=0. It can be used to shift the contents of a register right by n bits, by setting SH=64-n and MB=n. It can be used to clear the high-order n bits of a register, by setting SH=0 and MB=n. Extended mnemonics are provided for all of these uses; see Appendix D, "Assembler Extended Mnemonics" on page 383. # Rotate Left Doubleword Immediate then Clear Right MD-form | 30 | RS | RA | sh | me | 1 | sh | Rc | |----|----|----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 27 | 30 | 31 | $$\begin{array}{lll} n \leftarrow sh_5 & || & sh_{0:4} \\ r \leftarrow & ROTL_{64}\left(\left(RS\right), & n\right) \\ e \leftarrow & me_5 & || & me_{0:4} \\ m \leftarrow & MASK\left(0, e\right) \\ RA \leftarrow & r & \& & m \end{array}$$ The contents of register RS are rotated<sub>64</sub> left SH bits. A mask is generated having 1-bits from bit 0 through bit ME and 0-bits elsewhere. The rotated data are ANDed with the generated mask and the result is placed into register RA. #### **Special Registers Altered:** CR0 (if Rc=1) #### **Extended Mnemonics:** Examples of extended mnemonics for Rotate Left Doubleword Immediate then Clear Right. | Extended: | | Equivalent to: | | | |-----------|-----------|----------------|--------------|--| | extldi | Rx,Ry,n,b | rldicr | Rx,Ry,b,n-1 | | | sldi | Rx,Ry,n | rldicr | Rx,Ry,n,63-n | | | clrrdi | Rx,Ry,n | rldicr | Rx,Ry,0,63-n | | #### Programming Note - **rIdicr** can be used to extract an n-bit field that starts at bit position b in register RS, left-justified into register RA (clearing the remaining 64-n bits of RA), by setting SH=b and ME=n-1. It can be used to rotate the contents of a register left (right) by n bits, by setting SH=n (64-n) and ME=63. It can be used to shift the contents of a register left by n bits, by setting SH=n and ME=63-n. It can be used to clear the low-order n bits of a register, by setting SH=0 and ME=63-n. Extended mnemonics are provided for all of these uses (some devolve to *rldicl*); see Appendix D, "Assembler Extended Mnemonics" on page 383. ### Rotate Left Doubleword Immediate then Clear MD-form #### | 30 | RS | RA | sh | mb | 2 | sh | Rc | |----|----|----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 27 | 30 | 31 | $$\begin{array}{lll} & n \leftarrow sh_5 \mid \mid sh_{0:4} \\ & r \leftarrow ROTL_{64}\left(\left(RS\right), \; n\right) \\ & b \leftarrow mb_5 \mid \mid mb_{0:4} \\ & m \leftarrow MASK(b, \; \neg n) \\ & RA \leftarrow r \; \& \; m \end{array}$$ The contents of register RS are rotated<sub>64</sub> left SH bits. A mask is generated having 1-bits from bit MB through bit 63-SH and 0-bits elsewhere. The rotated data are ANDed with the generated mask and the result is placed into register RA. #### **Special Registers Altered:** #### **Extended Mnemonics:** Example of extended mnemonics for Rotate Left Doubleword Immediate then Clear. | Extended: | | Equivalent to: | | | |-----------|-----------|----------------|-------------|--| | clrlsldi | Rx.Rv.b.n | rldic | Rx.Rv.n.b-n | | #### Programming Note *rldic* can be used to clear the high-order b bits of the contents of a register and then shift the result left by n bits, by setting SH=n and MB=b-n. It can be used to clear the high-order n bits of a register, by setting SH=0 and MB=n. Extended mnemonics are provided for both of these uses (the second devolves to *rldicl*); see Appendix D, "Assembler Extended Mnemonics" on page 383. #### Rotate Left Doubleword then Clear Left MDS-form | rldcl | RA,RS,RB,MB | (Rc=0) | |--------|-------------|--------| | rldcl. | RA,RS,RB,MB | (Rc=1) | | 30 | RS | RA | RB | mb | 8 | Rc | |----|----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 27 | 31 | ``` \begin{array}{l} \text{n} \leftarrow (\text{RB})_{58:63} \\ \text{r} \leftarrow \text{ROTL}_{64} ((\text{RS}), \text{ n}) \\ \text{b} \leftarrow \text{mb}_5 \mid \mid \text{mb}_{0:4} \\ \text{m} \leftarrow \text{MASK}(\text{b}, 63) \\ \text{RA} \leftarrow \text{r} \& \text{m} \end{array} ``` The contents of register RS are rotated<sub>64</sub> left the number of bits specified by (RB)<sub>58:63</sub>. A mask is generated having 1-bits from bit MB through bit 63 and 0-bits elsewhere. The rotated data are ANDed with the generated mask and the result is placed into register RA. #### **Special Registers Altered:** #### **Extended Mnemonics:** Example of extended mnemonics for Rotate Left Doubleword then Clear Left. | Extended: | | Equiv | alent to: | |-----------|----------|-------|------------| | rotld | Rx,Ry,Rz | rldcl | Rx.Rv.Rz.0 | #### Programming Note **rIdcI** can be used to extract an n-bit field that starts at variable bit position b in register RS, right-justified into register RA (clearing the remaining 64-n bits of RA), by setting RB<sub>58:63</sub>=b+n and MB=64-n. It can be used to rotate the contents of a register left (right) by variable n bits, by setting RB<sub>58:63</sub>=n (64-n) and MB=0. Extended mnemonics are provided for some of these uses; see Appendix D, "Assembler Extended Mnemonics" on page 383. ### Rotate Left Doubleword then Clear Right MDS-form | rldcr | RA,RS,RB,ME | (Rc=0) | |--------|-------------|--------| | rldcr. | RA,RS,RB,ME | (Rc=1) | | 30 | RS | RA | RB | me | 9 | Rc | |----|----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 27 | 31 | ``` \begin{array}{l} \text{n} \leftarrow (\text{RB})_{58:63} \\ \text{r} \leftarrow \text{ROTL}_{64}((\text{RS}), \text{ n}) \\ \text{e} \leftarrow \text{me}_5 \mid \mid \text{me}_{0:4} \\ \text{m} \leftarrow \text{MASK}(0, \text{ e}) \\ \text{RA} \leftarrow \text{r} \text{ \& m} \end{array} ``` The contents of register RS are rotated<sub>64</sub> left the number of bits specified by (RB)<sub>58:63</sub>. A mask is generated having 1-bits from bit 0 through bit ME and 0-bits elsewhere. The rotated data are ANDed with the generated mask and the result is placed into register RA. #### **Special Registers Altered:** #### **Programming Note** **rldcr** can be used to extract an n-bit field that starts at variable bit position b in register RS, left-justified into register RA (clearing the remaining 64-n bits of RA), by setting RB<sub>58:63</sub>=b and ME=n-1. It can be used to rotate the contents of a register left (right) by variable n bits, by setting RB<sub>58:63</sub>=n (64-n) and ME=63. Extended mnemonics are provided for some of these uses (some devolve to *rldcl*); see Appendix D, "Assembler Extended Mnemonics" on page 383. #### Rotate Left Doubleword Immediate then Mask Insert MD-form | rldimi | RA,RS,SH,MB | (Rc=0) | |---------|-------------|--------| | rldimi. | RA,RS,SH,MB | (Rc=1) | | 30 | RS | RA | sh | mb | 3 | sh | Rc | |----|----|----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 27 | 30 | 31 | ``` \begin{array}{lll} n &\leftarrow sh_5 & || & sh_0 :_4 \\ r &\leftarrow & ROTL_{64} \left( \left( RS \right), & n \right) \\ b &\leftarrow & mb_5 & || & mb_0 :_4 \\ m &\leftarrow & MASK \left( b, & \neg n \right) \\ RA &\leftarrow & r&m & | & (RA) & \neg m \end{array} ``` The contents of register RS are rotated<sub>64</sub> left SH bits. A mask is generated having 1-bits from bit MB through bit 63-SH and 0-bits elsewhere. The rotated data are inserted into register RA under control of the generated mask. #### **Special Registers Altered:** #### **Extended Mnemonics:** Example of extended mnemonics for Rotate Left Doubleword Immediate then Mask Insert. **Extended: Equivalent to:** insrdi Rx,Ry,n,b rldimi Rx,Ry,64-(b+n),b #### **Programming Note** *rldimi* can be used to insert an n-bit field that is right-justified in register RS, into register RA starting at bit position b, by setting SH=64-(b+n) and MB=b. An extended mnemonic is provided for this use; see Appendix D, "Assembler Extended Mnemonics" on page 383. #### 3.3.13.2 Fixed-Point Shift Instructions The instructions in this section perform left and right shifts. #### **Extended mnemonics for shifts** Immediate-form logical (unsigned) shift operations are obtained by specifying appropriate masks and shift values for certain *Rotate* instructions. A set of extended mnemonics is provided to make coding of such shifts simpler and easier to understand. Some of these are shown as examples with the *Rotate* instructions. See Appendix D, "Assembler Extended Mnemonics" on page 383 for additional extended mnemonics. #### Programming Note Any Shift Right Algebraic instruction, followed by *addze*, can be used to divide quickly by 2<sup>n</sup>. The setting of the CA bit by the Shift Right Algebraic instructions is independent of mode. #### Programming Note Multiple-precision shifts can be programmed as shown in Section E.1, "Multiple-Precision Shifts" on page 397. #### Shift Left Word X-form | slw | RA,RS,RB | (Rc=0) | |------|----------|--------| | slw. | RA,RS,RB | (Rc=1) | | 31 | RS | RA | RB | 24 | Rc | |----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{l} n \leftarrow (\text{RB})_{59:63} \\ r \leftarrow \text{ROTL}_{32}((\text{RS})_{32:63}, \ n) \\ \text{if (RB)}_{58} = 0 \ \text{then} \\ m \leftarrow \text{MASK}(32, \ 63-n) \\ \text{else m} \leftarrow ^{64}_{0} \\ \text{RA} \leftarrow \text{r \& m} \end{array} ``` The contents of the low-order 32 bits of register RS are shifted left the number of bits specified by (RB) $_{58:63}$ . Bits shifted out of position 32 are lost. Zeros are supplied to the vacated positions on the right. The 32-bit result is placed into RA $_{32:63}$ . RA $_{0:31}$ are set to zero. Shift amounts from 32 to 63 give a zero result. #### **Special Registers Altered:** CR0 (if Rc=1) #### Shift Right Word X-form | srw | RA,RS,RB | (Rc=0) | |------|----------|--------| | srw. | RA,RS,RB | (Rc=1) | | 31 | RS | RA | RB | 536 | Rc | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{l} n \leftarrow (RB)_{59:63} \\ r \leftarrow ROTL_{32}((RS)_{32:63}, 64-n) \\ \text{if } (RB)_{58} = 0 \text{ then} \\ m \leftarrow MASK(n+32, 63) \\ \text{else } m \leftarrow {}^{64}0 \\ RA \leftarrow r \& m \end{array} ``` The contents of the low-order 32 bits of register RS are shifted right the number of bits specified by (RB) $_{58:63}$ . Bits shifted out of position 63 are lost. Zeros are supplied to the vacated positions on the left. The 32-bit result is placed into RA $_{32:63}$ . RA $_{0:31}$ are set to zero. Shift amounts from 32 to 63 give a zero result. #### **Special Registers Altered:** CR0 (if Rc=1) #### Shift Right Algebraic Word Immediate X-form | srawi | RA,RS,SH | (Rc=0) | |--------|----------|--------| | srawi. | RA,RS,SH | (Rc=1) | | I | 31 | RS | RA | SH | 824 | Rc | |---|----|----|----|----|-----|----| | | 0 | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{l} {\rm n} \leftarrow {\rm SH} \\ {\rm r} \leftarrow {\rm ROTL}_{32} (\,({\rm RS})_{\,32:63}, \,\, 64\text{-n}) \\ {\rm m} \leftarrow {\rm MASK} ({\rm n+32}, \,\, 63) \\ {\rm s} \leftarrow ({\rm RS})_{\,32} \\ {\rm RA} \leftarrow {\rm r\&m} \mid (^{64}{\rm s})\,\&\neg {\rm m} \\ {\rm CA} \leftarrow {\rm s} \& (\,({\rm r\&\neg m})_{\,32:63} \neq 0) \end{array} ``` The contents of the low-order 32 bits of register RS are shifted right SH bits. Bits shifted out of position 63 are lost. Bit 32 of RS is replicated to fill the vacated positions on the left. The 32-bit result is placed into RA $_{32:63}$ . Bit 32 of RS is replicated to fill RA $_{0:31}$ . CA is set to 1 if the low-order 32 bits of (RS) contain a negative number and any 1-bits are shifted out of position 63; otherwise CA is set to 0. A shift amount of zero causes RA to receive EXTS((RS) $_{32:63}$ ), and CA to be set to 0. #### **Special Registers Altered:** #### Shift Right Algebraic Word X-form | sraw | RA,RS,RB | (Rc=0) | |-------|----------|--------| | sraw. | RA,RS,RB | (Rc=1) | | 31 | RS | RA | RB | 792 | Rc | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` n \leftarrow (RB)_{59:63} r \leftarrow ROTL_{32}((RS)_{32:63}, 64-n) if (RB)_{58} = 0 then m \leftarrow MASK(n+32, 63) else m \leftarrow ^{64}0 s \leftarrow (RS)_{32} RA \leftarrow r&m \mid (^{64}s)\& \neg m CA \leftarrow s \& ((r\& \neg m)_{32:63} \neq 0) ``` The contents of the low-order 32 bits of register RS are shifted right the number of bits specified by (RB)<sub>58:63</sub>. Bits shifted out of position 63 are lost. Bit 32 of RS is replicated to fill the vacated positions on the left. The 32-bit result is placed into RA<sub>32:63</sub>. Bit 32 of RS is replicated to fill RA<sub>0:31</sub>. CA is set to 1 if the low-order 32 bits of (RS) contain a negative number and any 1-bits are shifted out of position 63; otherwise CA is set to 0. A shift amount of zero causes RA to receive EXTS((RS)<sub>32:63</sub>), and CA to be set to 0. Shift amounts from 32 to 63 give a result of 64 sign bits, and cause CA to receive the sign bit of (RS)<sub>32:63</sub>. #### **Special Registers Altered:** # 3.3.13.2.1 64-bit Fixed-Point Shift Instructions [Category: 64-Bit] # Shift Left Doubleword X-form sld RA,RS,RB (Rc=0) sld. RA,RS,RB (Rc=1) 31 RS RA RB 27 Rc 0 6 11 16 21 31 $$\begin{array}{l} \text{n} \leftarrow (\text{RB})_{58:63} \\ \text{r} \leftarrow \text{ROTL}_{64} ((\text{RS}) \text{, n}) \\ \text{if } (\text{RB})_{57} = \text{0 then} \\ \text{m} \leftarrow \text{MASK}(\text{0, 63-n}) \\ \text{else m} \leftarrow ^{64} \text{0} \\ \text{RA} \leftarrow \text{r} \text{\& m} \end{array}$$ The contents of register RS are shifted left the number of bits specified by $(RB)_{57:63}$ . Bits shifted out of position 0 are lost. Zeros are supplied to the vacated positions on the right. The result is placed into register RA. Shift amounts from 64 to 127 give a zero result. #### **Special Registers Altered:** #### Shift Right Doubleword X-form | srd | RA,RS,RB | (Rc=0) | |------|----------|--------| | srd. | RA,RS,RB | (Rc=1) | | | | | | 31 | RS | RA | RB | 539 | Rc | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{l} \text{n} \leftarrow (\text{RB})_{58:63} \\ \text{r} \leftarrow \text{ROTL}_{64} ((\text{RS}) \text{, } 64\text{-n}) \\ \text{if } (\text{RB})_{57} = \text{0 then} \\ \text{m} \leftarrow \text{MASK} (\text{n, } 63) \\ \text{else m} \leftarrow ^{64} \text{0} \\ \text{RA} \leftarrow \text{r} \text{\& m} \end{array} ``` The contents of register RS are shifted right the number of bits specified by $(RB)_{57:63}$ . Bits shifted out of position 63 are lost. Zeros are supplied to the vacated positions on the left. The result is placed into register RA. Shift amounts from 64 to 127 give a zero result. #### **Special Registers Altered:** #### Shift Right Algebraic Doubleword Immediate XS-form | sradi | RA,RS,SH | (Rc=0) | |--------|----------|--------| | sradi. | RA,RS,SH | (Rc=1) | | 31 | RS | RA | sh | 413 | sh | Rc | |----|----|----|----|-----|----|----| | 0 | 6 | 11 | 16 | 21 | 30 | 31 | ``` \begin{array}{lll} n & \leftarrow & sh_5 & \big| \big| & sh_{0:4} \\ r & \leftarrow & ROTL_{64} \left( (RS) \,, \, 64-n \right) \\ m & \leftarrow & MASK(n, \, 63) \\ s & \leftarrow & (RS)_0 \\ RA & \leftarrow & r\&m \mid (^{64}s)\& \neg m \\ CA & \leftarrow & s \& \left( (r\& \neg m) \neq 0 \right) \end{array} ``` The contents of register RS are shifted right SH bits. Bits shifted out of position 63 are lost. Bit 0 of RS is replicated to fill the vacated positions on the left. The result is placed into register RA. CA is set to 1 if (RS) is negative and any 1-bits are shifted out of position 63; otherwise CA is set to 0. A shift amount of zero causes RA to be set equal to (RS), and CA to be set to 0. #### Special Registers Altered: #### Shift Right Algebraic Doubleword X-form | srad | RA,RS,RB | (Rc=0) | |-------|----------|--------| | srad. | RA,RS,RB | (Rc=1) | | 31 | RS | RA | RB | 794 | Rc | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` n ← (RB)<sub>58:63</sub> r ← ROTL<sub>64</sub>((RS), 64-n) if (RB)<sub>57</sub> = 0 then m ← MASK(n, 63) else m ← ^{64}0 s ← (RS)<sub>0</sub> RA ← r&m | ^{64}s)&¬m CA ← s & ((r&¬m)≠0) ``` The contents of register RS are shifted right the number of bits specified by $(RB)_{57:63}$ . Bits shifted out of position 63 are lost. Bit 0 of RS is replicated to fill the vacated positions on the left. The result is placed into register RA. CA is set to 1 if (RS) is negative and any 1-bits are shifted out of position 63; otherwise CA is set to 0. A shift amount of zero causes RA to be set equal to (RS), and CA to be set to 0. Shift amounts from 64 to 127 give a result of 64 sign bits in RA, and cause CA to receive the sign bit of (RS). #### **Special Registers Altered:** #### 3.3.14 Move To/From System Register Instructions The Move To Condition Register Fields instruction has a preferred form; see Section 1.8.1, "Preferred Instruction Forms" on page 21. In the preferred form, the FXM field satisfies the following rule. ■ Exactly one bit of the FXM field is set to 1. #### **Extended mnemonics** Extended mnemonics are provided for the *mtspr* and *mfspr* instructions so that they can be coded with the SPR name as part of the mnemonic rather than as a numeric operand. An extended mnemonic is provided for the *mtcrf* instruction for compatibility with old software (written for a version of the architecture that precedes Version 2.00) that uses it to set the entire Condition Register. Some of these extended mnemonics are shown as examples with the relevant instructions. See Appendix D, "Assembler Extended Mnemonics" on page 383 for additional extended mnemonics. #### Move To Special Purpose Register XFX-form mtspr SPR,RS | 31 | RS | spr | 467 | / | |----|----|-----|-----|----| | 0 | 6 | 11 | 21 | 31 | ``` \begin{array}{ll} n \leftarrow spr_{5:9} \mid \mid spr_{0:4} \\ \text{if length}(SPR(n)) = 64 \text{ then} \\ SPR(n) \leftarrow (RS) \\ \text{else} \\ SPR(n) \leftarrow (RS)_{32:63} \end{array} ``` The SPR field denotes a Special Purpose Register, encoded as shown in the table below. The contents of register RS are placed into the designated Special Purpose Register. For Special Purpose Registers that are 32 bits long, the low-order 32 bits of RS are placed into the SPR. | ٦ | ecimal | SPR <sup>1</sup> | Register | | | |---|-----------------------------------------|---------------------------------------|----------------------|--|--| | u | ecimai | spr <sub>5:9</sub> spr <sub>0:4</sub> | Name | | | | | 1 | 00000 00001 | XER | | | | | 8 | 00000 01000 | LR | | | | | 9 | 00000 01001 | CTR | | | | | 256 | 01000 00000 | VRSAVE <sup>2</sup> | | | | | 512 | 10000 00000 | SPEFSCR <sup>3</sup> | | | | | 896 | 11100 00000 | PPR <sup>4</sup> | | | | 1 | Note tha | t the order of the tw | o 5-bit halves | | | | | of the SF | PR number is rever | sed. | | | | 2 | Category: Embedded and Vector ( <e></e> | | | | | | | see Prog | gramming Note in S | Section 3.2.4). | | | | 3 | Category | /: SPE. | | | | If execution of this instruction specifying an SPR number other than one of the values shown above is attempted, then one of the following occurs. - If spr<sub>0</sub> = 0, the illegal instruction error handler is invoked. - If spr<sub>0</sub> = 1, the system privileged instruction error handler is invoked. A complete description of this instruction can be found in Book III. #### **Special Registers Altered:** See above 4 Category: Server. #### **Extended Mnemonics:** Examples of extended mnemonics for *Move To Special Purpose Register*: | Extended: | | Equiva | lent to: | |-----------|----|--------|----------| | mtxer | Rx | mtspr | 1,Rx | | mtlr | Rx | mtspr | 8,Rx | | mtctr | Rx | mtspr | 9,Rx | #### Compiler and Assembler Note - For the *mtspr* and *mfspr* instructions, the SPR number coded in assembler language does not appear directly as a 10-bit binary number in the instruction. The number coded is split into two 5-bit halves that are reversed in the instruction, with the high-order 5 bits appearing in bits 16:20 of the instruction and the low-order 5 bits in bits 11:15. ### Move From Special Purpose Register XFX-form mfspr RT,SPR | 31 | RT | spr | 339 | / | |----|----|-----|-----|----| | 0 | 6 | 11 | 21 | 31 | $$\begin{array}{lll} n \leftarrow spr_{5.9} & \mid \mid spr_{0:4} \\ \text{if length}(SPR(n)) & = 64 \text{ then} \\ RT \leftarrow SPR(n) \\ \text{else} \\ RT \leftarrow & ^{32}0 & \mid \mid SPR(n) \end{array}$$ The SPR field denotes a Special Purpose Register, encoded as shown in the table below. The contents of the designated Special Purpose Register are placed into register RT. For Special Purpose Registers that are 32 bits long, the low-order 32 bits of RT receive the contents of the Special Purpose Register and the high-order 32 bits of RT are set to zero. | decimal | SPR <sup>1</sup> | Register | |---------|---------------------------------------|----------------------| | decimai | spr <sub>5:9</sub> spr <sub>0:4</sub> | Name | | 1 | 00000 00001 | XER | | 8 | 00000 01000 | LR | | 9 | 00000 01001 | CTR | | 136 | 00100 01000 | CTRL | | 256 | 01000 00000 | VRSAVE <sup>2</sup> | | 259 | 01000 00011 | SPRG3 | | 260 | 01000 00100 | SPRG4 <sup>3</sup> | | 261 | 01000 00101 | SPRG5 <sup>3</sup> | | 262 | 01000 00110 | SPRG6 <sup>3</sup> | | 263 | 01000 00111 | SPRG7 <sup>3</sup> | | 268 | 01000 01100 | TB <sup>4</sup> | | 269 | 01000 01101 | TBU <sup>4</sup> | | 512 | 10000 00000 | SPEFSCR <sup>5</sup> | | 526 | 10000 01110 | ATB <sup>4,6</sup> | | 527 | 10000 01111 | ATBU <sup>4,6</sup> | | 896 | 11100 00000 | PPR <sup>7</sup> | - Note that the order of the two 5-bit halves of the SPR number is reversed. - <sup>2</sup> Category: Embedded and Vector (<E> see Programming Note in Section 3.2.4). - 3 Category: Embedded. - See Chapter 4 of Book II. - <sup>5</sup> Category: SPE. - 6 Category: Alternate Time Base. - <sup>7</sup> Category: Server. If execution of this instruction specifying an SPR number other than one of the values shown above is attempted, then one of the following occurs. - If spr<sub>0</sub> = 0, the illegal instruction error handler is invoked. - If spr<sub>0</sub> = 1, the system privileged instruction error handler is invoked. A complete description of this instruction can be found in Book III. #### **Special Registers Altered:** None #### **Extended Mnemonics:** Examples of extended mnemonics for Move From Special Purpose Register: | Extended: | | Equiva | Equivalent to: | | | |-----------|----|--------|----------------|--|--| | mfxer | Rx | mfspr | Rx,1 | | | | mflr | Rx | mfspr | Rx,8 | | | | mfctr | Rx | mfspr | Rx,9 | | | #### - Note - See the Notes that appear with *mtspr*. #### Move To Condition Register Fields XFX-form mtcrf FXM,RS | 31 | RS | 0 | FXM | / | 144 | / | |----|----|----|-----|----|-----|----| | 0 | 6 | 11 | 12 | 20 | 21 | 31 | The contents of bits 32:63 of register RS are placed into the Condition Register under control of the field mask specified by FXM. The field mask identifies the 4-bit fields affected. Let i be an integer in the range 0-7. If $FXM_i=1$ then CR field i (CR bits $4\times i+32:4\times i+35$ ) is set to the contents of the corresponding field of the low-order 32 bits of RS. #### **Special Registers Altered:** CR fields selected by mask #### **Extended Mnemonics:** Example of extended mnemonics for *Move To Condition Register Fields*: Extended: Equivalent to: mtcr Rx mtcrf 0xFF,Rx #### Programming Note In the preferred form of this instruction (*mtocrf*), only one Condition Register field is updated. #### Move From Condition Register XFX-form mfcr RT $$RT \leftarrow ^{32}O \mid \mid CR$$ The contents of the Condition Register are placed into $RT_{32:63}$ . $RT_{0:31}$ are set to 0. #### **Special Registers Altered:** None # 3.3.14.1 Move to/From One Condition Register Field Instructions [Category: Phased-In (sV2.05)] ### Move To One Condition Register Field XFX-form mtocrf FXM,RS [Category: Phased-In] | 31 | RS | 1 | FXM | / | 144 | 7 | |----|----|----|-----|----|-----|----| | 0 | 6 | 11 | 12 | 20 | 21 | 31 | If exactly one bit of the FXM field is set to 1, let n be the position of that bit in the field ( $0 \le n \le 7$ ). The contents of bits $4 \times n + 32 : 4 \times n + 35$ of register RS are placed into CR field n (CR bits $4 \times n + 32 : 4 \times n + 35$ ). Otherwise, the contents of the Condition Register are undefined. #### **Special Registers Altered:** CR field selected by FXM #### Programming Note - These forms of the *mtcrf* and *mfcr* instructions are intended to replace the old forms of the instructions (the forms shown in page 95), which will eventually be phased out of the architecture. The new forms are backward compatible with most processors that comply with versions of the architecture that precede Version 2.00. On those processors, the new forms are treated as the old forms. However, on some processors that comply with versions of the architecture that precede Version 2.00 the new forms may be treated as follows: **mtocrf:** may cause the system illegal instruction error handler to be invoked mfocrf: may place an undefined value into register RT #### Move From One Condition Register Field XFX-form mfocrf RT,FXM [Category: Phased-In] | 31 | RT | 1 FXM | / 19 | / | |----|----|-------|-------|----| | 0 | 6 | 11 12 | 20 21 | 31 | ``` \begin{split} \text{RT} &\leftarrow \text{undefined} \\ \text{count} &\leftarrow 0 \\ \text{do i = 0 to 7} \\ &\quad \text{if } \text{FXM}_i = 1 \text{ then} \\ &\quad \text{n} &\leftarrow \text{i} \\ &\quad \text{count} &\leftarrow \text{count} + 1 \\ \text{if } \text{count} &= 1 \text{ then} \\ &\quad \text{RT}_{4 \times n + 32 : 4 \times n + 35} &\leftarrow \text{CR}_{4 \times n + 32 : 4 \times n + 35} \end{split} ``` If exactly one bit of the FXM field is set to 1, let n be the position of that bit in the field ( $0 \le n \le 7$ ). The contents of CR field n (CR bits $4 \times n + 32 \cdot 4 \times n + 35$ ) are placed into bits $4 \times n + 32 \cdot 4 \times n + 35$ of register RT and the contents of the remaining bits of register RT are undefined. Otherwise, the contents of register RT are undefined. #### **Special Registers Altered:** None #### 3.3.14.2 Move To/From System Registers [Category: Embedded] #### Move to Condition Register from XER X-form mcrxr BF | 31 | BF | // | /// | /// | 512 | / | |----|----|----|-----|-----|-----|----| | 0 | 6 | 9 | 11 | 16 | 21 | 31 | $\begin{array}{c} \operatorname{CR}_{4 \times \operatorname{BF} + 32:4 \times \operatorname{BF} + 35} \leftarrow \operatorname{XER}_{32:35} \\ \operatorname{XER}_{32:35} \leftarrow \operatorname{Ob0000} \end{array}$ The contents of XER<sub>32:35</sub> are copied to Condition Register field BF. XER<sub>32:35</sub> are set to zero. #### **Special Registers Altered:** CR field BF XER<sub>32:35</sub> #### Move From APID Indirect X-form mfapidi RT,RA | 31 | RT | RA | /// | 275 | / | |----|----|----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | RT ← implementation-dependent value based on (RA) The contents of RA are provided to any auxiliary processors that may be present. A value, that is implementation-dependent, is placed in RT. #### **Special Registers Altered:** None #### Programming Note - This instruction is provided as a mechanism for software to query the presence and configuration of one or more auxiliary processors. See the implementation for details on the behavior of this instruction. # Move To Device Control Register User-mode Indexed X-form mtdcrux RS,RA | 31 | RS | RA | /// | 419 | / | |----|----|----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | DCRN ← (RA) DCR(DCRN) ← RS Let the contents of register RA denote a Device Control Register. (The supported Device Control Registers are implementation-dependent.) The contents of RS are placed into the designated Device Control Register. For 32-bit Device Control Registers, the contents of bits 32:63 of RS are placed into the Device Control Register. See "Move To Device Control Register Indexed X-form" on page 624 in Book III for more information on this instruction. #### **Special Registers Altered:** Implementation-dependent # Move From Device Control Register User-mode Indexed X-form mfdcrux RT,RA | 31 | RT | RA | /// | 291 | / | |----|----|----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | DCRN ← (RA) RT ← DCR (DCRN) Let the contents of register RA denote a Device Control Register. (The supported Device Control Registers are implementation-dependent.) The contents of the designated Device Control Register are placed into RT. For 32-bit Device Control Registers, the contents of bits 32:63 of the designated Device Control Register are placed into RT. See "Move From Device Control Register Indexed X-form" on page 625 in Book III for more information on this instruction. #### **Special Registers Altered:** Implementation-dependent # **Chapter 4. Floating-Point Processor [Category: Floating-Point]** | 4.1 Floating-Point Processor Overview 99 | 4.5.2 Execution Model for | |----------------------------------------------------------------------------|-------------------------------------------------| | 4.2 Floating-Point Processor Registers 100 | Multiply-Add Type Instructions 115 | | 4.2.1 Floating-Point Registers 100 4.2.2 Floating-Point Status and Control | 4.6 Floating-Point Processor Instructions . 116 | | Register | 4.6.1 Floating-Point Storage Access | | 4.3 Floating-Point Data 103 | Instructions | | 4.3.1 Data Format | 4.6.1.1 Storage Access Exceptions 117 | | 4.3.2 Value Representation 104 | 4.6.2 Floating-Point Load Instructions. 117 | | 4.3.3 Sign of Result 105 | 4.6.3 Floating-Point Store Instructions 121 | | 4.3.4 Normalization and | 4.6.4 Floating-Point Load Store Double- | | Denormalization | word Pair Instructions [Category: Float- | | 4.3.5 Data Handling and Precision 106 | ing-Point.Phased-Out] | | 4.3.5.1 Single-Precision Operands 106 | 4.6.5 Floating-Point Move Instructions 126 | | 4.3.5.2 Integer-Valued Operands 107 | 4.6.6 Floating-Point Arithmetic Instructions | | 4.3.6 Rounding 107 | 127 | | 4.4 Floating-Point Exceptions 108 | 4.6.6.1 Floating-Point Elementary Arith- | | 4.4.1 Invalid Operation Exception 110 | metic Instructions | | 4.4.1.1 Definition | 4.6.6.2 Floating-Point Multiply-Add Instruc- | | 4.4.1.2 Action | tions132 | | 4.4.2 Zero Divide Exception | 4.6.7 Floating-Point Rounding and Con- | | 4.4.2.1 Definition111 | version Instructions | | 4.4.2.2 Action | 4.6.7.1 Floating-Point Rounding Instruc- | | 4.4.3 Overflow Exception | tion134 | | 4.4.3.1 Definition111 | 4.6.7.2 Floating-Point Convert To/From | | 4.4.3.2 Action | Integer Instructions | | 4.4.4 Underflow Exception 112 | 4.6.7.3 Floating Round to Integer Instruc- | | 4.4.4.1 Definition | tions [Category: Floating-Point.Phased-In | | 4.4.4.2 Action | (sV2.05)] | | 4.4.5 Inexact Exception | 4.6.8 Floating-Point Compare Instructions | | 4.4.5.1 Definition | 138 | | 4.4.5.2 Action | 4.6.9 Floating-Point Select Instruction 139 | | 4.5 Floating-Point Execution Models . 113 | 4.6.10 Floating-Point Status and Control | | 4.5.1 Execution Model for IEEE Opera- | Register Instructions 140 | | tions 113 | | # **4.1 Floating-Point Processor** Overview This chapter describes the registers and instructions that make up the Floating-Point Processor facility. The processor (augmented by appropriate software support, where required) implements a floating-point system compliant with the ANSI/IEEE Standard 754-1985, "IEEE Standard for Binary Floating-Point Arithmetic" (hereafter referred to as "the IEEE standard"). That standard defines certain required "operations" (addition, subtraction, etc.). Herein, the term "floating-point operation" is used to refer to one of these required operations and to additional operations defined (e.g., those performed by *Multiply-Add* or *Reciprocal Estimate* instructions). A Non-IEEE mode is also provided. This mode, which may produce results not in strict compliance with the IEEE standard, allows shorter latency. Instructions are provided to perform arithmetic, rounding, conversion, comparison, and other operations in floating-point registers; to move floating-point data between storage and these registers; and to manipulate the Floating-Point Status and Control Register explicitly. These instructions are divided into two categories. #### computational instructions The computational instructions are those that perform addition, subtraction, multiplication, division, extracting the square root, rounding, conversion, comparison, and combinations of these operations. These instructions provide the floating-point operations. They place status information into the Floating-Point Status and Control Register. They are the instructions described in Sections 4.6.6 through 4.6.8. #### non-computational instructions The non-computational instructions are those that perform loads and stores, move the contents of a floating-point register to another floating-point register possibly altering the sign, manipulate the Floating-Point Status and Control Register explicitly, and select the value from one of two floating-point registers based on the value in a third floating-point register. The operations performed by these instructions are not considered floating-point operations. With the exception of the instructions that manipulate the Floating-Point Status and Control Register explicitly, they do not alter the Floating-Point Status and Control Register. They are the instructions described in Sections 4.6.2 through 4.6.5, and 4.6.10. A floating-point number consists of a signed exponent and a signed significand. The quantity expressed by this number is the product of the significand and the number 2<sup>exponent</sup>. Encodings are provided in the data format to represent finite numeric values, ±Infinity, and values that are "Not a Number" (NaN). Operations involving infinities produce results obeying traditional mathematical conventions. NaNs have no mathematical interpretation. Their encoding permits a variable diagnostic information field. They may be used to indicate such things as uninitialized variables and can be produced by certain invalid operations. There is one class of exceptional events that occur during instruction execution that is unique to the Floating-Point Processor: the Floating-Point Exception. Floating-point exceptions are signaled with bits set in the Floating-Point Status and Control Register (FPSCR). They can cause the system floating-point enabled exception error handler to be invoked, precisely or imprecisely, if the proper control bits are set. #### Floating-Point Exceptions The following floating-point exceptions are detected by the processor: | Invalid Operation Exception | (VX) | |-----------------------------|----------| | SNaN | (VXSNAN) | | Infinity - Infinity | (VXISI) | | Infinity÷Infinity | (VXIDI) | | Zero÷Zero | (VXZDZ) | | Infinity×Zero | (VXIMZ) | | Invalid Compare | (VXVC) | | Software-Defined Condition | (VXSOFT) | | Invalid Square Root | (VXSQRT) | | Invalid Integer Convert | (VXCVI) | | Zero Divide Exception | (ZX) | | Overflow Exception | (OX) | | Underflow Exception | (UX) | | Inexact Exception | (XX) | Each floating-point exception, and each category of Invalid Operation Exception, has an exception bit in the FPSCR. In addition, each floating-point exception has a corresponding enable bit in the FPSCR. See Section 4.2.2, "Floating-Point Status and Control Register" on page 101 for a description of these exception and enable bits, and Section 4.4, "Floating-Point Exceptions" on page 108 for a detailed discussion of floating-point exceptions, including the effects of the enable bits. # 4.2 Floating-Point Processor Registers #### 4.2.1 Floating-Point Registers Implementations of this architecture provide 32 floating-point registers (FPRs). The floating-point instruction formats provide 5-bit fields for specifying the FPRs to be used in the execution of the instruction. The FPRs are numbered 0-31. See Figure 46 on page 101. Each FPR contains 64 bits that support the floating-point double format. Every instruction that interprets the contents of an FPR as a floating-point value uses the floating-point double format for this interpretation. The computational instructions, and the *Move* and *Select* instructions, operate on data located in FPRs and, with the exception of the *Compare* instructions, place the result value into an FPR and optionally (when Rc=1) place status information into the Condition Reg- ister. Instruction forms with Rc=1 are part of Category: Floating-Point.Record. Load Double and Store Double instructions are provided that transfer 64 bits of data between storage and the FPRs with no conversion. Load Single instructions are provided to transfer and convert floating-point values in floating-point single format from storage to the same value in floating-point double format in the FPRs. Store Single instructions are provided to transfer and convert floating-point values in floating-point double format from the FPRs to the same value in floating-point single format in storage. Instructions are provided that manipulate the Floating-Point Status and Control Register and the Condition Register explicitly. Some of these instructions copy data from an FPR to the Floating-Point Status and Control Register or vice versa. The computational instructions and the *Select* instruction accept values from the FPRs in double format. For single-precision arithmetic instructions, all input values must be representable in single format; if they are not, the result placed into the target FPR, and the setting of status bits in the FPSCR and in the Condition Register (if Rc=1), are undefined. | | FPR 0 | | |---|--------|----| | | FPR 1 | | | | | | | | | | | | FPR 30 | | | | FPR 31 | | | 0 | | 63 | Figure 46. Floating-Point Registers # 4.2.2 Floating-Point Status and Control Register The Floating-Point Status and Control Register (FPSCR) controls the handling of floating-point exceptions and records status resulting from the floating-point operations. Bits 32:55 are status bits. Bits 56:63 are control bits. The exception bits in the FPSCR (bits 35:44, 53:55) are sticky; that is, once set to 1 they remain set to 1 until they are set to 0 by an *mcrfs*, *mtfsfi*, *mtfsf*, or *mtfsb0* instruction. The exception summary bits in the FPSCR (FX, FEX, and VX, which are bits 32:34) are not considered to be "exception bits", and only FX is sticky. FEX and VX are simply the ORs of other FPSCR bits. Therefore these two bits are not listed among the FPSCR bits affected by the various instructions. Figure 47. Floating-Point Status and Control Register The bit definitions for the FPSCR are as follows. #### Bit(s) Description #### 0:31 Reserved 32 Floating-Point Exception Summary (FX) Every floating-point instruction, except mtfsfi and mtfsf, implicitly sets FPSCR<sub>FX</sub> to 1 if that instruction causes any of the floating-point exception bits in the FPSCR to change from 0 to 1. mcrfs, mtfsfi, mtfsf, mtfsb0, and mtfsb1 can alter FPSCR<sub>FX</sub> explicitly. #### Programming Note - FPSCR<sub>FX</sub> is defined not to be altered implicitly by *mtfsfi* and *mtfsf* because permitting these instructions to alter FPSCR<sub>FX</sub> implicitly could cause a paradox. An example is an *mtfsfi* or *mtfsf* instruction that supplies 0 for FPSCR<sub>FX</sub> and 1 for FPSCR<sub>OX</sub>, and is executed when FPSCR<sub>OX</sub>=0. See also the Programming Notes with the definition of these two instructions. #### 33 Floating-Point Enabled Exception Summary (FEX) This bit is the OR of all the floating-point exception bits masked by their respective enable bits. *mcrfs*, *mtfsfi*, *mtfsf*, *mtfsb0*, and *mtfsb1* cannot alter FPSCR<sub>FEX</sub> explicitly. # 34 Floating-Point Invalid Operation Exception Summary (VX) This bit is the OR of all the Invalid Operation exception bits. *mcrfs*, *mtfsfi*, *mtfsf*, *mtfsb0*, and *mtfsb1* cannot alter FPSCR<sub>VX</sub> explicitly. - 35 Floating-Point Overflow Exception (OX) See Section 4.4.3, "Overflow Exception" on page 111. - 36 Floating-Point Underflow Exception (UX) See Section 4.4.4, "Underflow Exception" on page 112. - 37 Floating-Point Zero Divide Exception (ZX) See Section 4.4.2, "Zero Divide Exception" on page 111. - 38 Floating-Point Inexact Exception (XX) See Section 4.4.5, "Inexact Exception" on page 113. $\mathsf{FPSCR}_{\mathsf{XX}}$ is a sticky version of $\mathsf{FPSCR}_{\mathsf{FI}}$ (see below). Thus the following rules completely describe how $\mathsf{FPSCR}_{\mathsf{XX}}$ is set by a given instruction. - If the instruction affects FPSCR<sub>FI</sub>, the new value of FPSCR<sub>XX</sub> is obtained by ORing the old value of FPSCR<sub>XX</sub> with the new value of FPSCR<sub>FI</sub>. - If the instruction does not affect FPSCR<sub>FI</sub>, the value of FPSCR<sub>XX</sub> is unchanged. - 39 Floating-Point Invalid Operation Exception (SNaN) (VXSNAN) See Section 4.4.1 "Invalid Operation Excep- See Section 4.4.1, "Invalid Operation Exception" on page 110. - 40 Floating-Point Invalid Operation Exception (∞ ∞) (VXISI) See Section 4.4.1. - Floating-Point Invalid Operation Exception (∞ ÷ ∞) (VXIDI) See Section 4.4.1. - 42 Floating-Point Invalid Operation Exception (0÷0) (VXZDZ) See Section 4.4.1. - 43 Floating-Point Invalid Operation Exception (∞×0) (VXIMZ) See Section 4.4.1. - 44 Floating-Point Invalid Operation Exception (Invalid Compare) (VXVC) See Section 4.4.1. - 45 Floating-Point Fraction Rounded (FR) The last Arithmetic or Rounding and Conversion instruction incremented the fraction during rounding. See Section 4.3.6, "Rounding" on page 107. This bit is not sticky. - 46 Floating-Point Fraction Inexact (FI) The last Arithmetic or Rounding and Conversion instruction either produced an inexact result during rounding or caused a disabled Overflow Exception. See Section 4.3.6. This bit is not sticky. See the definition of $FPSCR_{XX}$ , above, regarding the relationship between $FPSCR_{FI}$ and $FPSCR_{XX}$ . #### 47:51 Floating-Point Result Flags (FPRF) Arithmetic, rounding, and Convert From Integer instructions set this field based on the result placed into the target register and on the target precision, except that if any portion of the result is undefined then the value placed into FPRF is undefined. Floating-point Compare instructions set this field based on the relative values of the operands being compared. For Convert To Integer instructions, the value placed into FPRF is undefined. Additional details are given below. #### **Programming Note** A single-precision operation that produces a denormalized result sets FPRF to indicate a denormalized number. When possible, single-precision denormalized numbers are represented in normalized double format in the target register. 47 Floating-Point Result Class Descriptor (C) Arithmetic, rounding, and *Convert From Integer* instructions may set this bit with the FPCC bits, to indicate the class of the result as shown in Figure 48 on page 103. 48:51 Floating-Point Condition Code (FPCC) Floating-point *Compare* instructions set one of the FPCC bits to 1 and the other three FPCC bits to 0. Arithmetic, rounding, and *Convert From Integer* instructions may set the FPCC bits with the C bit, to indicate the class of the result as shown in Figure 48 on page 103. Note that in this case the high-order three bits of the FPCC retain their relational significance indicating that the value is less than, greater than, or equal to zero. - 48 Floating-Point Less Than or Negative (FL or <) - 49 Floating-Point Greater Than or Positive (FG or >) - 50 Floating-Point Equal or Zero (FE or =) - 51 Floating-Point Unordered or NaN (FU or ?) - 52 Reserved - 53 Floating-Point Invalid Operation Exception (Software-Defined Condition) (VXSOFT) This bit can be altered only by *mcrfs*, *mtfsfi*, *mtfsb1*, See Section 4.4.1. #### **Programming Note -** FPSCR<sub>VXSOFT</sub> can be used by software to indicate the occurrence of an arbitrary, software-defined, condition that is to be treated as an Invalid Operation Exception. For example, the bit could be set by a program that computes a base 10 logarithm if the supplied input is negative. 54 Floating-Point Invalid Operation Exception (Invalid Square Root) (VXSQRT) See Section 4.4.1. - 55 Floating-Point Invalid Operation Exception (Invalid Integer Convert) (VXCVI) See Section 4.4.1. - 56 Floating-Point Invalid Operation Exception Enable (VE) See Section 4.4.1. - 57 Floating-Point Overflow Exception Enable (OE) See Section 4.4.3, "Overflow Exception" on page 111. - 58 Floating-Point Underflow Exception Enable (UE) See Section 4.4.4, "Underflow Exception" on page 112. - 59 Floating-Point Zero Divide Exception Enable (ZE) See Section 4.4.2, "Zero Divide Exception" on page 111. - 60 Floating-Point Inexact Exception Enable (XE) See Section 4.4.5, "Inexact Exception" on page 113. - 61 Floating-Point Non-IEEE Mode (NI) Floating-point non-IEEE mode is optional. If floating-point non-IEEE mode is not implemented, this bit is treated as reserved, and the remainder of the definition of this bit does not apply. - If floating-point non-IEEE mode is implemented, this bit has the following meaning. - 0 The processor is not in floating-point non-IEEE mode (i.e., all floating-point operations conform to the IEEE standard). - 1 The processor is in floating-point non-IEEE mode. When the processor is in floating-point non-IEEE mode, the remaining FPSCR bits may have meanings different from those given in this document, and floating-point operations need not conform to the IEEE standard. The effects of executing a given floating-point instruction with FPSCR $_{\rm NI}$ =1, and any additional requirements for using non-IEEE mode, are implementation-dependent. The results of executing a given instruction in non-IEEE mode may vary between implementations, and between different executions on the same implementation. #### Programming Note - When the processor is in floating-point non-IEEE mode, the results of floating-point operations may be approximate, and performance for these operations may be better, more predictable, or less data-dependent than when the processor is not in non-IEEE mode. For example, in non-IEEE mode an implementation may return 0 instead of a denormalized number, and may return a large number instead of an infinity. - 62:63 *Floating-Point Rounding Control* (RN) See Section 4.3.6, "Rounding" on page 107. - 00 Round to Nearest - 01 Round toward Zero - 10 Round toward +Infinity - 11 Round toward -Infinity | Result<br>Flags | Result Value Class | |-----------------|---------------------------------------| | C < > = ? | | | 1 0 0 0 1 | Quiet NaN | | 0 1 0 0 1 | - Infinity | | 0 1 0 0 0 | <ul> <li>Normalized Number</li> </ul> | | 1 1 0 0 0 | - Denormalized Number | | 10010 | - Zero | | 0 0 0 1 0 | + Zero | | 10100 | + Denormalized Number | | 0 0 1 0 0 | + Normalized Number | | 0 0 1 0 1 | + Infinity | Figure 48. Floating-Point Result Flags #### 4.3 Floating-Point Data #### 4.3.1 Data Format This architecture defines the representation of a floating-point value in two different binary fixed-length formats. The format may be a 32-bit single format for a single-precision value or a 64-bit double format for a double-precision value. The single format may be used for data in storage. The double format may be used for data in storage and for data in floating-point registers. The lengths of the exponent and the fraction fields differ between these two formats. The structure of the single and double formats is shown below. | S EXP | | FRACTION | | |-------|----|----------|----| | 32 33 | 41 | | 63 | Figure 49. Floating-point single format Figure 50. Floating-point double format Values in floating-point format are composed of three fields: S sign bit EXP exponent+bias FRACTION fraction Representation of numeric values in the floating-point formats consists of a sign bit (S), a biased exponent (EXP), and the fraction portion (FRACTION) of the significand. The significand consists of a leading implied bit concatenated on the right with the FRACTION. This leading implied bit is 1 for normalized numbers and 0 for denormalized numbers and is located in the unit bit position (i.e., the first bit to the left of the binary point). Values representable within the two floating-point formats can be specified by the parameters listed in Figure 51. | | Format | | |------------------|--------|--------| | | Single | Double | | | | | | Exponent Bias | +127 | +1023 | | Maximum Exponent | +127 | +1023 | | Minimum Exponent | -126 | -1022 | | Widths (bits) | | | | Format | 32 | 64 | | Sign | 1 | 1 | | Exponent | 8 | 11 | | Fraction | 23 | 52 | | Significand | 24 | 53 | Figure 51. IEEE floating-point fields The architecture requires that the FPRs of the Floating-Point Processor support the floating-point double format only. #### 4.3.2 Value Representation This architecture defines numeric and non-numeric values representable within each of the two supported formats. The numeric values are approximations to the real numbers and include the normalized numbers, denormalized numbers, and zero values. The non-numeric values representable are the infinities and the Not a Numbers (NaNs). The infinities are adjoined to the real numbers, but are not numbers themselves, and the standard rules of arithmetic do not hold when they are used in an operation. They are related to the real numbers by order alone. It is possible however to define restricted operations among numbers and infini- ties as defined below. The relative location on the real number line for each of the defined entities is shown in Figure 52. Figure 52. Approximation to real numbers The NaNs are not related to the numeric values or infinities by order or value but are encodings used to convey diagnostic information such as the representation of uninitialized variables. The following is a description of the different floating-point values defined in the architecture: #### Binary floating-point numbers Machine representable values used as approximations to real numbers. Three categories of numbers are supported: normalized numbers, denormalized numbers, and zero values. #### **Normalized numbers** (± NOR) These are values that have a biased exponent value in the range: - 1 to 254 in single format - 1 to 2046 in double format They are values in which the implied unit bit is 1. Normalized numbers are interpreted as follows: NOR = $$(-1)^s \times 2^E \times (1.fraction)$$ where s is the sign, E is the unbiased exponent, and 1.fraction is the significand, which is composed of a leading unit bit (implied bit) and a fraction part. The ranges covered by the magnitude (M) of a normalized floating-point number are approximately equal to: Single Format: $$1.2x10^{-38} \le M \le 3.4x10^{38}$$ Double Format: $$2.2x10^{-308} \le M \le 1.8x10^{308}$$ #### Zero values (± 0) These are values that have a biased exponent value of zero and a fraction value of zero. Zeros can have a positive or negative sign. The sign of zero is ignored by comparison operations (i.e., comparison regards +0 as equal to -0). #### **Denormalized numbers** (± DEN) These are values that have a biased exponent value of zero and a nonzero fraction value. They are nonzero numbers smaller in magnitude than the representable normalized numbers. They are values in which the implied unit bit is 0. Denormalized numbers are interpreted as follows: $$DEN = (-1)^{s} \times 2^{Emin} \times (0.fraction)$$ where Emin is the minimum representable exponent value (-126 for single-precision, -1022 for double-precision). #### Infinities $(\pm \infty)$ These are values that have the maximum biased exponent value: 255 in single format 2047 in double format and a zero fraction value. They are used to approximate values greater in magnitude than the maximum normalized value. Infinity arithmetic is defined as the limiting case of real arithmetic, with restricted operations defined among numbers and infinities. Infinities and the real numbers can be related by ordering in the affine sense: - ∞ < every finite number < + ∞ Arithmetic on infinities is always exact and does not signal any exception, except when an exception occurs due to the invalid operations as described in Section 4.4.1, "Invalid Operation Exception" on page 110. For comparison operations, +Infinity compares equal to +Infinity and -Infinity compares equal to -Infinity. #### Not a Numbers (NaNs) These are values that have the maximum biased exponent value and a nonzero fraction value. The sign bit is ignored (i.e., NaNs are neither positive nor negative). If the high-order bit of the fraction field is 0 then the NaN is a *Signaling NaN*; otherwise it is a *Quiet NaN*. Signaling NaNs are used to signal exceptions when they appear as operands of computational instructions. Quiet NaNs are used to represent the results of certain invalid operations, such as invalid arithmetic operations on infinities or on NaNs, when Invalid Operation Exception is disabled (FPSCR<sub>VE</sub>=0). Quiet NaNs propagate through all floating-point operations except ordered comparison, *Floating Round to Single-Precision*, and conversion to integer. Quiet NaNs do not signal exceptions, except for ordered comparison and conversion to integer operations. Specific encodings in QNaNs can thus be preserved through a sequence of floating-point operations, and used to convey diagnostic information to help identify results from invalid operations. When a QNaN is the result of a floating-point operation because one of the operands is a NaN or because a QNaN was generated due to a disabled Invalid Operation Exception, then the following rule is applied to determine the NaN with the high-order fraction bit set to 1 that is to be stored as the result. if (FRA) is a NaN then FRT ← (FRA) else if (FRB) is a NaN then if instruction is *frsp* then FRT ← (FRB)<sub>0:34</sub> || <sup>29</sup>0 else FRT ← (FRB) else if (FRC) is a NaN then FRT ← (FRC) else if generated QNaN then FRT ← generated QNaN If the operand specified by FRA is a NaN, then that NaN is stored as the result. Otherwise, if the operand specified by FRB is a NaN (if the instruction specifies an FRB operand), then that NaN is stored as the result, with the low-order 29 bits of the result set to 0 if the instruction is frsp. Otherwise, if the operand specified by FRC is a NaN (if the instruction specifies an FRC operand), then that NaN is stored as the result. Otherwise, if a QNaN was generated due to a disabled Invalid Operation Exception, then that QNaN is stored as the result. If a QNaN is to be generated as a result, then the QNaN generated has a sign bit of 0, an exponent field of all 1s, and a high-order fraction bit of 1 with all other fraction bits 0. Any instruction that generates a QNaN as the result of a disabled Invalid Operation Exception generates QNaN this 0x7FF8\_0000\_0000\_0000). A double-precision NaN is considered to be representable in single format if and only if the low-order 29 bits of the double-precision NaN's fraction are zero. #### 4.3.3 Sign of Result The following rules govern the sign of the result of an arithmetic, rounding, or conversion operation, when the operation does not yield an exception. They apply even when the operands or results are zeros or infinities. ■ The sign of the result of an add operation is the sign of the operand having the larger absolute value. If both operands have the same sign, the sign of the result of an add operation is the same as the sign of the operands. The sign of the result of the subtract operation x-y is the same as the sign of the result of the add operation x+(-y). When the sum of two operands with opposite sign, or the difference of two operands with the same sign, is exactly zero, the sign of the result is positive in all rounding modes except Round toward -Infinity, in which mode the sign is negative. - The sign of the result of a multiply or divide operation is the Exclusive OR of the signs of the operands. - The sign of the result of a Square Root or Reciprocal Square Root Estimate operation is always positive, except that the square root of -0 is -0 and the reciprocal square root of -0 is -Infinity. - The sign of the result of a Round to Single-Precision, or Convert From Integer, or Round to Integer operation is the sign of the operand being converted. For the *Multiply-Add* instructions, the rules given above are applied first to the multiply operation and then to the add or subtract operation (one of the inputs to the add or subtract operation is the result of the multiply operation). # 4.3.4 Normalization and Denormalization The intermediate result of an arithmetic or *frsp* instruction may require normalization and/or denormalization as described below. Normalization and denormalization do not affect the sign of the result. When an arithmetic or rounding instruction produces an intermediate result which carries out of the significand, or in which the significand is nonzero but has a leading zero bit, it is not a normalized number and must be normalized before it is stored. For the carry-out case, the significand is shifted right one bit, with a one shifted into the leading significand bit, and the exponent is incremented by one. For the leading-zero case, the significand is shifted left while decrementing its exponent by one for each bit shifted, until the leading significand bit becomes one. The Guard bit and the Round bit (see Section 4.5.1, "Execution Model for IEEE Operations" on page 113) participate in the shift with zeros shifted into the Round bit. The exponent is regarded as if its range were unlimited. After normalization, or if normalization was not required, the intermediate result may have a nonzero significand and an exponent value that is less than the minimum value that can be represented in the format specified for the result. In this case, the intermediate result is said to be "Tiny" and the stored result is determined by the rules described in Section 4.4.4, "Underflow Exception". These rules may require denormalization. A number is denormalized by shifting its significand right while incrementing its exponent by 1 for each bit shifted, until the exponent is equal to the format's minimum value. If any significant bits are lost in this shifting process then "Loss of Accuracy" has occurred (See Section 4.4.4, "Underflow Exception" on page 112) and Underflow Exception is signaled. #### 4.3.5 Data Handling and Precision Most of the *Floating-Point Processor Architecture*, including all computational, *Move*, and *Select* instructions, use the floating-point double format to represent data in the FPRs. Single-precision and integer-valued operands may be manipulated using double-precision operations. Instructions are provided to coerce these values from a double format operand. Instructions are also provided for manipulations which do not require double-precision. In addition, instructions are provided to access a true single-precision representation in storage, and a fixed-point integer representation in GPRs. #### 4.3.5.1 Single-Precision Operands For single format data, a format conversion from single to double is performed when loading from storage into an FPR and a format conversion from double to single is performed when storing from an FPR to storage. No floating-point exceptions are caused by these instructions. An instruction is provided to explicitly convert a double format operand in an FPR to single-precision. Floating-point single-precision is enabled with four types of instruction. #### 1. Load Floating-Point Single This form of instruction accesses a single-precision operand in single format in storage, converts it to double format, and loads it into an FPR. No floating-point exceptions are caused by these instructions. #### 2. Round to Floating-Point Single-Precision The Floating Round to Single-Precision instruction rounds a double-precision operand to single-precision, checking the exponent for single-precision range and handling any exceptions according to respective enable bits, and places that operand into an FPR in double format. For results produced by single-precision arithmetic instructions, single-precision loads, and other instances of the Floating Round to Single-Precision instruction, this operation does not alter the value. #### 3. Single-Precision Arithmetic Instructions This form of instruction takes operands from the FPRs in double format, performs the operation as if it produced an intermediate result having infinite precision and unbounded exponent range, and then coerces this intermediate result to fit in single format. Status bits, in the FPSCR and optionally in the Condition Register, are set to reflect the single-precision result. The result is then converted to double format and placed into an FPR. The result lies in the range supported by the single format. All input values must be representable in single format; if they are not, the result placed into the target FPR, and the setting of status bits in the FPSCR and in the Condition Register (if Rc=1), are undefined. #### 4. Store Floating-Point Single This form of instruction converts a double-precision operand to single format and stores that operand into storage. No floating-point exceptions are caused by these instructions. (The value being stored is effectively assumed to be the result of an instruction of one of the preceding three types.) When the result of a *Load Floating-Point Single*, *Floating Round to Single-Precision*, or single-precision arithmetic instruction is stored in an FPR, the low-order 29 FRACTION bits are zero. #### Programming Note — The Floating Round to Single-Precision instruction is provided to allow value conversion from double-precision to single-precision with appropriate exception checking and rounding. This instruction should be used to convert double-precision floating-point values (produced by double-precision load and arithmetic instructions and by fcfid) to single-precision values prior to storing them into single format storage elements or using them as operands for single-precision arithmetic instructions. Values produced by single-precision load and arithmetic instructions are already single-precision values and can be stored directly into single format storage elements, or used directly as operands for single-precision arithmetic instructions, without preceding the store, or the arithmetic instruction, by a Floating Round to Single-Precision instruction. #### Programming Note - A single-precision value can be used in double-precision arithmetic operations. The reverse is true only if the double-precision value is representable in single format. Some implementations may execute single-precision arithmetic instructions faster than double-precision arithmetic instructions. Therefore, if double-precision accuracy is not required, single-precision data and instructions should be used. #### 4.3.5.2 Integer-Valued Operands Instructions are provided to round floating-point operands to integer values in floating-point format. To facilitate exchange of data between the floating-point and fixed-point processors, instructions are provided to convert between floating-point double format and fixed-point integer format in an FPR. Computation on integer-valued operands may be performed using arithmetic instructions of the required precision. (The results may not be integer values.) The two groups of instructions provided specifically to support integer-valued operands are described below. #### 1. Floating Round to Integer The Floating Round to Integer instructions round a double-precision operand to an integer value in floating-point double format. These instructions may cause Invalid Operation (VXSNAN) exceptions. See Sections 4.3.6 and 4.5.1 for more information about rounding. 2. Floating Convert To/From Integer The Floating Convert To Integer instructions convert a double-precision operand to a 32-bit or 64-bit signed fixed-point integer format. Variants are provided both to perform rounding based on the value of FPSCR<sub>RN</sub> and to round toward zero. These instructions may cause Invalid Operation (VXSNaN, VXCVI) and Inexact exceptions. The Floating Convert From Integer instruction converts a 64-bit signed fixed-point integer to a double-precision floating-point integer. Because of the limitations of the source format, only an Inexact exception may be generated. #### 4.3.6 Rounding The material in this section applies to operations that have numeric operands (i.e., operands that are not infinities or NaNs). Rounding the intermediate result of such an operation may cause an Overflow Exception, an Underflow Exception, or an Inexact Exception. The remainder of this section assumes that the operation causes no exceptions and that the result is numeric. See Section 4.3.2, "Value Representation" and Section 4.4, "Floating-Point Exceptions" for the cases not covered here. The Arithmetic and Rounding and Conversion instructions round their intermediate results. With the exception of the Estimate instructions, these instructions produce an intermediate result that can be regarded as having infinite precision and unbounded exponent range. All but two groups of these instructions normalize or denormalize the intermediate result prior to rounding and then place the final result into the target FPR in double format. The Floating Round to Integer and Floating Convert To Integer instructions with biased exponents ranging from 1022 through 1074 are prepared for rounding by repetitively shifting the significand right one position and incrementing the biased exponent until it reaches a value of 1075. (Intermediate results with biased exponents 1075 or larger are already integers, and with biased exponents 1021 or less round to zero.) After rounding, the final result for Floating Round to Integer is normalized and put in double format, and for Floating Convert To Integer is converted to a signed fixed-point integer. FPSCR bits FR and FI generally indicate the results of rounding. Each of the instructions which rounds its intermediate result sets these bits. If the fraction is incremented during rounding then FR is set to 1, otherwise FR is set to 0. If the result is inexact then FI is set to 1, otherwise FI is set to zero. The *Round to Integer* instructions are exceptions to this rule, setting FR and FI to 0. The *Estimate* instructions set FR and FI to undefined values. The remaining floating-point instructions do not alter FR and FI. Four user-selectable rounding modes are provided through the Floating-Point Rounding Control field in the FPSCR. See Section 4.2.2, "Floating-Point Status and Control Register". These are encoded as follows. | RN | Rounding Mode | |----|------------------------| | 00 | Round to Nearest | | 01 | Round toward Zero | | 10 | Round toward +Infinity | | 11 | Round toward -Infinity | Let Z be the intermediate arithmetic result or the operand of a convert operation. If Z can be represented exactly in the target format, then the result in all rounding modes is Z as represented in the target format. If Z cannot be represented exactly in the target format, let Z1 and Z2 bound Z as the next larger and next smaller numbers representable in the target format. Then Z1 or Z2 can be used to approximate the result in the target format. Figure 53 shows the relation of Z, Z1, and Z2 in this case. The following rules specify the rounding in the four modes. "LSB" means "least significant bit". Figure 53. Selection of Z1 and Z2 #### **Round to Nearest** Choose the value that is closer to Z (Z1 or Z2). In case of a tie, choose the one that is even (least significant bit 0). #### **Round toward Zero** Choose the smaller in magnitude (Z1 or Z2). #### Round toward +Infinity Choose Z1. #### Round toward -Infinity Choose Z2. See Section 4.5.1, "Execution Model for IEEE Operations" on page 113 for a detailed explanation of rounding. #### 4.4 Floating-Point Exceptions This architecture defines the following floating-point exceptions: ■ Invalid Operation Exception **SNaN** Infinity - Infinity Infinity + Infinity Zero + Zero Infinity × Zero Invalid Compare Software-Defined Condition Invalid Square Root Invalid Integer Convert - Zero Divide Exception - Overflow Exception - Underflow Exception - Inexact Exception These exceptions, other than Invalid Operation Exception due to Software-Defined Condition, may occur during execution of computational instructions. An Invalid Operation Exception due to Software-Defined Condition occurs when a *Move To FPSCR* instruction sets FPSCR<sub>VXSOFT</sub> to 1. Each floating-point exception, and each category of Invalid Operation Exception, has an exception bit in the FPSCR. In addition, each floating-point exception has a corresponding enable bit in the FPSCR. The exception bit indicates occurrence of the corresponding exception. If an exception occurs, the corresponding enable bit governs the result produced by the instruction and, in conjunction with the FE0 and FE1 bits (see page 109), whether and how the system floating-point enabled exception error handler is invoked. (In general, the enabling specified by the enable bit is of invoking the system error handler, not of permitting the exception to occur. The occurrence of an exception depends only on the instruction and its inputs, not on the setting of any control bits. The only deviation from this general rule is that the occurrence of an Underflow Exception may depend on the setting of the enable bit.) A single instruction, other than *mtfsfi* or *mtfsf*, may set more than one exception bit only in the following cases: - Inexact Exception may be set with Overflow Exception. - Inexact Exception may be set with Underflow Exception. - Invalid Operation Exception (SNaN) is set with Invalid Operation Exception (∞×0) for Multiply-Add instructions for which the values being multiplied are infinity and zero and the value being added is an SNaN. - Invalid Operation Exception (SNaN) may be set with Invalid Operation Exception (Invalid Compare) for Compare Ordered instructions. - Invalid Operation Exception (SNaN) may be set with Invalid Operation Exception (Invalid Integer Convert) for Convert To Integer instructions. When an exception occurs the writing of a result to the target register may be suppressed or a result may be delivered, depending on the exception. The writing of a result to the target register is suppressed for the following kinds of exception, so that there is no possibility that one of the operands is lost: - Enabled Invalid Operation - Enabled Zero Divide For the remaining kinds of exception, a result is generated and written to the destination specified by the instruction causing the exception. The result may be a different value for the enabled and disabled conditions for some of these exceptions. The kinds of exception that deliver a result are the following: - Disabled Invalid Operation - Disabled Zero Divide - Disabled Overflow - Disabled Underflow - Disabled Inexact - Enabled Overflow - Enabled Underflow - Enabled Inexact Subsequent sections define each of the floating-point exceptions and specify the action that is taken when they are detected. The IEEE standard specifies the handling of exceptional conditions in terms of "traps" and "trap handlers". In this architecture, an FPSCR exception enable bit of 1 causes generation of the result value specified in the IEEE standard for the "trap enabled" case; the expectation is that the exception will be detected by software, which will revise the result. An FPSCR exception enable bit of 0 causes generation of the "default result" value specified for the "trap disabled" (or "no trap occurs" or "trap is not implemented") case; the expectation is that the exception will not be detected by software, which will simply use the default result. The result to be delivered in each case for each exception is described in the sections below. The IEEE default behavior when an exception occurs is to generate a default value and not to notify software. In this architecture, if the IEEE default behavior when an exception occurs is desired for all exceptions, all FPSCR exception enable bits should be set to 0 and Ignore Exceptions Mode (see below) should be used. In this case the system floating-point enabled exception error handler is not invoked, even if floating-point exceptions occur: software can inspect the FPSCR exception bits if necessary, to determine whether exceptions have occurred. In this architecture, if software is to be notified that a given kind of exception has occurred, the corresponding FPSCR exception enable bit must be set to 1 and a mode other than Ignore Exceptions Mode must be used. In this case the system floating-point enabled exception error handler is invoked if an enabled float- ing-point exception occurs. The system floating-point enabled exception error handler is also invoked if a *Move To FPSCR* instruction causes an exception bit and the corresponding enable bit both to be 1; the *Move To FPSCR* instruction is considered to cause the enabled exception. The FE0 and FE1 bits control whether and how the system floating-point enabled exception error handler is invoked if an enabled floating-point exception occurs. The location of these bits and the requirements for altering them are described in Book III. (The system floating-point enabled exception error handler is never invoked because of a disabled floating-point exception.) The effects of the four possible settings of these bits are as follows. #### FE0 FE1 Description #### 0 0 Ignore Exceptions Mode Floating-point exceptions do not cause the system floating-point enabled exception error handler to be invoked. #### 0 1 Imprecise Nonrecoverable Mode The system floating-point enabled exception error handler is invoked at some point at or beyond the instruction that caused the enabled exception. It may not be possible to identify the excepting instruction or the data that caused the exception. Results produced by the excepting instruction may have been used by or may have affected subsequent instructions that are executed before the error handler is invoked. #### 1 0 Imprecise Recoverable Mode The system floating-point enabled exception error handler is invoked at some point at or beyond the instruction that caused the enabled exception. Sufficient information is provided to the error handler that it can identify the excepting instruction and the operands, and correct the result. No results produced by the excepting instruction have been used by or have affected subsequent instructions that are executed before the error handler is invoked. #### 1 1 Precise Mode The system floating-point enabled exception error handler is invoked precisely at the instruction that caused the enabled exception. In all cases, the question of whether a floating-point result is stored, and what value is stored, is governed by the FPSCR exception enable bits, as described in subsequent sections, and is not affected by the value of the FE0 and FE1 bits. In all cases in which the system floating-point enabled exception error handler is invoked, all instructions before the instruction at which the system floating-point enabled exception error handler is invoked have completed, and no instruction after the instruction at which the system floating-point enabled exception error handler is invoked has begun execution. The instruction at which the system floating-point enabled exception error handler is invoked has completed if it is the excepting instruction and there is only one such instruction. Otherwise it has not begun execution (or may have been partially executed in some cases, as described in Book III). #### - Programming Note In any of the three non-Precise modes, a *Floating-Point Status and Control Register* instruction can be used to force any exceptions, due to instructions initiated before the *Floating-Point Status and Control Register* instruction, to be recorded in the FPSCR. (This forcing is superfluous for Precise Mode.) In either of the Imprecise modes, a *Floating-Point Status and Control Register* instruction can be used to force any invocations of the system floating-point enabled exception error handler, due to instructions initiated before the *Floating-Point Status and Control Register* instruction, to occur. (This forcing has no effect in Ignore Exceptions Mode, and is superfluous for Precise Mode.) The last sentence of the paragraph preceding this Programming Note can apply only in the Imprecise modes, or if the mode has just been changed from Ignore Exceptions Mode to some other mode. (It always applies in the latter case.) In order to obtain the best performance across the widest range of implementations, the programmer should obey the following guidelines. - If the IEEE default results are acceptable to the application, Ignore Exceptions Mode should be used with all FPSCR exception enable bits set to 0 - If the IEEE default results are not acceptable to the application, Imprecise Nonrecoverable Mode should be used, or Imprecise Recoverable Mode if recoverability is needed, with FPSCR exception enable bits set to 1 for those exceptions for which the system floating-point enabled exception error handler is to be invoked. - Ignore Exceptions Mode should not, in general, be used when any FPSCR exception enable bits are set to 1. - Precise Mode may degrade performance in some implementations, perhaps substantially, and therefore should be used only for debugging and other specialized applications. #### 4.4.1 Invalid Operation Exception #### 4.4.1.1 Definition An Invalid Operation Exception occurs when an operand is invalid for the specified operation. The invalid operations are: - Any floating-point operation on a Signaling NaN (SNaN) - For add or subtract operations, magnitude subtraction of infinities (∞ ∞) - Division of infinity by infinity $(\infty \div \infty)$ - Division of zero by zero $(0 \div 0)$ - Multiplication of infinity by zero ( $\infty \times 0$ ) - Ordered comparison involving a NaN (Invalid Compare) - Square root or reciprocal square root of a negative (and nonzero) number (Invalid Square Root) - Integer convert involving a number too large in magnitude to be represented in the target format, or involving an infinity or a NaN (Invalid Integer Convert) An Invalid Operation Exception also occurs when an *mtfsfi*, *mtfsf*, or *mtfsb1* instruction is executed that sets FPSCR<sub>VXSOFT</sub> to 1 (Software-Defined Condition). #### 4.4.1.2 Action The action to be taken depends on the setting of the Invalid Operation Exception Enable bit of the FPSCR. When Invalid Operation Exception is enabled (FPSCR $_{\text{VE}}$ =1) and an Invalid Operation Exception occurs, the following actions are taken: 1. One or two Invalid Operation Exceptions are set | FPSCR <sub>VXSNAN</sub> | (if SNaN) | |-------------------------|---------------------------| | FPSCR <sub>VXISI</sub> | $(if \infty - \infty)$ | | FPSCR <sub>VXIDI</sub> | $(if \infty \div \infty)$ | | FPSCR <sub>VXZDZ</sub> | $(if 0 \div 0)$ | | FPSCR <sub>VXIMZ</sub> | $(if \infty \times 0)$ | | FPSCR <sub>VXVC</sub> | (if invalid comp) | | FPSCR <sub>VXSOFT</sub> | (if sfw-def cond) | | FPSCR <sub>VXSQRT</sub> | (if invalid sqrt) | | FPSCR <sub>VXCVI</sub> | (if invalid int cvrt) | | | | 2. If the operation is an arithmetic, Floating Round to Single-Precision, Floating Round to Integer, or convert to integer operation, the target FPR is unchanged FPSCR<sub>FR FI</sub> are set to zero FPSCR<sub>FPRF</sub> is unchanged 3. If the operation is a compare, FPSCR<sub>FR FI C</sub> are unchanged FPSCR<sub>FPCC</sub> is set to reflect unordered If an *mtfsfi*, *mtfsf*, or *mtfsb1* instruction is executed that sets FPSCR<sub>VXSOFT</sub> to 1, The FPSCR is set as specified in the instruction description. When Invalid Operation Exception is disabled (FPSCR $_{\text{VE}}$ =0) and an Invalid Operation Exception occurs, the following actions are taken: 1. One or two Invalid Operation Exceptions are set | FPSCR <sub>VXSNAN</sub> | (if SNaN) | |-------------------------|---------------------------| | FPSCR <sub>VXISI</sub> | (if ∞ - ∞) | | FPSCR <sub>VXIDI</sub> | $(if \infty \div \infty)$ | | FPSCR <sub>VXZDZ</sub> | $(if 0 \div 0)$ | | FPSCR <sub>VXIMZ</sub> | $(if \infty \times 0)$ | | FPSCR <sub>VXVC</sub> | (if invalid comp) | | FPSCR <sub>VXSOFT</sub> | (if sfw-def cond) | | FPSCR <sub>VXSQRT</sub> | (if invalid sqrt) | | FPSCR <sub>VXCVI</sub> | (if invalid int cvrt) | 2. If the operation is an arithmetic or *Floating Round* to *Single-Precision* operation, the target FPR is set to a Quiet NaN FPSCR<sub>FR FI</sub> are set to zero FPSCR<sub>FPRF</sub> is set to indicate the class of the result (Quiet NaN) If the operation is a convert to 64-bit integer operation, the target FPR is set as follows: FRT is set to the most positive 64-bit integer if the operand in FRB is a positive number or $+\infty$ , and to the most negative 64-bit integer if the operand in FRB is a negative number, $-\infty$ , or NaN FPSCR<sub>FR FI</sub> are set to zero FPSCR<sub>FPRF</sub> is undefined If the operation is a convert to 32-bit integer operation, the target FPR is set as follows: FRT<sub>0:31</sub> ← undefined FRT<sub>32:63</sub> are set to the most positive 32-bit integer if the operand in FRB is a positive number or +infinity, and to the most negative 32-bit integer if the operand in FRB is a negative number, -infinity, or NaN FPSCR<sub>FR FI</sub> are set to zero FPSCR<sub>FPRF</sub> is undefined 5. If the operation is a compare, FPSCR<sub>FR FI C</sub> are unchanged FPSCR<sub>FPCC</sub> is set to reflect unordered If an mtfsfi, mtfsf, or mtfsb1 instruction is executed that sets FPSCR<sub>VXSOFT</sub> to 1, The FPSCR is set as specified in the instruction description. #### 4.4.2 Zero Divide Exception #### 4.4.2.1 Definition A Zero Divide Exception occurs when a *Divide* instruction is executed with a zero divisor value and a finite nonzero dividend value. It also occurs when a *Reciprocal Estimate* instruction (*fre*[s] or *frsqrte*[s]) is executed with an operand value of zero. #### 4.4.2.2 Action The action to be taken depends on the setting of the Zero Divide Exception Enable bit of the FPSCR. When Zero Divide Exception is enabled (FPSCR<sub>ZE</sub>=1) and a Zero Divide Exception occurs, the following actions are taken: 1. Zero Divide Exception is set FPSCR<sub>ZX</sub> ← 1 - 2. The target FPR is unchanged - 3. FPSCR<sub>FR FI</sub> are set to zero - 4. FPSCR<sub>FPRF</sub> is unchanged When Zero Divide Exception is disabled (FPSCR<sub>ZE</sub>=0) and a Zero Divide Exception occurs, the following actions are taken: 1. Zero Divide Exception is set FPSCR<sub>ZX</sub> ← 1 - The target FPR is set to ± Infinity, where the sign is determined by the XOR of the signs of the operands - 3. FPSCR<sub>FR FI</sub> are set to zero - FPSCR<sub>FPRF</sub> is set to indicate the class and sign of the result (± Infinity) #### 4.4.3 Overflow Exception #### 4.4.3.1 Definition An Overflow Exception occurs when the magnitude of what would have been the rounded result if the exponent range were unbounded exceeds that of the largest finite number of the specified result precision. #### 4.4.3.2 Action The action to be taken depends on the setting of the Overflow Exception Enable bit of the FPSCR. When Overflow Exception is enabled (FPSCR<sub>OE</sub>=1) and an Overflow Exception occurs, the following actions are taken: 1. Overflow Exception is set FPSCR<sub>OX</sub> ← 1 - For double-precision arithmetic instructions, the exponent of the normalized intermediate result is adjusted by subtracting 1536 - 3. For single-precision arithmetic instructions and the *Floating Round to Single-Precision* instruction, the exponent of the normalized intermediate result is adjusted by subtracting 192 - 4. The adjusted rounded result is placed into the target FPR - FPSCR<sub>FPRF</sub> is set to indicate the class and sign of the result (± Normal Number) When Overflow Exception is disabled (FPSCR<sub>OE</sub>=0) and an Overflow Exception occurs, the following actions are taken: 1. Overflow Exception is set FPSCR<sub>OX</sub> ← 1 2. Inexact Exception is set FPSCR<sub>XX</sub> ← 1 - The result is determined by the rounding mode (FPSCR<sub>RN</sub>) and the sign of the intermediate result as follows: - Round to Nearest Store $\pm$ Infinity, where the sign is the sign of the intermediate result - Round toward Zero Store the format's largest finite number with the sign of the intermediate result - Round toward + Infinity For negative overflow, store the format's most negative finite number; for positive overflow, store +Infinity - Round toward Infinity For negative overflow, store Infinity; for positive overflow, store the format's largest finite number - 4. The result is placed into the target FPR - 5. $FPSCR_{FR}$ is undefined - 6. FPSCR<sub>FI</sub> is set to 1 - 7. FPSCR<sub>FPRF</sub> is set to indicate the class and sign of the result ( $\pm$ Infinity or $\pm$ Normal Number) #### 4.4.4 Underflow Exception #### 4.4.4.1 Definition Underflow Exception is defined separately for the enabled and disabled states: Enabled: Underflow occurs when the intermediate result is "Tiny". ■ Disabled: Underflow occurs when the intermediate result is "Tiny" and there is "Loss of Accuracy". A "Tiny" result is detected before rounding, when a nonzero intermediate result computed as though both the precision and the exponent range were unbounded would be less in magnitude than the smallest normalized number. If the intermediate result is "Tiny" and Underflow Exception is disabled (FPSCR<sub>UE</sub>=0) then the intermediate result is denormalized (see Section 4.3.4, "Normalization and Denormalization" on page 106) and rounded (see Section 4.3.6, "Rounding" on page 107) before being placed into the target FPR. "Loss of Accuracy" is detected when the delivered result value differs from what would have been computed were both the precision and the exponent range unbounded. #### 4.4.4.2 Action The action to be taken depends on the setting of the Underflow Exception Enable bit of the FPSCR. When Underflow Exception is enabled (FPSCR $_{\rm UE}$ =1) and an Underflow Exception occurs, the following actions are taken: - 1. Underflow Exception is set $FPSCR_{UX} \leftarrow 1$ - For double-precision arithmetic instructions, the exponent of the normalized intermediate result is adjusted by adding 1536 - For single-precision arithmetic instructions and the Floating Round to Single-Precision instruction, the exponent of the normalized intermediate result is adjusted by adding 192 - The adjusted rounded result is placed into the target FPR - FPSCR<sub>FPRF</sub> is set to indicate the class and sign of the result (± Normalized Number) #### Programming Note - The FR and FI bits are provided to allow the system floating-point enabled exception error handler, when invoked because of an Underflow Exception, to simulate a "trap disabled" environment. That is, the FR and FI bits allow the system floating-point enabled exception error handler to unround the result, thus allowing the result to be denormalized. When Underflow Exception is disabled (FPSCR $_{\rm UE}$ =0) and an Underflow Exception occurs, the following actions are taken: - 1. Underflow Exception is set $FPSCR_{UX} \leftarrow 1$ - 2. The rounded result is placed into the target FPR - FPSCR<sub>FPRF</sub> is set to indicate the class and sign of the result (± Normalized Number, ± Denormalized Number, or ± Zero) #### 4.4.5 Inexact Exception #### 4.4.5.1 Definition An Inexact Exception occurs when one of two conditions occur during rounding: - The rounded result differs from the intermediate result assuming both the precision and the exponent range of the intermediate result to be unbounded. In this case the result is said to be inexact. (If the rounding causes an enabled Overflow Exception or an enabled Underflow Exception, an Inexact Exception also occurs only if the significands of the rounded result and the intermediate result differ.) - The rounded result overflows and Overflow Exception is disabled. #### 4.4.5.2 Action The action to be taken does not depend on the setting of the Inexact Exception Enable bit of the FPSCR. When an Inexact Exception occurs, the following actions are taken: - Inexact Exception is set FPSCR<sub>XX</sub> ← 1 - The rounded or overflowed result is placed into the target FPR - FPSCR<sub>FPRF</sub> is set to indicate the class and sign of the result #### Programming Note - In some implementations, enabling Inexact Exceptions may degrade performance more than does enabling other types of floating-point exception. # 4.5 Floating-Point Execution Models All implementations of this architecture must provide the equivalent of the following execution models to ensure that identical results are obtained. Special rules are provided in the definition of the computational instructions for the infinities, denormalized numbers and NaNs. The material in the remainder of this section applies to instructions that have numeric operands and a numeric result (i.e., operands and result that are not infinities or NaNs), and that cause no exceptions. See Section 4.3.2 and Section 4.4 for the cases not covered here. Although the double format specifies an 11-bit exponent, exponent arithmetic makes use of two additional bits to avoid potential transient overflow conditions. One extra bit is required when denormalized double-precision numbers are prenormalized. The second bit is required to permit the computation of the adjusted exponent value in the following cases when the corresponding exception enable bit is 1: - Underflow during multiplication using a denormalized operand. - Overflow during division using a denormalized divisor. The IEEE standard includes 32-bit and 64-bit arithmetic. The standard requires that single-precision arithmetic be provided for single-precision operands. The standard permits double-precision floating-point operations to have either (or both) single-precision or double-precision operands, but states that single-precision floating-point operations should not accept double-precision operands. The Power ISA follows these guidelines; double-precision arithmetic instructions can have operands of either or both precisions, while single-precision arithmetic instructions require all operands to be single-precision. Double-precision arithmetic instructions and *fcfid* produce double-precision values, while single-precision arithmetic instructions produce single-precision values. For arithmetic instructions, conversions from double-precision to single-precision must be done explicitly by software, while conversions from single-precision to double-precision are done implicitly. # 4.5.1 Execution Model for IEEE Operations The following description uses 64-bit arithmetic as an example. 32-bit arithmetic is similar except that the FRACTION is a 23-bit field, and the single-precision Guard, Round, and Sticky bits (described in this section) are logically adjacent to the 23-bit FRACTION field. IEEE-conforming significand arithmetic is considered to be performed with a floating-point accumulator having the following format, where bits 0:55 comprise the significand of the intermediate result. | S | CL | FRACTION | G | R | Χ | |---|-----|----------|----|----|----| | | 0 1 | | 53 | 54 | 55 | Figure 54. IEEE 64-bit execution model The S bit is the sign bit. The C bit is the carry bit, which captures the carry out of the significand. The L bit is the leading unit bit of the significand, which receives the implicit bit from the operand. The FRACTION is a 52-bit field that accepts the fraction of the operand. The Guard (G), Round (R), and Sticky (X) bits are extensions to the low-order bits of the accumulator. The G and R bits are required for postnormalization of the result. The G, R, and X bits are required during rounding to determine if the intermediate result is equally near the two nearest representable values. The X bit serves as an extension to the G and R bits by representing the logical OR of all bits that may appear to the low-order side of the R bit, due either to shifting the accumulator right or to other generation of low-order result bits. The G and R bits participate in the left shifts with zeros being shifted into the R bit. Figure 55 shows the significance of the G, R, and X bits with respect to the intermediate result (IR), the representable number next lower in magnitude (NL), and the representable number next higher in magnitude (NH). | GRX | Interpretation | |-------|-----------------------------| | 000 | IR is exact | | 0 0 1 | | | 010 | IR closer to NL | | 011 | | | 100 | IR midway between NL and NH | | 101 | | | 110 | IR closer to NH | | 111 | | Figure 55. Interpretation of G, R, and X bits Figure 56 shows the positions of the Guard, Round, and Sticky bits for double-precision and single-precision floating-point numbers relative to the accumulator illustrated in Figure 54. | Format | | Round | Sticky | |--------|-------|-------|----------------------| | Double | G bit | R bit | X bit | | Single | 24 | 25 | OR of 26:52, G, R, X | Figure 56. Location of the Guard, Round, and Sticky bits in the IEEE execution model The significand of the intermediate result is prepared for rounding by shifting its contents right, if required, until the least significant bit to be retained is in the low-order bit position of the fraction. Four user-selectable rounding modes are provided through FPSCR<sub>RN</sub> as described in Section 4.3.6, "Rounding" on page 107. Using Z1 and Z2 as defined on page 107, the rules for rounding in each mode are as follows. #### ■ Round to Nearest #### Guard bit = 0 The result is truncated. (Result exact (GRX=000) or closest to next lower value in magnitude (GRX=001, 010, or 011)) #### Guard bit = 1 Depends on Round and Sticky bits: #### Case a If the Round or Sticky bit is 1 (inclusive), the result is incremented. (Result closest to next higher value in magnitude (GRX=101, 110, or 111)) #### Case b If the Round and Sticky bits are 0 (result midway between closest representable values), then if the low-order bit of the result is 1 the result is incremented. Otherwise (the low-order bit of the result is 0) the result is truncated (this is the case of a tie rounded to even). #### ■ Round toward Zero Choose the smaller in magnitude of Z1 or Z2. If the Guard, Round, or Sticky bit is nonzero, the result is inexact. - Round toward + Infinity Choose Z1. - Round toward Infinity Choose Z2. If rounding results in a carry into C, the significand is shifted right one position and the exponent is incremented by one. This yields an inexact result, and possibly also exponent overflow. If any of the Guard, Round, or Sticky bits is nonzero, then the result is also inexact. Fraction bits are stored to the target FPR. For Floating Round to Integer, Floating Round to Single-Precision, and single-precision arithmetic instructions, low-order zeros must be appended as appropriate to fill out the double-precision fraction. # 4.5.2 Execution Model for Multiply-Add Type Instructions The Power ISA provides a special form of instruction that performs up to three operations in one instruction (a multiplication, an addition, and a negation). With this added capability comes the special ability to produce a more exact intermediate result as input to the rounder. 32-bit arithmetic is similar except that the FRACTION field is smaller. Multiply-add significand arithmetic is considered to be performed with a floating-point accumulator having the following format, where bits 0:106 comprise the significand of the intermediate result. | S | С | L | FRACTION | X' | |---|---|---|----------|-----| | 0 | 1 | 2 | 3 | 106 | Figure 57. Multiply-add 64-bit execution model The first part of the operation is a multiplication. The multiplication has two 53-bit significands as inputs, which are assumed to be prenormalized, and produces a result conforming to the above model. If there is a carry out of the significand (into the C bit), then the significand is shifted right one position, shifting the L bit (leading unit bit) into the most significant bit of the FRACTION and shifting the C bit (carry out) into the L bit. All 106 bits (L bit, the FRACTION) of the product take part in the add operation. If the exponents of the two inputs to the adder are not equal, the significand of the operand with the smaller exponent is aligned (shifted) to the right by an amount that is added to that exponent to make it equal to the other input's exponent. Zeros are shifted into the left of the significand as it is aligned and bits shifted out of bit 105 of the significand are ORed into the X' bit. The add operation also produces a result conforming to the above model with the X' bit taking part in the add operation. The result of the addition is then normalized, with all bits of the addition result, except the X' bit, participating in the shift. The normalized result serves as the intermediate result that is input to the rounder. For rounding, the conceptual Guard, Round, and Sticky bits are defined in terms of accumulator bits. Figure 58 shows the positions of the Guard, Round, and Sticky bits for double-precision and single-precision floating-point numbers in the multiply-add execution model. | Format | Guard | Round | Sticky | |--------|-------|-------|------------------| | Double | 53 | 54 | OR of 55:105, X' | | Single | 24 | 25 | OR of 26:105, X' | Figure 58. Location of the Guard, Round, and Sticky bits in the multiply-add execution model The rules for rounding the intermediate result are the same as those given in Section 4.5.1. If the instruction is *Floating Negative Multiply-Add* or *Floating Negative Multiply-Subtract*, the final result is negated. # 4.6 Floating-Point Processor Instructions For each instruction in this section that defines the use of an Rc bit, the behavior defined for the instruction corresponding to Rc=1 is considered part of the Floating-Point.Record category. #### 4.6.1 Floating-Point Storage Access Instructions The Storage Access instructions compute the effective address (EA) of the storage to be accessed as described in Section 1.10.3, "Effective Address Calculation" on page 26. #### - Programming Note The *la* extended mnemonic permits computing an effective address as a *Load* or *Store* instruction would, but loads the address itself into a GPR rather than loading the value that is in storage at that address. This extended mnemonic is described in Section D.9, "Miscellaneous Mnemonics" on page 393. #### 4.6.1.1 Storage Access Exceptions Storage accesses will cause the system data storage error handler to be invoked if the program is not allowed to modify the target storage (*Store* only), or if the program attempts to access storage that is unavailable. #### 4.6.2 Floating-Point Load Instructions There are three basic forms of load instruction: single-precision, double-precision, and integer. The integer form is provided by the *Load Floating-Point as Integer Word Algebraic* instruction, described on page 120. Because the FPRs support only floating-point double format, single-precision *Load Floating-Point* instructions convert single-precision data to double format prior to loading the operand into the target FPR. The conversion and loading steps are as follows. Let $WORD_{0:31}$ be the floating-point single-precision operand accessed from storage. #### Normalized Operand ``` if WORD_{1:8} > 0 and WORD_{1:8} < 255 then FRT_{0:1} \leftarrow WORD_{0:1} FRT_2 \leftarrow \neg WORD_1 FRT_3 \leftarrow \neg WORD_1 FRT_4 \leftarrow \neg WORD_1 FRT_{5:63} \leftarrow WORD_{2:31} \mid\mid^{29}0 ``` #### **Denormalized Operand** ``` if WORD_{1:8} = 0 and WORD_{9:31} \neq 0 then sign \leftarrow WORD_0 exp \leftarrow -126 frac_{0:52} \leftarrow 0b0 \parallel WORD_{9:31} \parallel^{29}0 normalize the operand do while frac_0 = 0 frac_{0:52} \leftarrow frac_{1:52} \parallel 0b0 ``` ``` \exp \leftarrow \exp - 1 FRT_0 \leftarrow sign FRT_{1:11} \leftarrow \exp + 1023 FRT_{12:63} \leftarrow frac_{1:52} Zero / Infinity / NaN if WORD_{1:8} = 255 or WORD_{1:31} = 0 then FRT_{0:1} \leftarrow WORD_{0:1} FRT_2 \leftarrow WORD_1 FRT_3 \leftarrow WORD_1 FRT_4 \leftarrow WORD_1 FRT_4 \leftarrow WORD_1 FRT_5:63 \leftarrow WORD_{2:31} \parallel^{29}0 ``` For double-precision *Load Floating-Point* instructions and for the *Load Floating-Point as Integer Word Algebraic* instruction no conversion is required, as the data from storage are copied directly into the FPR. Many of the *Load Floating-Point* instructions have an "update" form, in which register RA is updated with the effective address. For these forms, if RA≠0, the effective address is placed into register RA and the storage element (word or doubleword) addressed by EA is loaded into FRT. **Note:** Recall that RA and RB denote General Purpose Registers, while FRT denotes a Floating-Point Register. #### Load Floating-Point Single #### D-form #### Ifs FRT,D(RA) | 48 | FRT | RA | D | ٦ | |----|-----|----|------|---| | 0 | 6 | 11 | 16 3 | 1 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTS(D) FRT \leftarrow DOUBLE(MEM(EA, 4)) ``` Let the effective address (EA) be the sum (RA|0)+D. The word in storage addressed by EA is interpreted as a floating-point single-precision operand. This word is converted to floating-point double format (see page 117) and placed into register FRT. #### **Special Registers Altered:** None # Load Floating-Point Single with Update D-form Ifsu FRT,D(RA) | 49 | FRT | RA | D | | |----|-----|----|----|----| | 0 | 6 | 11 | 16 | 31 | ``` \begin{array}{lll} \texttt{EA} & \leftarrow & (\texttt{RA}) & + & \texttt{EXTS} \, (\texttt{D}) \\ \texttt{FRT} & \leftarrow & \texttt{DOUBLE} \, (\texttt{MEM} \, (\texttt{EA}, \ 4) \, ) \\ \texttt{RA} & \leftarrow & \texttt{EA} \end{array} ``` Let the effective address (EA) be the sum (RA)+D. The word in storage addressed by EA is interpreted as a floating-point single-precision operand. This word is converted to floating-point double format (see page 117) and placed into register FRT. EA is placed into register RA. If RA=0, the instruction form is invalid. #### **Special Registers Altered:** None ### Load Floating-Point Single Indexed X-form Ifsx FRT,RA,RB | 31 | FRT | RA | RB | 535 | / | |----|-----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) FRT \leftarrow DOUBLE (MEM(EA, 4)) ``` Let the effective address (EA) be the sum (RA|0)+(RB). The word in storage addressed by EA is interpreted as a floating-point single-precision operand. This word is converted to floating-point double format (see page 117) and placed into register FRT. #### **Special Registers Altered:** None ### Load Floating-Point Single with Update Indexed X-form Ifsux FRT,RA,RB | | 31 | FRT | RA | RB | 567 | / | |---|----|-----|----|----|-----|----| | 0 | | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{lll} \texttt{EA} & \leftarrow & (\texttt{RA}) & + & (\texttt{RB}) \\ \texttt{FRT} & \leftarrow & \texttt{DOUBLE} \, (\texttt{MEM} \, (\texttt{EA}, \ 4) \, ) \\ \texttt{RA} & \leftarrow & \texttt{EA} \end{array} ``` Let the effective address (EA) be the sum (RA)+(RB). The word in storage addressed by EA is interpreted as a floating-point single-precision operand. This word is converted to floating-point double format (see page 117) and placed into register FRT. EA is placed into register RA. If RA=0, the instruction form is invalid. #### **Special Registers Altered:** None 599 X-form 31 #### Load Floating-Point Double D-form ### lfd FRT,D(RA) | 50 | FRT | RA | D | |----|-----|----|-------------------| | 0 | 6 | 11 | 16 3 <sup>-</sup> | if RA = 0 then b $$\leftarrow$$ 0 else b $\leftarrow$ (RA) EA $\leftarrow$ b + EXTS(D) FRT $\leftarrow$ MEM(EA, 8) Let the effective address (EA) be the sum (RA|0)+D. The doubleword in storage addressed by EA is loaded into register FRT. #### **Special Registers Altered:** None Let the effective address (EA) be the sum (RA|0)+(RB). RB 21 16 The doubleword in storage addressed by EA is loaded into register FRT. Load Floating-Point Double Indexed RA b ← (RA) FRT,RA,RB **FRT** if RA = 0 then $b \leftarrow 0$ #### **Special Registers Altered:** None $EA \leftarrow b + (RB)$ $FRT \leftarrow MEM(EA, 8)$ lfdx else 31 ### Load Floating-Point Double with Update D-form Ifdu FRT,D(RA) | Γ | 51 | FRT | RA | D | | |---|----|-----|----|----|----| | C | ) | 6 | 11 | 16 | 31 | $$EA \leftarrow (RA) + EXTS(D)$$ $FRT \leftarrow MEM(EA, 8)$ $RA \leftarrow EA$ Let the effective address (EA) be the sum (RA)+D. The doubleword in storage addressed by EA is loaded into register FRT. EA is placed into register RA. If RA=0, the instruction form is invalid. #### **Special Registers Altered:** None ### Load Floating-Point Double with Update Indexed X-form Ifdux FRT,RA,RB | 31 | FRT | RA | RB | 631 | / | |----|-----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$EA \leftarrow (RA) + (RB)$$ $FRT \leftarrow MEM(EA, 8)$ $RA \leftarrow EA$ Let the effective address (EA) be the sum (RA)+(RB). The doubleword in storage addressed by EA is loaded into register FRT. EA is placed into register RA. If RA=0, the instruction form is invalid. #### **Special Registers Altered:** #### Load Floating-Point as Integer Word Algebraic Indexed X-form Ifiwax FRT,RA,RB | 31 | FRT | RA | RB | 855 | / | |----|-----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) FRT \leftarrow EXTS(MEM(EA, 4)) ``` Let the effective address (EA) be the sum (RA|0)+(RB). The word in storage addressed by EA is loaded into ${\rm FRT}_{32:63}.~{\rm FRT}_{0:31}$ are filled with a copy of bit 0 of the loaded word. #### **Special Registers Altered:** #### 4.6.3 Floating-Point Store Instructions There are three basic forms of store instruction: single-precision, double-precision, and integer. The integer form is provided by the *Store Floating-Point as Integer Word* instruction, described on page 124. Because the FPRs support only floating-point double format for floating-point data, single-precision *Store Floating-Point* instructions convert double-precision data to single format prior to storing the operand into storage. The conversion steps are as follows. Let WORD<sub>0:31</sub> be the word in storage written to. ### No Denormalization Required (includes Zero / Infinity / NaN) ``` if FRS_{1:11} > 896 or FRS_{1:63} = 0 then WORD_{0:1} \leftarrow FRS_{0:1} WORD_{2:31} \leftarrow FRS_{5:34} ``` #### **Denormalization Required** ``` if 874 \le FRS_{1:11} \le 896 then sign \leftarrow FRS_0 exp \leftarrow FRS_{1:11} - 1023 frac_{0:52} \leftarrow 0b1 \parallel FRS_{12:63} denormalize operand do while exp < -126 frac_{0:52} \leftarrow 0b0 \parallel frac_{0:51} exp \leftarrow exp + 1 WORD_0 \leftarrow sign WORD_{1:8} \leftarrow 0x00 WORD_{9:31} \leftarrow frac_{1:23} else WORD \leftarrow undefined ``` Notice that if the value to be stored by a single-precision *Store Floating-Point* instruction is larger in magnitude than the maximum number representable in single format, the first case above (No Denormalization Required) applies. The result stored in WORD is then a well-defined value, but is not numerically equal to the value in the source register (i.e., the result of a sin- gle-precision *Load Floating-Point* from WORD will not compare equal to the contents of the original source register). For double-precision Store Floating-Point instructions and for the Store Floating-Point as Integer Word instruction no conversion is required, as the data from the FPR are copied directly into storage. Many of the *Store Floating-Point* instructions have an "update" form, in which register RA is updated with the effective address. For these forms, if RA≠0, the effective address is placed into register RA. **Note:** Recall that RA and RB denote General Purpose Registers, while FRS denotes a Floating-Point Register. #### Store Floating-Point Single #### D-form #### stfs FRS,D(RA) | 52 | FRS | RA | D | | |----|-----|----|------|---| | 0 | 6 | 11 | 16 3 | 1 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTS(D) MEM(EA, 4) \leftarrow SINGLE((FRS)) ``` Let the effective address (EA) be the sum (RA|0)+D. The contents of register FRS are converted to single format (see page 121) and stored into the word in storage addressed by EA. #### **Special Registers Altered:** None ### Store Floating-Point Single with Update D-form stfsu FRS,D(RA) | 53 | FRS | RA | D | | |----|-----|----|----|----| | 0 | 6 | 11 | 16 | 31 | ``` \text{EA} \leftarrow (\text{RA}) + \text{EXTS}(\text{D}) \text{MEM}(\text{EA}, 4) \leftarrow \text{SINGLE}((\text{FRS})) \text{RA} \leftarrow \text{EA} ``` Let the effective address (EA) be the sum (RA)+D. The contents of register FRS are converted to single format (see page 121) and stored into the word in storage addressed by EA. EA is placed into register RA. If RA=0, the instruction form is invalid. #### **Special Registers Altered:** None ### Store Floating-Point Single Indexed X-form stfsx FRS,RA,RB | Ī | 31 | FRS | RA | RB | 663 | / | |---|----|-----|----|----|-----|----| | C | ) | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) MEM(EA, 4) \leftarrow SINGLE((FRS)) ``` Let the effective address (EA) be the sum (RA|0)+(RB). The contents of register FRS are converted to single format (see page 121) and stored into the word in storage addressed by EA. #### **Special Registers Altered:** None ### Store Floating-Point Single with Update Indexed X-form stfsux FRS,RA,RB | 31 | FRS | RA | RB | 695 | / | |----|-----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` EA ← (RA) + (RB) MEM(EA, 4) ← SINGLE((FRS)) RA ← EA ``` Let the effective address (EA) be the sum (RA)+(RB). The contents of register FRS are converted to single format (see page 121) and stored into the word in storage addressed by EA. EA is placed into register RA. If RA=0, the instruction form is invalid. #### **Special Registers Altered:** #### Store Floating-Point Double **D-form** #### stfd FRS,D(RA) | ſ | 54 | FRS | RA | D | | |---|----|-----|----|----|----| | ( | 0 | 6 | 11 | 16 | 31 | if RA = 0 then b $$\leftarrow$$ 0 else b $\leftarrow$ (RA) EA $\leftarrow$ b + EXTS(D) MEM(EA, 8) $\leftarrow$ (FRS) Let the effective address (EA) be the sum (RA|0)+D. The contents of register FRS are stored into the doubleword in storage addressed by EA. #### **Special Registers Altered:** None ### Store Floating-Point Double with Update D-form stfdu FRS,D(RA) | 55 | FRS | RA | D | | |----|-----|----|----|----| | 0 | 6 | 11 | 16 | 31 | $$EA \leftarrow (RA) + EXTS(D)$$ $MEM(EA, 8) \leftarrow (FRS)$ $RA \leftarrow EA$ Let the effective address (EA) be the sum (RA)+D. The contents of register FRS are stored into the doubleword in storage addressed by EA. EA is placed into register RA. If RA=0, the instruction form is invalid. #### **Special Registers Altered:** None ### Store Floating-Point Double Indexed X-form stfdx FRS,RA,RB | 31 | FRS | RA | RB | 727 | / | |----|-----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | if RA = 0 then $$b \leftarrow 0$$ else $b \leftarrow (RA)$ EA $\leftarrow b + (RB)$ MEM(EA, 8) $\leftarrow (FRS)$ Let the effective address (EA) be the sum (RA|0)+(RB). The contents of register FRS are stored into the doubleword in storage addressed by EA. #### **Special Registers Altered:** None ### Store Floating-Point Double with Update Indexed X-form stfdux FRS,RA,RB | 31 | FRS | RA | RB | 759 | / | |----|-----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$EA \leftarrow (RA) + (RB)$$ $MEM(EA, 8) \leftarrow (FRS)$ $RA \leftarrow EA$ Let the effective address (EA) be the sum (RA)+(RB). The contents of register FRS are stored into the doubleword in storage addressed by EA. EA is placed into register RA. If RA=0, the instruction form is invalid. #### **Special Registers Altered:** ### Store Floating-Point as Integer Word Indexed X-form stfiwx FRS,RA,RB | 31 | FRS | RA | RB | 983 | / | |----|-----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) MEM(EA, 4) \leftarrow (FRS)<sub>32:63</sub> ``` Let the effective address (EA) be the sum (RA|0)+(RB). (FRS)<sub>32:63</sub> are stored, without conversion, into the word in storage addressed by EA. If the contents of register FRS were produced, either directly or indirectly, by a Load Floating-Point Single instruction, a single-precision Arithmetic instruction, or frsp, then the value stored is undefined. (The contents of register FRS are produced directly by such an instruction if FRS is the target register for the instruction. The contents of register FRS are produced indirectly by such an instruction if FRS is the final target register of a sequence of one or more Floating-Point Move instructions, with the input to the sequence having been produced directly by such an instruction.) #### **Special Registers Altered:** # 4.6.4 Floating-Point Load Store Doubleword Pair Instructions [Category: Floating-PointPhased-Out] For *Ifdp[x]*, the doubleword-pair in storage addressed by EA is loaded into an even-odd pair of FPRs with the even-numbered FPR being loaded with the leftmost doubleword from storage and the odd-numbered FPR being loaded with the rightmost doubleword. For **stfdp[x]**, the content of an even-odd pair of FPRs is stored into the doubleword-pair in storage addressed by EA, with the even-numbered FPR being stored into the leftmost doubleword in storage and the odd-numbered FPR being stored into the rightmost doubleword. #### Programming Note The instructions described in this section should not be used to access an operand in DFP128 format when MSR<sub>LE</sub>=1. #### Load Floating-Point Double Pair DS-form Ifdp FRTp,DS(RA) ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTS(DS||0b00) FRTp \leftarrow MEM(EA, 16) ``` Let the effective address (EA) be the sum (RA|0) + (DS||0b00). The doubleword-pair in storage addressed by EA is placed into register-pair FRTp. If FRTp is odd, the instruction form is invalid. #### Special Registers Altered: None #### Load Floating-Point Double Pair Indexed X-form Ifdpx FRTp,RA,RB | 04 EDT DA DD | | | |---------------------|--------|---| | 31 FRID RA RB | 791 I/ | ! | | 0 6 11 16 21 | 3- | 1 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) FRTD \leftarrow MEM(EA, 16) ``` Let the effective address (EA) be the sum (RA|0) + (RB). The doubleword-pair in storage addressed by EA is placed into register-pair FRTp. If FRTp is odd, the instruction form is invalid. #### Special Registers Altered: None #### Store Floating-Point Double Pair DS-form stfdp FRSp,DS(RA) | 61 | FRSp | RA | DS | 00 | |----|------|----|----|-------| | 0 | 6 | 11 | 16 | 30 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTS(DS||0b00) MEM(EA, 16) \leftarrow FRSp ``` Let the effective address (EA) be the sum (RA|0) + (DS||0b00). The contents of register-pair FRSp are stored into the doubleword-pair in storage addressed by EA. If FRSp is odd, the instruction form is invalid. #### **Special Registers Altered:** None #### Store Floating-Point Double Pair Indexed X-form stfdpx FRSp,RA,RB ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) MEM(EA, 16) \leftarrow FRSp ``` Let the effective address (EA) be the sum (RA|0) + (RB). The contents of register-pair FRSp are stored into the doubleword-pair in storage addressed by EA. If FRSp is odd, the instruction form is invalid. #### **Special Registers Altered:** fmr fmr. #### 4.6.5 Floating-Point Move Instructions These instructions copy data from one floating-point register to another, altering the sign bit (bit 0) as described below for *fneg*, *fabs*, *fnabs*, and *fcpsgn*. These instructions treat NaNs just like any other kind of value (e.g., the sign bit of a NaN may be altered by fneg, fabs, fnabs, and fcpsgn). These instructions do not alter the FPSCR. #### Floating Move Register FRT,FRB X-form FRT.FRB (Rc=0) (Rc=1) X-form | Г | 63 | FRT | /// | FRB | 72 | Rc | |---|----|-----|-----|-----|----|----| | 0 | | 6 | 11 | 16 | 21 | 31 | The contents of register FRB are placed into register #### **Special Registers Altered:** CR1 (if Rc=1) #### Floating Negate X-form FRT.FRB fneg (Rc=0)fneg. FRT,FRB (Rc=1) | 63 | FRT | /// | FRB 40 | | Rc | |----|-----|-----|--------|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | The contents of register FRB with bit 0 inverted are placed into register FRT. #### **Special Registers Altered:** CR1 (if Rc=1) #### Floating Absolute Value fabs FRT,FRB (Rc=0)fabs. FRT,FRB (Rc=1) | I | 63 | FRT | /// | FRB | 264 | Rc | |---|----|-----|-----|-----|-----|----| | | 0 | 6 | 11 | 16 | 21 | 31 | The contents of register FRB with bit 0 set to zero are placed into register FRT. #### **Special Registers Altered:** CR1 (if Rc=1) #### X-form Floating Copy Sign | fcpsgn | FRT, FRA, FRB | (Rc=0) | |---------|---------------|--------| | fcpsgn. | FRT, FRA, FRB | (Rc=1) | | 63 | FRT | FRA | FRB | 8 | Rc | |----|-----|-----|-----|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | The contents of register FRB with bit 0 set to the value of bit 0 of register FRA are placed into register FRT. #### **Special Registers Altered:** CR1 (if Rc=1) #### Floating Negative Absolute Value X-form FRT,FRB fnabs (Rc=0)FRT,FRB fnabs. (Rc=1) | 63 | | | FRB | 136 | Rc | |----|---|----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | The contents of register FRB with bit 0 set to one are placed into register FRT. #### **Special Registers Altered:** CR1 (if Rc=1) #### 4.6.6 Floating-Point Arithmetic Instructions #### 4.6.6.1 Floating-Point Elementary Arithmetic Instructions | Floatin | g Add [Single] | A-form | Floatir | ng Subtract [Single] | A-form | |-----------------|----------------------------|------------------|-----------------|----------------------------|--------------------| | fadd<br>fadd. | FRT,FRA,FRB<br>FRT,FRA,FRB | (Rc=0)<br>(Rc=1) | fsub<br>fsub. | FRT,FRA,FRB<br>FRT,FRA,FRB | (Rc=0)<br>(Rc=1) | | 63<br>0 | FRT FRA FRB /// 6 11 16 21 | 21 Rc<br>26 31 | 63<br>0 | FRT FRA FRB 6 11 16 21 | /// 20 Rc<br>26 31 | | fadds<br>fadds. | FRT,FRA,FRB<br>FRT,FRA,FRB | (Rc=0)<br>(Rc=1) | fsubs<br>fsubs. | FRT,FRA,FRB<br>FRT,FRA,FRB | (Rc=0)<br>(Rc=1) | | 59<br>0 | FRT FRA FRB /// 6 11 16 21 | 21 Rc<br>26 31 | 59<br>0 | FRT FRA FRB 6 11 16 21 | /// 20 Rc 31 | The floating-point operand in register FRA is added to the floating-point operand in register FRB. If the most significant bit of the resultant significand is not 1, the result is normalized. The result is rounded to the target precision under control of the Floating-Point Rounding Control field RN of the FPSCR and placed into register FRT. Floating-point addition is based on exponent comparison and addition of the two significands. The exponents of the two operands are compared, and the significand accompanying the smaller exponent is shifted right, with its exponent increased by one for each bit shifted, until the two exponents are equal. The two significands are then added or subtracted as appropriate, depending on the signs of the operands, to form an intermediate sum. All 53 bits of the significand as well as all three guard bits (G, R, and X) enter into the computation. If a carry occurs, the sum's significand is shifted right one bit position and the exponent is increased by one. $\label{eq:FPSCR} \begin{tabular}{ll} FPSCR_{FPRF} is set to the class and sign of the result, except for Invalid Operation Exceptions when FPSCR_{VE}=1. \end{tabular}$ #### Special Registers Altered: FPRF FR FI FX OX UX XX VXSNAN VXISI CR1 (if Rc=1) The floating-point operand in register FRB is subtracted from the floating-point operand in register FRA. If the most significant bit of the resultant significand is not 1, the result is normalized. The result is rounded to the target precision under control of the Floating-Point Rounding Control field RN of the FPSCR and placed into register FRT. The execution of the Floating Subtract instruction is identical to that of Floating Add, except that the contents of FRB participate in the operation with the sign bit (bit 0) inverted. $\label{eq:FPSCR} \begin{tabular}{ll} FPSCR_{FPRF} is set to the class and sign of the result, except for Invalid Operation Exceptions when FPSCR_{VE}=1. \end{tabular}$ #### **Special Registers Altered:** FPRF FR FI FX OX UX XX VXSNAN VXISI CR1 (if Rc=1) | Floatii | Floating Multiply [Single] | | | | A-fo | rm | FI | oatir | ng Divid | le [Sin | gle] | A-forn | | | |-----------------|----------------------------|----------------------|-----------|-----------|----------|--------------|------------|-------|----------|--------------------|-----------|-----------|----------|--------------| | fmul<br>fmul. | | ,FRA,FR(<br>,FRA,FR( | | | ` | c=0)<br>c=1) | fdi<br>fdi | | | FRA,FRI<br>FRA,FRI | | | | c=0)<br>c=1) | | 63<br>0 | FRT<br>6 | FRA<br>11 | ///<br>16 | FRC<br>21 | 25<br>26 | Rc<br>31 | 0 | 63 | FRT<br>6 | FRA<br>11 | FRB<br>16 | ///<br>21 | 18<br>26 | Rc<br>31 | | fmuls<br>fmuls. | | ,FRA,FR0<br>,FRA,FR0 | | | • | c=0)<br>c=1) | fdi<br>fdi | - | , | FRA,FRI<br>FRA,FRI | | | | c=0)<br>c=1) | | 59 | FRT | FRA | /// | FRC | 25 | Rc | | 59 | FRT | FRA | FRB | /// | 18 | Rc | | 0 | 6 | 11 | 16 | 21 | 26 | 31 | 0 | | 6 | 11 | 16 | 21 | 26 | 31 | The floating-point operand in register FRA is multiplied by the floating-point operand in register FRC. If the most significant bit of the resultant significand is not 1, the result is normalized. The result is rounded to the target precision under control of the Floating-Point Rounding Control field RN of the FPSCR and placed into register FRT. Floating-point multiplication is based on exponent addition and multiplication of the significands. $\label{eq:FPSCR} \begin{tabular}{ll} FPSCR_{FPRF} is set to the class and sign of the result, except for Invalid Operation Exceptions when FPSCR_{VF}=1. \end{tabular}$ #### **Special Registers Altered:** FPRF FR FI FX OX UX XX VXSNAN VXIMZ CR1 (if Rc=1) The floating-point operand in register FRA is divided by the floating-point operand in register FRB. The remainder is not supplied as a result. If the most significant bit of the resultant significand is not 1, the result is normalized. The result is rounded to the target precision under control of the Floating-Point Rounding Control field RN of the FPSCR and placed into register FRT. Floating-point division is based on exponent subtraction and division of the significands. #### **Special Registers Altered:** FPRF FR FI FX OX UX ZX XX VXSNAN VXIDI VXZDZ CR1 (if Rc=1) #### Floating Square Root [Single] A-form | fsqrt<br>fsqrt. | FRT,FRB (Rc= | | | | | | |-------------------|--------------|-----|-----|-----|-----|------------| | 63 | FRT | /// | FRB | /// | 22 | Rc | | 0 | 6 | 11 | 16 | 21 | 26 | 31 | | fsqrts<br>fsqrts. | | | | | (Ro | ≔0)<br>≔1) | | 59 | FRT | /// | FRB | /// | 22 | Rc | | 0 | 6 | 11 | 16 | 21 | 26 | 31 | The square root of the floating-point operand in register FRB is placed into register FRT. If the most significant bit of the resultant significand is not 1, the result is normalized. The result is rounded to the target precision under control of the Floating-Point Rounding Control field RN of the FPSCR and placed into register FRT. Operation with various special values of the operand is summarized below. | Operand | Result | Exception | |----------|--------------------------|-----------| | -∞ | QNaN <sup>1</sup> | VXSQRT | | < 0 | QNaN <sup>1</sup> | VXSQRT | | -0 | -0 | None | | +∞ | +∞ | None | | SNaN | QNaN <sup>1</sup> | VXSNAN | | QNaN | QNaN | None | | No resul | t if FPSCR <sub>VE</sub> | = = 1 | #### **Special Registers Altered:** ### Floating Reciprocal Estimate [Single] A-form | fre FRT,FRB,L (Rc=0) fre. FRT,FRB,L (Rc=1) [Category: Floating-Point.Phased-In (sV2.05)] | | | | | | | |------------------------------------------------------------------------------------------|-------|---------------------------|------------|-----------|----|--------------| | 63 | FRT | /// | L FRB | /// | 24 | Rc | | ° | 6 | 11 | | 21 | 26 | 31 | | fres<br>fres.<br>[Category | FRT,F | FRB,L<br>FRB,L<br>g-Point | .Phased-Iı | n (sV2.05 | | c=0)<br>c=1) | | 59 | FRT | /// | L FRB | /// | 24 | Rc | | ° | 6 | 11 | | 21 | 26 | 31 | An estimate of the reciprocal of the floating-point operand in register FRB is placed into register FRT. The estimate placed into register FRT is correct to a precision of one part in 256 of the reciprocal of (FRB), i.e., $$ABS(\frac{estimate - 1/x}{1/x}) \le \frac{1}{256}$$ where x is the initial value in FRB. Operation with various special values of the operand is summarized below. | Operand | Result | Exception | |-----------|--------------------------|-----------------| | - ∞ | -0 | None | | -0 | -∞1 | ZX | | +0 | <sub>+∞</sub> 1 | ZX | | +∞ | +0 | None | | SNaN | QNaN <sup>2</sup> | VXSNAN | | QNaN | QNaN | None | | No result | if FPSCR <sub>7F</sub> = | <del>-</del> 1. | | | if FPSCR <sub>VE</sub> = | | If L=1 [Category: Phased-Out], an operand may be treated as if it were zero having the same sign as the operand in the following cases. - The operand is a denormalized number. - The operand would be a denormalized number in single format and was produced by a Load Floating-Point Single instruction, a single-precision arithmetic instruction, or frsp, or by a sequence of one or more Floating-Point Move instructions for which the input to the sequence was produced by such an instruction. $\label{eq:FPSCR} \begin{array}{lll} \text{FPSCR}_{\text{FPRF}} \text{ is set to the class and sign of the result,} \\ \text{except} & \text{for Invalid Operation Exceptions when} \\ \text{FPSCR}_{\text{VE}} = 1 & \text{and Zero Divide Exceptions when} \\ \text{FPSCR}_{\text{ZE}} = 1. \end{array}$ The results of executing this instruction may vary between implementations, and between different executions on the same implementation. #### **Special Registers Altered:** FPRF FR (undefined) FI (undefined) FX OX UX ZX XX (undefined) **VXSNAN** CR1 (if Rc=1) #### **Programming Note** fre and fres serve as both basic and extended mnemonics. The Assembler will recognize a fre or fres mnemonic with three operands as the basic form, and a fre or fres mnemonic with two operands as the extended form. In the extended form the L operand is omitted and assumed to be 0. #### Programming Note For the Floating-Point Estimate instructions, some implementations might implement a precision higher than the minimum architected precision. Thus, a program may take advantage of the higher precision instructions to increase performance by decreasing the iterations needed for software emulation of floating-point instructions. However, there is no guarantee given about the precision which may vary (up or down) between implementations. Only programs targeted at a specific implementation (i.e., the program will not be migrated to another implementation) should take advantage of the higher precision of the instructions. All other programs should rely on the minimum architected precision, which will guarantee the program to run properly across different implementations. #### Programming Note In some implementations execution of fre[s]. with L=1 may have a shorter latency than execution with #### Floating Reciprocal Square Root Estimate [Single] A-form frsgrte FRT,FRB,L (Rc=0)frsgrte. FRT.FRB.L (Rc=1)[Category: Floating-Point.Phased-In (sV2.05)] | 63 | FRT | /// | L FRB | /// | 26 | Rc | |----|-----|-----|-------|-----|----|----| | 0 | 6 | 11 | 15 16 | 21 | 26 | 31 | frsartes FRT.FRB.L (Rc=0)frsgrtes. FRT,FRB,L (Rc=1)[Category: Floating-Point.Phased-In (sV2.05)] | 59 | FRT | /// | L | FRB | /// | 26 | Rc | |----|-----|-----|----|-----|-----|----|----| | 0 | 6 | 11 | 15 | 16 | 21 | 26 | 31 | An estimate of the reciprocal of the square root of the floating-point operand in register FRB is placed into register FRT. The estimate placed into register FRT is correct to a precision of one part in 32 of the reciprocal of the square root of (FRB), i.e., $$ABS(\frac{estimate-1/(\sqrt{x})}{1/(\sqrt{x})}) \leq \frac{1}{32}$$ where x is the initial value in FRB. Operation with various special values of the operand is summarized below. | Operand | Result | Exception | |-----------|--------------------------|-----------| | -∞ | QNaN <sup>2</sup> | VXSQRT | | < 0 | QNaN <sup>2</sup> | VXSQRT | | -0 | _∞1 | ZX | | +0 | <sub>+∞</sub> 1 | ZX | | $+\infty$ | +0 | None | | SNaN | QNaN <sup>2</sup> | VXSNAN | | QNaN | QNaN | None | | No result | t if FPSCR <sub>7F</sub> | = 1. | | | | | <sup>2</sup> No result if $FPSCR_{VF}^{--} = 1$ . If L=1 [Category: Phased-Out], an operand may be treated as if it were zero having the same sign as the operand in the following cases. - The operand is a denormalized number. - The operand would be a denormalized number in single format and was produced by a Load Floating-Point Single instruction, a single-precision arithmetic instruction, or frsp, or by a sequence of one or more Floating-Point Move instructions for which the input to the sequence was produced by such an instruction. FPSCR<sub>FPRF</sub> is set to the class and sign of the result, except for Invalid Operation Exceptions when FPSCR<sub>VF</sub>=1 and Zero Divide Exceptions when FPSCR<sub>ZE</sub>=1. The results of executing this instruction may vary between implementations, and between different executions on the same implementation. # Special Registers Altered: FPRF FR (undefined) FI (undefined) FX ZX XX (undefined) VXSNAN VXSQRT CR1 (if Rc=1) Note See the Notes that appear with fre[s]. #### 4.6.6.2 Floating-Point Multiply-Add Instructions These instructions combine a multiply and an add operation without an intermediate rounding operation. The fraction part of the intermediate product is 106 bits wide (L bit, FRACTION), and all 106 bits take part in the add/subtract portion of the instruction. Status bits are set as follows. Overflow, Underflow, and Inexact Exception bits, the FR and FI bits, and the FPRF field are set - based on the final result of the operation, and not on the result of the multiplication. - Invalid Operation Exception bits are set as if the multiplication and the addition were performed using two separate instructions (fmul[s], followed by fadd[s] or fsub[s]). That is, multiplication of infinity by 0 or of anything by an SNaN, and/or addition of an SNaN, cause the corresponding exception bits to be set. #### Floating Multiply-Add [Single] A-form | fmadd | FRT,FRA,FRC,FRB | (Rc=0) | |--------|-----------------|--------| | fmadd. | FRT,FRA,FRC,FRB | (Rc=1) | | | 63 | FRT | FRA | FRB | FRC | 29 | Rc | |---|----|-----|-----|-----|-----|----|----| | ( | ) | 6 | 11 | 16 | 21 | 26 | 31 | | fmadds | FRT,FRA,FRC,FRB | (Rc=0) | |---------|-----------------|--------| | fmadds. | FRT,FRA,FRC,FRB | (Rc=1) | | 59 | FRT | FRA | FRB | FRC | 29 | Rc | |----|-----|-----|-----|-----|----|----| | 0 | 6 | 11 | 16 | 21 | 26 | 31 | The operation $$FRT \leftarrow [(FRA) \times (FRC)] + (FRB)$$ is performed. The floating-point operand in register FRA is multiplied by the floating-point operand in register FRC. The floating-point operand in register FRB is added to this intermediate result. If the most significant bit of the resultant significand is not 1, the result is normalized. The result is rounded to the target precision under control of the Floating-Point Rounding Control field RN of the FPSCR and placed into register FRT. $\label{eq:FPSCR} \begin{tabular}{ll} FPSCR_{FPRF} is set to the class and sign of the result, except for Invalid Operation Exceptions when $FPSCR_{VE}=1$. \end{tabular}$ #### Special Registers Altered: FPRF FR FI FX OX UX XX VXSNAN VXISI VXIMZ CR1 (if Rc=1) #### Floating Multiply-Subtract [Single] A-form | fmsub | FRT,FRA,FRC,FRB | (Rc=0) | |--------|-----------------|--------| | fmsub. | FRT,FRA,FRC,FRB | (Rc=1) | | 63 | FRT | FRA | FRB | FRC | 28 | Rc | |----|-----|-----|-----|-----|----|----| | 0 | 6 | 11 | 16 | 21 | 26 | 31 | | fmsubs | FRT,FRA,FRC,FRB | (Rc=0) | |---------|-----------------|--------| | fmsubs. | FRT,FRA,FRC,FRB | (Rc=1) | | 59 | FRT | FRA | FRB | FRC | 28 | Rc | |----|-----|-----|-----|-----|----|----| | 0 | 6 | 11 | 16 | 21 | 26 | 31 | The operation $$FRT \leftarrow [(FRA) \times (FRC)] - (FRB)$$ is performed. The floating-point operand in register FRA is multiplied by the floating-point operand in register FRC. The floating-point operand in register FRB is subtracted from this intermediate result. If the most significant bit of the resultant significand is not 1, the result is normalized. The result is rounded to the target precision under control of the Floating-Point Rounding Control field RN of the FPSCR and placed into register FRT. $\label{eq:FPSCR} \begin{tabular}{ll} FPSCR_{FPRF} is set to the class and sign of the result, except for Invalid Operation Exceptions when $FPSCR_{VF}=1$. \end{tabular}$ #### **Special Registers Altered:** FPRF FR FI FX OX UX XX VXSNAN VXISI VXIMZ CR1 (if Rc=1) #### Floating Negative Multiply-Add [Single] A-form | fnmadd | FRT,FRA,FRC,FRB | | | | | ≔0) | |------------------------------|-----------------|-----|-----|-----|---------------|-----| | fnmadd. | FRT,FRA,FRC,FRB | | | | | ≔1) | | 63 | FRT | FRA | FRB | FRC | 31 | Rc | | 0 | 6 | 11 | 16 | 21 | <sub>26</sub> | 31 | | fnmadds FRT,FRA,FRC,FRB (Rc= | | | | | | | | 59 | FRT | FRA | FRB | FRC | 31 | Rc | |----|-----|-----|-----|-----|----|----| | 0 | 6 | 11 | 16 | 21 | 26 | 31 | (Rc=1) The operation FRT $\leftarrow$ - ( [(FRA)×(FRC)] + (FRB) ) is performed. FRT,FRA,FRC,FRB fnmadds. The floating-point operand in register FRA is multiplied by the floating-point operand in register FRC. The floating-point operand in register FRB is added to this intermediate result. If the most significant bit of the resultant significand is not 1, the result is normalized. The result is rounded to the target precision under control of the Floating-Point Rounding Control field RN of the FPSCR, then negated and placed into register FRT. This instruction produces the same result as would be obtained by using the *Floating Multiply-Add* instruction and then negating the result, with the following exceptions. - QNaNs propagate with no effect on their "sign" bit. - QNaNs that are generated as the result of a disabled Invalid Operation Exception have a "sign" bit of 0. - SNaNs that are converted to QNaNs as the result of a disabled Invalid Operation Exception retain the "sign" bit of the SNaN. FPSCR<sub>FPRF</sub> is set to the class and sign of the result, except for Invalid Operation Exceptions when FPSCR<sub>VE</sub>=1. #### **Special Registers Altered:** FPRF FR FI FX OX UX XX VXSNAN VXISI VXIMZ CR1 (if Rc=1) ### Floating Negative Multiply-Subtract [Single] A-form | fnmsub<br>fnmsub. | FRT,FRA,FRC,FRB<br>FRT,FRA,FRC,FRB | | | | (Rc=0<br>(Rc=1 | | | |---------------------|------------------------------------|-----|-----|-----|----------------|----|--| | 63 | FRT | FRA | FRB | FRC | 30 | Rc | | | 0 | 6 | 11 | 16 | 21 | 26 | 31 | | | fnmsubs<br>fnmsubs. | | (Ro | | | | | | | 59 | FRT | FRA | FRB | FRC | 30 | Rc | | | 0 | 6 | 11 | 16 | 21 | 26 | 31 | | The operation FRT $\leftarrow$ - ( [(FRA)×(FRC)] - (FRB) ) is performed. The floating-point operand in register FRA is multiplied by the floating-point operand in register FRC. The floating-point operand in register FRB is subtracted from this intermediate result. If the most significant bit of the resultant significand is not 1, the result is normalized. The result is rounded to the target precision under control of the Floating-Point Rounding Control field RN of the FPSCR, then negated and placed into register FRT. This instruction produces the same result as would be obtained by using the *Floating Multiply-Subtract* instruction and then negating the result, with the following exceptions. - QNaNs propagate with no effect on their "sign" bit. - QNaNs that are generated as the result of a disabled Invalid Operation Exception have a "sign" bit of 0. - SNaNs that are converted to QNaNs as the result of a disabled Invalid Operation Exception retain the "sign" bit of the SNaN. FPSCR<sub>FPRF</sub> is set to the class and sign of the result, except for Invalid Operation Exceptions when FPSCR<sub>VF</sub>=1. #### **Special Registers Altered:** FPRF FR FI FX OX UX XX VXSNAN VXISI VXIMZ CR1 (if Rc=1) #### 4.6.7 Floating-Point Rounding and Conversion Instructions #### Programming Note Examples of uses of these instructions to perform various conversions can be found in Section E.2, "Floating-Point Conversions [Category: Floating-Point]" on page 400. ### 4.6.7.1 Floating-Point Rounding Instruction ### Floating Round to Single-Precision X-form frsp FRT,FRB (Rc=0) frsp. FRT,FRB (Rc=1) | 63 | FRT | /// | FRB | 12 | Rc | |----|-----|-----|-----|----|----| | О | 6 | 11 | 16 | 21 | 31 | The floating-point operand in register FRB is rounded to single-precision, using the rounding mode specified by FPSCR<sub>RN</sub>, and placed into register FRT. The rounding is described fully in Section A.1, "Floating-Point Round to Single-Precision Model" on page 361. FPSCR<sub>FPRF</sub> is set to the class and sign of the result, except for Invalid Operation Exceptions when FPSCR<sub>VF</sub>=1. #### **Special Registers Altered:** FPRF FR FI FX OX UX XX VXSNAN CR1 (if Rc=1) ### 4.6.7.2 Floating-Point Convert To/From Integer Instructions ### Floating Convert To Integer Doubleword X-form | fctid | FRT,FRB | (Rc=0) | |--------|---------|--------| | fctid. | FRT,FRB | (Rc=1) | | 63 | FRT | /// | FRB | 814 | Rc | |----|-----|-----|-----|-----|----| | О | 6 | 11 | 16 | 21 | 31 | The floating-point operand in register FRB is converted to a 64-bit signed fixed-point integer, using the rounding mode specified by FPSCR<sub>RN</sub>, and placed into register FRT. If the operand in FRB is greater than $2^{63}$ - 1, then FRT is set to $0x7FFF\_FFFF\_FFFF$ . If the operand in FRB is less than $-2^{63}$ , then FRT is set to $0x8000\_0000\_0000\_0000$ . The conversion is described fully in Section A.2, "Floating-Point Convert to Integer Model" on page 365. Except for enabled Invalid Operation Exceptions, FPSCR $_{FPRF}$ is undefined. FPSCR $_{FR}$ is set if the result is incremented when rounded. FPSCR $_{FI}$ is set if the result is inexact. #### **Special Registers Altered:** FPRF (undefined) FR FI FX XX VXSNAN VXCVI CR1 (if Rc=1) #### Programming Note - The Floating Convert From Integer Word function can be performed by loading the desired word into an FPR using **Ifiwax** (see Section 4.6.2), and then converting the contents of that FPR to a floating-point integer using **fcfid**. ### Floating Convert To Integer Doubleword with round toward Zero X-form | fctidz | FRT,FRB | (Rc=0) | |---------|---------|--------| | fctidz. | FRT,FRB | (Rc=1) | | | 63 | FRT | /// | FRB | 815 | Rc | |---|----|-----|-----|-----|-----|----| | 0 | | 6 | 11 | 16 | 21 | 31 | The floating-point operand in register FRB is converted to a 64-bit signed fixed-point integer, using the rounding mode Round toward Zero, and placed into register FRT. If the operand in FRB is greater than $2^{63}$ - 1, then FRT is set to $0x7FFF\_FFFF\_FFFF$ . If the operand in FRB is less than $-2^{63}$ , then FRT is set to $0x8000\_0000\_0000\_0000$ . The conversion is described fully in Section A.2, "Floating-Point Convert to Integer Model" on page 365. Except for enabled Invalid Operation Exceptions, FPSCR<sub>FPRF</sub> is undefined. FPSCR<sub>FR</sub> is set if the result is incremented when rounded. FPSCR<sub>FI</sub> is set if the result is inexact. #### **Special Registers Altered:** FPRF (undefined) FR FI FX XX VXSNAN VXCVI CR1 #### Floating Convert To Integer Word X-form | fctiw | FRT,FRB | (Rc=0) | |--------|---------|--------| | fctiw. | FRT,FRB | (Rc=1) | | 63 | FRT | /// | FRB | 14 | Rc | |----|-----|-----|-----|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | The floating-point operand in register FRB is converted to a 32-bit signed fixed-point integer, using the rounding mode specified by FPSCR<sub>RN</sub>, and placed into FRT $_{32:63}$ . The contents of FRT $_{0:31}$ are undefined. If the operand in FRB is greater than $2^{31}$ - 1, then bits 32:63 of FRT are set to 0x7FFF\_FFFF. If the operand in FRB is less than -2<sup>31</sup>, then bits 32:63 of FRT are set to 0x8000\_0000. The conversion is described fully in Section A.2, "Floating-Point Convert to Integer Model" on page 365. Except for enabled Invalid Operation Exceptions, FPSCR<sub>FPRF</sub> is undefined. FPSCR<sub>FR</sub> is set if the result is incremented when rounded. FPSCR<sub>FI</sub> is set if the result is inexact. #### **Special Registers Altered:** (if Rc=1) FPRF (undefined) FR FI FX XX VXSNAN VXCVI CR1 (if Rc=1) ### Floating Convert To Integer Word with round toward Zero X-form | 63 | FRT | /// | FRB | 15 | Rc | |----|-----|-----|-----|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | The floating-point operand in register FRB is converted to a 32-bit signed fixed-point integer, using the rounding mode Round toward Zero, and placed into $FRT_{32:63}$ . The contents of $FRT_{0:31}$ are undefined. If the operand in FRB is greater than $2^{31}$ - 1, then bits 32:63 of FRT are set to 0x7FFF\_FFFF. If the operand in FRB is less than -2<sup>31</sup>, then bits 32:63 of FRT are set to 0x8000\_0000. The conversion is described fully in Section A.2, "Floating-Point Convert to Integer Model". Except for enabled Invalid Operation Exceptions, FPSCR<sub>FPRF</sub> is undefined. FPSCR<sub>FR</sub> is set if the result is incremented when rounded. FPSCR<sub>FI</sub> is set if the result is inexact. #### **Special Registers Altered:** FPRF (undefined) FR FI FX XX VXSNAN VXCVI CR1 (if Rc=1) ### Floating Convert From Integer Doubleword X-form | fcfid | FRT,FRB | (Rc=0) | |--------|---------|--------| | fcfid. | FRT,FRB | (Rc=1) | | | 63 | FRT | /// | FRR | 846 | Rc | |---|----|-----|-----|------|------|----| | | 00 | | ,,, | 1110 | 0.10 | | | 0 | | 6 | 11 | 16 | 21 | 31 | The 64-bit signed fixed-point operand in register FRB is converted to an infinitely precise floating-point integer. The result of the conversion is rounded to double-precision, using the rounding mode specified by FPSCR<sub>RN</sub>, and placed into register FRT. The conversion is described fully in Section A.3, "Floating-Point Convert from Integer Model". FPSCR<sub>FPRF</sub> is set to the class and sign of the result. FPSCR<sub>FR</sub> is set if the result is incremented when rounded. FPSCR<sub>FI</sub> is set if the result is inexact. #### **Special Registers Altered:** FPRF FR FI FX XX CR1 (if Rc=1) # 4.6.7.3 Floating Round to Integer Instructions [Category: Floating-Point.Phased-In (sV2.05)] The Floating Round to Integer instructions provide direct support for rounding functions found in high level languages. For example, *frin*, *friz*, *frip*, and *frim* implement C++ round(), trunc(), ceil(), and floor(), respectively. Note that *frin* does not implement the IEEE Round to Nearest function, which is often further described as "ties to even." The rounding performed by these instructions is described fully in Section A.4, "Floating-Point Round to Integer Model" on page 369. #### Programming Note – These instructions set FPSCR<sub>FR FI</sub> to 0b00 regardless of whether the result is inexact or rounded because there is a desire to preserve the value of FPSCR<sub>XX</sub>. Furthermore, it is believed that most programs do not need to know whether these rounding operations produce inexact or rounded results. If it is necessary to determine whether the result is inexact or rounded, software must compare the result with the original source operand. #### Floating Round to Integer Nearest X-form | frin | FRT,FRB | (Rc=0) | |-------|---------|--------| | frin. | FRT,FRB | (Rc=1) | | 63 | FRT | /// | FRB | 392 | Rc | |----|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | The floating-point operand in register FRB is rounded to an integral value as follows, with the result placed into register FRT. If the sign of the operand is positive, (FRB) + 0.5 is truncated to an integral value, otherwise (FRB) - 0.5 is truncated to an integral value. $\label{eq:FPSCR} \begin{tabular}{ll} FPSCR_{FPRF} is set to the class and sign of the result, except for Invalid Operation Exceptions when FPSCR_{VE} = 1. \end{tabular}$ #### **Special Registers Altered:** FPRF FR (set to 0) FI (set to 0) FX VXSNAN CR1 (if Rc = 1) ### Floating Round to Integer Toward Zero X-form | friz | FRT,FRB | (Rc=0) | |-------|---------|--------| | friz. | FRT,FRB | (Rc=1) | | | 63 | FRT | /// | FRB | 424 | Rc | |---|----|-----|-----|-----|-----|----| | 0 | | 6 | 11 | 16 | 21 | 31 | The floating-point operand in register FRB is rounded to an integral value using the rounding mode round toward zero, and the result is placed into register FRT. #### **Special Registers Altered:** FPRF FR (set to 0) FI (set to 0) FX $VXSNAN \\ CR1 \qquad \qquad (if Rc = 1)$ #### Floating Round to Integer Plus X-form | frip | FRT,FRB | (Rc=0) | |-------|---------|--------| | frip. | FRT,FRB | (Rc=1) | | | 63 | FRT | /// | FRB | 456 | Rc | |---|----|-----|-----|-----|-----|----| | C | ) | 6 | 11 | 16 | 21 | 31 | The floating-point operand in register FRB is rounded to an integral value using the rounding mode round toward +infinity, and the result is placed into register FRT. FPSCR<sub>FPRF</sub> is set to the class and sign of the result, except for Invalid Operation Exceptions when $FPSCR_{VE} = 1$ . #### **Special Registers Altered:** FPRF FR (set to 0) FI (set to 0) FX VXSNAN CR1 (if Rc = 1) #### Floating Round to Integer Minus X-form | frim | FRT,FRB | (Rc=0) | |-------|---------|--------| | frim. | FRT,FRB | (Rc=1) | | 63 | FRT | /// | FRB | 488 | Rc | |----|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | The floating-point operand in register FRB is rounded to an integral value using the rounding mode round toward -infinity, and the result is placed into register FRT. FPSCR<sub>FPRF</sub> is set to the class and sign of the result, except for Invalid Operation Exceptions when $FPSCR_{VE} = 1$ . #### Special Registers Altered: FPRF FR (set to 0) FI (set to 0) FX VXSNAN CR1 (if Rc = 1) #### 4.6.8 Floating-Point Compare Instructions The floating-point Compare instructions compare the contents of two floating-point registers. Comparison ignores the sign of zero (i.e., regards +0 as equal to -0). The comparison can be ordered or unordered. The comparison sets one bit in the designated CR field to 1 and the other three to 0. The FPCC is set in the same way. The CR field and the FPCC are set as follows. | Bit | Name | Description | |-----|------|---------------------------| | 0 | FL | (FRA) < (FRB) | | 1 | FG | (FRA) > (FRB) | | 2 | FE | (FRA) = (FRB) | | 3 | FU | (FRA) ? (FRB) (unordered) | #### Floating Compare Unordered #### X-form fcmpo #### Floating Compare Ordered BF,FRA,FRB X-form fcmpu BF,FRA,FRB | 63 | BF | // | FRA | FRB | 0 | / | |----|----|----|-----|-----|----|----| | 0 | 6 | 9 | 11 | 16 | 21 | 31 | ``` if (FRA) is a NaN or (FRB) is a NaN then c \leftarrow 0b0001 else if (FRA) < (FRB) then c \leftarrow 0b1000 else if (FRA) > (FRB) then c \leftarrow 0b0100 else c ← 0b0010 FPCC ← c CR_{4 \times BF: 4 \times BF+3} \leftarrow c if (FRA) is an SNaN or (FRB) is an SNaN then VXSNAN ← 1 ``` The floating-point operand in register FRA is compared to the floating-point operand in register FRB. The result of the compare is placed into CR field BF and the FPCC. If either of the operands is a NaN, either quiet or signaling, then CR field BF and the FPCC are set to reflect unordered. If either of the operands is a Signaling NaN, then VXSNAN is set. #### **Special Registers Altered:** CR field BF **FPCC** FX **VXSNAN** | 63 | BF | // | FRA | FRB | 32 | / | |----|----|----|-----|-----|----|----| | О | 6 | 9 | 11 | 16 | 21 | 31 | ``` if (FRA) is a NaN or (FRB) is a NaN then c ← 0b0001 else if (FRA) < (FRB) then c \leftarrow 0b1000 else if (FRA) > (FRB) then c \leftarrow 0b0100 else c ← 0b0010 FPCC ← c CR_{4 \times BF: 4 \times BF+3} \leftarrow c if (FRA) is an SNaN or (FRB) is an SNaN then VXSNAN ← 1 if VE = 0 then VXVC \leftarrow 1 else if (FRA) is a QNaN or (FRB) is a QNaN then VXVC \leftarrow 1 ``` The floating-point operand in register FRA is compared to the floating-point operand in register FRB. The result of the compare is placed into CR field BF and the FPCC. If either of the operands is a NaN, either guiet or signaling, then CR field BF and the FPCC are set to reflect unordered. If either of the operands is a Signaling NaN, then VXSNAN is set and, if Invalid Operation is disabled (VE=0), VXVC is set. If neither operand is a Signaling NaN but at least one operand is a Quiet NaN, then VXVC is set. #### **Special Registers Altered:** CR field BF **FPCC** FX VXSNAN VXVC #### 4.6.9 Floating-Point Select Instruction #### Floating Select #### A-form | fsel | FRT,FRA,FRC,FRB | (Rc=0) | |-------|-----------------|--------| | fsel. | FRT,FRA,FRC,FRB | (Rc=1) | | 63 | FRT | FRA | FRB | FRC | 23 | Rc | |----|-----|-----|-----|-----|----|----| | 0 | 6 | 11 | 16 | 21 | 26 | 31 | if (FRA) $$\geq$$ 0.0 then FRT $\leftarrow$ (FRC) else FRT $\leftarrow$ (FRB) The floating-point operand in register FRA is compared to the value zero. If the operand is greater than or equal to zero, register FRT is set to the contents of register FRC. If the operand is less than zero or is a NaN, register FRT is set to the contents of register FRB. The comparison ignores the sign of zero (i.e., regards +0 as equal to -0). #### **Special Registers Altered:** CR1 (if Rc=1) #### Programming Note Examples of uses of this instruction can be found in Sections E.2, "Floating-Point Conversions [Category: Floating-Point]" on page 400 and E.3, "Floating-Point Selection [Category: Floating-Point]" on page 402. **Warning:** Care must be taken in using *fsel* if IEEE compatibility is required, or if the values being tested can be NaNs or infinities; see Section E.3.4, "Notes" on page 402. #### 4.6.10 Floating-Point Status and Control Register Instructions Every Floating-Point Status and Control Register instruction synchronizes the effects of all floating-point instructions executed by a given processor. Executing a Floating-Point Status and Control Register instruction ensures that all floating-point instructions previously initiated by the given processor have completed before the Floating-Point Status and Control Register instruction is initiated, and that no subsequent floating-point instructions are initiated by the given processor until the Floating-Point Status and Control Register instruction has completed. In particular: - All exceptions that will be caused by the previously initiated instructions are recorded in the FPSCR before the Floating-Point Status and Control Register instruction is initiated. - All invocations of the system floating-point enabled exception error handler that will be caused by the previously initiated instructions have occurred before the Floating-Point Status and Control Register instruction is initiated. - No subsequent floating-point instruction that depends on or alters the settings of any FPSCR bits is initiated until the Floating-Point Status and Control Register instruction has completed. (Floating-point Storage Access instructions are not affected.) The instruction descriptions in this section refer to "FPSCR fields," where FPSCR field k is FPSCR bits 4xk:4xk+3. | Move From FPSCR | X-form | |-----------------|--------| | | | | 63 | FRT | /// | /// | 583 | Rc | |----|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | The contents of the FPSCR are placed into register FRT. #### **Special Registers Altered:** CR1 (if Rc=1) #### Move to Condition Register from FPSCR X-form mcrfs BF,BFA | 63 | BF | // | BFA | // | /// | 64 | / | |----|----|----|-----|----|-----|----|----| | 0 | 6 | 9 | 11 | 14 | 16 | 21 | 31 | The contents of FPSCR<sub>32:63</sub> field BFA are copied to Condition Register field BF. All exception bits copied are set to 0 in the FPSCR. If the FX bit is copied, it is set to 0 in the FPSCR. #### **Special Registers Altered:** CR field BF | OR IICIG DI | | |-------------------------|------------| | FX OX | (if BFA=0) | | UX ZX XX VXSNAN | (if BFA=1) | | VXISI VXIDI VXZDZ VXIMZ | (if BFA=2) | | VXVC | (if BFA=3) | | VXSOFT VXSQRT VXCVI | (if BFA=5) | #### Move To FPSCR Field Immediate X-form | mtfsfi | BF,U,W | (Rc=0) | |---------|--------|--------| | mtfsfi. | BF,U,W | (Rc=1) | | 63 | BF | // | /// | W | U | / | 134 | Rc | |----|----|----|-----|----|----|----|-----|----| | 0 | 6 | 9 | 11 | 15 | 16 | 20 | 21 | 31 | The value of the U field is placed into FPSCR field $BF+8\times(1-W)$ . $FPSCR_{FX}$ is altered only if BF = 0 and W = 0. #### **Special Registers Altered:** FPSCR field BF + $8\times(1-W)$ CR1 (if Rc=1) #### Programming Note - **mtfsfi** serves as both a basic and an extended mnemonic. The Assembler will recognize a **mtfsfi** mnemonic with three operands as the basic form, and a **mtfsfi** mnemonic with two operands as the extended form. In the extended form the W operand is omitted and assumed to be 0. #### Programming Note When FPSCR $_{32:35}$ is specified, bits 32 (FX) and 35 (OX) are set to the values of U $_0$ and U $_3$ (i.e., even if this instruction causes OX to change from 0 to 1, FX is set from U $_0$ and not by the usual rule that FX is set to 1 when an exception bit changes from 0 to 1). Bits 33 and 34 (FEX and VX) are set according to the usual rule, given on page 101, and not from U $_{1:2}$ . #### Move To FPSCR Fields XFL-form | mtfsf | FLM,FRB,L,W | (Rc=0) | |--------|-------------|--------| | mtfsf. | FLM,FRB,L,W | (Rc=1) | | 63 | L | FLM | W | FRB | 711 | Rc | |----|---|-----|----|-----|-----|----| | 0 | 6 | 7 | 15 | 16 | 21 | 31 | The FPSCR is modified as specified by the FLM, L, and W fields. L = 0 The contents of register FRB are placed into the FPSCR under control of the W field and the field mask specified by FLM. W and the field mask identify the 4-bit fields affected. Let i be an integer in the range 0-7. If $FLM_{i=1}$ then FPSCR field k is set to the contents of the corresponding field of register FRB, where $k = i+8\times(1-W)$ . L = 1 The contents of register FRB are placed into the FPSCR. FPSCR<sub>FX</sub> is not altered implicitly by this instruction. #### **Special Registers Altered:** FPSCR fields selected by mask, L, and W CR1 (if Rc=1) #### Programming Note - mtfsf serves as both a basic and an extended mnemonic. The Assembler will recognize a mtfsf mnemonic with four operands as the basic form, and a mtfsf mnemonic with two operands as the extended form. In the extended form the W and L operands are omitted and both are assumed to be 0. #### Programming Note - Updating fewer than eight fields of the FPSCR may have substantially poorer performance on some implementations than updating eight fields or all of the fields. #### Programming Note If L=1 or if L=0 and FPSCR $_{32:35}$ is specified, bits 32 (FX) and 35 (OX) are set to the values of (FRB) $_{32}$ and (FRB) $_{35}$ (i.e., even if this instruction causes OX to change from 0 to 1, FX is set from (FRB) $_{32}$ and not by the usual rule that FX is set to 1 when an exception bit changes from 0 to 1). Bits 33 and 34 (FEX and VX) are set according to the usual rule, given on page 101, and not from (FRB) $_{33:34}$ . # Move To FPSCR Bit 0 mtfsb0 BT mtfsb0. BT | 63 | BT | /// | /// | 70 | Rc | |----|----|-----|-----|----|----| | o | 6 | 11 | 16 | 21 | 31 | X-form (Rc=0) (Rc=1) Bit BT+32 of the FPSCR is set to 0. #### **Special Registers Altered:** FPSCR bit BT+32 CR1 (if Rc=1) #### Programming Note — Bits 33 and 34 (FEX and VX) cannot be explicitly reset. #### Move To FPSCR Bit 1 X-form $\begin{array}{ll} \text{mtfsb1} & \text{BT} & \text{(Rc=0)} \\ \text{mtfsb1}. & \text{BT} & \text{(Rc=1)} \end{array}$ | Ī | 63 | BT | /// | /// | 38 | Rc | |---|----|----|-----|-----|----|----| | | 0 | 6 | 11 | 16 | 21 | 31 | Bit BT+32 of the FPSCR is set to 1. #### **Special Registers Altered:** FPSCR bits BT+32 and FX R1 (if Rc=1) #### Programming Note — Bits 32 and 34 (FEX and VX) cannot be explicitly set. # Chapter 5. Decimal Floating-Point [Category: Decimal Floating-Point] | 5.1 Decimal Floating-Point (DFP) Proces- | 5.5.8.1 Data-Format Conversion153 | |-------------------------------------------|--------------------------------------------| | sor Overview | 5.5.8.2 Data-Type Conversion 154 | | 5.2 DFP Register Handling 144 | 5.5.9 Format Operations 154 | | 5.2.1 DFP Usage of Floating-Point Regis- | 5.5.10 DFP Exceptions 154 | | ters | 5.5.10.1 Invalid Operation Exception . 156 | | 5.3 DFP Support for Non-DFP Data Types | 5.5.10.2 Zero Divide Exception 157 | | 146 | 5.5.10.3 Overflow Exception 157 | | 5.4 DFP Number Representation 147 | 5.5.10.4 Underflow Exception 158 | | 5.4.1 DFP Data Format | 5.5.10.5 Inexact Exception 159 | | 5.4.1.1 Fields Within the Data Format 148 | 5.5.11 Summary of Normal Rounding And | | 5.4.1.2 Summary of DFP Data Formats | Range Actions | | 149 | 5.6 DFP Instruction Descriptions 162 | | 5.4.1.3 Preferred DPD Encoding 149 | 5.6.1 DFP Arithmetic Instructions 163 | | 5.4.2 Classes of DFP Data 149 | 5.6.2 DFP Compare Instructions 167 | | 5.5 DFP Execution Model 150 | 5.6.3 DFP Test Instructions 170 | | 5.5.1 Rounding 150 | 5.6.4 DFP Quantum Adjustment Instruc- | | 5.5.2 Rounding Mode Specification 151 | tions | | 5.5.3 Formation of Final Result 152 | 5.6.5 DFP Conversion Instructions 182 | | 5.5.3.1 Use of Ideal Exponent 152 | 5.6.5.1 DFP Data-Format Conversion | | 5.5.4 Arithmetic Operations 152 | Instructions | | 5.5.4.1 Sign of Arithmetic Result 152 | 5.6.5.2 DFP Data-Type Conversion | | 5.5.5 Compare Operations 153 | Instructions | | 5.5.6 Test Operations 153 | 5.6.6 DFP Format Instructions 187 | | 5.5.7 Quantum Adjustment Operations 153 | 5.6.7 DFP Instruction Summary 191 | | 5.5.8 Conversion Operations 153 | | # 5.1 Decimal Floating-Point (DFP) Processor Overview This chapter describes the behavior of the decimal floating-point processor, the supported data types, formats, and classes, and the usage of registers. Also included are the execution model, exceptions, and instructions supported by the decimal floating-point processor. The decimal floating-point (DFP) processor shares the 32 floating-point registers (FPRs) and the Floating-Point Status and Control Register (FPSCR) with the binary floating-point (BFP) processor. However, the interpretation of data formats in the FPRs, and the meaning of some control and status bits in the FPSCR are different between the BFP and DFP processors. The DFP processor also shares the Condition Register (CR) with the fixed-point processor, the BFP processor, and the vector processor. The DFP processor supports three DFP data formats: DFP Short (single precision), DFP Long (double precision), and DFP Extended (quad precision). Most operations are performed on DFP Long or DFP Extended format directly. Support for DFP Short is limited to conversion to and from DFP Long. Some DFP instructions operate on other data types, including signed or unsigned binary fixed-point data, and signed or unsigned decimal data. DFP instructions are provided to perform arithmetic, compare, test, quantum-adjustment, conversion, and format operations on operands held in FPRs or FPR pairs. #### Arithmetic instructions These instructions perform addition, subtraction, multiplication, and division operations. #### Compare instructions These instructions perform a comparison operation on the numerical value of two DFP operands. #### Test instructions These instructions test the data class, the data group, the exponent, or the number of significant digits of a DFP operand. #### Quantum-adjustment instructions These instructions convert a DFP number to a result in the form that has the designated exponent, which may be explicitly or implicitly specified. #### Conversion instructions These instructions perform conversion between different data formats or data types. #### ■ Format instructions These instructions facilitate composing or decomposing a DFP operand. These instructions are described in Section 5.6 "DFP Instruction Descriptions" on page 162. The three DFP data formats allow finite numbers to be represented with different precision and ranges. Special codes are also provided to represent +Infinity, -Infinity, Quiet NaN (Not-a-Number), and Signaling NaN. Operations involving infinities produce results obeying traditional mathematical conventions. NaNs have no mathematical interpretation. The encoding of NaNs provides a diagnostic information field. This diagnostic field may be used to indicate such things as the source of an uninitialized variable or the reason an invalid result was produced. The DFP processor recognizes a set of DFP exceptions which are indicated via bits set in the FPSCR. Additionally, the DFP exception actions depend on the setting of the various exception enable bits in the FPSCR. The following DFP exceptions are detected by the DFP processor. The exception status bits in the FPSCR are indicated in parentheses. | <ul> <li>Invalid Operation Exception</li> </ul> | (VX) | |-------------------------------------------------|----------| | SNaN | (VXSNAN) | | ∞ - ∞ | (VXISI) | | $\infty \div \infty$ | (VXIDI) | | 0 ÷ 0 | (VXZDZ) | | $\infty \times 0$ | (VXIMZ) | | Invalid Compare | (VXVC) | | Invalid conversion | (VXCVI) | |-------------------------------------------|---------| | <ul> <li>Zero Divide Exception</li> </ul> | (ZX) | | <ul> <li>Overflow Exception</li> </ul> | (OX) | | <ul><li>Underflow Exception</li></ul> | (UX) | | ■ Inexact Exception | (XX) | Each DFP exception and each category of Invalid Operation Exception has an exception status bit in the FPSCR. In addition, each of the five DFP exceptions has a corresponding enable bit in the FPSCR. These enable bits enable or disable the invocation of the system floating-point enabled exception error handler, and may affect the setting of some exception status bits in the FPSCR. The usage of these bits by the DFP processor differs from the usage by the BFP processor. Section 5.5.10 "DFP Exceptions" on page 154 provides a detailed discussion of DFP exceptions, including the effects of the enable bits. #### 5.2 DFP Register Handling The following sections describe first how the floating-point registers are utilized by the DFP processor. The subsequent section covers the DFP usage of CR and FPSCR. # 5.2.1 DFP Usage of Floating-Point Registers The DFP processor shares the same 32 64-bit FPRs with the BFP processor. Like the BFP instructions, DFP instructions also use 5-bit fields for designating the FPRs to hold the source or target operands. When data in DFP Short format is held in a FPR, it occupies the rightmost 32 bits of the FPR. The *Load Floating-Point as Integer Word Algebraic* instruction is provided to load the rightmost 32 bits of a FPR with a single-word data from storage. The *Store Floating-Point as Integer Word* instruction is available to store the rightmost 32 bits of a FPR to a storage location. Data in DFP Long format, 64-bit binary fixed-point values, or 64-bit BCD values is held in a FPR using all 64 bits. Data of 64 bits may be loaded from storage via any of the *Load Floating-Point Double* instructions and stored via any of the *Store Floating-Point Double* instructions. Data in DFP Extended format or 128-bit BCD values is held in an even-odd FPR pair using all 128 bits. Data of 128 bits must be loaded into the desired even-odd pair of floating-point registers using an appropriate sequence of the *Load Floating-Point Double* instructions and stored using an appropriate sequence of the *Store Floating-Point Double* instructions. Data used as a source operand by any *Decimal Float-ing-Point* instruction that was produced, either directly or indirectly, by a *Load Floating-Point Single* instruction, a *Floating Round to Single-Precision* instruction, or a binary floating-point single-precision arithmetic instruction is boundedly undefined. When an even-odd FPR pair is used to hold a 128-bit operand, the even-numbered FPR is used to hold the leftmost doubleword of the operand and the next higher-numbered FPR is used to hold the rightmost doubleword. A DFP instruction designating an odd-numbered FPR for a 128-bit operand is an invalid instruction form. #### Programming Note - The *Floating-Point Move* instructions can be used to move operands between FPRs. The bit definitions for the FPSCR are as follows. #### Bit(s) Description 0:28 Reserved #### 29:31 **DFP Rounding Control** (DRN) See Section 5.5.2, "Rounding Mode Specification" on page 151. - 000 Round to Nearest, Ties to Even - 001 Round toward Zero - 010 Round toward +Infinity - 011 Round toward -Infinity - 100 Round to Nearest, Ties away from 0 - 101 Round to Nearest, Ties toward 0 - 110 Round to away from Zero - 111 Round to Prepare for Shorter Precision #### Programming Note FPSCR<sub>28</sub> is reserved for extension of the DRN field, therefore DRN may be set using the *mtfsfi* instruction to set the rounding mode. 32 *Floating-Point Exception Summary* (FX) Every floating-point instruction, except *mtfsfi* and *mtfsf*, implicitly sets FPSCR<sub>FX</sub> to 1 if that instruction causes any of the floating-point exception bits in the FPSCR to change from 0 to 1. *mcrfs*, *mtfsfi*, *mtfsf*, *mtfsb0*, and *mtfsb1* can alter FPSCR<sub>FX</sub> explicitly. 33 Floating-Point Enabled Exception Summary (FEX) This bit is the OR of all the floating-point exception bits masked by their respective enable bits. *mcrfs*, *mtfsfi*, *mtfsf*, *mtfsb0*, and *mtfsb1* cannot alter FPSCR<sub>FEX</sub> explicitly. 34 Floating-Point Invalid Operation Exception Summary (VX) This bit is the OR of all the Invalid Operation exception bits. *mcrfs*, *mtfsfi*, *mtfsf*, *mtfsb0*, and *mtfsb1* cannot alter FPSCR<sub>VX</sub> explicitly. - 35 Floating-Point Overflow Exception (OX) See Section 5.5.10.3, "Overflow Exception" on page 157. - 36 Floating-Point Underflow Exception (UX) See Section 5.5.10.4, "Underflow Exception" on page 158. - 37 Floating-Point Zero Divide Exception (ZX) See Section 5.5.10.2, "Zero Divide Exception" on page 157. - 38 Floating-Point Inexact Exception (XX) See Section 5.5.10.5, "Inexact Exception" on page 159. $\mathsf{FPSCR}_\mathsf{XX}$ is a sticky version of $\mathsf{FPSCR}_\mathsf{FI}$ (see below). Thus the following rules completely describe how $\mathsf{FPSCR}_\mathsf{XX}$ is set by a given instruction. - If the instruction affects FPSCR<sub>FI</sub>, the new value of FPSCR<sub>XX</sub> is obtained by ORing the old value of FPSCR<sub>XX</sub> with the new value of FPSCR<sub>FI</sub>. - If the instruction does not affect FPSCR<sub>FI</sub>, the value of FPSCR<sub>XX</sub> is unchanged. - 39 Floating-Point Invalid Operation Exception (SNaN) (VXSNAN) See Section 5.5.10.1, "Invalid Operation Exception" on page 156. - 40 Floating-Point Invalid Operation Exception (∞ ∞) (VXISI) See Section 5.5.10.1. - 41 Floating-Point Invalid Operation Exception (∞ ÷ ∞) (VXIDI) See Section 5.5.10.1. - 442 Floating-Point Invalid Operation Exception (0÷0) (VXZDZ) See Section 5.5.10.1. - 43 Floating-Point Invalid Operation Exception (∞x0) (VXIMZ) See Section 5.5.10.1. - 44 Floating-Point Invalid Operation Exception (Invalid Compare) (VXVC) See Section 5.5.10.1. - 45 Floating-Point Fraction Rounded (FR) The last Arithmetic or Rounding and Conver- sion instruction incremented the fraction during rounding. See Section 5.5.1, "Rounding" on page 150. This bit is not sticky. 46 Floating-Point Fraction Inexact (FI) The last *Arithmetic* or *Rounding and Conversion* instruction either produced an inexact result during rounding or caused a disabled Overflow Exception. See Section 5.5.1. This bit is not sticky. See the definition of $FPSCR_{XX}$ , above, regarding the relationship between $FPSCR_{FI}$ and $FPSCR_{XX}$ . #### 47:51 Floating-Point Result Flags (FPRF) This field is set as described below. For arithmetic, rounding, and conversion instructions, the field is set based on the result placed into the target register, except that if any portion of the result is undefined then the value placed into FPRF is undefined. - 47 Floating-Point Result Class Descriptor (C) Arithmetic, rounding, and conversion instructions may set this bit with the FPCC bits, to indicate the class of the result as shown in Figure 59 on page 146. - 48:51 Floating-Point Condition Code (FPCC) Floating-point Compare and DFP Test instructions set one of the FPCC bits to 1 and the other three FPCC bits to 0. Arithmetic, rounding, and conversion instructions may set the FPCC bits with the C bit, to indicate the class of the result as shown in Figure 59 on page 146. Note that in this case the high-order three bits of the FPCC retain their relational significance indicating that the value is less than, greater than, or equal to zero. - 48 Floating-Point Less Than or Negative (FL or <) - 49 Floating-Point Greater Than or Positive (FG or >) - 50 Floating-Point Equal or Zero (FE or =) - 51 Floating-Point Unordered or NaN (FU or ?) - 52 Reserved - Floating-Point Invalid Operation Exception (Software Request) (VXSOFT) This bit can be altered only by mcrfs, mtfsfi, mtfsf, mtfsb0, or mtfsb1. See Section 5.5.10.1, "Invalid Operation Exception" on page 156. - Neither used nor changed by DFP. #### Programming Note - Although the architecture does not provide a DFP square root instruction, if software simulates such an instruction, it should set bit 54 whenever the source operand of the square root function is invalid. - 55 Floating-Point Invalid Operation Exception (Invalid Conversion) (VXCVI) See Section 5.5.10.1. - 56 Floating-Point Invalid Operation Exception Enable (VE) See Section 5.5.10.1. 57 Floating-Point Overflow Exception Enable (OE) See Section 5.5.10.3, "Overflow Exception" on page 157. 58 Floating-Point Underflow Exception Enable (UE) See Section 5.5.10.4, "Underflow Exception" on page 158. 59 Floating-Point Zero Divide Exception Enable (ZE) See Section 5.5.10.2, "Zero Divide Exception" on page 157. 60 Floating-Point Inexact Exception Enable (XE) See Section 5.5.10.5, "Inexact Exception" on page 159 - Reserved (not used by DFP) - 62:63 **Binary Floating-Point Rounding Control** (RN) See Section 5.5.1, "Rounding" on page 150. - 00 Round to Nearest - 01 Round toward Zero - 10 Round toward +Infinity - 11 Round toward -Infinity | Result<br>Flags | Result Value Class | |-----------------|--------------------------| | C < > = ? | | | 0 0 0 0 1 | Signaling NaN (DFP only) | | 1 0 0 0 1 | Quiet NaN | | 0 1 0 0 1 | - Infinity | | 0 1 0 0 0 | - Normal Number | | 1 1 0 0 0 | - Subnormal Number | | 10010 | - Zero | | 0 0 0 1 0 | + Zero | | 10100 | + Subnormal Number | | 0 0 1 0 0 | + Normal Number | | 0 0 1 0 1 | + Infinity | Figure 59. Floating-Point Result Flags # 5.3 DFP Support for Non-DFP Data Types In addition to the DFP data types, the DFP processor provides limited support for the following non-DFP data types: signed or unsigned binary fixed-point data, and signed or unsigned decimal data. In unsigned binary fixed-point data, all bits are used to express the absolute value of the number. For signed binary fixed-point data, the leftmost bit represents the sign, which is followed by the numeric field. Positive numbers are represented in true binary notation with the sign bit set to zero. When the value is zero, all bits are zeros, including the sign bit. Negative numbers are represented in two's complement binary notation with a one in the sign-bit position. For decimal data, each byte contains a pair of four-bit nibbles; each four-bit nibble contains a binary-coded-decimal (BCD) code. There are two kinds of BCD codes: digit code and sign code. For unsigned decimal data, all nibbles contain a digit code (D) as shown in Figure 60 Figure 60. Format for Unsigned Decimal Data For signed decimal data, the rightmost nibble contains a sign code (S) and all other nibbles contain a digit code as shown in Figure 61. Figure 61. Format for Signed Decimal Data The decimal digits 0-9 have the binary encoding 0000-1001. The preferred plus-sign codes are 1100 and 1111. The preferred minus sign code is 1101. These are the sign codes generated for the results of the *Decode DPD To BCD* instruction. A selection is provided by this instruction to specify which of the two preferred plus sign codes is to be generated. Alternate sign codes are also recognized as valid in the sign position: 1010 and 1110 are alternate sign codes for plus, and 1011 is an alternate sign code for minus. Alternate sign codes are accepted for any source operand, but are not generated as a result by the instruction. When an invalid digit or sign code is detected by the *Encode BCD To DPD* instruction, an invalid-operation exception occurs. A summary of digit and sign codes are provided in Figure 62. | Binary | Recognized As | | | |--------|---------------|----------------------------|--| | Code | Digit | Sign | | | 0000 | 0 | Invalid | | | 0001 | 1 | Invalid | | | 0010 | 2 | Invalid | | | 0011 | 3 | Invalid | | | 0100 | 4 | Invalid | | | 0101 | 5 | Invalid | | | 0110 | 6 | Invalid | | | 0111 | 7 | Invalid | | | 1000 | 8 | Invalid | | | 1001 | 9 | Invalid | | | 1010 | Invalid | Plus | | | 1011 | Invalid | Minus | | | 1100 | Invalid | Plus (preferred; option 1) | | | 1101 | Invalid | Minus (preferred) | | | 1110 | Invalid | Plus | | | 1111 | Invalid | Plus (preferred; option 2) | | Figure 62. Summary of BCD Digit and Sign Codes ### 5.4 DFP Number Representation A DFP finite number consists of three components: a sign bit, a signed exponent, and a significand. The signed exponent is a signed binary integer. The significand consists of a number of decimal digits, which are to the left of the implied decimal point. The rightmost digit of the significand is called the units digit. The numerical value of a DFP finite number is represented as $(-1)^{sign} \times significand \times 10^{exponent}$ and the unit value of this number is $(1 \times 10^{exponent})$ , which is called the quantum. DFP finite numbers are not normalized. This allows leading zeros and trailing zeros to exist in the significand. This unnormalized DFP number representation allows some values to have redundant forms; each form represents the DFP number with a different combination of the significand value and the exponent value. For example, $1000000 \times 10^5$ and $10 \times 10^{10}$ are two different forms of the same numerical value. A *form* of this number representation carries information about both the numerical value and the quantum of a DFP finite number. The *significant digits* of a DFP finite number are the digits in the significand beginning with the leftmost nonzero digit and ending with the units digit. #### 5.4.1 DFP Data Format DFP numbers and NaNs may be represented in FPRs in any of the three data formats: DFP Short, DFP Long, or DFP Extended. The contents of each data format represent encoded information. Special codes are assigned to NaNs and infinities. Different formats support different sizes in both significand and exponent. Arithmetic, compare, test, quantum-adjustment, and format instructions are provided for DFP Long and DFP Extended formats only. The sign is encoded as a one bit binary value. Significand is encoded as an unsigned decimal integer in two distinct parts. The leftmost digit (LMD) of the significand is encoded as part of the combination field; the remaining digits of the significand are encoded in the trailing significand field. The exponent is contained in the combination field in two parts. However, prior to encoding, the exponent is converted to an unsigned binary value called the biased exponent by adding a bias value which is a constant for each format. The two leftmost bits of the biased exponent are encoded with the leftmost digit of the significand in the leftmost bits of the combination field. The rest of the biased exponent occupies the remaining portion of the combination field. #### 5.4.1.1 Fields Within the Data Format The DFP data representation comprises three fields, as diagrammed below for each of the three formats: Figure 63. DFP Short format Figure 64. DFP Long format Figure 65. DFP Extended format The fields are defined as follows: #### Sign bit (S) The sign bit is in bit 0 of each format, and is zero for plus and one for minus. #### Combination field (G) As the name implies, this field provides a combination of the exponent and the left-most digit (LMD) of the significand, for finite numbers, or provides a special code for denoting the value as either a Not-a-Number or an Infinity. The first 5 bits of the combination field contain the encoding of NaN or infinity, or the two leftmost bits of the biased exponent and the leftmost digit (LMD) of the significand. The following tables show the encoding: | G <sub>0:4</sub> | Description | | |------------------|-------------------------------|--| | 11111 | NaN | | | 11110 | Infinity | | | All others | Finite Number (see Figure 67) | | Figure 66. Encoding of the G field for Special Symbols | LMD | Leftmost 2-bits of biased exponent | | | | | |-------|------------------------------------|-------|-------|--|--| | LIVID | 00 | 01 | 10 | | | | 0 | 00000 | 01000 | 10000 | | | | 1 | 00001 | 01001 | 10001 | | | | 2 | 00010 | 01010 | 10010 | | | | 3 | 00011 | 01011 | 10011 | | | | 4 | 00100 | 01100 | 10100 | | | | 5 | 00101 | 01101 | 10101 | | | | 6 | 00110 | 01110 | 10110 | | | | 7 | 00111 | 01111 | 10111 | | | | 8 | 11000 | 11010 | 11100 | | | | 9 | 11001 | 11011 | 11101 | | | Figure 67. Encoding of bits 0:4 of the G field for Finite Numbers For DFP finite numbers, the rightmost N-5 bits of the N-bit combination field contain the remaining bits of the biased exponent. For NaNs, bit 5 of the combination field is used to distinguish a Quiet NaN from a Signaling NaN; the remaining bits in a source operand are ignored and they are set to zeros in a target operand by most operations. For infinities, the rightmost N-5 bits of the N-bit combination field of a source operand are ignored and they are set to zeros in a target operand by most operations. #### Trailing Significand field (T) For DFP finite numbers, this field contains the remaining *significand* digits. For NaNs, this field may be used to contain diagnostic information. For infinities, contents in this field of a source operand are ignored and they are set to zeros in a target operand by most operations. The trailing significand field is a multiple of 10-bit blocks. The multiple depends on the format. Each 10-bit block is called a declet and represents three decimal digits, using the Densely Packed Decimal (DPD) encoding defined in Appendix A. #### 5.4.1.2 Summary of DFP Data Formats The properties of the three DFP formats are summarized in the following table:. | | Format | | | |----------------------------------------------|-----------------------------|--------------------------------------------|---------------------------------------------| | | DFP Short | DFP Long | DFP Extended | | Widths (bits): | | | | | Format | 32 | 64 | 128 | | Sign (S) | 1 | 1 | 1 | | Combination (G) | 11 | 13 | 17 | | Trailing Significand (T) | 20 | 50 | 110 | | Exponent: | | | | | Maximum biased | 191 | 767 | 12,287 | | Maximum (X <sub>max</sub> ) | 90 | 369 | 6111 | | Minimum (X <sub>min</sub> ) | -101 | -398 | -6176 | | Bias | 101 | 398 | 6176 | | Precision (p) (digits) | 7 | 16 | 34 | | Magnitude: | | | | | Maximum normal number (N <sub>max</sub> ) | $(10^7 - 1) \times 10^{90}$ | (10 <sup>16</sup> - 1) x 10 <sup>369</sup> | (10 <sup>34</sup> - 1) x 10 <sup>6111</sup> | | Minimum normal number (N <sub>min</sub> ) | 1 x 10 <sup>-95</sup> | 1 x 10 <sup>-383</sup> | 1 x 10 <sup>-6143</sup> | | Minimum subnormal number (D <sub>min</sub> ) | 1 x 10 <sup>-101</sup> | 1 x 10 <sup>-398</sup> | 1 x 10 <sup>-6176</sup> | Figure 68. Summary of DFP Formats #### 5.4.1.3 Preferred DPD Encoding Execution of DFP instructions decodes source operands from DFP data formats to an internal format for processing, and encodes the operation result before the final result is returned as the target operand. As part of the decoding process, declets in the trailing significand field of source operands are decoded to their corresponding BCD digit codes using the DPD-to-BCD decoding algorithm. As part of the encoding process, BCD digit codes to be stored into the trailing significand field of the target operand are encoded into declets using the BCD-to-DPD encoding algorithm. Both the decoding and encoding algorithms are defined in Appendix A. As explained in Appendix A, there are eight 3-digit decimal values that have redundant DPD codes and one preferred DPD code. All redundant DPD codes are recognized in source operands for the associated 3-digit decimal number. DFP operations will always generate the preferred DPD codes for the trailing significand field of the target operand. #### 5.4.2 Classes of DFP Data There are six classes of DFP data, which include numerical and nonnumeric entities. The numerical entities include zero, subnormal number, normal number, and infinity data classes. The nonnumeric entities include quiet and signaling NaNs data classes. The value of a DFP finite number, including zero, subnormal number, and normal number, is a quantization of the real number based on the data format. The *Test Data Class* instruction may be used to determine the class of a DFP operand. In general, an operation that returns a DFP result sets the FPSCR<sub>FPRF</sub> field to indicate the data class of the result. The following tables show the value ranges for finitenumber data classes, and the codes for NaNs and infinities. | ı | | |---|--| | Data Class | Sign | Magnitude | |------------|------|-------------------------------| | Zero | ± | 0* | | Subnormal | ± | $D_{min} \le X < N_{min}$ | | Normal | ± | $N_{min} \le Y \le N_{max}$ | The significand is zero and the exponent is any representable value Figure 69. Value Ranges for Finite Number Data Classes | Data Class | s | G | T | |---------------|---|---------------|---------| | +Infinity | 0 | 11110xxx xxx | xxx xxx | | -Infinity | 1 | 11110xxx xxx | xxx xxx | | Quiet NaN | х | 111110xx xxx | xxx xxx | | Signaling NaN | х | 1111111xx xxx | xxx xxx | | x Don't care | | | | Figure 70. Encoding of NaN and Infinity Data Classes #### Zeros Zeros have a zero significand and any representable value in the exponent. A +0 is distinct from -0, and zeros with different exponents are distinct, except that comparison treats them as equal. #### Subnormal Numbers Subnormal numbers have values that are smaller than $N_{\text{min}}$ and greater than zero in magnitude. #### Normal Numbers Normal numbers are nonzero finite numbers whose magnitude is between $N_{\text{min}}$ and $N_{\text{max}}$ inclusively. #### Infinities Infinities are represented by 0b11110 in the leftmost 5 bits of the combination field. When an operation is defined to generate an infinity as the result, a default infinity is sometimes supplied. A default infinity has all remaining bits in the combination field and trailing significand field set to zeros. When infinities are used as source operands, only the leftmost 5 bits of the combination field are interpreted (i.e., 0b11110 indicates the value is an infinity). The trailing significand field of infinities is usually ignored. For generated infinities, the leftmost 5 bits of the combination field are set to 0b11110 and all remaining combination bits are set to zero. Infinities can participate in most arithmetic operations and give a consistent result. In comparisons, any +Infinity compares greater than any finite number, and any -Infinity compares less than any finite number. All +Infinity are compared equal and all -Infinity are compared equal. #### Signaling and Quiet NaNs There are two types of Not-a-Numbers (NaNs), Signaling (SNaN) and Quiet (QNaN). 0b111110 in the leftmost 6 bits of the combination field indicates a Quiet NaN, whereas 0b111111 indicates a Signaling NaN. A special QNaN is sometimes supplied as the *default QNaN* for a disabled invalid-operation exception; it has a plus sign, the leftmost 6 bits of the combination field set to 0b111110 and remaining bits in the combination field and the trailing significand field set to zero. Normally, source QNaNs are *propagated* during operations so that they will remain visible at the end. When a QNaN is propagated, the sign is preserved, the decimal value of the trailing significand field is preserved but reencoded using the preferred DPD codes, and the contents in the rightmost N-6 bits of the combination field set to zero, where N is the width of the combination field for the format. A source SNaN generally causes an invalid-operation exception. If the exception is disabled, the SNaN is converted to the corresponding QNaN and propagated. The primary encoding difference between an SNaN and a QNaN is that bit 5 of an SNaN is 1 and bit 5 of a QNaN is 0. When an SNaN is propagated as a QNaN, bit 5 is set to 0, and, just as with QNaN proagation, the sign is preserved, the decimal value of the trailing significand field is preserved but reencoded using the preferred DPD codes, and the contents in the rightmost N-6 bits of the combination field set to zero, where N is the width of the combination field for the format. For some format-conversion instructions, a source SNaN does not cause an invalid-operation exception, and an SNaN is returned as the target operand. For instructions with two source NaNs and a NaN is to be propagated as the result, do the following. - If there is a QNaN in FRA and an SNaN in FRB, the SNaN in FRB is propagated. - Otherwise, propagate the NaN is FRA. #### 5.5 DFP Execution Model DFP operations are performed as if they first produce an intermediate result correct to infinite precision and with unbounded range. The intermediate result is then rounded to the destination's precision according to one of the eight DFP rounding modes. If the rounded result has only one form, it is delivered as the final result; if the rounded result has redundant forms, then an *ideal exponent* is used to select the form of the final result. The ideal exponent determines the form, not the value, of the final result. (See Section 5.5.3 "Formation of Final Result" on page 152.) #### 5.5.1 Rounding Rounding takes a number regarded as infinitely precise and, if necessary, modifies it to fit the destination's precision. The destination's precision of an operation defines the set of permissible resultant values. For most operations, the destination's precision is the target-format precision and the permissible resultant values are those values representable in the target format. For some special operations, the destination precision is constrained by both the target format and some additional restrictions, and the permissible resultant values are a subset of the values representable in the target format. Rounding sets FPSCR bits FR and FI. When an inexact exception occurs, FI is set to one; otherwise, FI is set to zero. When an inexact exception occurs and if the rounded result is greater in magnitude than the intermediate result, then FR is set to one; otherwise, FR is set to zero. The exception is the *Round to FP Integer Without Inexact* instruction, which always sets FR and FI to zero. Rounding may cause an overflow exception or underflow exception; it may also cause an inexact exception. Refer to Figure 71 below for rounding. Let Z be the intermediate result of a DFP operation. Z may or may not fit in the destination's precision. If Z is exactly one of the permissible representable resultant values, then the final result in all rounding modes is Z. Otherwise, either Z1 or Z2 is chosen to approximate the result, where Z1 and Z2 are the next larger and smaller permissible resultant values, respectively. Figure 71. Rounding #### Round to Nearest, Ties to Even Choose the value that is closer to Z (Z1 or Z2). In case of a tie, choose the one whose units digit would have been even in the form with the largest common quantum of the two permissible resultant values. However, an infinitely precise result with magnitude at least ( $N_{max} + 0.5Q(N_{max})$ ) is rounded to infinity with no change in sign; where $Q(N_{max})$ is the quantum of $N_{max}$ . #### Round toward 0 Choose the smaller in magnitude (Z1 or Z2). #### Round toward +∞ Choose Z1. #### Round toward -∞ Choose Z2. #### Round to Nearest, Ties away from 0 Choose the value that is closer to Z (Z1 or Z2). In case of a tie, choose the larger in magnitude (Z1 or Z2). However, an infinitely precise result with magnitude at least ( $N_{max} + 0.5Q(N_{max})$ ) is rounded to infinity with no change in sign; where $Q(N_{max})$ is the quantum of $N_{max}$ . #### Round to Nearest, Ties toward 0 Choose the value that is closer to Z (Z1 or Z2). In case of a tie, choose the smaller in magnitude (Z1 or Z2). However, an infinitely precise result with magnitude greater than $(N_{max} + 0.5Q(N_{max}))$ is rounded to infinity with no change in sign; where $Q(N_{max})$ is the quantum of $N_{max}$ . #### Round away from 0 Choose the larger in magnitude (Z1 or Z2). #### Round to prepare for shorter precision Choose the smaller in magnitude (Z1 or Z2). If the selected value is inexact and the units digit of the selected value is either 0 or 5, then the digit is incremented by one and the incremented result is delivered. In all other cases, the selected value is delivered. When a value has redundant forms, the units digit is determined by using the form that has the smallest exponent. # 5.5.2 Rounding Mode Specification Unless otherwise specified in the instruction definition, the rounding mode used by an operation is specified in the DFP rounding control (DRN) field of the FPSCR. The eight DFP rounding modes are encoded in the DRN field as specified in the table below. | DRN | Rounding Mode | |-----|----------------------------------------| | 000 | Round to Nearest, Ties to Even | | 001 | Round toward 0 | | 010 | Round toward +Infinity | | 011 | Round toward -Infinity | | 100 | Round to Nearest, Ties away from 0 | | 101 | Round to Nearest, Ties toward 0 | | 110 | Round away from 0 | | 111 | Round to Prepare for Shorter Precision | Figure 72. Encoding of DFP Rounding-Mode Control (DRN) For the quantum-adjustment, a 2-bit immediate field, called RMC (*Rounding Mode Control*), in the instruction specifies the rounding mode used. The RMC field may contain a primary encoding or a secondary encoding. For *Quantize*, *Quantize Immediate*, and *Reround*, the RMC field contains the primary encoding. For *Round to FP Integer* the field contains either encoding, depending on the setting of a RMC-encoding-selection bit. The following tables define the primary encoding and the secondary encoding. | Primary RMC | Rounding Mode | |-------------|-----------------------------------------| | 00 | Round to nearest, ties to even | | 01 | Round toward 0 | | 10 | Round to nearest, ties away from 0 | | 11 | Round according to FPSCR <sub>DRN</sub> | Figure 73. Primary Encoding of Rounding-Mode Control | Secondary<br>RMC | Rounding Mode | | |------------------|---------------------------------|--| | 00 | Round to +∞ | | | 01 | Round to - ∞ | | | 10 | Round away from 0 | | | 11 | Round to nearest, ties toward 0 | | Figure 74. Secondary Encoding of Rounding-Mode Control #### 5.5.3 Formation of Final Result An ideal exponent is defined for each DFP instruction that returns a DFP data operand. #### 5.5.3.1 Use of Ideal Exponent For all DFP operations, - if the rounded intermediate result has only one form, then that form is delivered as the final result. - if the rounded intermediate result has redundant. forms and is exact, then the form with the exponent closest to the ideal exponent is delivered. - if the rounded intermediate result has redundant forms and is inexact, then the form with the smallest exponent is delivered. The following table specifies the ideal exponent for each instruction. | Operations | Ideal Exponent | | |-------------------------------------------------------------|-----------------------------|--| | Add | min(E(FRA), E(FRB)) | | | Subtract | min(E(FRA), E(FRB)) | | | Multiply | E(FRA) + E(FRB) | | | Divide | E(FRA) - E(FRB) | | | Quantize-Immediate | See Instruction Description | | | Quantize | E(FRA) | | | Reround | See Instruction Description | | | Round to FP Integer | max(0, E(FRA)) | | | Convert to DFP Long | E(FRA) | | | Convert to DFP<br>Extended | E(FRA) | | | Round to DFP Short | E(FRA) | | | Round to DFP Long | E(FRA) | | | Convert from Fixed | 0 | | | Encode BCD to DPD | 0 | | | Insert Biased Expo-<br>nent | E(FRA) | | | Notes:<br>E(x) - exponent of the DFP operand in register x. | | | Figure 75. Summary of Ideal Exponents #### 5.5.4 Arithmetic Operations Four arithmetic operations are provided: Add, Subtract, Multiply, and Divide. #### 5.5.4.1 Sign of Arithmetic Result The following rules govern the sign of an arithmetic operation when the operation does not yield an exception. They apply even when the operands or results are zeros or infinities. - The sign of the result of an add operation is the sign of the source operand having the larger absolute value. If both source operands have the same sign, the sign of the result of an add operation is the same as the sign of the source operands. When the sum of two operands with opposite signs is exactly zero, the sign of the result is positive in all rounding modes except Round toward -∞, in which case the sign is negative. - The sign of the result of the subtract operation x y is the same as the sign of the result of the add operation x + (-y). - The sign of the result of a multiply or divide operation is the exclusive-OR of the signs of the source operands. #### 5.5.5 Compare Operations Two sets of instructions are provided for comparing numerical values: *Compare Ordered* and *Compare Unordered*. In the absence of NaNs, these instructions work the same. These instructions work differently when either of the followings is true: - At least one source operand of the instruction is an SNaN and the invalid-operation exception is disabled. - When there is no SNaN in any source operand, at least one source operand of the instruction is a QNaN In case 1, Compare Unordered recognizes an invalidoperation exception and sets the FPSCR<sub>VXSNAN</sub> flag, but Compare Ordered recognizes the exception and sets both the FPSCR<sub>VXSNAN</sub> and FPSCR<sub>VXVC</sub> flags. In case 2, Compare Unordered does not recognize an exception, but Compare Ordered recognizes an invalidoperation exception and sets the FPSCR<sub>VXVC</sub> flag. For finite numbers, comparisons are performed on values, that is, all redundant forms of a DFP number are treated equal. Comparisons are always exact and cannot cause an inexact exception. Comparison ignores the sign of zero, that is, +0 equals -0. Infinities with like sign compare equal, that is, $+\infty$ equals $+\infty$ , and $-\infty$ equals $-\infty$ . A NaN compares as unordered with any other operand, whether a finite number, an infinity, or another NaN, including itself. Execution of a compare instruction always completes, regardless of whether any DFP exception occurs or not, and whether the exception is enabled or not. #### 5.5.6 Test Operations Four kinds of test operations are provided: *Test Data Class, Test Data Group, Test Exponent,* and *Test Significance.* The *Test Data Class* instruction examines the contents of a source operand and determines if the operand is one of the specified data classes. The test result and the sign of the source operand are indicated in the FPSCR<sub>FPCC</sub> field and CR field BF. The *Test Data Group* instruction examines the contents of a source operand and determines if the operand is one of the specified data groups. The test result and the sign of the source operand are indicated in the FPSCR<sub>EPCC</sub> field and CR field BF. The *Test Exponent* instruction compares the exponent of the two source operands. The test operation ignores the sign and significand of operands. Infinities compare equal, and NaNs compare equal. The test result is indicated in the FPSCR<sub>FPCC</sub> field and CR field BF. The *Test Significance* instruction compares the number of significant digits of one source operand with the referenced number of significant digits in another source operand. The test result is indicated in the FPSCR<sub>FPCC</sub> field and CR field BF. Execution of a test instruction does not cause any DFP exception. ## 5.5.7 Quantum Adjustment Operations Four kinds of quantum-adjustment operations are provided: *Quantize, Quantize Immediate, Reround,* and *Round To FP Integer.* Each of them has an immediate field which specifies whether the rounding mode in FPSCR or a different one is to be used. The *Quantize* instruction is used to adjust a DFP number to the form that has the specified target exponent. The *Quantize Immediate* instruction is similar to the *Quantize* instruction, except that the target exponent is specified in a 5-bit immediate field as a signed binary integer and has a limited range. The *Reround* instruction is used to simulate a DFP operation of a precision other than that of DFP Long or DFP Extended. For the *Reround* instruction to produce a result which accurately reflects that which would have resulted from a DFP operation of the desired precision *d* in the range {1: 33} inclusively, the following conditions must be met: - The precision of the preceding DFP operation must be at least one digit larger than *d*. - The rounding mode used by the preceding DFP operation must be round-to-prepare-for-shorter-precision. The Round To FP Integer instruction is used to round a DFP number to an integer value of the same format. The target exponent is implicitly specified, and is greater than or equal to zero. #### 5.5.8 Conversion Operations There are two kinds of conversion operations: data-format conversion and data-type conversion. #### 5.5.8.1 Data-Format Conversion The instructions *Convert To DFP Long* and *Convert To DFP Extended* convert DFP operands to wider formats; the instructions *Round To DFP Short* and *Round To DFP Long* convert DFP operands to narrower formats. When converting a finite number to a wider format, the result is exact. When converting a finite number to a narrower format, the source operand is rounded to the target-format precision, which is specified by the instruction, not by the target register size. When converting a finite number, the ideal exponent of the result is the source exponent. Conversion of an infinity or NaN to a different format does not preserve the source combination field. Let N be the width of the target format's combination field. - When the result is an infinity or a QNaN, the contents of the rightmost N-5 bits of the N-bit target combination field are set to zero. - When the result is an SNaN, bit 5 of the target format's combination field is set to one and the rightmost N-6 bits of the N-bit target combination field are set to zero. When converting a NaN to a wider format or when converting an infinity from DFP Short to DFP Long, digits in the source trailing significand field are reencoded using the preferred DPD codes with sufficient zeros appended on the left to form the target trailing significand field. When converting a NaN to a narrower format or when converting an infinity from DFP Long to DFP Short, the appropriate number of leftmost digits of the source trailing significand field are removed and the remaining digits of the field are reencoded using the preferred DPD codes to form the target trailing significand field. When converting an infinity between DFP Long and DFP Extended, a default infinity with the same sign is produced. When converting an SNaN between DFP Short and DFP Long, it is converted to an SNaN without causing an invalid-operation exception. When converting an SNaN between DFP Long and DFP Extended, the invalid-operation exception occurs; if the invalid-operation exception is disabled, the result is converted to the corresponding QNaN. #### 5.5.8.2 Data-Type Conversion The instructions *Convert From Fixed* and *Convert To Fixed* are provided to convert a number between the DFP data type and the signed 64-bit binary-integer data type. Conversion of a signed 64-bit binary integer to a DFP Extended number is always exact. Conversion of a DFP number to a signed 64-bit binary integer results in an invalid-operation exception when the converted value does not fit into the target format, or when the source operand is an infinity or NaN. When the exception is disabled, the most positive integer is returned if the source operand is a positive number or $+\infty$ , and the most negative integer is returned if the source operand is a negative number, $-\infty$ , or NaN. #### 5.5.9 Format Operations The format instructions are provided to facilitate composing or decomposing a DFP number, and consist of Encode BCD To DPD, Decode DPD To BCD, Extract Biased Exponent, Insert Biased Exponent, Shift Significand Left Immediate, and Shift Significand Right Immediate. A source operand of SNaN does not cause an invalid-operation exception, and an SNaN may be produced as the target operand. #### 5.5.10 DFP Exceptions This architecture defines the following DFP exceptions: Invalid Operation Exception **SNaN** ∞ - ∞ $\infty \div \infty$ 0 ÷ 0 ∞ X 0 Invalid Compare Invalid Conversion - Zero Divide Exception - Overflow Exception - Underflow Exception - Inexact Exception These exceptions may occur during execution of a DFP instruction. Each DFP exception, and each category of the Invalid Operation Exception, has an exception status bit in the FPSCR. In addition, each DFP exception has a corresponding enable bit in the FPSCR. The exception status bit indicates occurrence of the corresponding exception. If an exception occurs, the corresponding enable bit governs the result produced by the instruction and, in conjunction with the FE0 and FE1 bits (see the discussion of FE0 and FE1 below), whether and how the system floating-point enabled exception error handler is invoked. (In general, the enabling specified by the enable bit is of invoking the system error handler, not of permitting the exception to occur. The occurrence of an exception depends only on the instruction and its source operands, not on the setting of any control bits. The only deviation from this general rule is that the occurrence of an Underflow Exception may depend on the setting of the enable bit.) A single instruction, other than *mtfsfi* or *mtfsf*, may set more than one exception bit only in the following cases: - Inexact Exception may be set with Overflow Exception. - Inexact Exception may be set with Underflow Exception. - Invalid Operation Exception (SNaN) may be set with Invalid Operation Exception (Invalid Compare) for Compare Ordered instructions Invalid Operation Exception (SNaN) may be set with Invalid Operation Exception (Invalid Conversion) for Convert To Fixed instructions. When an exception occurs the instruction execution may be completed or partially completed, depending on the exception and the operation. For all instructions, except for the Compare and Test instructions, the following exceptions cause the instruction execution to be partially completed. That is, setting of CR field 1(when Rc=1) and exception status flags is performed, but no result is stored into the target FPR or FPR pair. For Compare and Test instructions, instruction execution is always completed, regardless of whether any DFP exception occurs or not, and whether the exception is enabled or not. - Enabled Invalid Operation - Enabled Zero Divide For the remaining kinds of exceptions, instruction execution is completed, a result, if specified by the instruction, is generated and stored into the target FPR or FPR pair, and appropriate status flags are set. The result may be a different value for the enabled and disabled conditions for some of these exceptions. The kinds of exceptions that deliver a result in target FPR are the following: - Disabled Invalid Operation - Disabled Zero Divide - Disabled Overflow - Disabled Underflow - Disabled Inexact - Enabled Overflow - Enabled Underflow - Enabled Inexact Subsequent sections define each of the DFP exceptions and specify the action that is taken when they are detected. The IEEE standard specifies the handling of exceptional conditions in terms of "traps" and "trap handlers". In this architecture, a FPSCR exception enable bit of 1 causes generation of the result value specified in the IEEE standard for the "trap enabled" case: the expectation is that the exception will be detected by software, which will revise the result. A FPSCR exception enable bit of 0 causes generation of the "default result" value specified for the "trap disabled" (or "no trap occurs" or "trap is not implemented") case: the expectation is that the exception will not be detected by software, which will simply use the default result. The result to be delivered in each case for each exception is described in the sections below. The IEEE default behavior when an exception occurs is to generate a default value and not to notify software. In this architecture, if the IEEE default behavior when an exception occurs is desired for all exceptions, all FPSCR exception enable bits should be set to zero and Ignore Exceptions Mode (see below) should be used. In this case the system floating-point enabled exception error handler is not invoked, even if DFP exceptions occur: software can inspect the FPSCR exception bits if necessary, to determine whether exceptions have occurred. In this architecture, if software is to be notified that a given kind of exception has occurred, the corresponding FPSCR exception enable bit must be set to one and a mode other than Ignore Exceptions Mode must be used. In this case the system floating-point enabled exception error handler is invoked if an enabled DFP exception occurs. The system floating-point enabled exception error handler is also invoked if a *Move To FPSCR* instruction causes an exception bit and the corresponding enable bit both to be 1; the *Move To FPSCR* instruction is considered to cause the enabled exception. The FE0 and FE1 bits control whether and how the system floating-point enabled exception error handler is invoked if an enabled DFP exception occurs. The location of these bits and the requirements for altering them are described in Book III, *PowerPC AS Operating Environment Architecture*. (The system floating-point enabled exception error handler is never invoked because of a disabled DFP exception.) The effects of the four possible settings of these bits are as follows. #### FE0 FE1 Description #### 0 0 Ignore Exceptions Mode DFP exceptions do not cause the system floating-point enabled exception error handler to be invoked. #### 0 1 Imprecise Nonrecoverable Mode The system floating-point enabled exception error handler is invoked at some point at or beyond the instruction that caused the enabled exception. It may not be possible to identify the excepting instruction or the data that caused the exception. Results produced by the excepting instruction may have been used by or may have affected subsequent instructions that are executed before the error handler is invoked. #### 1 0 Imprecise Recoverable Mode The system floating-point enabled exception error handler is invoked at some point at or beyond the instruction that caused the enabled exception. Sufficient information is provided to the error handler that it can identify the excepting instruction and the operands, and correct the result. No results produced by the excepting instruction have been used by or have affected subsequent instructions that are executed before the error handler is invoked. #### FE0 FE1 Description #### 1 1 Precise Mode The system floating-point enabled exception error handler is invoked precisely at the instruction that caused the enabled exception. In all cases, the question of whether a DFP result is stored, and what value is stored, is governed by the FPSCR exception enable bits, as described in subsequent sections, and is not affected by the value of the FE0 and FE1 bits. In all cases in which the system floating-point enabled exception error handler is invoked, all instructions before the instruction at which the system floating-point enabled exception error handler is invoked have completed, and no instruction after the instruction at which the system floating-point enabled exception error handler is invoked has begun execution. (Recall that, for the two Imprecise modes, the instruction at which the system floating-point enabled exception error handler is invoked need not be the instruction that caused the exception.) The instruction at which the system floating-point enabled exception error handler is invoked has not been executed unless it is the excepting instruction, in which case it has been executed if the exception is not among those listed on page 154 as suppressed. #### Programming Note In the ignore and both imprecise modes, a *Floating-Point Status and Control Register* instruction can be used to force any exceptions, due to instructions initiated before the *Floating-Point Status and Control Register* instruction, to be recorded in the FPSCR. (This forcing is superfluous for Precise Mode.) In either of the Imprecise modes, a *Floating-Point Status and Control Register* instruction can be used to force any invocations of the system floating-point enabled exception error handler, due to instructions initiated before the *Floating-Point Status and Control Register* instruction, to occur. (This forcing has no effect in Ignore Exceptions Mode, and is superfluous for Precise Mode.) In order to obtain the best performance across the widest range of implementations, the programmer should obey the following guidelines. - If the IEEE default results are acceptable to the application, Ignore Exceptions Mode should be used with all FPSCR exception enable bits set to zero. - If the IEEE default results are not acceptable to the application, Imprecise Nonrecoverable Mode should be used, or Imprecise Recoverable Mode if recoverability is needed, with FPSCR exception - enable bits set to one for those exceptions for which the system floating-point enabled exception error handler is to be invoked. - Ignore Exceptions Mode should not, in general, be used when any FPSCR exception enable bits are set to one. - Precise Mode may degrade performance in some implementations, perhaps substantially, and therefore should be used only for debugging and other specialized applications. # 5.5.10.1 Invalid Operation Exception #### **Definition** An Invalid Operation Exception occurs when an operand is invalid for the specified DFP operation. The invalid DFP operations are: - Any DFP operation on a signaling NaN (SNaN), except for Test, Round To DFP Short, Convert To DFP Long, Decode DPD To BCD, Extract Biased Exponent, Insert Biased Exponent, Shift Significand Left Immediate, and Shift Significand Right Immediate - For add or subtract operations, magnitude subtraction of infinities (+∞) + (-∞) - Division of infinity by infinity $(\infty \div \infty)$ - Division of zero by zero (0 ÷ 0) - Multiplication of infinity by zero (∞ × 0) - Ordered comparison involving a NaN (Invalid Compare) - The Quantize operation detects that the significand associated with the specified target exponent would have more significant digits than the targetformat precision - For the Quantize operation, when one source operand specifies an infinity and the other specifies a finite number - The Reround operation detects that the target exponent associated with the specified target significance would be greater than X<sub>max</sub> - The Encode BCD To DPD operation detects an invalid BCD digit or sign code - The Convert To Fixed operation involving a number too large in magnitude to be represented in the target format, or involving a NaN. #### Programming Note - In addition, an Invalid Operation Exception occurs if software explicitly requests this by executing an *mtfsfi*, *mtfsf*, or *mtfsb1* instruction that sets FPSCR<sub>VXSOFT</sub> to 1 (Software Request). The purpose of FPSCR<sub>VXSOFT</sub> is to allow software to cause an Invalid Operation Exception for a condition that is not necessarily associated with the execution of a DFP instruction. For example, it might be set by a program that computes a square root, if the source operand is negative. #### **Action** The action to be taken depends on the setting of the Invalid Operation Exception Enable bit of the FPSCR. When Invalid Operation Exception is enabled (FPSCR $_{VE}$ =1) and Invalid Operation occurs, the following actions are taken: 1. One or two Invalid Operation Exceptions are set: $\begin{array}{lll} \text{FPSCR}_{\text{VXSNAN}} & \text{ (if SNaN)} \\ \text{FPSCR}_{\text{VXIDI}} & \text{ (if $\infty$ - $\infty$)} \\ \text{FPSCR}_{\text{VXIDI}} & \text{ (if $\infty$ + $\infty$)} \\ \text{FPSCR}_{\text{VXZDZ}} & \text{ (if $0$ + $0$)} \\ \text{FPSCR}_{\text{VXIMZ}} & \text{ (if $\infty$ \times $0$)} \\ \text{FPSCR}_{\text{VXVC}} & \text{ (if invalid comp)} \\ \text{FPSCR}_{\text{VXCVI}} & \text{ (if invalid conversion)} \end{array}$ If the operation is an arithmetic, quantum-adjustment, conversion, or format, > the target FPR is unchanged, FPSCR<sub>FR FI</sub> are set to zero, and FPSCR<sub>FPRF</sub> is unchanged. 3. If the operation is a compare, $\mathsf{FPSCR}_\mathsf{FR}\,\mathsf{FI}\,\mathsf{C}$ are unchanged, and $\mathsf{FPSCR}_\mathsf{FPCC}$ is set to reflect unordered. When Invalid Operation Exception is disabled (FPSCR $_{\text{VE}}$ =0) and Invalid Operation occurs, the following actions are taken: 1. One or two Invalid Operation Exceptions are set: $\begin{array}{lll} \text{FPSCR}_{\text{VXSNAN}} & \text{ (if SNaN)} \\ \text{FPSCR}_{\text{VXIDI}} & \text{ (if $\infty$-$\infty$)} \\ \text{FPSCR}_{\text{VXIDI}} & \text{ (if $\infty$-$\infty$)} \\ \text{FPSCR}_{\text{VXZDZ}} & \text{ (if $0$-$0$)} \\ \text{FPSCR}_{\text{VXIMZ}} & \text{ (if $\infty$\times$0$)} \\ \text{FPSCR}_{\text{VXVC}} & \text{ (if invalid comp)} \\ \text{FPSCR}_{\text{VXCVI}} & \text{ (if invalid conversion)} \end{array}$ 2. If the operation is an arithmetic, quantum-adjustment, Round to DFP Long, Convert to DFP Extended, or format the target FPR is set to a Quiet NaN FPSCR<sub>FR FI</sub> are set to zero FPSCR<sub>FPRF</sub> is set to indicate the class of the result (Quiet NaN) 3. If the operation is a Convert To Fixed the target FPR is set as follows: FRT is set to the most positive 64-bit binary integer if the operand in FRB is a positive or $+\infty$ , and to the most negative 64-bit binary integer if the operand in FRB is a negative number, $-\infty$ , or NaN. FPSCR<sub>FR FI</sub> are set to zero FPSCR<sub>FPRF</sub> is unchanged 4. If the operation is a compare, FPSCR<sub>FR FI C</sub> are unchanged FPSCR<sub>FPCC</sub> is set to reflect unordered #### 5.5.10.2 Zero Divide Exception #### **Definition** A Zero Divide Exception occurs when a Divide instruction is executed with a zero divisor value and a finite nonzero dividend value. #### **Action** The action to be taken depends on the setting of the Zero Divide Exception Enable bit of the FPSCR. When Zero Divide Exception is enabled (FPSCR<sub>ZE</sub>=1) and Zero Divide occurs, the following actions are taken: - 1. Zero Divide Exception is set - FPSCR<sub>ZX</sub> ← 1 - 2. The target FPR is unchanged - 3. FPSCR<sub>FR FI</sub> are set to zero - 4. FPSCR<sub>FPRF</sub> is unchanged When Zero Divide Exception is disabled (FPSCR<sub>ZE</sub>=0) and Zero Divide occurs, the following actions are taken: - Zero Divide Exception is set FPSCR<sub>7X</sub> ← 1 - The target FPR is set to ±∞, where the sign is determined by the XOR of the signs of the operands - 3. $FPSCR_{FR\ FI}$ are set to zero - 4. FPSCR<sub>FPRF</sub> is set to indicate the class and sign of the result $(\pm \infty)$ #### 5.5.10.3 Overflow Exception #### **Definition** An overflow exception occurs whenever the target format's largest finite number is exceeded in magnitude by what would have been the rounded result if the exponent range were unbounded. #### Action Except for *Reround*, the following describes the handling of the IEEE overflow exception condition. The *Reround* operation does not recognize an overflow exception condition. The action to be taken depends on the setting of the Overflow Exception Enable bit of the FPSCR. When Overflow Exception is enabled (FPSCR $_{OE}$ =1) and overflow occurs, the following actions are taken: - 1. Overflow Exception is set $FPSCR_{OX} \leftarrow 1$ - 2. The infinitely precise result is divided by $10^{\alpha}$ . That is, the exponent adjustment $\alpha$ is subtracted from the exponent. This is called the *wrapped result*. The exponent adjustment for all operations, except for *Round To DFP Short* and *Round To DFP Long*, is 576 for DFP Long and 9216 for DFP Extended. For *Round To DFP Short* and *Round To DFP Long*, the exponent adjustment is 192 for the source format of DFP Long and 3072 for the source format of DFP Extended. - The wrapped result is rounded to the target-format precision. This is called the wrapped rounded result. - 4. If the wrapped rounded result has only one form, it is the delivered result. If the wrapped rounded result has redundant forms and is exact, the result of the form that has the exponent closest to the wrapped ideal exponent is returned. If the wrapped rounded result has redundant forms and is inexact, the result of the form that has the smallest exponent is returned. The wrapped ideal exponent is the result of subtracting the exponent adjustment from the ideal exponent. - FPSCR<sub>FPRF</sub> is set to indicate the class and sign of the result (± Normal Number) When Overflow Exception is disabled (FPSCR $_{OE}$ =0) and overflow occurs, the following actions are taken: - 1. Overflow Exception is set $FPSCR_{OX} \leftarrow 1$ - 2. Inexact Exception is set FPSCR<sub>XX</sub> ← 1 - 3. The result is determined by the rounding mode and the sign of the intermediate result as follows. | | Sign of inter-<br>mediate result | | | |----------------------------------------|----------------------------------|-------------------|--| | Rounding Mode | Plus | Minus | | | Round to Nearest, Ties to Even | +∞ | -∞ | | | Round toward 0 | +N <sub>max</sub> | -N <sub>max</sub> | | | Round toward +∞ | +∞ | -N <sub>max</sub> | | | Round toward - ∞ | +N <sub>max</sub> | -∞ | | | Round to Nearest, Ties away from 0 | +∞ | -∞ | | | Round to Nearest, Ties toward 0 | +∞ | -∞ | | | Round away from 0 | +∞ | -∞ | | | Round to prepare for shorter precision | +N <sub>max</sub> | -N <sub>max</sub> | | Figure 76. Overflow Results When Exception Is Disabled - 4. The result is placed into the target FPR - FPSCR<sub>FR</sub> is set to one if the returned result is ± ∞, and is set to zero if the returned result is ±N<sub>max</sub> - 6. FPSCR<sub>FI</sub> is set to one - FPSCR<sub>FPRF</sub> is set to indicate the class and sign of the result (± ∞ or ± Normal number) #### 5.5.10.4 Underflow Exception #### Definition Except for *Reround*, the following describes the handling of the IEEE underflow exception condition. The *Reround* operation does not recognize an underflow exception condition. The Underflow Exception is defined differently for the enabled and disabled states. However, a tininess condition is recognized in both states when a result computed as though both the precision and exponent range were unbounded would be nonzero and less than the target format's smallest normal number, N<sub>min</sub>, in magnitude. Unless otherwise defined in the instruction description, an underflow exception occurs as follows: - Enabled: When the tininess condition is recognized. - Disabled: When the tininess condition is recognized and when the delivered result value differs from what would have been computed were both the precision and the exponent range unbounded. #### **Action** The action to be taken depends on the setting of the Underflow Exception Enable bit of the FPSCR. When Underflow Exception is enabled (FPSCR<sub>UE</sub>=1) and underflow occurs, the following actions are taken: - Underflow Exception is set FPSCR<sub>UX</sub> ← 1 - 2. The infinitely precise result is multiplied by $10^{\alpha}$ . That is, the exponent adjustment $\alpha$ is added to the exponent. This is called the *wrapped result*. The exponent adjustment for all operations, except for *Round To DFP Short* and *Round To DFP Long*, is 576 for DFP Long and 9216 for DFP Extended. For *Round To DFP Short* and *Round To DFP Long*, the exponent adjustment is 192 for the source format of DFP Long and 3072 for the source format of DFP Extended. - The wrapped result is rounded to the target-format precision. This is called the wrapped rounded result. - 4. If the wrapped rounded result has only one form, it is the delivered result. If the wrapped rounded result has redundant forms and is exact, the result of the form that has the exponent closest to the wrapped ideal exponent is returned. If the wrapped rounded result has redundant forms and is inexact, the result of the form that has the smallest exponent is returned. The wrapped ideal exponent is the result of adding the exponent adjustment to the ideal exponent. FPSCR<sub>FPRF</sub> is set to indicate the class and sign of the result (± Normal number) When Underflow Exception is disabled (FPSCR<sub>UE</sub>=0) and underflow occurs, the following actions are taken: - Underflow Exception is set FPSCR<sub>IIX</sub> ← 1 - The infinitely precise result is rounded to the target-format precision. - The rounded result is returned. If this result has redundant forms, the result of the form that is closest to the ideal exponent is returned. - FPSCR<sub>FPRF</sub> is set to indicate the class and sign of the result (± Normal number, ± Subnormal Number, or ± Zero) #### 5.5.10.5 Inexact Exception #### **Definition** Except for Round to FP Integer Without Inexact, the following describes the handling of the IEEE inexact exception condition. The Round to FP Integer Without Inexact does not recognize an inexact exception condition. An Inexact Exception occurs when either of two conditions occur during rounding: - The delivered result differs from what would have been computed were both the precision and exponent range unbounded. - The rounded result overflows and Overflow Exception is disabled. #### Action The action to be taken does not depend on the setting of the Inexact Exception Enable bit of the FPSCR. When Inexact Exception occurs, the following actions are taken: - Inexact Exception is set FPSCR<sub>XX</sub> ← 1 - The rounded or overflowed result is placed into the target FPR - FPSCR<sub>FPRF</sub> is set to indicate the class and sign of the result #### Programming Note - In some implementations, enabling Inexact Exceptions may degrade performance more than does enabling other types of floating-point exception. # 5.5.11 Summary of Normal Rounding And Range Actions Figure 77 and Figure 78 summarize rounding and range actions, with the following exceptions: - The *Reround* operation recognizes neither an underflow nor an overflow exception. - The Round to FP Integer Without Inexact operation does not recognize the inexact operation exception. | | | | | | | | ult (r) | | | | |----------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|-------|-------|-------|-------|-----------|-------|-------|-------| | | | | | | | | ding Mode | | | | | | nge of v | Case | RNE | RNTZ | RNAZ | RAFZ | RTMI | RFSP | RTPI | RTZ | | | x, q < -Nmax | Overflow | -∞1 | -∞1 | -∞1 | -∞1 | -∞1 | -Nmax | -Nmax | -Nmax | | | x, q = -Nmax | Normal | -Nmax | -Nmax | -Nmax | _ | _ | -Nmax | -Nmax | -Nmax | | -Nmax | ≤ v ≤ -Nmin | Normal | b | b | b | b | b | b | b | b | | -Nmin | -Nmin $<$ v $≤$ -Dmin | | b* | b* | b* | b* | b* | b* | b | b | | | v < -Dmin/2 | Tiny | -Dmin | -Dmin | -Dmin | -Dmin | -Dmin | -Dmin | -0 | -0 | | | -Dmin/2 | Tiny | -0 | -0 | -Dmin | -Dmin | -Dmin | -Dmin | -0 | -0 | | | 1/2 < v < 0 | Tiny | -0 | -0 | -0 | -Dmin | -Dmin | -Dmin | -0 | -0 | | , | v = 0 | EZD | +0 | +0 | +0 | +0 | -0 | +0 | +0 | +0 | | | < +Dmin/2 | Tiny | +0 | +0 | +0 | +Dmin | +0 | +Dmin | +Dmin | +0 | | V = - | +Dmin/2 | Tiny | +0 | +0 | +Dmin | +Dmin | +0 | +Dmin | +Dmin | +0 | | +Dmin/2 | < v < +Dmin | Tiny | +Dmin | +Dmin | +Dmin | +Dmin | +0 | +Dmin | +Dmin | +0 | | +Dmin : | ≤ v < +Nmin | Tiny | b* | b* | b* | b* | b | b* | b* | b | | +Nmin ≤ | ≤ v ≤ +Nmax | Normal | b | b | b | b | b | b | b | b | | +Nmax < | v, q = +Nmax | Normal | +Nmax | +Nmax | +Nmax | _ | +Nmax | +Nmax | _ | +Nmax | | +Nmax < | v, q > +Nmax | Overflow | +∞1 | +∞1 | +∞1 | +∞1 | +Nmax | +Nmax | +∞1 | +Nmax | | b q r v Dmin EZD Nmax Nmin RAFZ RFSP RNAZ RNE RNTZ RTPI | The rounded value, in the extreme case, may be Nmin. In this case, the exception conditions are underflow, inexact, and incremented. b The value derived when the precise result v is rounded to the destination's precision, including both bounded precision and bounded exponent range. q The value derived when the precise result v is rounded to the destination's precision, but assuming an unbounded exponent range. r This is the returned value when neither overflow nor underflow is enabled. v Precise result before rounding, assuming unbounded precision and an unbounded exponent range. For data-format conversion operations, v is the source value. Dmin Smallest (in magnitude) representable subnormal number in the target format. EZD The result r of the exact-zero-difference case applies only to ADD and SUBTRACT with both source operands having opposite signs. (For ADD and SUBTRACT, when both source operands have the same sign, the sign of the zero result is the same sign as the sign of the source operands.) Nmax Largest (in magnitude) representable finite number in the target format. Nmin Smallest (in magnitude) representable normalized number in the target format. RAFZ Round away from 0. RFSP Round to Prepare for Shorter Precision. RNAZ Round to Nearest, Ties away from 0. RNE Round to Nearest, Ties to even. | | | | | | | | | | | RTMI<br>RTZ | Round toward -« Round toward 0. | ·. | | | | | | | | | Figure 77. Rounding and Range Actions (Part 1) | | ls r | | | | Is r Incre- | Is q | Is q Incre- | | |----------|------------------|------|------|------|-------------|-----------------|-----------------|--------------------------------------------------------------------------------------------------------------| | | inexact | | | | mented | inexact | mented | | | Case | (r≠v) | OE=1 | UE=1 | XE=1 | ( r > v ) | (q≠v) | ( q > v ) | Returned Results and Status Setting* | | Overflow | Yes <sup>1</sup> | No | _ | No | No | _ | _ | $T(r)$ , OX $\leftarrow$ 1, FI $\leftarrow$ 1, FR $\leftarrow$ 0, XX $\leftarrow$ 1 | | Overflow | Yes <sup>1</sup> | No | _ | No | Yes | _ | _ | $T(r)$ , OX $\leftarrow$ 1, FI $\leftarrow$ 1, FR $\leftarrow$ 1, XX $\leftarrow$ 1 | | Overflow | Yes <sup>1</sup> | No | _ | Yes | No | _ | _ | $T(r)$ , $OX \leftarrow 1$ , $FI \leftarrow 1$ , $FR \leftarrow 0$ , $XX \leftarrow 1$ , $TX$ | | Overflow | Yes <sup>1</sup> | No | _ | Yes | Yes | _ | _ | $T(r)$ , OX $\leftarrow$ 1, FI $\leftarrow$ 1, FR $\leftarrow$ 1, XX $\leftarrow$ 1, TX | | Overflow | Yes <sup>1</sup> | Yes | _ | _ | _ | No | No <sup>1</sup> | $Tw(q \div \beta)$ , OX $\leftarrow$ 1, FI $\leftarrow$ 0, FR $\leftarrow$ 0, TO | | Overflow | Yes <sup>1</sup> | Yes | _ | _ | _ | Yes | No | Tw(q÷ $\beta$ ), OX $\leftarrow$ 1, FI $\leftarrow$ 1, FR $\leftarrow$ 0, XX $\leftarrow$ 1,TO | | Overflow | Yes <sup>1</sup> | Yes | _ | _ | _ | Yes | Yes | $Tw(q \div \beta)$ , OX $\leftarrow$ 1, FI $\leftarrow$ 1, FR $\leftarrow$ 1, XX $\leftarrow$ 1,TO | | Normal | No | _ | _ | _ | _ | _ | _ | $T(r)$ , $FI \leftarrow 0$ , $FR \leftarrow 0$ | | Normal | Yes | _ | _ | No | No | _ | _ | T(r), FI← 1, FR← 0, XX ← 1 | | Normal | Yes | _ | _ | No | Yes | _ | _ | $T(r)$ , $FI \leftarrow 1$ , $FR \leftarrow 1$ , $XX \leftarrow 1$ | | Normal | Yes | _ | _ | Yes | No | _ | _ | $T(r)$ , $FI \leftarrow 1$ , $FR \leftarrow 0$ , $XX \leftarrow 1$ , $TX$ | | Normal | Yes | _ | _ | Yes | Yes | _ | _ | $T(r)$ , $FI \leftarrow 1$ , $FR \leftarrow 1$ , $XX \leftarrow 1$ , $TX$ | | Tiny | No | _ | No | _ | _ | _ | _ | $T(r)$ , $FI \leftarrow 0$ , $FR \leftarrow 0$ | | Tiny | No | _ | Yes | _ | _ | No <sup>1</sup> | No <sup>1</sup> | $Tw(q \bullet \beta)$ , $UX \leftarrow 1$ , $FI \leftarrow 0$ , $FR \leftarrow 0$ , $TU$ | | Tiny | Yes | _ | No | No | No | _ | _ | $T(r)$ , $UX \leftarrow 1$ , $FI \leftarrow 1$ , $FR \leftarrow 0$ , $XX \leftarrow 1$ | | Tiny | Yes | _ | No | No | Yes | _ | _ | T(r), UX← 1, FI← 1, FR← 1, XX ← 1 | | Tiny | Yes | _ | No | Yes | No | _ | _ | $T(r)$ , $UX \leftarrow 1$ , $FI \leftarrow 1$ , $FR \leftarrow 0$ , $XX \leftarrow 1$ , $TX$ | | Tiny | Yes | _ | No | Yes | Yes | | | $T(r)$ , $UX \leftarrow 1$ , $FI \leftarrow 1$ , $FR \leftarrow 1$ , $XX \leftarrow 1$ , $TX$ | | Tiny | Yes | _ | Yes | _ | _ | No | No <sup>1</sup> | Tw( $q \bullet \beta$ ), UX $\leftarrow$ 1, FI $\leftarrow$ 0, FR $\leftarrow$ 0, TU | | Tiny | Yes | | Yes | | _ | Yes | No | Tw( $q \bullet \beta$ ), UX $\leftarrow$ 1, FI $\leftarrow$ 1, FR $\leftarrow$ 0, XX $\leftarrow$ 1,TU | | Tiny | Yes | _ | Yes | _ | _ | Yes | Yes | $Tw(q \bullet \beta)$ , $UX \leftarrow 1$ , $FI \leftarrow 1$ , $FR \leftarrow 1$ , $XX \leftarrow 1$ , $TU$ | #### Explanation: - The results do not depend on this condition. - This condition is true by virtue of the state of some condition to the left of this column. - \* Rounding sets only the FI and FR status flags. Setting of the OX, XX, or UX flag is part of the exception actions. They are listed here for reference. - Wrap adjust, which depends on the type of operation and operand format. For all operations except *Round to DFP* Short and *Round to DFP Long*, the wrap adjust depends on the target format: $\beta = 10^{\alpha}$ , where $\alpha$ is 576 for DFP Long, and 9216 for DFP Extended. For *Round to DFP Short* and *Round to DFP Long*, the wrap adjust depends on the source format: $\beta = 10^{\kappa}$ where $\kappa$ is 192 for DFP Long and 3072 for DFP Extended. - q The value derived when the precise result v is rounded to destination's precision, but assuming an unbounded exponent range. - r The result as defined in Part 1 of this figure. - v Precise result before rounding, assuming unbounded precision and unbounded exponent range. - $\label{eq:Floating-Point-Fraction-Inexact status flag, FPSCR_{Fl}. This status flag is non-sticky. \\$ - $\label{eq:FR} {\sf FR} \qquad {\sf Floating-Point-Fraction-Rounded\ status\ flag,\ FPSCR}_{\sf FR}.$ - OX Floating-Point Overflow Exception status flag, FPSCR<sub>oX</sub>. - TO The system floating-point enabled exception error handler is invoked for the overflow exception if the FE0 and FE1 bits in the machine-state register are set to any mode other than the ignore-exception mode. - TU The system floating-point enabled exception error handler is invoked for the underflow exception if the FE0 and FE1 bits in the machine-state register are set to any mode other than the ignore-exception mode. - TX The system floating-point enabled exception error handler is invoked for the inexact exception if the FE0 and FE1 bits in the machine-state register are set to any mode other than the ignore-exception mode. - T(x) The value x is placed at the target operand location. - Tw(x) The wrapped rounded result x is placed at the target operand location. For all operations except data format conversions, the wrapped rounded result is in the same format and length as normal results at the target location. For data format conversions, the wrapped rounded result is in the same format and length as the source, but rounded to the target-format precision. - UX Floating-Point-Underflow-Exception status flag, FPSCR<sub>UX</sub> - XX Float-Point-Inexact-Exception Status flag, FPSCR<sub>XX</sub>. The flag is a sticky version of FPSCR<sub>FI</sub>. When FPSCR<sub>FI</sub> is set to a new value, the new value of FPSCR<sub>XX</sub> is set to the result of ORing the old value of FPSCR<sub>XX</sub> with the new value of FPSCR<sub>FI</sub>. Figure 78. Rounding and Range Actions (Part 2) # **5.6 DFP Instruction Descriptions** The following sections describe the DFP instructions. When a 128-bit operand is used, it is held in a FPR pair and the instruction mnemonic uses a letter "q" to mean the quad-precision operation. Note that in the following descriptions, FPXp denotes a FPR pair and must address an even-odd pair. If the FPXp field specifies an odd-numbered register, then the instruction form is invalid. The notation FPX[p] means either a FPR, FPX, or a FPR pair, FPXp. For DFP instructions, if a DFP operand is returned, the trailing significand field of the target operand is encoded using preferred DPD codes. ## 5.6.1 DFP Arithmetic Instructions All DFP arithmetic instructions are X-form instructions. They all set the FI and FR status flags, and also set the FPSCR<sub>FPRF</sub> field. Furthermore, they all have an ideal exponent assigned and employ the record bit (Rc). The arithmetic instructions consist of Add, Divide, Multiply, and Subtract. | DFP Ac | DFP Add [Quad] | | | | | | | | | | |---------------|----------------|----------|------------------|----|---|------------------|--|--|--|--| | dadd<br>dadd. | FRT,<br>FRT, | | (Rc=0)<br>(Rc=1) | | | | | | | | | 59 | FRT | FRA | FRB | 21 | 2 | Rc | | | | | | o . | 0 | 1 | 10 | - | | 101 | | | | | | daddq | FRT | p,FRAp,F | FRBp | | | (Rc=0) | | | | | | daddq. | FRT | o,FRAp,F | RBp | | | (Rc=0)<br>(Rc=1) | | | | | | 00 | LEDT. | EDA. | EDD. | T | | D- | | | | | | 63 | FRTp | FRAp | FRBp | L | 2 | Rc | | | | | | 0 | 6 | 11 | 16 | 21 | | 31 | | | | | The DFP operand in FRA[p] is added to the DFP operand in FRB[p]. The result is rounded to the target-format precision under control of the DRN (bits 29:31) of the FPSCR. An appropriate form of the rounded result is selected based on the ideal exponent and is placed in FRT[p]. The ideal exponent is the smaller exponent of the two source operands. Figure 79 summarizes the actions for Add. Figure 79 does not include the setting of the FPSCR<sub>FPRF</sub> field. The FPSCR<sub>FPRF</sub> field is always set to the class and sign of the result, except for an enabled invalid-operation exception, in which case the field remains unchanged. #### Special Registers Altered: FPRF FR FI FX OX UX XX VXSNAN VXISI CR1 (if Rc=1) | DFP Su | DFP Subtract [Quad] | | | | | | | | | | |-----------------|---------------------|----------------------|------|------------------|------------------|----|--|--|--|--| | dsub<br>dsub. | | FRA,FR<br>FRA,FR | | (Rc=0)<br>(Rc=1) | | | | | | | | 59 | FRT | FRA | FRB | | 514 | Rc | | | | | | 0 | 6 | 11 | 16 | 21 | | 31 | | | | | | dsubq<br>dsubq. | | o,FRAp,F<br>o,FRAp,F | | | (Rc=0)<br>(Rc=1) | | | | | | | 63 | FRTp | FRAp | FRBp | | 514 | Rc | | | | | | 0 | 6 | 11 | 16 | 21 | | 31 | | | | | The DFP operand in FRB[p] is subtracted from the DFP operand in FRA[p]. The result is rounded to the target-format precision under control of the DRN (bits 29:31) of the FPSCR. An appropriate form of the rounded result is selected based on the ideal exponent and is placed in FRT[p]. The ideal exponent is the smaller exponent of the two source operands. The execution of Subtract is identical to that of Add, except that the operand in FRB participates in the operation with its sign bit inverted. See Figure 79. The table does not include the setting of the FPSCR<sub>FPRF</sub> field. The FPSCR<sub>FPRF</sub> field is always set to the class and sign of the result, except for an enabled invalid-operation exception, in which case the field remains unchanged. #### **Special Registers Altered:** FPRF FR FI FX OX UX XX VXSNAN VXISI CR1 (if Rc=1) | Operand a | | | | | | | | | | |--------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------|------------------------------------------------------------------------------------------------|-------------------------------------------|----------------------------------------------|--|--|--|--| | in FRA[p] is | -∞ | F | +∞ | QNaN | SNaN | | | | | | =∞ | T(-dINF) | T(-dINF) | V <sub>XISI</sub> : T(dNaN) | P(b) | V <sub>XSNAN</sub> : U(b) | | | | | | F | T(-dINF) | S(a + b) | T(+dINF) | P(b) | V <sub>XSNAN</sub> : U(b) | | | | | | +∞ | V <sub>XISI</sub> : T(dNaN) | T(+dINF) | T(+dINF) | P(b) | V <sub>XSNAN</sub> : U(b) | | | | | | QNaN | P(a) | P(a) | P(a) | P(a) | V <sub>XSNAN</sub> : U(b) | | | | | | SNaN | V <sub>XSNAN</sub> : U(a) | V <sub>XSNAN</sub> : U(a) | V <sub>XSNAN</sub> : U(a) | V <sub>XSNAN</sub> : U(a) | V <sub>XSNAN</sub> : U(a) | | | | | | Explanation: | | | | | | | | | | | a + b | | to b, rounded to the on 5.5.11 on page 1 | e target-format precis<br>60) | sion and returned in | the appropriate | | | | | | +dINF | Default plus infinity | | | | | | | | | | - dINF | Default minus infini | ty. | | | | | | | | | dNaN | Default quiet NaN. | | | | | | | | | | F | All finite numbers, including zeros. | | | | | | | | | | P(x) | The QNaN of operand x is propagated and placed in FRT[p]. | | | | | | | | | | S(x) | ands have the sa<br>ing the case whe | ame sign, the sign on the result is zero. | e sign set by the rule<br>f the result is the sar<br>When the operands<br>es, except round tow | me as the sign of the have opposite signs | e operands, includ-<br>s, the sign of a zero | | | | | | T(x) | The value x is plac | ed in FRT[p]. | | | | | | | | | U(x) | The SNaN of opera | and x is converted to | the corresponding | QNaN and placed in | FRT[p]. | | | | | | V <sub>XISI</sub> | The Invalid-Operation Exception (VXISI) occurs. The result is produced only when the exception is disabled. (See Section 5.5.10.1 "Invalid Operation Exception" on page 156 for the exception actions.) | | | | | | | | | | V <sub>XSNAN</sub> | • | See Section 5.5.10 | NAN) occurs. The re<br>.1 "Invalid Operati | • | • | | | | | Figure 79. Actions: Add #### DFP Multiply [Quad] X-form dmul FRT,FRA,FRB (Rc=0)dmul. FRT,FRA,FRB (Rc=1)59 FRT **FRA FRB** 34 Rc 16 31 dmulq FRTp,FRAp,FRBp (Rc=0)dmulq. FRTp,FRAp,FRBp (Rc=1)63 FRTp FRAp FRBp 34 Rc 16 31 The DFP operand in FRA[p] is multiplied by the DFP operand in FRB[p]. The result is rounded to the target-format precision under control of the DRN (bits 29:31) of the FPSCR. An appropriate form of the rounded result is selected based on the ideal exponent and is placed in FRT[p]. The ideal exponent is the sum of the two exponents of the source operands. Figure 80 summarizes the actions for Multiply. Figure 80 does not include the setting of the $FPSCR_{F-PRF}$ field. The $FPSCR_{F-PRF}$ field is always set to the class and sign of the result, except for an enabled invalid-operation exception, in which case the field remains unchanged. # Special Registers Altered: FPRF FR FI FX OX UX XX VXSNAN VXIMZ CR1 (if Rc=1) | Operand a | Actions for Multiply (a*b) when operand b in FRB[p] is | | | | | | | | | |--------------|--------------------------------------------------------|---------------------------|-----------------------------|---------------------------|---------------------------|--|--|--|--| | in FRA[p] is | 0 | Fn | ∞ | QNaN | SNaN | | | | | | 0 | S(a * b) | S(a * b) | V <sub>XIMZ</sub> : T(dNaN) | P(b) | V <sub>XSNAN</sub> : U(b) | | | | | | Fn | S(a * b) | S(a * b) | S(dINF) | P(b) | V <sub>XSNAN</sub> : U(b) | | | | | | ∞ | V <sub>XIMZ</sub> : T(dNaN) | S(dINF) | S(dINF) | P(b) | V <sub>XSNAN</sub> : U(b) | | | | | | QNaN | P(a) | P(a) | P(a) | P(a) | V <sub>XSNAN</sub> : U(b) | | | | | | SNaN | V <sub>XSNAN</sub> : U(a) | V <sub>XSNAN</sub> : U(a) | V <sub>XSNAN</sub> : U(a) | V <sub>XSNAN</sub> : U(a) | V <sub>XSNAN</sub> : U(a) | | | | | #### Explanation: a \* b The value a multiplied by b, rounded to the target-format precision and returned in the appropriate form. (See Section 5.5.11 on page 160) dINF Default infinity. dNaN Default quiet NaN. Fn Finite nonzero number (includes both normal and subnormal numbers). P(x) The QNaN of operand x is propagated and placed in FRT[p]. S(x) The value x is placed in FRT[p] with the sign set to the exclusive-OR of the source-operand signs. T(x) The value x is placed in FRT[p]. U(x) The SNaN of operand x is converted to the corresponding QNaN and placed in FRT[p]. V<sub>XIMZ</sub>: The Invalid-Operation Exception (VXIMZ) occurs. The result is produced only when the exception is disabled. (See Section 5.5.10.1 "Invalid Operation Exception" on page 156 for the exception actions.) V<sub>XSNAN:</sub> The Invalid-Operation Exception (VXSNAN) occurs. The result is produced only when the exception is disabled. (See Section 5.5.10.1. "Invalid Operation Exception" on page 156 for the exception is disabled. (See Section 5.5.10.1 "Invalid Operation Exception" on page 156 for the exception actions.) Figure 80. Actions: Multiply Operand a in FRA[p] is | DFP Di | vide [G | | <b>X</b> -i | form | | | |-----------------|------------------|------------------|-------------|------------------|------------------|----| | ddiv<br>ddiv. | | FRA,FR<br>FRA,FR | | ( | (Rc=0)<br>(Rc=1) | | | 59 | FRT | FRA | FRB | | 546 | Rc | | 0 | 6 | 11 | 16 | 21 | | 31 | | ddivq<br>ddivq. | FRT <sub>I</sub> | | | (Rc=0)<br>(Rc=1) | | | | 63 | FRTp | FRAp | FRBp | | 546 | Rc | | 0 | 6 | 11 | 16 | 21 | | 31 | The DFP operand in FRA[p] is divided by the DFP operand in FRB[p]. The result is rounded to the target-format precision under control of the DRN (bits 29:31) of the FPSCR. An appropriate form of the rounded result is selected based on the ideal exponent and is placed in FRT[p]. The ideal exponent is the difference of subtracting the exponent of the divisor from the exponent of the dividend. 0 Figure 81 summarizes the actions for Divide. Figure 81 does not include the setting of the FPSCR<sub>FPRF</sub> field. The FPSCR<sub>FPRF</sub> field is always set to the class and sign of the result, except for an enabled invalid-operation and enabled zero-divide exceptions, in which cases the field remains unchanged. #### **Special Registers Altered:** FPRF FR FI FX OX UX ZX XX VXSNAN VXIDI VXZDZ (if Rc=1) CR1 **QNaN** **SNaN** | 0 | V <sub>XZDZ</sub> : T(dNaN) | S(a ÷ b) | S(zt) | P(b) | V <sub>XSNAN</sub> : U(b) | | | | |---------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------|----------------------------------------------|---------------------------|---------------------------|--|--|--| | Fn | Zx: S(dINF) | S(a ÷ b) | S(zt) | P(b) | V <sub>XSNAN</sub> : U(b) | | | | | ∞ | S(dINF) | S(dINF) | V <sub>XIDI</sub> : T(dNaN) | P(b) | V <sub>XSNAN</sub> : U(b) | | | | | QNaN | P(a) | P(a) | P(a) | P(a) | V <sub>XSNAN</sub> : U(b) | | | | | SNaN | V <sub>XSNAN</sub> : U(a) | V <sub>XSNAN</sub> : U(a) | V <sub>XSNAN</sub> : U(a) | V <sub>XSNAN</sub> : U(a) | V <sub>XSNAN</sub> : U(a) | | | | | Explanation: | | | | | | | | | | a ÷ b | | by b, rounded to the bn 5.5.11 on page 16 | e target-format preci<br>60.) | sion and returned in | the appropriate | | | | | dINF | Default infinity. | | | | | | | | | dNaN | Default quiet NaN. | | | | | | | | | Fn | Finite nonzero num | ber (includes both n | ormal and subnorma | al numbers). | | | | | | P(x) | The QNaN of opera | and x is propagated | and placed in FRT[p] | ]. | | | | | | S(x) | The value x is placed in FRT[p] with the sign set to the exclusive-OR of the source-operand signs. | | | | | | | | | T(x) | The value x is placed in FRT[p]. | | | | | | | | | U(x) | The SNaN of operand x is converted to the corresponding QNaN and placed in FRT[p]. | | | | | | | | | V <sub>XIDI</sub> : | | | ) occurs. The result valid Operation Exce | | | | | | | V <sub>XSNAN:</sub> | The Invalid-Operati is disabled. (See actions.) | | IAN) occurs. The res<br>Invalid Operation Ex | | • | | | | | V <sub>XZDZ:</sub> | The Invalid-Operation Exception (VXZDZ) occurs. The result is produced only when the exception is disabled. (See Section 5.5.10.1 "Invalid Operation Exception" on page 156 for the exception actions.) | | | | | | | | | zt | True zero (zero sig | nificand and most ne | egative exponent). | | | | | | | Zx | The Zero-Divide Exception occurs. The result is produced only when the exception is disabled (See Section 5.5.10.2 "Zero Divide Exception" on page 157 for the exception actions.) | | | | | | | | | igure 81. Act | ions: Divide | | | | | | | | Actions for Divide ( $a \div b$ ) when operand b in FRB[p] is # **5.6.2 DFP Compare Instructions** The DFP compare instructions consist of the *Compare Ordered* and *Compare Unordered* instructions. The compare instructions do not provide the record bit. The comparison sets the designated CR field to indicate the result. The FPSCR<sub>FPCC</sub> is set in the same way The codes in the CR field BF and $FPSCR_{FPCC}$ are defined for the DFP compare operations as follows. #### **Bit Name Description** 0 FL (FRA[p]) < (FRB[p])1 FG (FRA[p]) > (FRB[p]) 2 FE (FRA[p]) = (FRB[p]) 3 FU (FRA[p]) ? (FRB[p]) # DFP Compare Unordered [Quad] X-form dcmpu BF,FRA,FRB | | 59 | BF | // | FRA | FRB | 642 | / | |---|----|----|----|-----|-----|-----|----| | 0 | | 6 | 9 | 11 | 16 | 21 | 31 | dcmpuq BF,FRAp,FRBp | ( | 63 | BF | // | FRAp | FRBp | 642 | / | |---|----|----|----|------|------|-----|----| | 0 | | 6 | 9 | 11 | 16 | 21 | 31 | The DFP operand in FRA[p] is compared to the DFP operand in FRB[p]. The result of the compare is placed into CR field BF and the FPSCR<sub>FPCC</sub>. ## **Special Registers Altered:** CR field BF FPCC FX VXSNAN | Operand a in | Action | s for Compare Un | ordered (a:b) whe | n operand b in FR | B[p] is | | | | |--------------------|------------------------|----------------------------------------------------------|------------------------|------------------------|------------------------|--|--|--| | FRA[p] is | -∞ | F | +∞ | QNaN | SNaN | | | | | -∞ | AeqB | AltB | AltB | AuoB | Fu, V <sub>XSNAN</sub> | | | | | F | AgtB | C(a:b) | AltB | AuoB | Fu, V <sub>XSNAN</sub> | | | | | +∞ | AgtB | AgtB | AeqB | AuoB | Fu, V <sub>XSNAN</sub> | | | | | QNaN | AuoB | AuoB | AuoB | AuoB | Fu, V <sub>XSNAN</sub> | | | | | SNaN | Fu, V <sub>XSNAN</sub> | Fu, V <sub>XSNAN</sub> | Fu, V <sub>XSNAN</sub> | Fu, V <sub>XSNAN</sub> | Fu, V <sub>XSNAN</sub> | | | | | Explanation: | | | | | | | | | | C(a:b) | Algebraic comparis | son. See the table | below. | | | | | | | F | All finite numbers, | including zeros. | | | | | | | | AeqB | CR field BF and FF | PSCR <sub>FPCC</sub> are set | to 0b0010. | | | | | | | AgtB | CR field BF and FF | PSCR <sub>FPCC</sub> are set | to 0b0100. | | | | | | | AltB | CR field BF and FF | PSCR <sub>FPCC</sub> are set | to 0b1000. | | | | | | | AuoB | CR field BF and FF | CR field BF and FPSCR <sub>FPCC</sub> are set to 0b0001. | | | | | | | | V <sub>XSNAN</sub> | The invalid-operat | ion exception (VXS | NAN) occurs. See | Section 5.5.10.1 for | actions. | | | | | Relation of Value a to Value b | Action for C(a:b) | |--------------------------------|-------------------| | a = b | AeqB | | a < b | AltB | | a > b | AgtB | Figure 82. Actions: Compare Unordered # DFP Compare Ordered [Quad] X-form dcmpo BF,FRA,FRB | | 59 | BF | // | FRA | FRB | 130 | / | |---|----|----|----|-----|-----|-----|----| | 0 | | 6 | 9 | 11 | 16 | 21 | 31 | dcmpoq BF,FRAp,FRBp | 63 | BF | // | FRAp | FRBp | 130 | / | |----|----|----|------|------|-----|----| | 0 | 6 | 9 | 11 | 16 | 21 | 31 | The DFP operand in FRA[p] is compared to the DFP operand in FRB[p]. The result of the compare is placed into CR field BF and the FPSCR<sub>FPCC</sub>. #### **Special Registers Altered:** CR field BF FPCC FX VXSNAN VXVC | Operand a in | Actions for Compare ordered (a:b) when operand b in FRB[p] is | | | | | | | | |--------------|---------------------------------------------------------------|----------------------------------------------------------|------------------------|-----------------------------------------------|------------------------|--|--|--| | FRA[p] is | -∞ | F | +∞ | QNaN | SNaN | | | | | -∞ | AeqB | AltB | AltB | AuoB, V <sub>XVC</sub> | AuoB, V <sub>XSV</sub> | | | | | F | AgtB | C(a:b) | AltB | AuoB, V <sub>XVC</sub> | AuoB, V <sub>XSV</sub> | | | | | +∞ | AgtB | AgtB | AeqB | AuoB, V <sub>XVC</sub> | AuoB, V <sub>XSV</sub> | | | | | QNaN | AuoB, V <sub>XVC</sub> | AuoB, V <sub>XVC</sub> | AuoB, V <sub>XVC</sub> | AuoB, V <sub>XVC</sub> | AuoB, V <sub>XSV</sub> | | | | | SNaN | AuoB, V <sub>XSV</sub> | AuoB, V <sub>XSV</sub> | AuoB, V <sub>XSV</sub> | AuoB, V <sub>XSV</sub> | AuoB, V <sub>XSV</sub> | | | | | Explanation: | | | | | | | | | | C(a:b) | Algebraic comparis | son. See the table b | oelow | | | | | | | F | All finite numbers, | including zeros | | | | | | | | AeqB | CR field BF and FI | PSCR <sub>FPCC</sub> are set | to 0b0010. | | | | | | | AgtB | CR field BF and FI | PSCR <sub>FPCC</sub> are set | to 0b0100. | | | | | | | AltB | CR field BF and FI | CR field BF and FPSCR <sub>FPCC</sub> are set to 0b1000. | | | | | | | | AuoB | CR field BF and FPSCR <sub>FPCC</sub> are set to 0b0001. | | | | | | | | | $V_{XSV}$ | • | • | • | onally, if the except<br>e Section 5.5.10.1 f | | | | | | Relation of Value a to Value b | Action for C(a:b) | |--------------------------------|-------------------| | a = b | AeqB | | a < b | AltB | | a > b | AgtB | The invalid-operation exception (VXVC) occurs. See Section 5.5.10.1 for actions. Figure 83. Actions: Compare Ordered $V_{XVC}$ # 5.6.3 DFP Test Instructions The DFP test instructions consist of the *Test Data Class*, *Test Data Group*, *Test Exponent*, and *Test Significance* instructions, and they do not provide the record bit. The test instructions set the designated CR field to indicate the result. The $\mathsf{FPSCR}_\mathsf{FPCC}$ is set in the same way. #### DFP Test Data Class [Quad] Z22-form dtstdc BF,FRA,DCM | 59 | BF | // | FRA | DCM | 194 | / | |----|----|----|-----|-----|-----|----| | 0 | 6 | 9 | 11 | 16 | 22 | 31 | dtstdcq BF,FRAp,DCM | 63 | BF | // | FRAp | DCM | 194 | / | |----|----|----|------|-----|-----|----| | 0 | 6 | 9 | 11 | 16 | 22 | 31 | Let the DCM (Data Class Mask) field specify one or more of the 6 possible data classes, where each bit corresponds to a specific data class. # DCM Bit Data Class 0 Zero 1 Subnormal 2 Normal 3 Infinity 4 Quiet NaN 5 Signaling NaN CR field BF and FPSCR<sub>FPCC</sub> are set to indicate the sign of the DFP operand in FRA[p] and whether the data class of the DFP operand in FRA[p] matches any of the data classes specified by DCM. | Field | Meaning | |-------|--------------------------------| | 0000 | Operand positive with no match | | 0010 | Operand positive with match | | 1000 | Operand negative with no match | | 1010 | Operand negative with match | #### **Special Registers Altered:** CR field BF FPCC ## DFP Test Data Group [Quad] Z22-form dtstdg BF,FRA,DGM | | | .,, | | | | | |----|--------|-----|-----|-----|-----|----| | 59 | ) BF | // | FRA | DGM | 226 | / | | | | | | | 00 | | | 0 | 6 | 9 | 11 | 16 | 22 | 31 | dtstdgq BF,FRAp,DGM | | 63 | BF | // | FRAp | DGM | 226 | / | |---|----|----|----|------|-----|-----|----| | C | ) | 6 | 9 | 11 | 16 | 22 | 31 | Let the DGM (Data Group Mask) field specify one or more of the 6 possible data groups, where each bit corresponds to a specific data group. The term extreme exponent means either the maximum exponent, $X_{\text{max}}$ , or the minimum exponent, $X_{\text{min}}$ . #### **DGM Bit** Data Group | _ | | |---|----------------------------------------------------------------------------| | 0 | Zero with non-extreme exponent | | 1 | Zero with extreme exponent | | 2 | Subnormal or (Normal with extreme exponent) | | 3 | Normal with non-extreme exponent and leftmost zero digit in significand | | 4 | Normal with non-extreme exponent and leftmost nonzero digit in significand | | 5 | Special symbol (Infinity, QNaN, or SNaN) | | | | CR field BF and FPSCR<sub>FPCC</sub> are set to indicate the sign of the DFP operand in FRA[p] and whether the data group of the DFP operand in FRA[p] matches any of the data groups specified by DGM. | Field | Meaning | |-------|--------------------------------| | 0000 | Operand positive with no match | | 0010 | Operand positive with match | | 1000 | Operand negative with no match | | 1010 | Operand negative with match | #### **Special Registers Altered:** CR field BF FPCC # DFP Test Exponent [Quad] X-form dtstex BF,FRA,FRB | 59 | BF | // | FRA | FRB | 162 | / | |----|----|----|-----|-----|-----|----| | 0 | 6 | 9 | 11 | 16 | 21 | 31 | dtstexq BF,FRAp,FRBp | | 63 | BF | // | FRAp | FRBp | 162 | / | |---|----|----|----|------|------|-----|----| | 0 | | 6 | 9 | 11 | 16 | 21 | 31 | The exponent value (Ea) of the DFP operand in FRA[p] is compared to the exponent value (Eb) of the DFP operand in FRB [p]. The result of the compare is placed into CR field BF and the $\mathsf{FPSCR}_{\mathsf{FPCC}}$ . The codes in the CR field BF and $FPSCR_{FPCC}$ are defined for the *DFP Test Exponent* operations as follows. | Bit | Description | |-----|-------------| | 0 | Ea < Eb | | 4 | Fa. Fh | 1 Ea > Eb 2 Ea = Eb 3 Ea? Eb #### **Special Registers Altered:** CR field BF FPCC | Operand a in | Actions for Te | est Exponent (Ea:E | Eb) when operand | b in FRB[p] is | | |--------------|----------------------------------------------------------|--------------------------------|------------------|----------------|--| | FRA[p] is | F | ∞ | QNaN | SNaN | | | F | C(Ea:Eb) | AuoB | AuoB | AuoB | | | ∞ | AuoB | AeqB | AuoB | AuoB | | | QNaN | AuoB | AuoB | AeqB | AeqB | | | SNaN | AuoB | AuoB | AeqB | AeqB | | | Explanation: | • | | | | | | C(Ea:Eb) | Algebraic compari | son. See the table | below. | | | | F | All finite numbers, | including zeros | | | | | AeqB | CR field BF and FPSCR <sub>FPCC</sub> are set to 0b0010. | | | | | | AgtB | CR field BF and FPSCR <sub>FPCC</sub> are set to 0b0100. | | | | | | AltB | CR field BF and FPSCR <sub>FPCC</sub> are set to 0b1000. | | | | | | AuoB | | PSCR <sub>EPCC</sub> are set t | | | | | Relation of Value Ea to Value Eb | Action for C(Ea:Eb) | |----------------------------------|---------------------| | Ea = Eb | AeqB | | Ea < Eb | AltB | | Ea > Eb | AgtB | Figure 84. Actions: Test Exponent # DFP Test Significance [Quad] X-form dtstsf BF,FRA,FRB | | 59 | BF | // | FRA | FRB | 674 | / | |---|----|----|----|-----|-----|-----|----| | 0 | | 6 | 9 | 11 | 16 | 21 | 31 | dtstsfq BF,FRA,FRBp | | 63 | BF | // | FRA | FRBp | 674 | / | |---|----|----|----|-----|------|-----|----| | 0 | | 6 | 9 | 11 | 16 | 21 | 31 | Let k be the contents of bits 58:63 of FRA that specifies the reference significance. The number of significant digits of the DFP operand in FRB[p], NSDb, is compared to the reference significance, k. For this instruction, the number of significant digits of the value 0 is considered to be zero. The result of the compare is placed into CR field BF and the FPSCR<sub>EPCC</sub> as follows. #### Bit Description - 0 $k \neq 0$ and k < NSDb - 1 $k \neq 0$ and k > NSDb, or k = 0 - 2 $k \neq 0$ and k = NSDb - 3 k? NSDb #### **Special Registers Altered:** CR field BF FPCC #### Programming Note The reference significance can be loaded into a FPR using a Load Float as Integer Word Algebraic instruction | Actions for Test Significance when the operand in FRB[p] is | | | | | | |-------------------------------------------------------------|----------------------------------------------------------|---------------|-------------|--|--| | F | ∞ | QNaN | SNaN | | | | C(k: NSDb) | AuoB | AuoB | AuoB | | | | Explanation: | | | | | | | C(k: NSDb) Algebraic comparison. See the table below. | | | | | | | F | All finite nu | mbers, includ | ding zeros. | | | | AeqB | CR field BF and FPSCR <sub>FPCC</sub> are set to 0b0010. | | | | | | AgtB | CR field BF and FPSCR <sub>FPCC</sub> are set to 0b0100. | | | | | | AltB | CR field BF and FPSCR <sub>FPCC</sub> are set to 0b1000. | | | | | | AuoB | CR field BF and FPSCR <sub>FPCC</sub> are set to 0b0001. | | | | | | Relation of Value NSDb to Value k | Action for C(k:NSDb) | |----------------------------------------|----------------------| | $k \neq 0$ and $k = NSDb$ | AeqB | | k ≠ 0 and k < NSDb | AltB | | $k \neq 0$ and $k > NSDb$ , or $k = 0$ | AgtB | Figure 85. Actions: Test Significance # 5.6.4 DFP Quantum Adjustment Instructions The Quantum Adjustment operations consist of the Quantize, Quantize Immediate, Reround, and Round To FP Integer operations. The *Quantum Adjustment* instructions are Z23-form instructions and have an immediate RMC (Rounding-Mode-Control) field, which specifies the rounding mode used. For *Quantize*, *Quantize Immediate*, and *Reround*, the RMC field contains the primary encoding. For *Round to FP Integer*, the field contains either pri- mary or secondary encoding, depending on the setting of a RMC-encoding-selection bit. See Section 5.5.2 "Rounding Mode Specification" on page 151 for the definition of RMC encoding. All *Quantum Adjustment* instructions set the FI and FR status flags, and also set the FPSCR<sub>FPRF</sub> field. The record bit is provided to each of these instructions. They return the target operand in a form with the ideal exponent. ## DFP Quantize Immediate [Quad] Z23-form | dquai | TE,FRT,FRB,RMC | (Rc=0) | |--------|----------------|--------| | dquai. | TE,FRT,FRB,RMC | (Rc=1) | | 59 | FRT | TE | FRB | RMC | 67 | Rc | |----|-----|----|-----|-----|----|----| | 0 | 6 | 11 | 16 | 21 | 23 | 31 | | dquaiq | TE,FRTp,FRBp,RMC | (Rc=0) | |---------|------------------|--------| | dquaiq. | TE,FRTp,FRBp,RMC | (Rc=1) | | 63 | FRTp | TE | FRBp | RMC | 67 | Rc | |----|------|----|------|-----|----|----| | 0 | 6 | 11 | 16 | 21 | 23 | 31 | The DFP operand in FRB[p] is converted and rounded to the form with the exponent specified by TE based on the rounding mode specified in the RMC field. TE is a 5-bit signed binary integer. The result of that form is placed in FRT[p]. The sign of the result is the same as the sign of the operand in FRB[p]. The ideal exponent is the exponent specified by TE. When the value of the operand in FRB[p] is greater than $(10^p-1) \times 10^{TE}$ , where p is the format precision, an invalid operation exception is recognized. When the delivered result differs in value from the operand in FRB[p], an inexact exception is recognized. No underflow exception is recognized by this operation, regardless of the value of the operand in FRB[p]. The FPSCR<sub>FPRF</sub> field is always set to the class and sign of the result, except for an enabled invalid-operation exception, in which case the field remains unchanged. #### **Special Registers Altered:** FPRF FR FI FX XX VXSNAN VXCVI CR1 (if Rc=1) #### Programming Note - *DFP Quantize Immediate* can be used to adjust values to a form having the specified exponent in the range -16 to 15. If the adjustment requires the significand to be shifted left, then: - if the result would cause overflow from the most significant digit, the result is a default QNaN.; - otherwise the result is the adjusted value (left shifted with matching exponent). If the adjustment requires the significand to be shifted right, the result is rounded based on the value of the RMC field. DFP Quantize Immediate can round a value to a specific number of fractional digits. Consider the computation of sales tax. Values expressed in U.S. dollars have 2 fractional digits, and sales tax rates typically have 3 fractional digits. The product of value and rate will yield 5 fractional digits. For example: This result needs to be rounded to the penny to compute the correct tax of \$3.00. The following sequence computes the sales tax assuming the pre-tax total is in FRA and the tax rate is in FRB. The *DFP Quantize Immediate* instruction rounds the product (FRA \* FRB) to 2 fractional digits (TE field = -2) using Round to nearest, ties away from 0 (RMC field = 2). The quantized and rounded result is placed in FRT. dmul f0,FRA,FRB dquai -2,FRT,f0,2 | DFP Quantize [Quad] | | | | | | <b>Z</b> 23 | 8-form | |---------------------|------------------------------------|--------|---------|------|----|-------------|------------------| | dqua<br>dqua. | FRT,FRA,FRB,RMC<br>FRT,FRA,FRB,RMC | | | | | | (Rc=0)<br>(Rc=1) | | 59 | FRT | FRA | FRB | RMC | | 3 | Rc | | 0 | 6 | 11 | 16 | 21 | 23 | | 31 | | dquaq | FR | Tp,FRA | p,FRBp | ,RMC | | | (Rc=0) | | dquaq. | FR | Tp,FRA | .p,FRBp | ,RMC | | | (Rc=1) | | 63 | FRTp | FRAp | FRBp | RMC | | 3 | Rc | | 0 | 6 | 11 | 16 | 21 | 23 | | 31 | The DFP operand in register FRB[p] is converted and rounded to the form with the same exponent as that of the DFP operand in FRA[p] based on the rounding mode specified in the RMC field. The result of that form is placed in FRT[p]. The sign of the result is the same as the sign of the operand in FRB[p]. The ideal exponent is the exponent specified in FRA[p]. When the value of the operand in FRB[p] is greater than $(10^p-1) \times 10^{Ea}$ , where p is the format precision and Ea is the exponent of the operand in FRA[p], an invalid operation exception is recognized. When the delivered result differs in value from the operand in FRB[p], an inexact exception is recognized. No underflow exception is recognized by this operation, regardless of the value of the operand in FRB[p]. Figure 87 and Figure 88 summarize the actions. The tables do not include the setting of the $FPSCR_{FPRF}$ field. The $FPSCR_{FPRF}$ field is always set to the class and sign of the result, except for an enabled invalid-operation exception, in which case the field remains unchanged. #### **Special Register Altered:** FPRF FR FI FX XX VXSNAN VXCVI CR1 (if Rc=1) #### **Programming Note** - *DFP Quantize* can be used to adjust one DFP value (FRB[p]) to a form having the same exponent as a second DFP value (FRA[p]). If the adjustment requires the significand to be shifted left, then: - if the result would cause overflow from the most significant digit, the result is a default QNaN.; - otherwise the result is the adjusted value (left shifted with matching exponent). If the adjustment requires the significand to be shifted right, the result is rounded based on the value of the RMC field. Figure 86 shows examples of these adjustments. | FRA | FRB | FRT when RMC=1 | FRT when RMC=2 | |--------------------------------|----------------------------------------|----------------------------------------|----------------------------------------| | 1 (1 x 10 <sup>0</sup> ) | 9. (9 x 10 <sup>0</sup> ) | 9 (9 x 10 <sup>0</sup> ) | 9 (9 x 10 <sup>0</sup> ) | | 1.00 (100 x 10 <sup>-2</sup> ) | 9. (9 x 10 <sup>0</sup> ) | 9.00 (900 x 10 <sup>-2</sup> ) | 9.00 (900 x 10 <sup>-2</sup> ) | | 1 (1 x 10 <sup>0</sup> ) | 49.1234 (491234 x 10 <sup>-4</sup> ) | 49 (49 x 10 <sup>0</sup> ) | 49 (49 x 10 <sup>0</sup> ) | | 1.00 (100 x 10 <sup>-2</sup> ) | 49.1234 (491234 x 10 <sup>-4</sup> ) | 49.12 (4912 x 10 <sup>-2</sup> ) | 49.12 (4912 x 10 <sup>-2</sup> ) | | 1 (1 x 10 <sup>0</sup> ) | 49.9876 (499876 x 10 <sup>-4</sup> ) | 49 (49 x 10 <sup>0</sup> ) | 50 (50 x 10 <sup>0</sup> ) | | 1.00 (100 x 10 <sup>-2</sup> ) | 49.9876 (499876 x 10 <sup>-4</sup> ) | 49.98 (4998 x 10 <sup>-2</sup> ) | 49.99 (4999 x 10 <sup>-2</sup> ) | | 0.01 (1 x 10 <sup>-2</sup> ) | 49.9876 (499876 x 10 <sup>-4</sup> ) | 49.98 (4998 x 10 <sup>-2</sup> ) | 49.99 (4999 x 10 <sup>-2</sup> ) | | 1 (1 x 10 <sup>0</sup> ) | 99999999999999999999999999999999999999 | 99999999999999999999999999999999999999 | 99999999999999999999999999999999999999 | | 1.0 (10 x 10 <sup>-1</sup> ) | 99999999999999999999999999999999999999 | QNaN | QNaN | Figure 86. DFP Quantize examples | Operand a | Actions for Quantize when operand b in FRB[p] is | | | | | | | | | |--------------------|------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------|-------------------------------|---------------------------|---------------------------|--|--|--|--| | in FRA[p] is | 0 | Fn | ∞ | QNaN | SNaN | | | | | | 0 | * | * | V <sub>XCVI</sub> : T(dNaN) | P(b) | V <sub>XSNAN</sub> : U(b) | | | | | | Fn | * | * | V <sub>XCVI</sub> : T(dNaN) | P(b) | V <sub>XSNAN</sub> : U(b) | | | | | | • | V <sub>XCVI</sub> : T(dNaN) | V <sub>XCVI</sub> : T(dNaN) | T(dINF) | P(b) | V <sub>XSNAN</sub> : U(b) | | | | | | QNaN | P(a) | P(a) | P(a) | P(a) | V <sub>XSNAN</sub> : U(b) | | | | | | SNaN | V <sub>XSNAN</sub> : U(a) | V <sub>XSNAN</sub> : U(a) | V <sub>XSNAN</sub> : U(a) | V <sub>XSNAN</sub> : U(a) | V <sub>XSNAN</sub> : U(a) | | | | | | Explanation: | | | | | | | | | | | * | See next table. | | | | | | | | | | dINF | Default infinity | | | | | | | | | | dNaN | Default quiet NaN | | | | | | | | | | Fn | Finite nonzero numb | bers (includes both s | ubnormal and normal | numbers) | | | | | | | P(x) | The QNaN of opera | nd x is propagated a | nd placed in FRT[p] | | | | | | | | T(x) | The value x is place | ed in FRT[p] | | | | | | | | | U(x) | The SNaN of opera | nd x is converted to t | he corresponding QN | laN and placed in F | RT[p]. | | | | | | V <sub>XCVI</sub> | The Invalid-Operation Exception (VXCVI) occurs. The result is produced only when the exception is disabled. (See Section 5.5.10.1 for actions) | | | | | | | | | | V <sub>XSNAN</sub> | | on Exception (VXSNA<br>Section 5.5.10.1 for a | AN) occurs. The resulactions) | t is produced only v | when the exception | | | | | ## Figure 87. Actions (part 1) Quantize | | Actions for Quantize when operand b in FRB[p] is | | | | | | | |---------------------|--------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------|--|--|--|--| | | | 0 | Fn | | | | | | Te < Se | $V_b > (10^p - 1) \times 10^{Te}$ | E(0) | V <sub>XCVI</sub> : T(dNaN) | | | | | | | $V_b \le (10^p - 1) \times 10^{Te}$ | E(0) | L(b) | | | | | | Te = Se | | E(0) | W(b) | | | | | | Te > Se | | E(0) | QR(b) | | | | | | Explanation | n: | | | | | | | | dNaN | Default quiet NaN | | | | | | | | E(0) | The value of zero with th | e exponent value Te is place | d in FRT[p]. | | | | | | L(x) | The operand x is converted to the form with the exponent value Te. | | | | | | | | р | The precision of the form | at. | | | | | | | QR(x) | • | d to the result of the form with esult of that form is placed in | the exponent value Te based on the specified FRT[p]. | | | | | | Se | The exponent of the ope | rand in FRB[p]. | | | | | | | Te | The target exponent; FR | A[p] for <i>dqua</i> [ <i>q</i> ], or TE, a 5-b | it signed binary integer for <i>dquai</i> [ <i>q</i> ]. | | | | | | T(x) | The value x is placed in I | FRT[p]. | | | | | | | $V_b$ | The value of the operand | l in FRB[p]. | | | | | | | W(x) | The value and the form of | of operand x is placed in FRT | [p]. | | | | | | V <sub>XCVI</sub> : | • | The Invalid-Operation Exception (VXCVI) occurs. The result is produced only when the exception is disabled. (See Section 5.5.10.1 for actions.) | | | | | | ## Figure 88. Actions (part2) Quantize #### DFP Reround [Quad] Z23-form | drrnd<br>drrnd. | FRT,FRA,FRB,RMC<br>FRT,FRA,FRB,RMC | | | | | | (Ro | :=0)<br>c=1 | |-------------------|------------------------------------|-----|------------------|-----|----|----|-----|--------------| | 59 | FRT | FRA | FRB | RMC | | 35 | | Rc | | 0 | 6 | 11 | 16 | 21 | 23 | | | 31 | | drrndq<br>drrndq. | | | ,FRBp,<br>,FRBp, | | | | (Ro | :=0)<br>:=1) | | Į | 0 | 6 | 11 | 16 | 21 | 23 | 31 | |---|---|------|-----|----|----|----|----| | ı | | FRIP | FKA | | | 35 | RC | Let k be the contents of bits 58:63 of FRA that specifies the reference significance. When the DFP operand in FRB[p] is a finite number, and if the reference significance is zero, or if the reference significance is nonzero and the number of significant digits of the source operand is less than or equal to the reference significance, then the value and the form of the source operand is placed in FRT[p]. If the reference significance is nonzero and the number of significant digits of the source operand is greater than the reference significance, then the source operand is converted and rounded to the number of significant digits specified in the reference significance based on the rounding mode specified in the RMC field. The result of the form with the specified number of significant digits is placed in FRT[p]. The sign of the result is the same as the sign of the operand in FRB[p]. For this instruction, the number of significant digits of the value 0 is considered to be zero. The ideal exponent is the greater value of the exponent of the operand in FRB[p] and the referenced exponent. The referenced exponent is the resultant exponent if the operand in FRB[p] would have been converted and rounded to the number of significant digits specified in the reference significance based on the rounding mode specified in the RMC field. If the exponent of the rounded result of the form that has the specified number of significant digits would be greater than $X_{max}$ , an invalid operation exception (VXCVI) occurs. When the invalid-operation exception occurs, and if the exception is disabled, a default QNaN is returned. When an invalid-operation exception occurs, no inexact exception is recognized. In the absence of an invalid-operation exception, if the result differs in value from the operand in FRB[p], an inexact exception is recognized. This operation causes neither an overflow nor an underflow exception. Figure 90 summarizes the actions for *Reround*. The table does not include the setting of the FPSCR<sub>FPRF</sub> field. The FPSCR<sub>FPRF</sub> field is always set to the class and sign of the result, except for an enabled invalid- operation exception, in which case the field remains unchanged. #### **Special Registers Altered:** ``` FPRF FR FI FX XX VXSNAN VXCVI CR1 (if Rc=1) ``` #### Programming Note - DFP Reround can be used to adjust a DFP value (FRB[p]) to have no more than a specified number (FRA[p]58:63) of significant digits. The result (FRT[p]) is right-justified leaving the specified number of digits and rounded as specified by the RMC field. If rounding increases the number of significant digits, the result is adjusted again (the significand is shifted right 1 digit and the exponent is incremented by 1). Figure 89 has example results from DFP Reround for 1, 2, and 10 significant digits. #### Programming Note DFP Reround is primarily used to round a DFP value to a specific number of digits before conversion to string format for printing or display. Another use for DFP Reround is to obtain the effective exponent of the most significant digit by specifying a reference significance of 1. The exponent can be extracted and used to compute the number of significant digits or to left-justify a value. For example, the following sequence computes the number of significant digits and returns it as an integer. FRB is the DFP value for which we want the number of significant digits; f13 contains the reference significance value 0x000000000000001; and r1 is the stack pointer, with free space for doublewords at offsets -8 and -16. These doublewords are used to transfer the biased exponents from the FPRs to GPRs for integer computation. R3 contains the result of E(reround(1,FRA)) - E(FRA) + 1, where E(x) represents the biased exponent of x. ``` dxex f0,FRB stfd f0,-16(r1) drrnd f1,f13,FRB,1 # reround 1 digit toward 0 dxex f1,f1 stfd f1,-8(r1) lfd r11,-16(r1) lfd r3,-8(r1) subf r3,r11,r3 addi r3,r3,1 ``` Given the value 412.34 the result is $E(4 \times 102) - E(41234 \times 10-2) + 1 = (398+2) - (398-2) + 1 = 400 - 396 + 1 = 5$ . Additional code is required to detect and handle special values like Subnormal, Infinity, and NAN. | ı | FRA <sub>58:63</sub> (binary) | FRB | FRT when RMC=1 | FRT when RMC=2 | |---|-------------------------------|----------------------------------------|----------------------------------------------------|---------------------------------------------------| | ı | 1 | 0.41234 (41234 × 10 <sup>-5</sup> ) | 0.4 (4 × 10 <sup>-1</sup> ) | 0.4 (4 × 10 <sup>-1</sup> ) | | ı | 1 | 4.1234 (41234 × 10 <sup>-4</sup> ) | 4 (4 × 10 <sup>0</sup> ) | $4 (4 \times 10^{0})$ | | ı | 1 | 41.234 (41234 × 10 <sup>-3</sup> ) | 4 (4 × 10 <sup>1</sup> ) | 4 (4×10 <sup>1</sup> ) | | ı | 1 | 412.34 (41234 × 10 <sup>-2</sup> ) | $4 (4 \times 10^2)$ | $4 (4 \times 10^2)$ | | ı | 2 | 0.491234 (491234 × 10 <sup>-6</sup> ) | 0.49 (49 × 10 <sup>-2</sup> ) | 0.49 (49 × 10 <sup>-2</sup> ) | | I | 2 | 0.499876 (499876 × 10 <sup>-6</sup> ) | 0.49 (49 × 10 <sup>-2</sup> ) | 0.50 (50 × 10 <sup>-2</sup> ) | | ı | 2 | 0.999876 (999876 × 10 <sup>-6</sup> ) | 0.99 (99 × 10 <sup>-2</sup> ) | 1.0 (10 × 10 <sup>-1</sup> ) | | ı | 10 | 0.491234 (491234 × 10 <sup>-6</sup> ) | 0.491234 (491234 × 10 <sup>-6</sup> ) | 0.491234 (491234 × 10 <sup>-6</sup> ) | | I | 10 | 999.999 (999999 × 10 <sup>-3</sup> ) | 999.999 (999999 × 10 <sup>-3</sup> ) | 999.999 (999999 × 10 <sup>-3</sup> ) | | | 10 | 99999999999999999999999999999999999999 | 9.999999999E+14<br>(9999999999 × 10 <sup>5</sup> ) | 1.00000000E+15<br>(1000000000 × 10 <sup>6</sup> ) | Figure 89. DFP Reround examples #### Programming Note - DFP Reround combined with DFP Quantize can be used to left justify a value (as needed by the frexp function). FRB is the DFP value for which we want to left justify; f13 contains the reference significance value 0x000000000000000001; and r1 is the stack pointer, with free space for a doubleword at offset -8. This doubleword is used to transfer the biased exponents from the FPR to a GPR, for integer computation. The adjusted biased exponent (+ format precision - 1) is transferred back into an FPR so it can be inserted into the rerounded value. The adjusted rerounded value becomes the quantize reference value. The quantize instruction returns the left justified result in FRT. ``` drrnd f1,f13,FRB,1 # reround 1 digit toward 0 dxex f0,f1 stfd f0, -8(r1) lfd r11,-8(r1) addi r11,r11,15 # biased exp + precision - 1 lfd r11,-8(r1) stfd f0,-8(r1) diex f1,f0,f1 # adjust exponent FRT,f1,f0,1 # quantize to adjusted dqua exponent ``` | | | Actions for Reround when operand b in FRB[p] is | | | | | | | | |----------------------------------------|-------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------|-------------------|--------------------|---------------------------|--|--|--|--| | | 0* | Fn | ∞ | QNaN | SNaN | | | | | | k ≠ 0, k < m | - | RR(b) or<br>V <sub>XCVI</sub> : T(dNaN) | T(dINF) | P(b) | V <sub>XSNAN</sub> : U(b) | | | | | | k ≠ 0, k = m | - | W(b) | T(dINF) | P(b) | V <sub>XSNAN</sub> : U(b) | | | | | | $k \neq 0$ and $k > m$ ,<br>or $k = 0$ | W(b) | | | P(b) | V <sub>XSNAN</sub> : U(b) | | | | | | Explanation: | 1 | <del>'</del> | | 1 | - | | | | | | * | The number of s | ignificant digits of the | value 0 is consi | dered to be zero f | or this instruction. | | | | | | - | Not applicable. | | | | | | | | | | dINF | Default infinity. | | | | | | | | | | Fn | Finite nonzero n | Finite nonzero numbers (includes both subnormal and normal numbers). | | | | | | | | | k | Reference significance, which specifies the number of significant digits in the target operand. | | | | | | | | | | m | Number of signif | icant digits in the ope | erand in FRB[p]. | | | | | | | | P(x) | The QNaN of op | erand x is propagate | d and placed in F | FRT[p]. | | | | | | | RR(x) | | unded to the form the $(-1) \times 10^{X_{max}}$ , then R zed. | | | | | | | | | T(x) | The value x is plant | aced in FRT[p]. | | | | | | | | | U(x) | The SNaN of ope | erand x is converted | to the correspon | ding QNaN and pl | aced in FRT[p]. | | | | | | $V_{XCVI}$ | • | The Invalid-Operation Exception (VXCVI) occurs. The result is produced only when the exception is disabled. (See Section 5.5.10.1 for actions.) | | | | | | | | | V <sub>XSNAN</sub> : | • | ration Exception (VXS sabled. See Section | • | • | ced only when the | | | | | | W(x) | The value and th | e form of x is placed | in FRT[p]. | | | | | | | Figure 90. Actions: Reround # DFP Round To FP Integer With Inexact [Quad] Z23-form | drintx | R,FRT,FRB,RMC | (Rc=0) | |---------|---------------|--------| | drintx. | R,FRT,FRB,RMC | (Rc=1) | | | | | | 59 | FRT | /// | R | FRB | RMC | 99 | Rc | |----|-----|-----|----|-----|-----|----|----| | 0 | 6 | 11 | 15 | 16 | 21 | 23 | 31 | | drintxq | R,FRTp,FRBp,RMC | (Rc=0) | |----------|-----------------|--------| | drintxq. | R,FRTp,FRBp,RMC | (Rc=1) | | | 63 | FRTp | /// | R | FRBp | RMC | 99 | Rc | |---|----|------|-----|----|------|-----|----|----| | 0 | | 6 | 11 | 15 | 16 | 21 | 23 | 31 | The DFP operand in FRB[p] is rounded to a floating-point integer and placed into FRT[p]. The sign of the result is the same as the sign of the operand in FRB[p]. The ideal exponent is the larger value of zero and the exponent of the operand in FRB[p]. The rounding mode used is specified in the RMC field. When the RMC-encoding-selection (R) bit is zero, the RMC field contains the primary encoding; when the bit is one, the field contains the secondary encoding. In addition to coercion of the converted value to fit the target format, the special rounding used by *Round To FP Integer* also coerces the target exponent to the ideal exponent. When the operand in FRB[p] is a finite number and the exponent is less than zero, the operand is rounded to the result with an exponent of zero. When the exponent is greater than or equal to zero, the result is set to the numerical value and the form of the operand in FRB[p]. When the result differs in value from the operand in FRB[p], an inexact exception is recognized. No underflow exception is recognized by this operation, regardless of the value of the operand in FRB[p]. Figure 91 summarizes the actions for *Round To FP Integer With Inexact*. The table does not include the setting of the FPSCR<sub>FPRF</sub> field. The FPSCR<sub>FPRF</sub> field is always set to the class and sign of the result, except for an enabled invalid-operation, in which case the field remains unchanged. #### **Special Registers Altered:** FPRF FR FI FX XX VXSNAN CR1 (if Rc=1) #### Programming Note – The DFP Round To FP Integer With Inexact and DFP Round To FP Integer With Inexact Quad instructions can be used to implement the decimal equivalent of the C99 rint function by specifying the primary RMC encoding for round according to FPSCR<sub>DRN</sub> (R=0, RMC=11). The specification for rint requires the inexact exception be raised if detected. | Operand b | Is n not pre- | InvOp. | Inexact | Is n Incre- | | | |--------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------|----------------------|--------------------------|---------------------------------------------------------------------------------------------------------------|--| | in FRB is | cise (n ≠ b) | Exception<br>Enabled | Exception<br>Enabled | mented<br>( n > b ) | Actions* | | | -∞ | No <sup>1</sup> | - | - | - | $T(-dINF)$ , $FI \leftarrow 0$ , $FR \leftarrow 0$ | | | F | No | - | - | - | W(n), FI ← 0, FR ← 0 | | | F | Yes | - | No | No | $W(n)$ , $FI \leftarrow 1$ , $FR \leftarrow 0$ , $XX \leftarrow 1$ | | | F | Yes | - | No | Yes | W(n), FI ← 1, FR ← 1, XX ← 1 | | | F | Yes | - | Yes | No | W(n), FI ← 1, FR ← 0, XX ← 1, TX | | | F | Yes | - | Yes | Yes | W(n), FI ← 1, FR ← 1, XX ← 1, TX | | | +∞ | No <sup>1</sup> | - | - | - | T(+dINF), FI ← 0, FR ← 0 | | | QNaN | No <sup>1</sup> | - | - | - | P(b), FI $\leftarrow$ 0, FR $\leftarrow$ 0 | | | SNaN | No <sup>1</sup> | No | - | - | U(b), FI ← 0, FR ← 0, VXSNAN ← 1 | | | SNaN | No <sup>1</sup> | Yes | - | - | VXSNAN ← 1, TV | | | Explanation: | Setting of XX a operation ex | ception occurs | s, setting of FI a | and FR is part | exception actions. Also, when an invalid-<br>of the exception actions.(See the sections,<br>or more details.) | | | - | The actions do | not depend o | n this condition | ٦. | | | | 1 | This condition | is true by virtu | e of the state of | of some conditi | ion to the left of this column. | | | dINF | Default infinity | | | | | | | F | All finite numb | ers, including a | zeros. | | | | | FI | Floating-Point | | • | • • | | | | FR | Floating-Point | -Fraction-Rour | nded status flaç | g, FPSCR <sub>FR</sub> . | | | | n | | ved when the so <i>FP Integer</i> . | source operan | d, b, is rounde | d to an integer using the special rounding | | | P(x) | The QNaN of | operand x is pr | opagated and | placed in FRT | [p]. | | | T(x) | The value x is | placed in FRT | [p]. | | | | | TV | The system floating-point enabled exception error handler is invoked for the invalid-operation exception if the FE0 and FE1 bits in the machine-state register are set to any mode other than the ignore-exception mode. | | | | | | | TX | The system floating-point enabled exception error handler is invoked for the inexact exception if the FE0 and FE1 bits in the machine-state register are set to any mode other than the ignore-exception mode. | | | | | | | U(x) | The SNaN of | operand x is co | nverted to the | corresponding | g QNaN and placed in FPT[p]. | | | W(x) | The value x in | the form of ze | ro exponent or | the source ex | ponent is placed in FRT[p]. | | | XX | Floating-Point | -Inexact-Excep | otion status flag | g, FPSCR <sub>XX</sub> . | | | Figure 91. Actions: Round to FP Integer With Inexact # DFP Round To FP Integer Without Inexact [Quad] Z23-form $\begin{array}{lll} \mbox{drintn} & \mbox{R,FRT,FRB,RMC} & \mbox{(Rc=0)} \\ \mbox{drintn.} & \mbox{R,FRT,FRB,RMC} & \mbox{(Rc=1)} \end{array}$ | 59 | FRT | /// | R | FRB | RMC | 227 | Rc | |----|-----|-----|----|-----|-----|-----|----| | 0 | 6 | 11 | 15 | 16 | 21 | 23 | 31 | $\begin{array}{ll} \mbox{drintnq} & \mbox{R,FRTp,FRBp,RMC} & \mbox{(Rc=0)} \\ \mbox{drintnq.} & \mbox{R,FRTp,FRBp,RMC} & \mbox{(Rc=1)} \end{array}$ | | 63 | FRTp | /// | R | FRBp | RMC | 227 | Rc | |---|----|------|-----|----|------|-----|-----|----| | 0 | | 6 | 11 | 15 | 16 | 21 | 23 | 31 | This operation is the same as the *Round To FP Integer With Inexact* operation, except that this operation does not recognize an inexact exception. Figure 92 summarizes the actions for *Round To FP Integer Without Inexact*. The table does not include the setting of the FPSCR<sub>FPRF</sub> field. The FPSCR<sub>FPRF</sub> field is always set to the class and sign of the result, except for an enabled invalid-operation, in which case the field remains unchanged. #### **Special Registers Altered:** FPRF FR (set to 0) FI (set to 0) FX VXSNAN CR1 (if Rc=1) #### **Programming Note -** The DFP Round To FP Integer Without Inexact and DFP Round To FP Integer Without Inexact Quad instructions can be used to implement decimal equivalents of several C99 rounding functions by specifying the appropriate R and RMC field values. | Function | R | RMC | |-----------|---|------| | Ceil | 1 | 0b00 | | Floor | 1 | 0b01 | | Nearbyint | 0 | 0b11 | | Round | 0 | 0b10 | | Trunc | 0 | 0b01 | Note that nearbyint is similar to the rint function but without raising the inexact exception. Similarly ceil, floor, round, and trunc do not require the inexact exception. | Operand b in FRB is | InvOp. Exception<br>Enabled | Actions* | |---------------------|-----------------------------|----------------------------------------------------| | -∞ | - | $T(-dINF)$ , $FI \leftarrow 0$ , $FR \leftarrow 0$ | | F | - | $W(n)$ , $FI \leftarrow 0$ , $FR \leftarrow 0$ | | +∞ | - | $T(+dINF)$ , $FI \leftarrow 0$ , $FR \leftarrow 0$ | | QNaN | - | $P(b)$ , $FI \leftarrow 0$ , $FR \leftarrow 0$ | | SNaN | No | U(b), FI ← 0, FR ← 0, VXSNAN←1 | | SNaN | Yes | VXSNAN ← 1, TV | #### Explanation: - \* Setting of VXSNAN is part of the corresponding exception actions. Also, when an invalid-operation exception occurs, setting of FI and FR bits is part of the exception actions. (See the sections, "Invalid Operation Exception" for more details.) - The actions do not depend on this condition. dINF Default infinity. F All finite numbers, including zeros. FI Floating-Point-Fraction-Inexact status flag, FPSCR<sub>FI</sub>. FR Floating-Point-Fraction-Rounded status flag, FPSCR<sub>FR</sub>. n The value derived when the source operand, b, is rounded to an integer using the special rounding for Round-To-FP-Integer. P(x) The QNaN of operand x is propagated and placed in FRT[p]. T(x) The value x is placed in FRT[p]. TV The system floating-point enabled exception error handler is invoked for the invalid-operation exception if the FE0 and FE1 bits in the machine-state register are set to any mode other than the ignore-exception mode. U(x) The SNaN of operand x is converted to the corresponding QNaN and placed in FPT[p]. W(x) The value x in the form of zero exponent or the source exponent is placed in FRT[p]. Figure 92. Actions: Round to FP Integer Without Inexact ## 5.6.5 DFP Conversion Instructions The DFP conversion instructions consist of data-format conversion instructions and data-type conversion instructions. They are all X-form instructions and employ the record bit (Rc). # 5.6.5.1 DFP Data-Format Conversion Instructions The data-format conversion instructions consist of *Convert To DFP Long*, *Convert To DFP Extended*, *Round To DFP Short*, and *Round To DFP Long*. Figure 93 summarizes the actions for these instructions. #### Programming Note DFP does not provide operations on short operands, so they must be converted to long format, and then converted back to be stored. Preserving correct signaling NaN semantics requires that signaling NaNs be propagated from the source to the result without recognizing an exception during widening from short to long or narrowing from long to short. Because DFP does not provide equivalents to the BFP Load Floating-Point Single and Store Floating-Point Single functions, the widening is performed by loading the DFP short value with a Load Floating as Integer Word Indexed followed by a DFP Convert to DFP Long, and narrowing is performed by a DFP Round to DFP Short followed by a Store Floating-Point as Integer Word Indexed. If the SNaN or infinity in DFP short format uses the preferred DPD encoding, then converting this operand to DFP long format and back to DFP short will result in the original bit pattern. | Instruction | | | Actions when ope | rand b in FRB[p] | is | | | |--------------------|---------------------------------------------------------------------------|---------------------------------------------------------------|--------------------------|---------------------|------------------------------------------|--|--| | | | F | ∞ | QNaN | SNaN | | | | Convert To DF | P Long | T(b) <sup>1</sup> | P(b) <sup>2,4</sup> | P(b) <sup>2,4</sup> | P(b) <sup>3,4</sup> | | | | Convert To DF | P Extended | T(b) <sup>1</sup> | T(dINF) | P(b) <sup>2,4</sup> | V <sub>XSNAN</sub> : U(b) <sup>2,4</sup> | | | | Round To DFF | P Short | R(b) <sup>1</sup> | P(b) <sup>2,5</sup> | P(b) <sup>2,5</sup> | P(b) <sup>3,5</sup> | | | | Round To DFF | P Long | R(b) <sup>1</sup> | T(dINF) | P(b) <sup>2,5</sup> | V <sub>XSNAN</sub> : U(b) <sup>2,5</sup> | | | | Explanation: | | | | | <u>.</u> | | | | 1 | The ideal expor | nent is the exponent | of the source operan | d. | | | | | 2 | Bits 5:N-1 of the | e N-bit combination | field are set to zero. | | | | | | 3 | Bit 5 of the N-bi | t combination field is | s set to one. Bits 6:N-1 | of the combination | on field are set to zero. | | | | 4 | The trailing sign | ificand field is padd | ed on the left with zer | os. | | | | | 5 | Leftmost digits i | n the trailing signific | cand field are removed | d. | | | | | dINF | Default infinity. | | | | | | | | F | All finite numbe | rs, including zeros. | | | | | | | P(x) | The special syn | The special symbol in operand x is propagated into FRT[p]. | | | | | | | R(x) | The value x is rounded to the target-format precision; see Section 5.5.11 | | | | | | | | T(x) | The value x is placed in FRT[p]. | | | | | | | | U(x) | The SNaN of or | The SNaN of operand x is converted to the corresponding QNaN. | | | | | | | V <sub>XSNAN</sub> | • | eration Exception (V<br>See Section 5.5.10 | • | result is produced | only when the excep- | | | Figure 93. Actions: Data-Format Conversion Instructions #### DFP Convert To DFP Long X-form | dctdp | FRT,FRB | (Rc=0) | |--------|---------|--------| | dctdp. | FRT,FRB | (Rc=1) | | 59 | FRT | /// | FRB | 258 | Rc | |----|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | The DFP short operand in bits 32-63 of FRB is converted to DFP long format and the converted result is placed into FRT. The sign of the result is the same as the sign of the source operand. The ideal exponent is the exponent of the source operand. If the operand in FRB is an SNaN, it is converted to an SNaN in DFP long format and does not cause an invalid-operation exception. #### **Special Registers Altered:** #### Programming Note Note that DFP short format is a storage-only format, Therefore, conversion of a short SNaN to long format will not cause an exception and the SNaN is preserved. Subsequent operation on that SNaN in long format will cause an exception. #### DFP Convert To DFP Extended X-form | dctqpq | FRTp,FRB | (Rc=0) | |---------|----------|--------| | dctqpq. | FRTp,FRB | (Rc=1) | | 63 | FRTp | /// | FRB | 258 | Rc | |----|------|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | The DFP long operand in the FRB is converted to DFP extended format and placed into FRTp. The sign of the result is the same as the sign of the operand in FRB. The ideal exponent is the exponent of the operand in FRB. If the operand in FRB is an SNaN, an invalid-operation exception is recognized. If the exception is disabled, the SNaN is converted to the corresponding QNaN in DFP extended format. #### Special Registers Altered: ``` FPRF FR (set to 0) FI (set to 0) FX VXSNAN CR1 (if Rc=1) ``` #### DFP Round To DFP Short X-form | drsp | FRT,FRB | (Rc=0) | |-------|---------|--------| | drsp. | FRT,FRB | (Rc=1) | | | 59 | FRT | /// | FRB | 770 | Rc | |---|----|-----|-----|-----|-----|----| | 0 | | 6 | 11 | 16 | 21 | 31 | The DFP long operand in FRB is converted and rounded to DFP short format. The DFP short value is extended on the left with zeros to form a 64-bit entity and placed into FRT. The sign of the result is the same as the sign of the source operand. The ideal exponent is the exponent of the source operand. If the operand in FRB is an SNaN, it is converted to an SNaN in DFP short format and does not cause an invalid-operation exception. Normally, the result is in the format and length of the target. However, when an overflow or underflow exception occurs and if the exception is enabled, the operation is completed by producing a wrapped rounded result in the same format and length as the source but rounded to the target-format precision. #### **Special Registers Altered:** #### **Programming Note** Note that DFP short format is a storage-only format, Therefore, conversion of a long SNaN to short format will not cause an exception. Converting a long format SNaN to short format is an implied move operation. #### DFP Round To DFP Long X-form | drdpq | FRTp,FRBp | (Rc=0) | |--------|-----------|--------| | drdpq. | FRTp,FRBp | (Rc=1) | | 63 | FRTp | /// | FRBp | 770 | Rc | |----|------|-----|------|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | The DFP extended operand in FRBp is converted and rounded to DFP long format. The result concatenated with 64 0s is placed in FRTp. The sign of the result is the same as the sign of the source operand. The ideal exponent is the exponent of the operand in FRBp. If the operand in FRBp is an SNaN, an invalid-operation exception is recognized. If the exception is disabled, the SNaN is converted to the corresponding QNaN in DFP long format. Normally, the result is in the format and length of the target. However, when an overflow or underflow exception occurs and if the exception is enabled, the operation is completed by producing a wrapped rounded result in the same format and length as the source but rounded to the target-format precision. #### **Special Registers Altered:** #### Programming Note - Note that DFP Round to DFP Long, while producing a result in DFP long format, actually targets a register pair, writing 64 0s in FRTp+1. # 5.6.5.2 DFP Data-Type Conversion Instructions The DFP data-type conversion instructions are used to convert data type between DFP and fixed. The data-type conversion instructions consist of *Convert From Fixed* and *Convert To Fixed*. #### DFP Convert From Fixed Quad X-form | dcffixq | FRTp,FRB | (Rc=0) | |----------|----------|--------| | dcffixq. | FRTp,FRB | (Rc=1) | | 63 | FRTp | /// | FRB | 802 | Rc | |----|------|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | The 64-bit signed binary integer in FRB is converted and rounded to a DFP Extended value and placed into FRTp. The sign of the result is the same as the sign of the source operand. The ideal exponent is zero. If the source operand is a zero, then a plus zero with a zero exponent is returned. The following table summarizes the actions for *Convert From Fixed*. The table does not include the setting of the FPSCR<sub>FPRF</sub> field. The FPSCR<sub>FPRF</sub> field is always set to the class and sign of the result. #### Special Registers Altered: #### DFP Convert To Fixed [Quad] X-form | dctfix | FRT,FRB | (Rc=0) | |---------|---------|--------| | dctfix. | FRT,FRB | (Rc=1) | | 59 | FRT | /// | FRB | 290 | Rc | |----|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | dctfixq FRT,FRBp (Rc=0) dctfixq. FRT,FRBp (Rc=1) | 63 | FRT | /// | FRBp | 290 | Rc | |----|-----|-----|------|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | The DFP operand in FRB[p] is rounded to an integer value and is placed into FRT in the 64-bit signed binary integer format. The sign of the result is the same as the sign of the source operand, except when the source operand is a NaN or a zero. Figure 94 summarizes the actions for *Convert To Fixed*. #### **Special Registers Altered:** FPRF (undefined) FR FI FX XX VXSNAN VXCVI CR1 (if Rc=1) #### **Programming Note** It is recommended that software pre-round the operand to a floating-point integral using *drintx*[*q*] or *drintn*[*q*] is a rounding mode other than the current rounding mode specified by FPSCR<sub>DRN</sub> is needed. Saving, modifying and restoring the FPSCR just to temporarily change the rounding mode is less efficient than just employing drintx[p] or drint[p] which override the current rounding mode using an immediate control field. For example if the desired function rounding is Round to Nearest, Ties away from 0 but the default rounding (from FPSCR<sub>DRN</sub>) is Round to Nearest, Ties to Even then following is preferred. drintn 0,f1,f1,2 dctfix f1,f1 | Operand b in FRB[p] is | q is | Is n not precise (n ≠ b) | InvOp.<br>Except.<br>Enabled | Inexact<br>Except.<br>Enabled | Is n Incre-<br>mented<br>( n > b ) | Actions * | | | |------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|--------------------------|------------------------------|-------------------------------|--------------------------------------|----------------------------------------------------------------------------|--|--| | -∞ ≤ b < MN | < MN | - | No | - | - | T(MN), FI ← 0, FR ← 0, VXCVI ← 1 | | | | -∞ ≤ b < MN | < MN | - | Yes | - | - | VXCVI ← 1, TV | | | | -∞ < b < MN | = MN | - | - | No | - | T(MN), FI ← 1, FR ← 0, XX ← 1 | | | | -∞ < b < MN | = MN | - | - | Yes | - | T(MN), FI $\leftarrow$ 1, FR $\leftarrow$ 0, XX $\leftarrow$ 1,TX | | | | $MN \le b < 0$ | - | No | - | - | - | T(n), FI ← 0, FR ← 0 | | | | $MN \le b < 0$ | - | Yes | - | No | No | T(n), FI ← 1, FR ← 0, XX ← 1 | | | | $MN \le b < 0$ | - | Yes | - | No | Yes | T(n), FI ← 1, FR ← 1, XX ← 1 | | | | $MN \le b < 0$ | - | Yes | - | Yes | No | T(n), FI $\leftarrow$ 1, FR $\leftarrow$ 0, XX $\leftarrow$ 1, TX | | | | $MN \le b < 0$ | - | Yes | - | Yes | Yes | T(n), FI ← 1, FR ← 1, XX ← 1, TX | | | | ±0 | - | No | - | - | - | T(0), FI ← 0, FR ← 0 | | | | 0 < b ≤ MP | - | No | - | - | - | T(n), FI ← 0, FR ← 0 | | | | 0 < b ≤ MP | - | Yes | - | No | No | T(n), FI ← 1, FR ← 0, XX ← 1 | | | | 0 < b ≤ MP | - | Yes | - | No | Yes | T(n), FI ← 1, FR ← 1, XX ← 1 | | | | $0 < b \le MP$ | - | Yes | - | Yes | No | T(n), FI $\leftarrow$ 1, FR $\leftarrow$ 0, XX $\leftarrow$ 1, TX | | | | 0 < b ≤ MP | - | Yes | - | Yes | Yes | T(n), FI ← 1, FR ← 1, XX ← 1, TX | | | | MP < b < +∞ | = MP | - | - | No | - | T(MP), FI ← 1, FR ← 0, XX ← 1 | | | | MP < b < +∞ | = MP | - | - | Yes | - | $T(MP)$ , $FI \leftarrow 1$ , $FR \leftarrow 0$ , $XX \leftarrow 1$ , $TX$ | | | | $MP < b \le +\infty$ | > MP | - | No | - | - | $T(MP)$ , $FI \leftarrow 0$ , $FR \leftarrow 0$ , $VXCVI \leftarrow 1$ | | | | MP < b ≤ +∞ | > MP | - | Yes | - | - | VXCVI ← 1, TV | | | | QNaN | - | - | No | - | - | T(MN), FI←0, FR←0, VXCVI←1 | | | | QNaN | - | - | Yes | - | - | VXCVI←1, TV | | | | SNaN | - | - | No | - | - | T(MN),FI←0, FR←0, VXCVI←1,VXSNAN ←1 | | | | SNaN | - | - | Yes | - | - | VXCVI←1,VXSNAN ← 1, TV | | | | Explanation: * Setting of XX, VXCVI, and VXSNAN is part of the corresponding exception actions. Also, when an invalid-operation exception occurs, setting of FI and FR bits is part of the exception actions. (See the sections, "Inexact Exception" and "Invalid Operation Exception" for more details.) - The actions do not depend on this condition. | | | | | | | | | - FI Floating-Point-Fraction-Inexact status flag, FPSCR<sub>FI</sub>. - FR Floating-Point-Fraction-Rounded status flag, FPSCR<sub>FR</sub>. - MN Maximum negative number representable by the 64-bit binary integer format - MP Maximum positive number representable by the 64-bit binary integer format. - n The value g converted to a fixed-point result. - q The value derived when the source value b is rounded to an integer using the specified rounding mode - T(x) The value x is placed in FRT[p]. - TV The system floating-point enabled exception error handler is invoked for the invalid-operation exception if the FE0 and FE1 bits in the machine-state register are set to any mode other than the ignore-exception mode. - TX The system floating-point enabled exception error handler is invoked for the inexact exception if the FE0 and FE1 bits in the machine-state register are set to any mode other than the ignore-exception mode. - VXCVI The FPSCR<sub>VXCVI</sub> invalid operation exception status bit. - VXSNAN The FPSCR<sub>VXSNAN</sub> invalid operation exception status bit. - XX Floating-Point-Inexact-Exception status flag, FPSCR<sub>XX</sub> Figure 94. Actions: Convert To Fixed ## 5.6.6 DFP Format Instructions The DFP format instructions are used to compose or decompose a DFP operand. A source operand of SNaN does not cause an invalid-operation exception. All format instructions employ the record bit (Rc). The format instructions consist of Decode DPD To BCD, Encode BCD To DPD, Extract Biased Exponent, Insert Biased Exponent, Shift Significand Left Immediate, and Shift Significand Right Immediate. #### DFP Decode DPD To BCD [Quad] X-form | ddedpd | SP,FRT,FRB | (Rc=0) | |---------|------------|--------| | ddedpd. | SP,FRT,FRB | (Rc=1) | | 59 | FRT | SP | /// | FRB | 322 | Rc | |----|-----|----|-----|-----|-----|----| | 0 | 6 | 11 | 13 | 16 | 21 | 31 | $\begin{array}{ll} \mbox{ddedpdq} & \mbox{SP,FRTp,FRBp} & (\mbox{Rc=0}) \\ \mbox{ddedpdq}. & \mbox{SP,FRTp,FRBp} & (\mbox{Rc=1}) \end{array}$ | 63 | FRTp | SP | /// | FRBp | 322 | Rc | |----|------|----|-----|------|-----|----| | 0 | 6 | 11 | 13 | 16 | 21 | 31 | A portion of the significand of the DFP operand in FRB[p] is converted to a signed or unsigned BCD number depending on the SP field. For infinity and NaN, the significand is considered to be the contents in the trailing significand field padded on the left by a zero digit. #### $SP_0 = 0$ (unsigned conversion) The rightmost 16 digits of the significand (32 digits for *ddedpdq*) is converted to an unsigned BCD number and the result is placed into FRT[p]. #### $SP_0 = 1$ (signed conversion) The rightmost 15 digits of the significand (31 digits for ddedpdq) is converted to a signed BCD number with the same sign as the DFP operand, and the result is placed into FRT[p]. If the DFP operand is negative, the sign is encoded as 0b1101. If the DFP operand is positive, SP<sub>1</sub> indicates which preferred plus sign encoding is used. If SP<sub>1</sub> = 0, the plus sign is encoded as 0b1100 (the option-1 preferred sign code), otherwise the plus sign is encoded as 0b1111(the option-2 preferred sign code). #### **Special Registers Altered:** CR1 (if Rc=1) # DFP Encode BCD To DPD [Quad] X-form | denbcd | S,FRT,FRB | (Rc=0) | |---------|-----------|--------| | denbcd. | S,FRT,FRB | (Rc=1) | | 59 | FRT | S | /// | FRB | 834 | Rc | |----|-----|----|-----|-----|-----|----| | 0 | 6 | 11 | 12 | 16 | 21 | 31 | | denbcdq | S,FRTp,FRBp | (Rc=0) | |----------|-------------|--------| | denbcdq. | S,FRTp,FRBp | (Rc=1) | | 63 | FRTp | S | /// | FRBp | 834 | Rc | |----|------|----|-----|------|-----|----| | 0 | 6 | 11 | 12 | 16 | 21 | 31 | The signed or unsigned BCD operand, depending on the S field, in FRB[p] is converted to a DFP number. The ideal exponent is zero. #### S = 0 (unsigned BCD operand) The unsigned BCD operand in FRB[p] is converted to a positive DFP number of the same magnitude and the result is placed into FRT[p]. #### S = 1 (signed BCD operand) The signed BCD operand in FRB[p] is converted to the corresponding DFP number and the result is placed into FRT[p]. If an invalid BCD digit or sign code is detected in the source operand, an invalid-operation exception (VXCVI) occurs. $\label{eq:FPSCR} \begin{tabular}{ll} FPSCR_{FPRF} is set to the class and sign of the result, except for Invalid Operation Exception when $FPSCR_{VF}=1$. \end{tabular}$ #### **Special Registers Altered:** FPRF FR (set to 0) FI (set to 0) FX VXCVI CR1 # DFP Extract Biased Exponent [Quad] X-form | dxex | FRT,FRB | (Rc=0) | |-------|---------|--------| | dxex. | FRT,FRB | (Rc=1) | | Ī | 59 | FRT | /// | FRB | 354 | Rc | |---|----|-----|-----|-----|-----|----| | C | ) | 6 | 11 | 16 | 21 | 31 | | dxexq | FRT,FRBp | (Rc=0) | |--------|----------|--------| | dxexq. | FRT,FRBp | (Rc=1) | | 63 | FRT | /// | FRBp | 354 | Rc | |----|-----|-----|------|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | The biased exponent of the operand in FRB[p] is extracted and placed into FRT in the 64-bit signed binary integer format. When the operand in FRB is an infinity, QNaN, or SNaN, a special code is returned. #### Operand Result Finite Number biased exponent value Infinity -1 QNaN -2 SNaN -3 #### **Special Registers Altered:** CR1 (if Rc=1) #### Programming Note The exponent bias value is 101 for DFP Short, 398 for DFP Long, and 6176 for DFP Extended. # DFP Insert Biased Exponent [Quad] X-form | diex | FRT,FRA,FRB | (Rc=0) | |-------|-------------|--------| | diex. | FRT,FRA,FRB | (Rc=1) | | 59 | FRT | FRA | FRB | 866 | Rc | |----|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | | diexq FRTp,FRA,FRBp (Rc=0) diexq. FRTp,FRA,FRBp (Rc=0) | |--------------------------------------------------------| |--------------------------------------------------------| | 63 | FRTp | FRA | FRBp | 866 | Rc | |----|------|-----|------|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | Let *a* be the value of the 64-bit signed binary integer in FRA. | a | Result | |----------------------|--------------------------------------| | a > MBE <sup>1</sup> | QNaN | | $MBE \geq a \geq 0$ | Finite number with biased exponent a | | a = -1 | Infinity | | a = -2 | QNaN | | a = -3 | SNaN | a < -3 QNaN Maximum biased exponent for the target format When $0 \le a \le MBE$ , a is the biased target exponent that is combined with the sign bit and the significand value of the DFP operand in FRB[p] to form the DFP result in FRT[p]. The ideal exponent is the specified target exponent. When a specifies a special code (a < 0 or a > MBE), an infinity, QNaN, or SNaN is formed in FRT[p] with the trailing significand field containing the value from the trailing significand field of the source operand in FRB[p], and with an N-bit combination field set as follows. - For an Infinity result, - the leftmost 5 bits are set to 0b11110, and - the rightmost N-5 bits are set to zero. - For a QNaN result. - the leftmost 5 bits are set to 0b11111, - bit 5 is set to zero, and - the rightmost N-5 bits are set to zero. - For an SNaN result. - the leftmost 5 bits are set to 0b11111, - bit 5 is set to one, and - the rightmost N-5 bits are set to zero. #### **Special Registers Altered:** CR1 (if Rc=1) #### Programming Note - The exponent bias value is 101 for DFP Short, 398 for DFP Long, and 6176 for DFP Extended. | Operand a in | Actions for Insert Biased Exponent when operand b in FRB[p] specifies | | | | | | |------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------|-------|-------|--|--| | FRA[p] specifies | F | ∞ | QNaN | SNaN | | | | F | N, Rb | Z, Rb | Z, Rb | Z, Rb | | | | ∞ | I, Rb | I, Rb | I, Rb | I, Rb | | | | QNaN | Q, Rb | Q, Rb | Q, Rb | Q, Rb | | | | SNaN | S, Rb | S, Rb | S, Rb | S, Rb | | | | Explanation: | | | | | | | | F | All finite numbers, including zeros | | | | | | | I | The combination field in FRT[p] is set to indicate a default Infinity. | | | | | | | N | The combination field in FRT[p] is set to the specified biased exponent in FRA and the leftmost significand digit in FRB[p]. | | | | | | | Q | The combination field in FRT[p] is set to indicate a default QNaN. | | | | | | | S | The combination field in FRT[p] is set to indicate a default SNaN. | | | | | | | Z | The combination field in FRT[p] is set to indicate the specific biased exponent in FRA and a leftmost coefficient digit of zero. | | | | | | | Rb | The contents of the trailing significand field in FRB[p] are reencoded using preferred DPD encodings and the reencoded result is placed in the same field in FRT[p]. The sign bit of FRB[p] is copied into the sign bit in FRT[p]. | | | | | | Figure 95. Actions: Insert Biased Exponent # DFP Shift Significand Left Immediate [Quad] Z22-form | | dscli<br>dscli. | , ,- | | | | | Rc=0)<br>Rc=1) | |---------------------------------------------|-----------------|------|------|----------|----------------|----|----------------| | | 59 | FRT | FRA | SH | | 66 | Rc | | | 0 | 6 | 11 | 16 | 22 | | 31 | | dscliq FRTp,FRAp,SH<br>dscliq. FRTp,FRAp,SH | | | | (F<br>(F | Rc=0)<br>Rc=1) | | | | | 63 | FRTp | FRAp | SH | | 66 | Rc | | | ام | _ | 144 | 4.0 | 00 | | 0.4 | The significand of the DFP operand in FRA[p] is shifted left SH digits. For a NaN or infinity, all significand digits are in the trailing significand field. SH is a 6-bit unsigned binary integer. Digits shifted out of the leftmost digit are lost. Zeros are supplied to the vacated positions on the right. The result is placed into FRT[p]. The sign of the result is the same as the sign of the source operand in FRA[p]. If the source operand in FRA[p] is a finite number, the exponent of the result is the same as the exponent of the source operand. For an Infinity, QNaN or SNaN result, the target format's N-bit combination field is set as follows. - For an Infinity result, - the leftmost 5 bits are set to 0b11110, and - the rightmost N-5 bits are set to zero. - For a QNaN result, - the leftmost 5 bits are set to 0b11111, - bit 5 is set to zero, and - the rightmost N-6 bits are set to zero. - For an SNaN result, - the leftmost 5 bits are set to 0b11111, - bit 5 is set to one, and - the rightmost N-6 bits are set to zero. #### **Special Registers Altered:** CR1 (if Rc=1) # DFP Shift Significand Right Immediate [Quad] Z22-form | dscri<br>dscri. | FRT,FRA,SH<br>FRT,FRA,SH | | | | | Rc=0)<br>Rc=1) | |-------------------|------------------------------|------|----|----|----|----------------| | 59 | FRT | FRA | SH | | 98 | Rc | | 0 | 6 | 11 | 16 | 22 | | 31 | | dscriq<br>dscriq. | FRTp,FRAp,SH<br>FRTp,FRAp,SH | | | | ( | Rc=0)<br>Rc=1) | | 63 | FRTp | FRAp | SH | | 98 | Rc | | 0 | 6 | 11 | 16 | 22 | | 31 | The significand of the DFP operand in FRA[p] is shifted right SH digits. For a NaN or infinity, all significand digits are in the trailing significand field. SH is a 6-bit unsigned binary integer. Digits shifted out of the units digit are lost. Zeros are supplied to the vacated positions on the left. The result is placed into FRT[p]. The sign of the result is the same as the sign of the source operand in FRA[p]. If the source operand in FRA[p] is a finite number, the exponent of the result is the same as the exponent of the source operand. For an Infinity, QNaN or SNaN result, the target format's N-bit combination field is set as follows. - For an Infinity result, - the leftmost 5 bits are set to 0b11110, and - the rightmost N-5 bits are set to zero. - For a QNaN result, - the leftmost 5 bits are set to 0b11111, - bit 5 is set to zero, and - the rightmost N-6 bits are set to zero. - For an SNaN result, - the leftmost 5 bits are set to 0b11111, - bit 5 is set to one, and - the rightmost N-6 bits are set to zero. #### **Special Registers Altered:** CR1 (if Rc=1) ## 5.6.7 DFP Instruction Summary | nic | | | | | | ng | FP | RF | | | | 1 | |----------|----------------------------------------------|------|---------------------------|----------|---------|----------|----|----------------|------------------------------|-------|----|---| | Mnemonic | Full Name | FORM | Operands | SN<br>Vs | aN<br>G | Encoding | ပ | FPCC | FP<br>Exception<br>V Z O U X | FR\FI | ΙE | 0 | | dadd | DFP Add | Х | FRT, FRA, FRB | Υ | Ν | RE | Υ | Υ | V О U X | Υ | Υ | , | | daddq | DFP Add Quad | Х | FRTp, FRAp, FRBp | Υ | Ν | RE | Υ | Υ | V О U X | Υ | Υ | | | dsub | DFP Subtract | Х | FRT, FRA, FRB | Υ | Ν | RE | Υ | Υ | V О U X | Υ | Υ | | | dsubq | DFP Subtract Quad | Х | FRTp, FRAp, FRBp | Υ | N | RE | Υ | Υ | v оих | Υ | Υ | | | dmul | DFP Multiply | Х | FRT, FRA, FRB | Υ | N | RE | Υ | Υ | V | Υ | Υ | | | dmulq | DFP Multiply Quad | Х | FRTp, FRAp, FRBp | Υ | Ν | RE | Υ | Υ | v оих | Υ | Υ | | | ddiv | DFP Divide | Х | FRT, FRA, FRB | Υ | Ν | RE | Υ | Υ | VZOUX | Υ | Υ | | | ddivq | DFP Divide Quad | Х | FRTp, FRAp, FRBp | Υ | N | RE | Υ | Υ | VZOUX | Υ | Υ | | | dcmpo | DFP Compare Ordered | Х | BF, FRA, FRB | Υ | - | 1 | Z | Υ | V | - | - | | | dcmpoq | DFP Compare Ordered Quad | Х | BF, FRAp, FRBp | Υ | - | 1 | Z | Υ | V | - | - | | | dcmpu | DFP Compare Unordered | Х | BF, FRA, FRB | Υ | - | - | Z | Υ | V | - | - | | | dcmpuq | DFP Compare Unordered Quad | Х | BF, FRAp, FRBp | Υ | - | 1 | Ν | Υ | V | - | - | | | dtstdc | DFP Test Data Class | Z22 | BF, FRA, DCM | Ν | - | - | Ν | Y <sup>1</sup> | | - | - | | | dtstdcq | DFP Test Data Class Quad | Z22 | BF, FRAp, DCM | N | - | - | N | Y <sup>1</sup> | | - | - | | | dtstdg | DFP Test Data Group | Z22 | BF, FRA,DGM | N | - | - | N | Y <sup>1</sup> | | - | - | | | dtstdgq | DFP Test Data Group Quad | Z22 | BF, FRAp, DGM | Ν | - | - | Ν | Y <sup>1</sup> | | - | - | | | dtstex | DFP Test Exponent | Х | BF, FRA, FRB | Ν | - | - | N | Υ | | - | - | Ī | | dtstexq | DFP Test Exponent Quad | Х | BF, FRAp, FRBp | Ν | - | - | Ν | Υ | | - | - | Ī | | dtstsf | DFP Test Significance | Х | BF, FRA(FIX), FRB | Ν | - | - | N | Υ | | - | - | Ī | | dtstsfq | DFP Test Significance Quad | Х | BF, FRA(FIX), FRBp | Ν | - | - | Ν | Υ | | - | - | Ī | | dquai | DFP Quantize Immediate | Z23 | TE, FRT, FRB, RMC | Υ | Ν | RE | Υ | Υ | V X | Υ | Υ | Ī | | dquaiq | DFP Quantize Immediate Quad | Z23 | TE, FRTp, FRBp, RMC | Υ | N | RE | Υ | Υ | V X | Υ | Υ | Ī | | dqua | DFP Quantize | Z23 | FRT,FRA,FRB,RMC | Υ | N | RE | Υ | Υ | V X | Υ | Υ | Ī | | dquaq | DFP Quantize Quad | Z23 | FRTp,FRAp,FRBp, RMC | Υ | Ν | RE | Υ | Υ | V X | Υ | Υ | Ī | | drrnd | DFP Reround | Z23 | FRT,FRA(FIX),FRB,RMC | Υ | N | RE | Υ | Υ | V X | Υ | Υ | Ī | | drrndq | DFP Reround Quad | Z23 | FRTp, FRA(FIX), FRBp, RMC | Υ | N | RE | Υ | Υ | V X | Υ | Υ | Ī | | drintx | DFP Round To FP Integer With Inexact | Z23 | R,FRT, FRB,RMC | Υ | N | RE | Υ | Υ | v x | Υ | Υ | | | drintxq | DFP Round To FP Integer With Inexact Quad | Z23 | R,FRTp,FRBp,RMC | Υ | N | RE | Υ | Υ | v x | Υ | Υ | | | drintn | DFP Round To FP Integer Without Inexact | Z23 | R,FRT, FRB,RMC | Υ | N | RE | Υ | Υ | V | Y# | Υ | | | drintnq | DFP Round To FP Integer Without Inexact Quad | Z23 | R,FRTp, FRBp,RMC | Υ | N | RE | Υ | Υ | V | Y# | Υ | | | dctdp | DFP Convert To DFP Long | Х | FRT, FRB (DFP Short) | N | Υ | RE | Υ | Y <sup>2</sup> | | U | Υ | | | dctqpq | DFP Convert To DFP Extended | Х | FRTp, FRB | Υ | N | RE | Υ | Υ | V | Y# | Υ | | | drsp | DFP Round To DFP Short | Х | FRT (DFP Short), FRB | N | Υ | RE | Υ | Y <sup>2</sup> | оих | Υ | Υ | Ī | | drdpq | DFP Round To DFP Long | Х | FRTp, FRBp | Υ | Ν | RE | Υ | Υ | v оих | Υ | Υ | T | | dcffixq | DFP Convert From Fixed Quad | Х | FRTp, FRB (FIX) | - | N | RE | Υ | Υ | | U | Υ | T | | dctfix | DFP Convert To Fixed | Х | FRT (FIX), FRB | Υ | N | - | U | U | V X | Υ | - | Ť | | dctfixq | DFP Convert To Fixed Quad | Х | FRT (FIX), FRBp | Υ | N | - | U | U | V X | Υ | - | t | | ddedpd | DFP Decode DPD To BCD | Х | SP, FRT(BCD), FRB | N | - | - | N | N | | - | - | t | | ddedpdq | DFP Decode DPD To BCD Quad | Х | SP, FRTp(BCD), FRBp | N | - | - | N | N | | - | - | T | Figure 96. Decimal Floating-Point Instructions Summary | )nic | | | | | | | FPRF | | | | | | |------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|----------|----------|------|-------------------|------------------------------|-------|------------------|----| | Mnemonic | Full Name | FORM | Operands | | laN<br>G | Encoding | ပ | FPCC | FP<br>Exception<br>V Z O U X | FR\FI | ΙE | | | denbcd | DFP Encode BCD To DPD | Х | S, FRT, FRB (BCD) | - | Ν | RE | Υ | Υ | V | Υ# | Υ | Ī | | denbcdq | DFP Encode BCD To DPD Quad | Х | S, FRTp, FRBp (BCD) | - | N | RE | Υ | Υ | V | Υ# | Υ | Ī | | dxex | DFP Extract Biased Exponent | Х | FRT (FIX), FRB | N | N | - | N | N | | - | - | | | dxexq | DFP Extract Biased Exponent Quad | Х | FRT (FIX), FRBp | N | N | - | N | N | | - | - | 1 | | diex | DFP Insert Biased Exponent | Х | FRT, FRA(FIX), FRB | N | Υ | RE | N | N | | - | Υ | | | diexq | DFP Insert Biased Exponent Quad | Х | FRTp, FRA(FIX), FRBp | N | Υ | RE | N | N | | - | Υ | | | dscli | DFP Shift Significand Left Immediate | Z22 | FRT,FRA,SH | N | Υ | RE | Ν | Ν | | - | - | | | dscliq | DFP Shift Significand Left Immediate Quad | Z22 | FRTp,FRAp,SH | N | Υ | RE | Z | Z | | - | - | | | dscri | DFP Shift Significand Right Immediate | Z22 | FRT,FRA,SH | N | Υ | RE | Z | Z | | - | - | | | dscriq | DFP Shift Significand Right Immediate Quad | Z22 | FRTp,FRAp,SH | N | Υ | RE | Ν | Ν | | - | - | | | Explanat | tion: | | | | | | | | | | | | | # | FI and FR are set to zeros for these i | instru | uctions. | | | | | | | | | | | _ | Not applicable. | | | | | | | | | | | | | - | | | | | | | | | | | | | | 1 | A unique definition of the $FPSCR_FPC$ | <sub>C</sub> fie | ld is provided for the instruction | | | | | | | | | | | 1 2 | A unique definition of the FPSCR <sub>FPC</sub><br>These are the only instructions that n<br>does not include a code for SNaN, th | nay ( | generate an SNaN and also set | the | | | | | | | PRF 1 | fi | | | These are the only instructions that n | nay (<br>iese | generate an SNaN and also set instructions cause the need for | the | | | | | | | PRF <sup>1</sup> | fi | | 2 | These are the only instructions that n does not include a code for SNaN, the | nay (<br>lese<br>g the | generate an SNaN and also set instructions cause the need for data-class mask. | the | | | | | | | PRF <sup>1</sup> | fi | | 2<br>DCM | These are the only instructions that n does not include a code for SNaN, the A 6-bit immediate operand specifying | nay giese githe githe | generate an SNaN and also set instructions cause the need for data-class mask. data-group mask. | the | | | | | | - | PRF <sup>1</sup> | fi | | 2<br>DCM<br>DGM | These are the only instructions that in does not include a code for SNaN, the A 6-bit immediate operand specifying A 6-bit immediate operand specifying | may onese of the of the original the original the original the original through | generate an SNaN and also set instructions cause the need for data-class mask. data-group mask. operand. | the | | | | | | - | PRF <sup>1</sup> | fi | | DCM<br>DGM<br>G | These are the only instructions that in does not include a code for SNaN, the A 6-bit immediate operand specifying A 6-bit immediate operand specifying An SNaN can be generated as the tax | may onese of the of the original the original the original the original through | generate an SNaN and also set instructions cause the need for data-class mask. data-group mask. operand. | the | | | | | | - | PRF <sup>1</sup> | fi | | DCM<br>DGM<br>G<br>IE | These are the only instructions that in does not include a code for SNaN, the A 6-bit immediate operand specifying A 6-bit immediate operand specifying An SNaN can be generated as the talk An ideal exponent is defined for the incomparison. | may onese of the of the original the original the original the original through | generate an SNaN and also set instructions cause the need for data-class mask. data-group mask. operand. | the | | | | | | - | PRF <sup>1</sup> | fi | | DCM DGM G IE FI | These are the only instructions that in does not include a code for SNaN, the A 6-bit immediate operand specifying A 6-bit immediate operand specifying An SNaN can be generated as the tath An ideal exponent is defined for the inserting of the FPSCR <sub>FI</sub> flag. | may onese of the of the original the original the original the original through | generate an SNaN and also set instructions cause the need for data-class mask. data-group mask. operand. | the | | | | | | - | PRF <sup>1</sup> | fi | | DCM<br>DGM<br>G<br>IE<br>FI<br>FR | These are the only instructions that in does not include a code for SNaN, the A 6-bit immediate operand specifying A 6-bit immediate operand specifying An SNaN can be generated as the talk An ideal exponent is defined for the inserting of the FPSCR <sub>FI</sub> flag. Setting of the FPSCR <sub>FR</sub> flag. | may giese githe githe irget nstru | generate an SNaN and also set instructions cause the need for data-class mask. data-group mask. operand. | the | | | | | | - | PRF <sup>1</sup> | fi | | DCM DGM G IE FI FR N | These are the only instructions that in does not include a code for SNaN, the A 6-bit immediate operand specifying A 6-bit immediate operand specifying An SNaN can be generated as the tath An ideal exponent is defined for the inserting of the FPSCR <sub>FI</sub> flag. Setting of the FPSCR <sub>FR</sub> flag. No. | may giese githe githe urget nstru | generate an SNaN and also set instructions cause the need for data-class mask. data-group mask. operand. action. | the | | | | | | - | PRF <sup>1</sup> | fi | | DCM DGM G IE FI FR N O | These are the only instructions that in does not include a code for SNaN, the A 6-bit immediate operand specifying A 6-bit immediate operand specifying An SNaN can be generated as the tat An ideal exponent is defined for the inserting of the FPSCR <sub>FI</sub> flag. Setting of the FPSCR <sub>FR</sub> flag. No. An overflow exception may be recognized. | nay (nese of the property t | generate an SNaN and also set instructions cause the need for data-class mask. data-group mask. operand. action. | the rede | finin | g the I | FPSC | :R <sub>FPF</sub> | <sub>RF</sub> field for DFF | P. | | | | DCM DGM G IE FI R N O Rc | These are the only instructions that in does not include a code for SNaN, the A 6-bit immediate operand specifying A 6-bit immediate operand specifying An SNaN can be generated as the tat An ideal exponent is defined for the inserting of the FPSCR <sub>FI</sub> flag. Setting of the FPSCR <sub>FR</sub> flag. No. An overflow exception may be recognitive record bit, Rc, is provided to record the trailing significand field is reenced. | nay ( nese of the result | generate an SNaN and also set instructions cause the need for data-class mask. data-group mask. operand. action. TPSCR <sub>0:3</sub> in CR field 1. using preferred DPD encodings ities. | the rede | finin | g the I | FPSC | :R <sub>FPF</sub> | <sub>RF</sub> field for DFF | P. | | | | DCM DGM G IE FI FR N O RC | These are the only instructions that in does not include a code for SNaN, the A 6-bit immediate operand specifying A 6-bit immediate operand specifying An SNaN can be generated as the talk An ideal exponent is defined for the insertion of the FPSCR <sub>FI</sub> flag. Setting of the FPSCR <sub>FR</sub> flag. No. An overflow exception may be recognitive record bit, Rc, is provided to recognitive trailing significand field is reenced gated NaNs, or converted NaNs and | may ( plese plant the plant mizec ord F oded infin the | generate an SNaN and also set instructions cause the need for data-class mask. data-group mask. operand. action. TPSCR <sub>0:3</sub> in CR field 1. using preferred DPD encodings ities. rounding-mode control. | the rede | finin | g the I | FPSC | :R <sub>FPF</sub> | <sub>RF</sub> field for DFF | P. | | | | DCM DGM G IE FI RR N O RC RE RMC | These are the only instructions that in does not include a code for SNaN, the A 6-bit immediate operand specifying A 6-bit immediate operand specifying An SNaN can be generated as the tat An ideal exponent is defined for the inserting of the FPSCR <sub>FI</sub> flag. Setting of the FPSCR <sub>FR</sub> flag. No. An overflow exception may be recognized to recognize the trailing significant field is reenced gated NaNs, or converted NaNs and A 2-bit immediate operand specifying | may ( ma | generate an SNaN and also set instructions cause the need for data-class mask. data-group mask. operand. action. d. FPSCR <sub>0:3</sub> in CR field 1. using preferred DPD encodings ities. rounding-mode control. if the operation is signed or uncifies if the operation is signed | the rede | finin | g the I | DPD | encc | <sub>RF</sub> field for DFF | o. | for pi | rc | | DCM DGM G IE FI FR N O RC RE RMC S | These are the only instructions that in does not include a code for SNaN, the A 6-bit immediate operand specifying A 6-bit immediate operand specifying An SNaN can be generated as the tat An ideal exponent is defined for the inserting of the FPSCR <sub>FI</sub> flag. Setting of the FPSCR <sub>FR</sub> flag. No. An overflow exception may be recognized to recognize the record bit, Rc, is provided to recognized NaNs, or converted NaNs and A 2-bit immediate operand specifying An one-bit immediate operand: one bit o | may (green and the second representation of t | generate an SNaN and also set instructions cause the need for data-class mask. data-group mask. operand. action. d. FPSCR <sub>0:3</sub> in CR field 1. using preferred DPD encodings ities. rounding-mode control. if the operation is signed or uncifies if the operation is signed s generated. | the rede | finin | g the I | DPD | encc | <sub>RF</sub> field for DFF | o. | for pi | rc | | DCM DGM G IE FI R O R C RE RMC S SP | These are the only instructions that in does not include a code for SNaN, the A 6-bit immediate operand specifying A 6-bit immediate operand specifying An SNaN can be generated as the talk An ideal exponent is defined for the inserting of the FPSCR <sub>FI</sub> flag. Setting of the FPSCR <sub>FR</sub> flag. No. An overflow exception may be recognized to recognize the record bit, Rc, is provided to recognize the trailing significant field is reened gated NaNs, or converted NaNs and A 2-bit immediate operand specifying An one-bit immediate operand: one bit specifies which preferred plus sign or | may greese grees | generate an SNaN and also set instructions cause the need for data-class mask. data-group mask. operand. action. d. PSCR <sub>0:3</sub> in CR field 1. using preferred DPD encodingsities. rounding-mode control. if the operation is signed or uncifies if the operation is signed s generated. | the rede | finin | g the I | DPD | encc | <sub>RF</sub> field for DFF | o. | for pi | rc | | DCM DGM G IE FI FR N O RC RE RMC S SP U | These are the only instructions that in does not include a code for SNaN, the A 6-bit immediate operand specifying A 6-bit immediate operand specifying An SNaN can be generated as the tax An ideal exponent is defined for the insertion of the FPSCRFI flag. Setting of the FPSCRFI flag. Setting of the FPSCRFI flag. No. An overflow exception may be recognized to recognize the record bit, Rc, is provided to recognize the trailing significant field is reencognized NaNs, or converted NaNs and A 2-bit immediate operand specifying An one-bit immediate operand specifies which preferred plus significant and underflow exception may be recognized. | nay ( | generate an SNaN and also set instructions cause the need for data-class mask. data-group mask. operand. action. d. FPSCR <sub>0:3</sub> in CR field 1. using preferred DPD encodings ities. rounding-mode control. if the operation is signed or uncifies if the operation is signed s generated. ed. cognized. | the rede | finin | g the I | DPD | encc | <sub>RF</sub> field for DFF | o. | for pi | rc | | DCM DGM G IE FI RC RC RE RMC S SP U | These are the only instructions that in does not include a code for SNaN, the A 6-bit immediate operand specifying A 6-bit immediate operand specifying An SNaN can be generated as the tat An ideal exponent is defined for the inserting of the FPSCR <sub>FI</sub> flag. Setting of the FPSCR <sub>FR</sub> flag. No. An overflow exception may be recognized to recognize the record bit, Rc, is provided to recognize the recognized NaNs, or converted NaNs and A 2-bit immediate operand specifying An one-bit immediate operand specification and the specifies which preferred plus sign of the An underflow exception may be recognized. | may greese grees | generate an SNaN and also set instructions cause the need for data-class mask. data-group mask. operand. action. d. FPSCR <sub>0:3</sub> in CR field 1. using preferred DPD encodings ities. rounding-mode control. if the operation is signed or uncifies if the operation is signed s generated. ed. cognized. alid-operation exception. | the rede | finin | g the I | DPD | encc | <sub>RF</sub> field for DFF | o. | for pi | rc | | DCM DGM G IE FI R O R C RE RMC S SP U V Vs | These are the only instructions that in does not include a code for SNaN, the A 6-bit immediate operand specifying A 6-bit immediate operand specifying An SNaN can be generated as the tat An ideal exponent is defined for the insetting of the FPSCR <sub>FI</sub> flag. Setting of the FPSCR <sub>FR</sub> flag. No. An overflow exception may be recognized NaNs, or converted NaNs and A 2-bit immediate operand specifying An one-bit immediate operand specifies which preferred plus sign of An underflow exception may be recognized an invalid-operation exception may be recognized. | may greese grees | generate an SNaN and also set instructions cause the need for data-class mask. data-group mask. operand. action. d. FPSCR <sub>0:3</sub> in CR field 1. using preferred DPD encodings ities. rounding-mode control. if the operation is signed or uncifies if the operation is signed s generated. ed. cognized. alid-operation exception. | the rede | finin | g the I | DPD | encc | <sub>RF</sub> field for DFF | o. | for pi | rc | | DCM DGM G IE FI FR N O RC RE RMC S SP U V VS X | These are the only instructions that in does not include a code for SNaN, the A 6-bit immediate operand specifying A 6-bit immediate operand specifying An SNaN can be generated as the tax An ideal exponent is defined for the insertion of the FPSCR <sub>FI</sub> flag. Setting of the FPSCR <sub>FI</sub> flag. Setting of the FPSCR <sub>FR</sub> flag. No. An overflow exception may be recognized NaNs, or converted NaNs and A 2-bit immediate operand specifying An one-bit immediate operand specifies which preferred plus sign of the A two-bit immediate operand: one bit specifies which preferred plus sign of the An invalid-operation exception may be recognized an input operand of SNaN causes and An inexact exception may be recognized. | may greese grees | generate an SNaN and also set instructions cause the need for data-class mask. data-group mask. operand. action. d. FPSCR <sub>0:3</sub> in CR field 1. using preferred DPD encodings ities. rounding-mode control. if the operation is signed or uncifies if the operation is signed s generated. ed. cognized. alid-operation exception. | the rede | finin | g the I | DPD | encc | <sub>RF</sub> field for DFF | o. | for pi | rc | Figure 96. Decimal Floating-Point Instructions Summary (Continued) ## **Chapter 6. Vector Processor [Category: Vector]** | 6.1 Vector Processor Overview 194<br>6.2 Chapter Conventions 194 | 6.8.6 Vector Shift Instructions 218 6.9 Vector Integer Instructions 220 | |-----------------------------------------------------------------------------|-------------------------------------------------------------------------| | 6.2.1 Description of Instruction Operation 194 | 6.9.1 Vector Integer Arithmetic Instructions 220 | | 6.3 Vector Processor Registers 195 | 6.9.1.1 Vector Integer Add Instructions 220 | | 6.3.1 Vector Registers 195 | 6.9.1.2 Vector Integer Subtract Instruc- | | 6.3.2 Vector Status and Control Register. 195 | tions | | 6.3.3 VR Save Register 196 | 226 | | 6.4 Vector Storage Access Operations 196 | 6.9.1.4 Vector Integer Multiply-Add/Sum | | 6.4.1 Accessing Unaligned Storage Oper- | Instructions | | ands | 6.9.1.5 Vector Integer Sum-Across Instructions | | 6.5.1 Integer Saturation 199 | 6.9.1.6 Vector Integer Average Instruc- | | 6.6 Vector Floating-Point Operations . 200 | tions | | 6.6.1 Floating-Point Overview 200 | 6.9.1.7 Vector Integer Maximum and Mini- | | 6.6.2 Floating-Point Exceptions 200 6.6.2.1 NaN Operand Exception 201 | mum Instructions | | 6.6.2.2 Invalid Operation Exception 201 | 241 | | 6.6.2.3 Zero Divide Exception 201 | 6.9.3 Vector Logical Instructions 244 | | 6.6.2.4 Log of Zero Exception 201 | 6.9.4 Vector Integer Rotate and Shift | | 6.6.2.5 Overflow Exception 201<br>6.6.2.6 Underflow Exception 202 | Instructions | | 6.7 Vector Storage Access Instructions | 249 | | 202 | 6.10.1 Vector Floating-Point Arithmetic | | 6.7.1 Storage Access Exceptions 202 | Instructions | | 6.7.2 Vector Load Instructions 203<br>6.7.3 Vector Store Instructions 206 | 6.10.2 Vector Floating-Point Maximum and Minimum Instructions251 | | 6.7.4 Vector Alignment Support Instruc- | 6.10.3 Vector Floating-Point Rounding and | | tions | Conversion Instructions | | 6.8 Vector Permute and Formatting | 6.10.4 Vector Floating-Point Compare | | Instructions | Instructions | | 6.8.1 Vector Pack and Unpack Instructions 209 | 6.10.5 Vector Floating-Point Estimate Instructions | | 6.8.2 Vector Merge Instructions 214 | 6.11 Vector Status and Control Register | | 6.8.3 Vector Splat Instructions 216 | Instructions | | 6.8.4 Vector Permute Instruction 217<br>6.8.5 Vector Select Instruction 217 | | | 6.8.5 Vector Select Instruction 217 | | ## 6.1 Vector Processor Overview This chapter describes the registers and instructions that make up the Vector Processor facility. ## 6.2 Chapter Conventions # **6.2.1 Description of Instruction Operation** The following notation, in addition to that described in Section 1.3.2, is used in this chapter. Additional RTL functions are described in Appendix B. | idilotiono di | o accombatin Apponaix B. | |--------------------------------------------|------------------------------------------------------------------------------------------------------------| | Notation | Meaning | | x?y:z | if the value of x is true, then the value of y, | | , | otherwise the value z. | | + <sub>int</sub> | Integer addition. | | + <sub>fp</sub> | Floating-point addition. | | _tp | Floating-point subtraction. | | × <sub>sui</sub> | Multiplication of a signed-integer (first | | - Sui | operand) by an unsigned-integer (second | | | operand). | | × <sub>fp</sub> | Floating-point multiplication. | | = <sub>int</sub> | Integer equals relation. | | = <sub>fp</sub> | Floating-point equals relation. | | <ui, ≤<sub="">ui, &gt;<sub>ui</sub>,</ui,> | $\geq_{ui}$ | | | Unsigned-integer comparison relations. | | $<_{Si}$ , $\leq_{Si}$ , $>_{Si}$ , $\geq$ | <sup>≥</sup> si | | | Signed-integer comparison relations. | | $<_{fp}, \le_{fp}, >_{fp},$ | <u>≥</u> fp | | | Floating-point comparison relations. | | LENGTH( x | ) Length of x, in bits. If x is the word "ele- | | | ment", LENGTH( $x$ ) is the length, in bits, | | | of the element implied by the instruction | | | mnemonic. | | x << y | Result of shifting x left by y bits, filling | | | vacated bits with zeros. | | | $b \leftarrow LENGTH(x)$ | | | result $\leftarrow$ (y < b) ? (x <sub>y:b-1</sub> $ ^{y}$ 0) : $^{b}$ 0 | | x >> <sub>ui</sub> y | Result of shifting x right by y bits, filling | | | vacated bits with zeros. | | | b ← LENGTH(x) | | | result $\leftarrow$ (y < b) ? ( $^{y}$ 0 $x_{0:(b-y)-1}$ ) : $^{b}$ 0 | | x >> y | Result of shifting x right by y bits, filling | | | vacated bits with copies of bit 0 (sign bit) | | | of x. | | | b ← LENGTH(x) | | | result $\leftarrow$ (y <b) (<math="" ?="">^{y}x_{0} \parallel x_{0:(b-y)-1}) : <math>^{b}x_{0}</math></b)> | | x <<< y | Result of rotating x left by y bits. | | | b ← LENGTH(x) | | Chop(x, y) | result $\leftarrow x_{y:b-1} \parallel x_{0:y-1}$<br>Result of extending the right-most y bits of | | Chop(x, y) | x on the left with zeros. | | | | | EVT7() | result $\leftarrow$ x & ((1< <y)-1)< td=""></y)-1)<> | #### Clamp(x, y, z) x is interpreted as a signed integer. If the value of x is less than y, then the value y is returned, else if the value of x is greater than z, the value z is returned, else the value x is returned. ``` \begin{array}{l} \text{if } (x < y) \text{ then} \\ \text{result} \leftarrow y \\ \text{VSCR}_{SAT} \leftarrow 1 \\ \text{else if } (x > z) \text{ then} \\ \text{result} \leftarrow z \\ \text{VSCR}_{SAT} \leftarrow 1 \\ \text{else result} \leftarrow x \end{array} ``` ## RoundToSPIntCeil(x) The value x if x is a single-precision floating-point integer; otherwise the smallest single-precision floating-point integer that is greater than x. ## RoundToSPIntFloor(x) The value x if x is a single-precision floating-point integer; otherwise the largest single-precision floating-point integer that is less than x. #### RoundToSPIntNear(x) The value x if x is a single-precision floating-point integer; otherwise the single-precision floating-point integer that is nearest in value to x (in case of a tie, the even single-precision floating-point integer is used). ## RoundToSPIntTrunc(x) The value x if x is a single-precision floating-point integer; otherwise the largest single-precision floating-point integer that is less than x if x>0, or the smallest single-precision floating-point integer that is greater than x if x<0. ### RoundToNearSP(x) The single-precision floating-point number that is nearest in value to the infinitely-precise floating-point intermediate result x (in case of a tie, the single-precision floating-point value with the least-significant bit equal to 0 is used). ## ReciprocalEstimateSP(x) A single-precision floating-point estimate of the reciprocal of the single-precision floating-point number x. ## ReciprocalSquareRootEstimateSP(x) A single-precision floating-point estimate of the reciprocal of the square root of the single-precision floating-point number x. ## LogBase2EstimateSP(x) A single-precision floating-point estimate of the base 2 logarithm of the single-precision floating-point number x. #### Power2EstimateSP(x) A single-precision floating-point estimate of the 2 raised to the power of the single-precision floating-point number x. $b \leftarrow LENGTH(x)$ result $\leftarrow$ x & ((1<<b)-1) Result of extending x on the left with EXTZ(x) | | | Quadword | | | | | | | | | | | | | | | |---|-------|----------|-------|--------|-------|--------|-------|--------|-------|--------|-------|--------|-------|--------|-------|---------| | | | Wo | rd 0 | | | Wo | rd 1 | | | Wo | ord 2 | | | Wo | rd 3 | | | | Halfw | ord 0 | Halfv | word 1 | Halfv | vord 2 | Halfv | vord 3 | Halfv | vord 4 | Halfv | vord 5 | Halfv | vord 6 | Halfv | vord 7 | | | В0 | B1 | B2 | В3 | B4 | B5 | B6 | B7 | B8 | B9 | B10 | B11 | B12 | B13 | B14 | B15 | | 0 | | 8 | 16 | 24 | 32 | 40 | 48 | 56 | 64 | 72 | 80 | 88 | 96 | 104 | 112 | 120 127 | Figure 97. Vector Register elements ## **6.3 Vector Processor Registers** ## 6.3.1 Vector Registers There are 32 Vector Registers (VRs), each containing 128 bits. See Figure 98. All computations and other data manipulation are performed on data residing in Vector Registers, and results are placed into a VR. | | VR0 | | |---|------|-----| | | VR1 | | | | ••• | | | | | | | | VR30 | | | | VR31 | | | 0 | | 127 | Figure 98. Vector Registers Depending on the instruction, the contents of a Vector Register are interpreted as a sequence of equal-length elements (bytes, halfwords, or words) or as a quadword. Each of the elements is aligned at its natural boundary within the Vector Register, as shown in Figure 97. Many instructions perform a given operation in parallel on all elements in a Vector Register. Depending on the instruction, a byte, halfword, or word element can be interpreted as a signed-integer, an unsigned-integer, or a logical value; a word element can also be interpreted as a single-precision floating-point value. In the instruction descriptions, phrases like "signed-integer word element" are used as shorthand for "word element, interpreted as a signed-integer". Load and Store instructions are provided that transfer a byte, halfword, word, or quadword between storage and a Vector Register. # 6.3.2 Vector Status and Control Register The Vector Status and Control Register (VSCR) is a special 32-bit register (not an SPR) that is read and written in a manner similar to the FPSCR in the Power ISA scalar floating-point unit. Special instructions (*mfvscr* and *mtvscr*) are provided to move the VSCR from and to a vector register. When moved to or from a vector register, the 32-bit VSCR is right justified in the 128-bit vector register. When moved to a vector register, bits 0-95 of the vector register are cleared (set to 0). Figure 99. Vector Status and Control Register The bit definitions for the VSCR are as follows. ## Bit(s) Description 96:110 Reserved #### 111 **Vector Non-Java Mode** (NJ) This bit controls how denormalized values are handled by *Vector Floating-Point* instructions. - 0 Denormalized values are handled as specified by Java and the IEEE standard; see Section 6.6.1. - 1 If an element in a source VR contains a denormalized value, the value 0 is used instead. If an instruction causes an Underflow Exception, the corresponding element in the target VR is set to 0. In both cases the 0 has the same sign as the denormalized or underflowing value. #### 112:126 Reserved #### 127 Vector Saturation (SAT) Every vector instruction having "Saturate" in its name implicitly sets this bit to 1 if any result of that instruction "saturates"; see Section 6.8. *mtvscr* can alter this bit explicitly. This bit is sticky; that is, once set to 1 it remains set to 1 until it is set to 0 by an *mtvscr* instruction. After the *mfvscr* instruction executes, the result in the target vector register will be architecturally precise. That is, it will reflect all updates to the SAT bit that could have been made by vector instructions logically preceding it in the program flow, and further, it will not reflect any SAT updates that may be made to it by vector instructions logically following it in the program flow. To implement this, processors may choose to make the *mfvscr* instruction execution serializing within the vector unit, meaning that it will stall vector instruction execution until all preceding vector instructions are complete and have updated the architectural machine state. This is permitted in order to simplify implementation of the sticky status bit (SAT) which would otherwise be difficult to implement in an out-of-order execution machine. The implication of this is that reading the VSCR can be much slower than typical Vector instructions, and therefore care must be taken in reading it, as advised in Section 6.5.1, to avoid performance problems. The *mtvscr* is context synchronizing. This implies that all Vector instructions logically preceding an *mtvscr* in the program flow will execute in the architectural context (NJ mode) that existed prior to completion of the *mtvscr*, and that all instructions logically following the *mtvscr* will execute in the new context (NJ mode) established by the *mtvscr*. ## 6.3.3 VR Save Register The VR Save Register (VRSAVE) is a 32-bit register provided for application and operating system use. Figure 100.VR Save Register ### Programming Note The VRSAVE register can be used to indicate which VRs are currently being used by a program. If this is done, the operating system could save only those VRs when an "interrupt" occurs (see Book III), and could restore only those VRs when resuming the interrupted program. If this approach is taken it must be applied rigorously; if a program fails to indicate that a given VR is in use, software errors may occur that will be difficult to detect and correct because they are timing-dependent. Some operating systems save and restore VRSAVE only for programs that also use other vector registers. # 6.4 Vector Storage Access Operations The Vector Storage Access instructions provide the means by which data can be copied from storage to a Vector Register or from a Vector Register to storage. Instructions are provided that access byte, halfword, word, and quadword storage operands. These instructions differ from the fixed-point and floating-point Storage Access instructions in that vector storage operands are assumed to be aligned, and vector storage accesses are performed as if the appropriate number of low-order bits of the specified effective address (EA) were zero. For example, the low-order bit of EA is ignored for halfword Vector Storage Access instructions, and the low-order four bits of EA are ignored for quadword Vector Storage Access instructions. The effect is to load or store the storage operand of the specified length that contains the byte addressed by EA. If a storage operand is unaligned, additional instructions must be used to ensure that the operand is correctly placed in a Vector Register or in storage. Instructions are provided that shift and merge the contents of two Vector Registers, such that an unaligned quadword storage operand can be copied between storage and the Vector Registers in a relatively efficient manner. As shown in Figure 97, the elements in Vector Registers are numbered; the high-order (or most significant) byte element is numbered 0 and the low-order (or least significant) byte element is numbered 15. The numbering affects the values that must be placed into the permute control vector for the *Vector Permute* instruction in order for that instruction to achieve the desired effects, as illustrated by the examples in the following subsections. A vector quadword *Load* instruction for which the effective address (EA) is quadword-aligned places the byte in storage addressed by EA into byte element 0 of the target Vector Register, the byte in storage addressed by EA+1 into byte element 1 of the target Vector Register, etc. Similarly, a vector quadword *Store* instruction for which the EA is quadword-aligned places the contents of byte element 0 of the source Vector Register into the byte in storage addressed by EA, the contents of byte element 1 of the source Vector Register into the byte in storage addressed by EA+1, etc. Figure 101 shows an aligned quadword in storage. Figure 102 shows the result of loading that quadword into a Vector Register or, equivalently, shows the contents that must be in a Vector Register if storing that Vector Register is to produce the storage contents shown in Figure 101. When an aligned byte, halfword, or word storage operand is loaded into a Vector Register, the element (byte, halfword, or word respectively) that receives the data is the element that would have received the data had the entire aligned quadword containing the storage operand addressed by EA been loaded. Similarly, when a byte, halfword, or word element in a Vector Register is stored into an aligned storage operand (byte, halfword, or word respectively), the element selected to be stored is the element that would have been stored into the storage operand addressed by EA had the entire Vector Register been stored to the aligned quadword containing the storage operand addressed by EA. (Byte storage operands are always aligned.) For aligned byte, halfword, and word storage operands, if the corresponding element number is known when the program is written, the appropriate *Vector Splat* and *Vector Permute* instructions can be used to copy or replicate the data contained in the storage operand after loading the operand into a Vector Register. An example of this is given in the Programming Note for *Vector Splat*; see page 216. Another example is to replicate the element across an entire Vector Register before storing it into an arbitrary aligned storage operand of the same length; the replication ensures that the correct data are stored regardless of the offset of the storage operand in its aligned quadword in storage. Figure 101. Aligned quadword storage operand Figure 102. Vector Register contents for aligned quadword Load or Store Figure 103. Unaligned quadword storage operand Figure 104. Vector Register contents ## 6.4.1 Accessing Unaligned Storage Operands Figure 103 shows an unaligned quadword storage operand that spans two aligned quadwords. In the remainder of this section, the aligned quadword that contains the most significant bytes of the unaligned quadword is called the most significant quadword (MSQ) and the aligned quadword that contains the least significant bytes of the unaligned quadword is called the least significant quadword (LSQ). Because the Vector Storage Access instructions ignore the low-order bits of the effective address, the unaligned quadword cannot be transferred between storage and a Vector Register using a single instruction. The remainder of this section gives examples of accessing unaligned quadword storage operands. Similar sequences can be used to access unaligned halfword and word storage operands. ## **Programming Note** The sequence of instructions given below is one approach that can be used to load the unaligned quadword shown in Figure 103 into a Vector Register. In Figure 104 Vhi and Vlo are the Vector Registers that will receive the most significant quadword and least significant quadword respectively. VRT is the target Vector Register. After the two quadwords have been loaded into Vhi and Vlo, using Load Vector Indexed instructions, the alignment is performed by shifting the 32-byte quantity Vhi || Vlo left by an amount determined by the address of the first byte of the desired data. The shifting is done using a Vector Permute instruction for which the permute control vector is generated by a Load Vector for Shift Left instruction uses the same address specification as the Load Vector Indexed instruction that loads the Vhi register; this is the address of the desired unaligned quadword. The following sequence of instructions copies the unaligned quadword storage operand into register Vt. ``` # Assumptions: # Rb != 0 and contents of Rb = 0xB lvx Vhi,0,Rb # load MSQ lvsl Vp,0,Rb # set permute control vector addi Rb,Rb,16 # address of LSQ lvx Vlo,0,Rb # load LSQ vperm Vt,Vhi,Vlo,Vp# align the data ``` The procedure for storing an unaligned quadword is essentially the reverse of the procedure for loading one. However, a read-modify-write sequence is required that inserts the source quadword into two aligned quadwords in storage. The quadword to be stored is assumed to be in Vs; see Figure 104 The contents of Vs are shifted right and split into two parts, each of which is merged (using a Vector Select instruction) with the current contents of the two aligned quadwords (MSQ and LSQ) that will contain the most significant bytes and least significant bytes, respectively, of the unaligned quadword. The resulting two quadwords are stored using Store Vector Indexed instructions. A Load Vector for Shift Right instruction is used to generate the permute control vector that is used for the shifting. A single register is used for the "shifted" contents; this is possible because the "shifting" is done by means of a right rotation. The rotation is accomplished by specifying Vs for both components of the Vector Permute instruction. In addition, the same permute control vector is used on a sequence of 1s and Os to generate the mask used by the Vector Select instructions that do the merging. The following sequence of instructions copies the contents of Vs into an unaligned quadword in storage. ``` # Assumptions: # Rb != 0 and contents of Rb = 0xB lvx Vhi,0,Rb # load current MSQ lvsr Vp,0,Rb # set permute control vector addi Rb, Rb, 16 # address of LSQ # load current LSQ lvx Vlo,0,Rb V1s,-1 # generate the select mask bits vspltisb vspltisb V0s,0 vperm Vmask, V0s, V1s, Vp # generate the select mask vperm Vs, Vs, Vs, Vp # right rotate the data Vlo, Vs, Vlo, Vmask # insert LSQ component vsel Vhi, Vhi, Vs, Vmask # insert MSQ component vsel Vlo,0,Rb # store LSQ stvx addi Rb, Rb, -16 # address of MSQ Vhi,0,Rb # store MSQ stvx ``` ## **6.5 Vector Integer Operations** Many of the instructions that produce fixed-point integer results have the potential to compute a result value that cannot be represented in the target format. When this occurs, this unrepresentable intermediate value is converted to a representable result value using one of the following methods. - The high-order bits of the intermediate result that do not fit in the target format are discarded. This method is used by instructions having names that include the word "Modulo". - 2. The intermediate result is converted to the nearest value that is representable in the target format (i.e., to the minimum or maximum representable value, as appropriate). This method is used by instructions having names that include the word "Saturate". An intermediate result that is forced to the minimum or maximum representable value as just described is said to "saturate". - An instruction for which an intermediate result saturates causes VSCR<sub>SAT</sub> to be set to 1; see Section 6.3.2. - If the intermediate result includes non-zero fraction bits it is rounded up to the nearest fixed-point integer value. This method is used by the six Vector Average Integer instructions and by the Vector Multiply-High-Round-Add Signed Halfword Saturate instruction. The latter instruction then uses method 2, if necessary. ## **Programming Note** Because VSCR<sub>SAT</sub> is sticky, it can be used to detect whether any instruction in a sequence of "Saturate"-type instructions produced an inexact result due to saturation. For example, the contents of the VSCR can be copied to a VR (*mfvscr*), bits other than the SAT bit can be cleared in the VR (*vand* with a constant), the result can be compared to zero setting CR6 (*vcmpequb*.), and a branch can be taken according to whether VSCR<sub>SAT</sub> was set to 1 (*Branch Conditional* that tests CR field 6). Testing VSCR<sub>SAT</sub> after each "Saturate"-type instruction would degrade performance considerably. Alternative techniques include the following: - Retain sufficient information at "checkpoints" that the sequence of computations performed between one checkpoint and the next can be redone (more slowly) in a manner that detects exactly when saturation occurs. Test VSCR<sub>SAT</sub> only at checkpoints, or when redoing a sequence of computations that saturated. - Perform intermediate computations using an element length sufficient to prevent saturation, and then use a *Vector Pack Integer Saturate* instruction to pack the final result to the desired length. (Vector Pack Integer Saturate causes results to saturate if necessary, and sets VSCR<sub>SAT</sub> to 1 if any result saturates.) ## 6.5.1 Integer Saturation Saturation occurs whenever the result of a saturating instruction does not fit in the result field. Unsigned saturation clamps results to zero (0) on underflow and to the maximum positive integer value (2<sup>n-1</sup>, e.g. 255 for byte fields) on overflow. Signed saturation clamps results to the smallest representable negative number (-2<sup>n-1</sup>, e.g. -128 for byte fields) on underflow, and to the largest representable positive number (2<sup>n-1</sup>-1, e.g. +127 for byte fields) on overflow. In most cases, the simple maximum/minimum saturation performed by the vector instructions is adequate. However, sometimes, e.g. in the creation of very high quality images, more complex saturation functions must be applied. To support this, the Vector facility provides a mechanism for detecting that saturation has occurred. The VSCR has a bit, the SAT bit, which is set to a one (1) anytime any field in a saturating instruction saturates. The SAT bit can only be cleared by explicitly writing zero to it. Thus SAT accumulates a summary result of any integer overflow or underflow that occurs on a saturating instruction. Borderline cases that generate results equal to saturation values, for example unsigned 0+0=0 and unsigned byte 1+254=255, are not considered saturation conditions and do not cause SAT to be set. The SAT bit can be set by the following types of instructions: - Move To VSCR - Vector Add Integer with Saturation - Vector Subtract Integer with Saturation - Vector Multiply-Add Integer with Saturation - Vector Multiply-Sum with Saturation - Vector Sum-Across with Saturation - Vector Pack with Saturation - Vector Convert to Fixed-point with Saturation Note that only instructions that explicitly call for "saturation" can set SAT. "Modulo" integer instructions and floating-point arithmetic instructions never set SAT. ## **Programming Note** The SAT state can be tested and used to alter program flow by moving the VSCR to a vector register (with *mfvscr*), then masking out bits 0:126 (to clear undefined and reserved bits) and performing a vector compare equal-to unsigned byte w/record (*vcmpequb*.) with zero to get a testable value into the condition register for consumption by a subsequent branch. Since *mfvscr* will be slow compared to other Vector instructions, reading and testing SAT after each instruction would be prohibitively expensive. Therefore, software is advised to employ strategies that minimize checking SAT. For example: checking SAT periodically and backtracking to the last checkpoint to identify exactly which field in which instruction saturated; or, working in an element size sufficient to prevent any overflow or underflow during intermediate calculations, then packing down to the desired element size as the final operation (the vector pack instruction saturates the results and updates SAT when a loss of significance is detected). # 6.6 Vector Floating-Point Operations ## 6.6.1 Floating-Point Overview Unless VSCR<sub>NJ</sub>=1 (see Section 6.3.2), the floating-point model provided by the Vector Processor conforms to The Java Language Specification (hereafter referred to as "Java"), which is a subset of the default environment specified by the IEEE standard (i.e., by ANSI/IEEE Standard 754-1985, "IEEE Standard for Binary Floating-Point Arithmetic"). For aspects of floating-point behavior that are not defined by Java but are defined by the IEEE standard, vector floating-point conforms to the IEEE standard. For aspects of floating-point behavior that are defined neither by Java nor by the IEEE standard but are defined by the "C9X Floating-Point Proposal" (hereafter referred to as "C9X"), vector floating-point conforms to C9X. The single-precision floating-point data format, value representations, and computational models defined in Chapter 4. "Floating-Point Processor [Category: Floating-Point]" on page 99 apply to vector floating-point except as follows. - In general, no status bits are set to reflect the results of floating-point operations. The only exception is that VSCR<sub>SAT</sub> may be set by the Vector Convert To Fixed-Point Word instructions. - With the exception of the two *Vector Convert To Fixed-Point Word* instructions and three of the four *Vector Round to Floating-Point Integer* instructions, all vector floating-point instructions that round use the rounding mode Round to Nearest. - Floating-point exceptions (see Section 6.6.2) cannot cause the system error handler to be invoked. #### Programming Note - If a function is required that is specified by the IEEE standard, is not supported by the Vector Processor, and cannot be emulated satisfactorily using the functions that are supported by the Vector Processor, the functions provided by the Floating-Point Processor should be used; see Chapter 4. ## 6.6.2 Floating-Point Exceptions The following floating-point exceptions may occur during execution of vector floating-point instructions. - NaN Operand Exception - Invalid Operation Exception - Zero Divide Exception - Log of Zero Exception - Overflow Exception - Underflow Exception If an exception occurs, a result is placed into the corresponding target element as described in the following subsections. This result is the default result specified by Java, the IEEE standard, or C9X, as applicable. Recall that denormalized source values are treated as if they were zero when VSCR<sub>NJ</sub>=1. This has the following consequences regarding exceptions. - Exceptions that can be caused by a zero source value can be caused by a denormalized source value when VSCR<sub>N,I</sub>=1. - Exceptions that can be caused by a nonzero source value cannot be caused by a denormalized source value when VSCR<sub>N.I</sub>=1. ## 6.6.2.1 NaN Operand Exception A NaN Operand Exception occurs when a source value for any of the following instructions is a NaN. - A vector instruction that would normally produce floating-point results - Either of the two Vector Convert To Fixed-Point Word instructions - Any of the four Vector Floating-Point Compare instructions The following actions are taken: If the vector instruction would normally produce floating-point results, the corresponding result is a source NaN selected as follows. In all cases, if the selected source NaN is a Signaling NaN it is converted to the corresponding Quiet NaN (by setting the high-order bit of the fraction field to 1) before being placed into the target element. if the element in VRA is a NaN then the result is that NaN else if the element in VRB is a NaN then the result is that NaN else if the element in VRC is a NaN then the result is that NaN else if Invalid Operation exception (Section 6.6.2.2) then the result is the QNaN 0x7FC0\_0000 If the instruction is either of the two *Vector Convert To Fixed-Point Word* instructions, the corresponding result is 0x0000\_0000. VSCR<sub>SAT</sub> is not affected. If the instruction is *Vector Compare Bounds Floating-Point*, the corresponding result is 0xC000\_0000. If the instruction is one of the other *Vector Floating-Point Compare* instructions, the corresponding result is $0x0000\_0000$ . ## 6.6.2.2 Invalid Operation Exception An Invalid Operation Exception occurs when a source value or set of source values is invalid for the specified operation. The invalid operations are: - Magnitude subtraction of infinities - Multiplication of infinity by zero - Reciprocal square root estimate of a negative, nonzero number or -infinity. - Log base 2 estimate of a negative, nonzero number or -infinity. The corresponding result is the QNaN 0x7FC0\_0000. ## 6.6.2.3 Zero Divide Exception A Zero Divide Exception occurs when a Vector Reciprocal Estimate Floating-Point or Vector Reciprocal Square Root Estimate Floating-Point instruction is executed with a source value of zero. The corresponding result is an infinity, where the sign is the sign of the source value. ## 6.6.2.4 Log of Zero Exception A Log of Zero Exception occurs when a *Vector Log Base 2 Estimate Floating-Point* instruction is executed with a source value of zero. The corresponding result is -Infinity. ## 6.6.2.5 Overflow Exception An Overflow Exception occurs under either of the following conditions. - For a vector instruction that would normally produce floating-point results, the magnitude of what would have been the result if the exponent range were unbounded exceeds that of the largest finite floating-point number for the target floating-point format. - For either of the two *Vector Convert To Fixed-Point Word* instructions, either a source value is an infinity or the product of a source value and 2<sup>UIM</sup> is a number too large in magnitude to be represented in the target fixed-point format. The following actions are taken: - If the vector instruction would normally produce floating-point results, the corresponding result is an infinity, where the sign is the sign of the intermediate result. - If the instruction is Vector Convert To Unsigned Fixed-Point Word Saturate, the corresponding result is 0xFFFF\_FFFF if the source value is a positive number or +infinity, and is 0x0000\_0000 if the source value is a negative number or -infinity. VSCR<sub>SAT</sub> is set to 1. If the instruction is Vector Convert To Signed Fixed-Point Word Saturate, the corresponding result is 0x7FFF\_FFFF if the source value is a positive number or +infinity., and is 0x8000\_0000 if the source value is a negative number or -infinity. VSCR<sub>SAT</sub> is set to 1. ## 6.6.2.6 Underflow Exception An Underflow Exception can occur only for vector instructions that would normally produce floating-point results. It is detected before rounding. It occurs when a nonzero intermediate result computed as though both the precision and the exponent range were unbounded is less in magnitude than the smallest normalized floating-point number for the target floating-point format. The following actions are taken: - If VSCR<sub>NJ</sub>=0, the corresponding result is the value produced by denormalizing and rounding the intermediate result. - If VSCR<sub>NJ</sub>=1, the corresponding result is a zero, where the sign is the sign of the intermediate result # 6.7 Vector Storage Access Instructions The Storage Access instructions compute the effective address (EA) of the storage to be accessed as described in Section 1.10.3, "Effective Address Calculation" on page 26. The low-order bits of the EA that would correspond to an unaligned storage operand are ignored. The Load Vector Element Indexed and Store Vector Element Indexed instructions transfer a byte, halfword, or word element between storage and a Vector Register. The Load Vector Indexed and Store Vector Indexed instructions transfer an aligned quadword between storage and a Vector Register. ## 6.7.1 Storage Access Exceptions Storage accesses will cause the system data storage error handler to be invoked if the program is not allowed to modify the target storage (*Store* only), or if the program attempts to access storage that is unavailable. ## 6.7.2 Vector Load Instructions The aligned byte, halfword, word, or quadword in storage addressed by EA is loaded into register VRT. #### Programming Note - The Load Vector Element instructions load the specified element into the same location in the target register as the location into which it would be loaded using the Load Vector instruction. ## Load Vector Element Byte Indexed X-form Ivebx VRT,RA,RB ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) eb \leftarrow EA<sub>60:63</sub> VRT \leftarrow undefined if Big-Endian byte ordering then VRT_{8\times eb:8\times eb+7} \leftarrow \text{MEM}(EA,1) else VRT_{120-(8\times eb):127-(8\times eb)} \leftarrow \text{MEM}(EA,1) ``` Let the effective address (EA) be the sum (RA|0)+(RB). Let eb be bits 60:63 of EA. If Big-Endian byte ordering is used for the storage access, the contents of the byte in storage at address EA are placed into byte eb of register VRT. The remaining bytes in register VRT are set to undefined values. If Category: Vector.Little-Endian is supported, then if Little-Endian byte ordering is used for the storage access, the contents of the byte in storage at address EA are placed into byte 15-eb of register VRT. The remaining bytes in register VRT are set to undefined values. ## **Special Registers Altered:** None ## Load Vector Element Halfword Indexed X-form Ivehx VRT,RA,RB Let the effective address (EA) be the result of ANDing 0xFFFF\_FFFF\_FFFF with the sum (RA|0)+(RB). Let eb be bits 60:63 of EA. If Big-Endian byte ordering is used for the storage access, - the contents of the byte in storage at address EA are placed into byte eb of register VRT, - the contents of the byte in storage at address EA+1 are placed into byte eb+1 of register VRT, and - the remaining bytes in register VRT are set to undefined values. If Category: Vector.Little-Endian is supported, then if Little-Endian byte ordering is used for the storage access, - the contents of the byte in storage at address EA are placed into byte 15-eb of register VRT, - the contents of the byte in storage at address EA+1 are placed into byte 14-eb of register VRT, and - the remaining bytes in register VRT are set to undefined values. ## **Special Registers Altered:** Ivewx ## Load Vector Element Word Indexed X-form 31 VRT RA RB 71 / ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow (b + (RB)) & 0xFFFF_FFFF_FFFC eb \leftarrow EA<sub>60:63</sub> VRT \leftarrow undefined if Big-Endian byte ordering then VRT<sub>8×eb:8×eb+31</sub> \leftarrow MEM(EA,4) else VRT<sub>96-(8×eb):127-(8×eb)</sub> \leftarrow MEM(EA,4) ``` VRT,RA,RB Let the effective address (EA) be the result of ANDing 0xFFFF\_FFFF\_FFFC with the sum (RA|0)+(RB). Let eb be bits 60:63 of EA. If Big-Endian byte ordering is used for the storage access, - the contents of the byte in storage at address EA are placed into byte eb of register VRT, - the contents of the byte in storage at address EA+1 are placed into byte eb+1 of register VRT. - the contents of the byte in storage at address EA+2 are placed into byte eb+2 of register - the contents of the byte in storage at address EA+3 are placed into byte eb+3 of register VRT, and - the remaining bytes in register VRT are set to undefined values. If Category: Vector.Little-Endian is supported, then if Little-Endian byte ordering is used for the storage access, - the contents of the byte in storage at address EA are placed into byte 15-eb of register VRT, - the contents of the byte in storage at address EA+1 are placed into byte 14-eb of register VRT, - the contents of the byte in storage at address EA+2 are placed into byte 13-eb of register VRT - the contents of the byte in storage at address EA+3 are placed into byte 12-eb of register VRT, and - the remaining bytes in register VRT are set to undefined values. #### **Special Registers Altered:** None ## Load Vector Indexed X-form lvx VRT,RA,RB ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) VRT \leftarrow MEM(EA & 0xFFFF FFFF FFFF FFF0, 16) ``` Let the effective address (EA) be the sum (RA|0)+(RB). The quadword in storage addressed by the result of EA ANDed with 0xFFFF\_FFFF\_FFFF is loaded into VRT. ## **Special Registers Altered:** None ### Load Vector Indexed LRU X-form IvxI VRT,RA,RB ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) VRT \leftarrow MEM(EA & 0xFFFF_FFFF_FFFF, 16) mark_as_not_likely_to_be_needed_again_anytime_soon (EA) ``` Let the effective address (EA) be the sum (RA|0)+(RB). The quadword in storage addressed by the result of EA ANDed with 0xFFFF\_FFFF\_FFFF0 is loaded into VRT. *IvxI* provides a hint that the quadword in storage addressed by EA will probably not be needed again by the program in the near future. #### **Special Registers Altered:** ## - Programming Note On some implementations, the hint provided by the IvxI instruction and the corresponding hint provided by the stvxl, Ivepxl, and stvepxl instructions are applied to the entire cache block containing the specified quadword. On such implementations, the effect of the hint may be to cause that cache block to be considered a likely candidate for replacement when space is needed in the cache for a new block. Thus, on such implementations, the hint should be used with caution if the cache block containing the quadword also contains data that may be needed by the program in the near future. Also, the hint may be used before the last reference in a sequence of references to the quadword if the subsequent references are likely to occur sufficiently soon that the cache block containing the quadword is not likely to be displaced from the cache before the last reference. ## 6.7.3 Vector Store Instructions Some portion or all of the contents of VRS are stored into the aligned byte, halfword, word, or quadword in storage addressed by EA. #### Programming Note - The Store Vector Element instructions store the specified element into the same storage location as the location into which it would be stored using the Store Vector instruction. ## Store Vector Element Byte Indexed X-form stvebx VRS,RA,RB | 31 | VRS | RA | RB | 135 | / | |----|-----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) eb \leftarrow EA<sub>60:63</sub> if Big-Endian byte ordering then MEM(EA,1) \leftarrow VRS<sub>8×eb:8×eb+7</sub> else MEM(EA,1) \leftarrow VRS<sub>120-(8×eb):127-(8×eb)</sub> ``` Let the effective address (EA) be the sum (RA|0)+(RB). Let eb be bits 60:63 of EA. If Big-Endian byte ordering is used for the storage access, the contents of byte eb of register VRS are placed in the byte in storage at address EA. If Category: Vector.Little-Endian is supported, then if Little-Endian byte ordering is used for the storage access, the contents of byte 15-eb of register VRS are placed in the byte in storage at address EA. #### **Special Registers Altered:** None #### Programming Note Unless bits 60:63 of the address are known to match the byte offset of the subject byte element in register VRS, software should use *Vector Splat* to splat the subject byte element before performing the store. ## Store Vector Element Halfword Indexed X-form stvehx VRS,RA,RB | 31 | VRS | RA | RB | 167 | / | |----|-----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow (b + (RB)) & 0xFFFF_FFFF_FFFF_FFFE eb \leftarrow EA<sub>60:63</sub> if Big-Endian byte ordering then MEM(EA,2) \leftarrow VRS<sub>8×eb:8×eb+15</sub> else MEM(EA,2) \leftarrow VRS<sub>112-(8×eb):127-(8×eb)</sub> ``` Let the effective address (EA) be the result of ANDing 0xFFFF\_FFFF\_FFFF with the sum (RA|0)+(RB). Let eb be bits 60:63 of EA. If Big-Endian byte ordering is used for the storage access, - the contents of byte eb of register VRS are placed in the byte in storage at address EA, and - the contents of byte eb+1 of register VRS are placed in the byte in storage at address EA+1. If Category: Vector.Little-Endian is supported, then if Little-Endian byte ordering is used for the storage access, - the contents of byte 15-eb of register VRS are placed in the byte in storage at address EA, - the contents of byte 14-eb of register VRS are placed in the byte in storage at address EA+1. ## **Special Registers Altered:** None ### Programming Note - Unless bits 60:62 of the address are known to match the halfword offset of the subject halfword element in register VRS software should use *Vector Splat* to splat the subject halfword element before performing the store. ## Store Vector Element Word Indexed X-form stvewx VRS,RA,RB | 31 | VRS | RA | RB | 199 | / | |----|-----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow (b + (RB)) & 0xFFFF_FFFF_FFFC eb \leftarrow EA<sub>60:63</sub> if Big-Endian byte ordering then MEM(EA,4) \leftarrow VRS<sub>8×eb:8×eb+31</sub> else MEM(EA,4) \leftarrow VRS<sub>96-(8×eb):127-(8×eb)</sub> ``` Let the effective address (EA) be the result of ANDing 0xFFFF\_FFFF\_FFFC with the sum (RA|0)+(RB). Let eb be bits 60:63 of EA. If Big-Endian byte ordering is used for the storage access, - the contents of byte eb of register VRS are placed in the byte in storage at address EA, - the contents of byte eb+1 of register VRS are placed in the byte in storage at address EA+1, - the contents of byte eb+2 of register VRS are placed in the byte in storage at address EA+2, and - the contents of byte eb+3 of register VRS are placed in the byte in storage at address EA+3. If Category: Vector.Little-Endian is supported, then if Little-Endian byte ordering is used for the storage access. - the contents of byte 15-eb of register VRS are placed in the byte in storage at address EA, - the contents of byte 14-eb of register VRS are placed in the byte in storage at address EA+1, - the contents of byte 13-eb of register VRS are placed in the byte in storage at address EA+2, and - the contents of byte 12-eb of register VRS are placed in the byte in storage at address EA+3. ### **Special Registers Altered:** None #### Programming Note Unless bits 60:61 of the address are known to match the word offset of the subject word element in register VRS, software should use *Vector Splat* to splat the subject word element before performing the store. ## Store Vector Indexed X-form stvx VRS,RA,RB ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) MEM(EA & 0xFFFF FFFF FFFF FFF0, 16) \leftarrow (VRS) ``` Let the effective address (EA) be the sum (RA|0)+(RB). The contents of VRS are stored into the quadword in storage addressed by the result of EA ANDed with 0xFFFF\_FFFF\_FFFF\_FFFO. ## **Special Registers Altered:** None #### Store Vector Indexed LRU X-form stvxl VRS,RA,RB Let the effective address (EA) be the sum (RA|0)+(RB). The contents of VRS are stored into the quadword in storage addressed by the result of EA ANDed with 0xFFFF\_FFFF\_FFFF\_FFF0. **stvxI** provides a hint that the quadword in storage addressed by EA will probably not be needed again by the program in the near future. #### **Special Registers Altered:** None #### Programming Note See the Programming Note for the *IvxI* instruction on page 204. # 6.7.4 Vector Alignment Support Instructions #### Programming Note The *IvsI* and *Ivsr* instructions can be used to create the permute control vector to be used by a subsequent *vperm* instruction (see page 217). Let X and Y be the contents of register VRA and VRB specified by the *vperm*. The control vector created by *IvsI* causes the *vperm* to select the high-order 16 bytes of the result of shifting the 32-byte value X || Y left by sh bytes. The control vector created by *Ivsr* causes the *vperm* to select the low-order 16 bytes of the result of shifting X || Y right by sh bytes. #### Programming Note Examples of uses of *IvsI*, *Ivsr*, and *vperm* to load and store unaligned data are given in Section 6.4.1. These instructions can also be used to rotate or shift the contents of a Vector Register left (*IvsI*) or right (*IvsI*) by sh bytes. For rotating, the Vector Register to be rotated should be specified as both register VRA and VRB for *vperm*. For shifting left, VRB for *vperm* should be a register containing all zeros and VRA should contain the value to be shifted, and vice versa for shifting right. ## Load Vector for Shift Left Indexed X-form #### Ivsl VRT,RA,RB ``` if RA = 0 then b \leftarrow 0 b \leftarrow (RA) sh \leftarrow (b + (RB))_{60:63} switch(sh) case(0x0): VRT←0x000102030405060708090A0B0C0D0E0F case(0x1): VRT←0x0102030405060708090A0B0C0D0E0F10 case(0x2): VRT←0x02030405060708090A0B0C0D0E0F1011 case(0x3): VRT←0x030405060708090A0B0C0D0E0F101112 case(0x4): VRT←0x0405060708090A0B0C0D0E0F10111213 case(0x5): VRT←0x05060708090A0B0C0D0E0F1011121314 case(0x6): VRT←0x060708090A0B0C0D0E0F101112131415 case(0x7): VRT←0x0708090A0B0C0D0E0F10111213141516 case(0x8): VRT←0x08090A0B0C0D0E0F1011121314151617 case(0x9): VRT←0x090A0B0C0D0E0F101112131415161718 case(0xA): VRT←0x0A0B0C0D0E0F10111213141516171819 case(0xB): VRT←0x0B0C0D0E0F101112131415161718191A case(0xC): VRT←0x0C0D0E0F101112131415161718191A1B case(0xD): VRT←0x0D0E0F101112131415161718191A1B1C case(0xE): VRT←0x0E0F101112131415161718191A1B1C1D case(0xF): VRT \leftarrow 0x0F101112131415161718191A1B1C1D1E ``` Let sh be bits 60:63 of the sum (RA|0)+(RB). Let X be the 32 byte value $0x00 \parallel 0x01 \parallel 0x02 \parallel ... \parallel 0x1E \parallel 0x1F$ . Bytes sh to sh+15 of X are placed into VRT. #### **Special Registers Altered:** None ## Load Vector for Shift Right Indexed X-form lvsr VRT,RA,RB ``` if RA = 0 then b \leftarrow 0 b \leftarrow (RA) sh \leftarrow (b + (RB))_{60:63} switch(sh) case(0x0): VRT \leftarrow 0x101112131415161718191A1B1C1D1E1F case(0x1): VRT←0x0F101112131415161718191A1B1C1D1E case (0x2): VRT\leftarrow 0x0E0F101112131415161718191A1B1C1D case(0x3): VRT←0x0D0E0F101112131415161718191A1B1C case(0x4): VRT←0x0C0D0E0F101112131415161718191A1B case(0x5): VRT←0x0B0C0D0E0F101112131415161718191A case(0x6): VRT←0x0A0B0C0D0E0F10111213141516171819 case(0x7): VRT←0x090A0B0C0D0E0F101112131415161718 case(0x8): VRT←0x08090A0B0C0D0E0F1011121314151617 case(0x9): VRT←0x0708090A0B0C0D0E0F10111213141516 case(0xA): VRT←0x060708090A0B0C0D0E0F101112131415 case(0xB): VRT←0x05060708090A0B0C0D0E0F1011121314 case(0xC): VRT←0x0405060708090A0B0C0D0E0F10111213 case(0xD): VRT←0x030405060708090A0B0C0D0E0F101112 case(0xE): VRT←0x02030405060708090A0B0C0D0E0F1011 case(0xF): VRT←0x0102030405060708090A0B0C0D0E0F10 ``` Let sh be bits 60:63 of the sum (RA|0)+(RB). Let X be the 32-byte value $0x00 \parallel 0x01 \parallel 0x02 \parallel ... \parallel 0x1E \parallel 0x1F$ . Bytes 16-sh to 31-sh of X are placed into VRT. #### **Special Registers Altered:** ## 6.8 Vector Permute and Formatting Instructions ## 6.8.1 Vector Pack and Unpack Instructions #### Vector Pack Pixel VX-form vpkpx VRT,VRA,VRB | 4 | VRT | VRA | VRB | 782 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i \leftarrow 0 to 63 by 16 VRT_i \leftarrow (VRA)_{i\times 2+7} VRT_{i+1:i+5} \leftarrow (VRA)_{i\times 2+8:i\times 2+12} VRT_{i+6:i+10} \leftarrow (VRA)_{i\times 2+16:i\times 2+20} VRT_{i+11:i+15} \leftarrow (VRA)_{i\times 2+24:i\times 2+28} VRT_{i+64} \leftarrow (VRB)_{i\times 2+7} VRT_{i+65:i+69} \leftarrow (VRB)_{i\times 2+8:i\times 2+12} VRT_{i+70:i+74} \leftarrow (VRB)_{i\times 2+16:i\times 2+20} VRT_{i+75:i+79} \leftarrow (VRB)_{i\times 2+24:i\times 2+28} ``` Let the source vector be the concatenation of the contents of VRA followed by the contents of VRB. For each vector element *i* from 0 to 7, do the following. Word element *i* in the source vector is packed to produce a 16-bit value as described below. - bit 7 of the first byte (bit 7 of the word) - bits 0:4 of the second byte (bits 8:12 of the word) - bits 0:4 of the third byte (bits 16:20 of the word) - bits 0:4 of the fourth byte (bits 24:28 of the word) The result is placed into halfword element *i* of VRT. ## **Special Registers Altered:** None ## **Programming Note** Each source word can be considered to be a 32-bit "pixel", consisting of four 8-bit "channels". Each target halfword can be considered to be a 16-bit pixel, consisting of one 1-bit channel and three 5-bit channels. A channel can be used to specify the intensity of a particular color, such as red, green, or blue, or to provide other information needed by the application. ## Vector Pack Signed Halfword Signed Saturate VX-form vpkshss VRT,VRA,VRB | 4 | VRT | VRA | VRB | 398 | |---|-----|-----|-----|------| | 0 | 6 | 11 | 16 | 21 3 | ``` do i=0 to 63 by 8 \begin{array}{l} \text{VRT}_{\text{i}:\text{i+7}} \leftarrow \\ \text{Clamp}\left(\text{EXTS}\left(\left(\text{VRA}\right)_{\text{i}\times2:\text{i}\times2+15}\right), -128, 127\right)_{24:31} \\ \text{VRT}_{\text{i+64}:\text{i+71}} \leftarrow \\ \text{Clamp}\left(\text{EXTS}\left(\left(\text{VRB}\right)_{\text{i}\times2:\text{i}\times2+15}\right), -128, 127\right)_{24:31} \end{array} ``` Let the source vector be the concatenation of the contents of VRA followed by the contents of VRB. For each vector element i from 0 to 15, do the following. Signed-integer halfword element *i* in the source vector is converted to an signed-integer byte. - If the value of the element is greater than 127 the result saturates to 127 - If the value of the element is less than -128 the result saturates to -128. The low-order 8 bits of the result is placed into byte element *i* of VRT. ## **Special Registers Altered:** SAT ## Vector Pack Signed Word Signed Saturate VX-form vpkswss VRT,VRA,VRB | 4 | VRT | VRA | VRB | 462 | |---|-----|-----|-----|------| | 0 | 6 | 11 | 16 | 21 3 | ``` do i=0 to 63 by 16 \begin{array}{l} \text{VRT}_{i:i+15} \\ \leftarrow \text{ Clamp}\left(\text{EXTS}\left(\left(\text{VRA}\right)_{i\times 2:i\times 2+31}, \ -2^{15}, \ 2^{15}-1\right)_{16:31} \\ \text{VRT}_{i+64:i+79} \\ \leftarrow \text{ Clamp}\left(\text{EXTS}\left(\left(\text{VRB}\right)_{i\times 2:i\times 2+31}, \ -2^{15}, \ 2^{15}-1\right)_{16:31} \end{array} ``` Let the source vector be the concatenation of the contents of VRA followed by the contents of VRB. For each vector element *i* from 0 to 7, do the following. Signed-integer word element *i* in the source vector is converted to an signed-integer halfword. - If the value of the element is greater than 2<sup>15</sup>-1 the result saturates to 2<sup>15</sup>-1 - If the value of the element is less than -2<sup>15</sup> the result saturates to -2<sup>15</sup>. The low-order 16 bits of the result is placed into halfword element *i* of VRT. ## **Special Registers Altered:** SAT ## Vector Pack Signed Halfword Unsigned Saturate VX-form vpkshus VRT,VRA,VRB | 4 | VRT | VRA | VRB | 270 | |---|-----|-----|-----|------| | 0 | 6 | 11 | 16 | 21 3 | ``` do i=0 to 63 by 8 \begin{array}{c} \text{VRT}_{\text{i:i+7}} \\ \leftarrow \text{Clamp}\left(\text{EXTS}\left(\left(\text{VRA}\right)_{\text{i}\times2:\text{i}\times2+15}\right), \ 0, \ 255\right)_{24:31} \\ \text{VRT}_{\text{i+64:i+71}} \\ \leftarrow \text{Clamp}\left(\text{EXTS}\left(\left(\text{VRB}\right)_{\text{i}\times2:\text{i}\times2+15}\right), \ 0, \ 255\right)_{24:31} \end{array} ``` Let the source vector be the concatenation of the contents of VRA followed by the contents of VRB. For each vector element i from 0 to 15, do the following. Signed-integer halfword element *i* in the source vector is converted to an unsigned-integer byte. - If the value of the element is greater than 255 the result saturates to 255 - If the value of the element is less than 0 the result saturates to 0. The low-order 8 bits of the result is placed into byte element *i* of VRT. #### **Special Registers Altered:** SAT ## Vector Pack Signed Word Unsigned Saturate VX-form vpkswus VRT,VRA,VRB ``` \begin{array}{l} \text{do i=0 to 63 by 16} \\ \text{VRT}_{i:i+15} \\ \leftarrow \text{Clamp}\left(\text{EXTS}\left(\left(\text{VRA}\right)_{i\times 2:i\times 2+31}\right), \ 0, \ 2^{16}\text{--}1\right)_{16:31} \\ \text{VRT}_{i+64:i+79} \\ \leftarrow \text{Clamp}\left(\text{EXTS}\left(\left(\text{VRB}\right)_{i\times 2:i\times 2+31}\right), \ 0, \ 2^{16}\text{--}1\ \right)_{16:31} \end{array} ``` Let the source vector be the concatenation of the contents of VRA followed by the contents of VRB. For each vector element *i* from 0 to 7, do the following. Signed-integer word element *i* in the source vector is converted to an unsigned-integer halfword. - If the value of the element is greater than 2<sup>16</sup>-1 the result saturates to 2<sup>16</sup>-1 - If the value of the element is less than 0 the result saturates to 0. The low-order 16 bits of the result is placed into halfword element *i* of VRT. ### **Special Registers Altered:** SAT ## Vector Pack Unsigned Halfword Unsigned Modulo VX-form vpkuhum VRT,VRA,VRB | 4 | VRT | VRA | VRB | 14 | |---|-----|-----|-----|-------| | 0 | 6 | 11 | 16 | 21 31 | ``` do i=0 to 63 by 8 VRT_{i:i+7} \leftarrow (VRA)_{i\times 2+8: i\times 2+15} VRT_{i+64:i+71} \leftarrow (VRB)_{i\times 2+8: i\times 2+15} ``` Let the source vector be the concatenation of the contents of VRA followed by the contents of VRB. For each vector element *i* from 0 to 15, do the following. The contents of bits 8:15 of halfword element *i* in the source vector is placed into byte element *i* of VRT ## **Special Registers Altered:** None ## Vector Pack Unsigned Word Unsigned Modulo VX-form vpkuwum VRT,VRA,VRB | 4 | VRT | VRA | VRB | 78 | | |---|-----|-----|-----|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 63 by 16 VRT_{i:i+15} \leftarrow (VRA)_{i\times2+16:i\times2+31} VRT_{i+64:i+79} \leftarrow (VRB)_{i\times2+16:i\times2+31} ``` Let the source vector be the concatenation of the contents of VRA followed by the contents of VRB. For each vector element *i* from 0 to 7, do the following. The contents of bits 16:31 of word element *i* in the source vector is placed into halfword element *i* of VRT. ## **Special Registers Altered:** None ## Vector Pack Unsigned Halfword Unsigned Saturate VX-form vpkuhus VRT,VRA,VRB | 4 | VRT | VRA | VRB | 142 | 1 | |---|-----|-----|-----|------|---| | 0 | 6 | 11 | 16 | 21 3 | 1 | ``` do i=0 to 63 by 8 \begin{array}{l} VRT_{i:i+7} \\ \leftarrow \mbox{Clamp( EXTZ((VRA)}_{i\times 2:i\times 2+15)}, \ 0, \ 255 \ )_{24:31} \\ VRT_{i+64:i+71} \\ \leftarrow \mbox{Clamp( EXTZ((VRB)}_{i\times 2:i\times 2+15}), \ 0, \ 255 \ )_{24:31} \end{array} ``` Let the source vector be the concatenation of the contents of VRA followed by the contents of VRB. For each vector element *i* from 0 to 15, do the following. Unsigned-integer halfword element *i* in the source vector is converted to an unsigned-integer byte. - If the value of the element is greater than 255 the result saturates to 255. The low-order 8 bits of the result is placed into byte element *i* of VRT. ## **Special Registers Altered:** SAT ## Vector Pack Unsigned Word Unsigned Saturate VX-form vpkuwus VRT,VRA,VRB ``` do i=0 to 63 by 16 \begin{array}{l} {\rm VRT_{i:i+15}} \\ \leftarrow {\rm Clamp(\ EXTZ((VRA)_{i\times 2:i\times 2+31}\ ),\ 0,\ 2^{16}\text{--}1\ )_{16:31}} \\ {\rm VRT_{i+64:i+79}} \\ \leftarrow {\rm Clamp(\ EXTZ((VRB)_{i\times 2:i\times 2+31}\ ),\ 0,\ 2^{16}\text{--}1\ )_{16:31}} \end{array} ``` Let the source vector be the concatenation of the contents of VRA followed by the contents of VRB. For each vector element *i* from 0 to 7, do the following. Unsigned-integer word element *i* in the source vector is converted to an unsigned-integer halfword. If the value of the element is greater than 2<sup>16</sup>-1 the result saturates to 2<sup>16</sup>-1. The low-order 16 bits of the result is placed into halfword element *i* of VRT. #### **Special Registers Altered:** SAT ## Vector Unpack High Pixel VX-form #### vupkhpx VRT,VRB | 4 | VRT | /// | VRB | 846 | |---|-----|-----|-----|-------| | 0 | 6 | 11 | | 21 31 | ``` \begin{array}{lll} \text{do i=0 to 63 by 16} \\ & \text{VRT}_{\text{i}\times2:\text{i}\times2+7} & \leftarrow & \text{EXTS((VRB)}_{\text{i}} & ) \\ & \text{VRT}_{\text{i}\times2+8:\text{i}\times2+15} & \leftarrow & \text{EXTZ((VRB)}_{\text{i}+1:\text{i}+5} & ) \\ & \text{VRT}_{\text{i}\times2+16:\text{i}\times2+23} & \leftarrow & \text{EXTZ((VRB)}_{\text{i}+6:\text{i}+10} & ) \\ & \text{VRT}_{\text{i}\times2+24:\text{i}\times2+31} & \leftarrow & \text{EXTZ((VRB)}_{\text{i}+11:\text{i}+15}) \end{array} ``` For each vector element *i* from 0 to 3, do the following. Halfword element *i* in VRB is unpacked as follows. - sign-extend bit 0 of the halfword to 8 bits - zero-extend bits 1:5 of the halfword to 8 bits - zero-extend bits 6:10 of the halfword to 8 hits - zero-extend bits 11:15 of the halfword to 8 bits The result is placed in word element *i* of VRT. ## **Special Registers Altered:** None ## **Programming Note** The source and target elements can be considered to be 16-bit and 32-bit "pixels" respectively, having the formats described in the Programming Note for the *Vector Pack Pixel* instruction on page 209. ## - Programming Note Notice that the unpacking done by the *Vector Unpack Pixel* instructions does not reverse the packing done by the *Vector Pack Pixel* instruction. Specifically, if a 16-bit pixel is unpacked to a 32-bit pixel which is then packed to a 16-bit pixel, the resulting 16-bit pixel will not, in general, be equal to the original 16-bit pixel (because, for each channel except the first, *Vector Unpack Pixel* inserts high-order bits while *Vector Pack Pixel* discards low-order bits). ## Vector Unpack High Signed Byte VX-form vupkhsb VRT,VRB | 4 | VRT | /// | VRB | 526 | | |---|-----|-----|-----|------|----| | 0 | 6 | 11 | 16 | 21 3 | 31 | ``` do i=0 to 63 by 8 \text{VRT}_{\texttt{i} \times 2: \texttt{i} \times 2+15} \leftarrow \texttt{EXTS} \left( \left( \texttt{VRB} \right)_{\texttt{i}: \texttt{i}+7} \right) ``` For each vector element *i* from 0 to 7, do the following. Signed-integer byte element *i* in VRB is sign-extended to produce a signed-integer half-word and placed into halfword element *i* in VRT. ## **Special Registers Altered:** None ## Vector Unpack High Signed Halfword VX-form vupkhsh VRT,VRB | 4 | VRT | /// | VRB | 590 | 1 | |---|-----|-----|-----|------|---| | 0 | 6 | 11 | 16 | 21 3 | 1 | ``` do i=0 to 63 by 16 \text{VRT}_{\text{i}\times 2: \text{i}\times 2+31} \leftarrow \text{EXTS}\left(\left(\text{VRB}\right)_{\text{i}: \text{i}+15}\right) ``` For each vector element *i* from 0 to 3, do the following. Signed-integer halfword element *i* in VRB is sign-extended to produce a signed-integer word and placed into word element *i* in VRT. #### **Special Registers Altered:** ## Vector Unpack Low Pixel VX-form ## vupklpx VRT,VRB | 4 | VRT | /// | VRB | 974 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{lll} \text{do i=0 to 63 by 16} \\ & \text{VRT}_{\texttt{i}\times2:\texttt{i}\times2+7} & \leftarrow & \text{EXTS((VRB)}_{\texttt{i}+64} & ) \\ & \text{VRT}_{\texttt{i}\times2+8:\texttt{i}\times2+15} & \leftarrow & \text{EXTZ((VRB)}_{\texttt{i}+65:\texttt{i}+69}) \\ & \text{VRT}_{\texttt{i}\times2+16:\texttt{i}\times2+23} & \leftarrow & \text{EXTZ((VRB)}_{\texttt{i}+70:\texttt{i}+74}) \\ & \text{VRT}_{\texttt{i}\times2+24:\texttt{i}\times2+31} & \leftarrow & \text{EXTZ((VRB)}_{\texttt{i}+75:\texttt{i}+79}) \end{array} ``` For each vector element i from 0 to 3, do the following. Halfword element *i*+4 in VRB is unpacked as follows. - sign-extend bit 0 of the halfword to 8 bits - zero-extend bits 1:5 of the halfword to 8 bits - zero-extend bits 6:10 of the halfword to 8 bits - zero-extend bits 11:15 of the halfword to 8 bits The result is placed in word element *i* of VRT. ## **Special Registers Altered:** None ## Vector Unpack Low Signed Byte VX-form vupklsb VRT,VRB | ĺ | 4 | VRT | /// | VRB | 654 | 1 | |---|---|-----|-----|-----|-------|---| | | 0 | 6 | 11 | 16 | 21 3° | 1 | ``` do i=0 to 63 by 8 \text{VRT}_{\text{i} \times 2: \text{i} \times 2+15} \leftarrow \text{EXTS} \left( \left( \text{VRB} \right)_{\text{i}+64: \text{i}+71} \right) ``` For each vector element *i* from 0 to 7, do the following. Signed-integer byte element *i*+8 in VRB is sign-extended to produce a signed-integer halfword and placed into halfword element *i* in VRT. ## **Special Registers Altered:** None ## Vector Unpack Low Signed Halfword VX-form vupklsh VRT,VRB | 4 | VRT | /// | VRB | 718 | | |---|-----|-----|-----|-----|----| | o | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 63 by 16 VRT_{i \times 2: i \times 2+31} \leftarrow EXTS((VRB)_{i+64: i+79}) ``` For each vector element *i* from 0 to 3, do the following. Signed-integer halfword element *i*+4 in VRB is sign-extended to produce a signed-integer word and placed into word element *i* in VRT. ## **Special Registers Altered:** ## **6.8.2 Vector Merge Instructions** ## Vector Merge High Byte ## VX-form ## Vector Merge High Halfword VX-form **VRB** 76 vmrghb VRT,VRA,VRB | 4 | VRT | VRA | VRB | 12 | | |---|-----|-----|-----|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | do i=0 to 63 by 8 $$VRT_{i\times 2: i\times 2+7} \leftarrow (VRA)_{i:i+7}$$ $VRT_{i\times 2+8: i\times 2+15} \leftarrow (VRB)_{i:i+7}$ For each vector element *i* from 0 to 7, do the following. Byte element *i* in VRA is placed into byte element 2xi in VRT. Byte element *i* in VRB is placed into byte element 2xi+1 in VRT. ## **Special Registers Altered:** None ## do i=0 to 63 by 16 $VRT_{i\times 2: i\times 2+15} \leftarrow (VRA)_{i:i+15}$ $VRT_{i\times 2+16:i\times 2+31} \leftarrow (VRB)_{i:i+15}$ VRT,VRA,VRB VRA For each vector element *i* from 0 to 3, do the following. Halfword element *i* in VRA is placed into halfword element 2xi in VRT. Halfword element *i* in VRB is placed into halfword element 2xi+1 in VRT. ## **Special Registers Altered:** None vmrghh ## Vector Merge High Word VX-form vmrghw VRT,VRA,VRB | 4 | VRT | VRA | VRB | 140 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 63 by 32 VRT_{i\times 2: i\times 2+31} \leftarrow (VRA)_{i:i+31} VRT_{i\times 2+32: i\times 2+63} \leftarrow (VRB)_{i:i+31} ``` For each vector element *i* from 0 to 1, do the following. Word element *i* in VRA is placed into word element 2xi in VRT. Word element *i* in VRB is placed into word element 2xi+1 in VRT. The word elements in the high-order half of VRA are placed, in the same order, into the even-numbered word elements of VRT. The word elements in the high-order half of VRB are placed, in the same order, into the odd-numbered word elements of VRT. #### **Special Registers Altered:** 332 ## Vector Merge Low Byte VX-form ## Vector Merge Low Halfword VX-form VRB VRT,VRA,VRB $VRT_{i\times 2+16:i\times 2+31} \leftarrow (VRB)_{i+64:i+79}$ **VRA** VRT vmrglb VRT,VRA,VRB | 4 | VRT | VRA | VRB | 268 | |---|-----|-----|-----|-------| | 0 | 6 | 11 | 16 | 21 31 | For each vector element *i* from 0 to 7, do the following. Byte element i+8 in VRA is placed into byte element 2xi in VRT. Byte element i+8 in VRB is placed into byte element 2xi+1 in VRT. ## **Special Registers Altered:** None ## For each vector element *i* from 0 to 3, do the following. $\leftarrow$ (VRA)<sub>i+64:i+79</sub> Halfword element i+4 in VRA is placed into halfword element 2xi in VRT. Halfword element i+4 in VRB is placed into halfword element 2xi+1 in VRT. ## **Special Registers Altered:** None vmrglh 4 do i=0 to 63 by 16 VRT<sub>i×2:i×2+15</sub> ## Vector Merge Low Word VX-form vmrglw VRT,VRA,VRB | 4 | VRT | VRA | VRB | 396 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 63 by 32 VRT_{i\times 2: i\times 2+31} \leftarrow (VRA)_{i+64: i+95} VRT_{i\times 2+32: i\times 2+63} \leftarrow (VRB)_{i+64: i+95} ``` For each vector element *i* from 0 to 1, do the following. Word element i+2 in VRA is placed into word element 2xi in VRT. Word element i+2 in VRB is placed into word element 2xi+1 in VRT. ## **Special Registers Altered:** ## 6.8.3 Vector Splat Instructions ## Programming Note The *Vector Splat* instructions can be used in preparation for performing arithmetic for which one source vector is to consist of elements that all have the same value (e.g., multiplying all elements of a Vector Register by a constant). ## Vector Splat Byte ## VX-form ## Vector Splat Immediate Signed Byte VX-form vspltb VRT,VRB,UIM | 4 | VRT | / | UIM | VRB | 524 | | |---|-----|----|-----|-----|-----|----| | 0 | 6 | 11 | 12 | 16 | 21 | 31 | b $$\leftarrow$$ UIM $\parallel$ 0b000 do i=0 to 127 by 8 $\text{VRT}_{\text{i:i+7}} \leftarrow \text{(VRB)}_{\text{b:b+7}}$ For each vector element *i* from 0 to 15, do the following. The contents of byte element UIM in VRB are placed into byte element *i* of VRT. ## **Special Registers Altered:** None ## Vector Splat Halfword VX-form vsplth VRT,VRB,UIM | 4 | VRT | // | UIM | VRB | 588 | | |---|-----|----|-----|-----|-----|----| | • | * | ′′ | 0 | * | 000 | | | 0 | 6 | 11 | 13 | 16 | 21 | 31 | ``` b \leftarrow UIM \parallel 0b0000 do i=0 to 127 by 16 _{\mathrm{VRT}_{i:i+15}} \leftarrow (VRB) _{b:b+15} ``` For each vector element *i* from 0 to 7, do the following. The contents of halfword element UIM in VRB are placed into halfword element *i* of VRT. ## **Special Registers Altered:** None ## Vector Splat Word VX-form vspltw VRT,VRB,UIM | 4 | VRT | /// | UIM | VRB | 652 | | |---|-----|-----|-----|-----|-----|----| | 0 | 6 | 11 | 14 | 16 | 21 | 31 | ``` b \leftarrow UIM || 0b00000 do i=0 to 127 by 32 VRT_{i:i+31} \leftarrow (VRB)_{b:b+31} ``` For each vector element *i* from 0 to 3, do the following. The contents of word element UIM in VRB are placed into word element *i* of VRT. ## Special Registers Altered: None vspltisb VRT,SIM | 4 | VRT | SIM | /// | 780 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 8 VRT_{i:i+7} \leftarrow EXTS(SIM, 8) ``` For each vector element *i* from 0 to 15, do the following. The value of the SIM field, sign-extended to 8 bits, is placed into byte element *i* of VRT. ## **Special Registers Altered:** None ### Vector Splat Immediate Signed Halfword VX-form vspltish VRT,SIM | 4 | VRT | SIM | /// | 844 | | |---|-----|-----|-----|------|---| | 0 | 6 | 11 | 16 | 21 3 | 1 | do i=0 to 127 by 16 $$\text{VRT}_{\text{i:i+15}} \leftarrow \text{EXTS}(\text{SIM, 16})$$ For each vector element *i* from 0 to 7, do the following. The value of the SIM field, sign-extended to 16 bits, is placed into halfword element *i* of VRT. ## **Special Registers Altered:** None ## Vector Splat Immediate Signed Word VX-form vspltisw VRT,SIM | 4 | VRT | SIM | /// | 908 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | do i=0 to 127 by 32 $$VRT_{i:i+31} \leftarrow EXTS(SIM, 32)$$ For each vector element *i* from 0 to 3, do the following. The value of the SIM field, sign-extended to 32 bits, is placed into word element *i* of VRT. #### **Special Registers Altered:** ## 6.8.4 Vector Permute Instruction The *Vector Permute* instruction allows any byte in two source Vector Registers to be copied to any byte in the target Vector Register. The bytes in a third source Vector Register specify from which byte in the first two source Vector Registers the corresponding target byte is to be copied. The contents of the third source Vector Register are sometimes referred to as the "permute control vector". #### **Vector Permute** VA-form vperm VRT,VRA,VRB,VRC | 4 | VRT | VRA | VRB | VRC | 43 | | |---|-----|-----|-----|-----|----|----| | 0 | 6 | 11 | 16 | 21 | 26 | 31 | Let the source vector be the concatenation of the contents of VRA followed by the contents of VRB. For each vector element *i* from 0 to 15, do the following. The contents of the byte element in the source vector specified by bits 3:7 of byte element *i* of VRC are placed into byte element *i* of VRT. #### **Special Registers Altered:** None ## **Programming Note** See the Programming Notes with the Load Vector for Shift Left and Load Vector for Shift Right instructions on page 208 for examples of uses of **vperm**. ## 6.8.5 Vector Select Instruction #### Vector Select VA-form vsel VRT,VRA,VRB,VRC do i=0 to 127 $$VRT_i \leftarrow ((VRC)_i=0)$$ ? $(VRA)_i$ : $(VRB)_i$ For each bit in VRC that contains the value 0, the corresponding bit in VRA is placed into the corresponding bit of VRT. Otherwise, the corresponding bit in VRB is placed into the corresponding bit of VRT. ## **Special Registers Altered:** ## 6.8.6 Vector Shift Instructions The *Vector Shift* instructions rotate or shift the contents of a Vector Register or a pair of Vector Registers left or right by a specified number of bytes (vslo, vsro, vsldoi) or bits (vsl, vsr). Depending on the instruction, this "shift count" is specified either by the contents of a Vector Register or by an immediate field in the instruction. In the former case, 7 bits of the shift count register give the shift count in bits ( $0 \le \text{count} \le 127$ ). Of these 7 bits, the high-order 4 bits give the number of complete bytes by which to shift and are used by vslo and vsro; the low-order 3 bits give the number of remaining bits by which to shift and are used by vsl and vsr. ## **Programming Note** A pair of these instructions, specifying the same shift count register, can be used to shift the contents of a Vector Register left or right by the number of bits (0-127) specified in the shift count register. The following example shifts the contents of register Vx left by the number of bits specified in register Vy and places the result into register Vz. vslo Vz,Vx,Vy vsl Vz,Vz,Vy #### Vector Shift Left VX-form vsl VRT,VRA,VRB | | 4 | VRT | VRA | VRB | 452 | | |---|---|-----|-----|-----|-----|----| | 0 | | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{l} \mathtt{sh} \;\leftarrow\; (\mathtt{VRB})_{\;125:127} \\ \mathtt{t} \;\leftarrow\; \mathtt{1} \\ \mathtt{do} \;\; \mathtt{i=0} \;\; \mathtt{to} \;\; \mathtt{127} \;\; \mathtt{by} \;\; \mathtt{8} \\ \mathtt{t} \;\; \leftarrow\; \mathtt{t} \;\; \&\; ((\mathtt{VRB})_{\;\mathtt{i+5}:\mathtt{i+7}}\mathtt{=}\mathtt{sh}) \\ \mathtt{if} \;\; \mathtt{t=1} \;\; \mathtt{then} \;\; \mathtt{VRT} \;\; \leftarrow\; (\mathtt{VRA}) \;\; <<\; \mathtt{sh} \\ \mathtt{else} \qquad \qquad \mathtt{VRT} \;\; \leftarrow\; \mathtt{undefined} \end{array} ``` The contents of VRA are shifted left by the number of bits specified in (VRB)<sub>125:127</sub>. - Bits shifted out of bit 0 are lost. - Zeros are supplied to the vacated bits on the right. The result is place into VRT, except if, for any byte element in register VRB, the low-order 3 bits are not equal to the shift amount, then VRT is undefined. ## **Special Registers Altered:** None ## Vector Shift Left Double by Octet Immediate VA-form vsldoi VRT,VRA,VRB,SHB | | 4 | VRT | VRA | VRB | / | SHB | | 44 | | |---|---|-----|-----|-----|----|-----|----|----|----| | 0 | | 6 | 11 | 16 | 21 | 22 | 26 | | 31 | $VRT \leftarrow ((VRA) \parallel (VRB))_{8 \times SHB: 8 \times SHB+127}$ Let the source vector be the concatenation of the contents of VRA followed by the contents of VRB. Bytes SHB:SHB+15 of the source vector are placed into VRT. #### **Special Registers Altered:** None ## Vector Shift Left by Octet VX-form vslo VRT,VRA,VRB | 4 | VRT | VRA | VRB | 1036 | | |---|-----|-----|-----|------|---| | 0 | 6 | 11 | 16 | 21 3 | 1 | ``` \begin{array}{lll} \mathtt{shb} \; \leftarrow \; \mathtt{(VRB)}_{121:124} \\ \mathtt{VRT} \; \leftarrow \; \mathtt{(VRA)} \; << \; \mathtt{(shb} \; || \; \mathtt{0b0000} \; \mathtt{)} \end{array} ``` The contents of VRA are shifted left by the number of bytes specified in $(VRB)_{121:124}$ . - Bytes shifted out of byte 0 are lost. - Zeros are supplied to the vacated bytes on the right. The result is placed into VRT. ## **Special Registers Altered:** ## **Vector Shift Right** #### VX-form ## Vector Shift Right by Octet VX-form vsr VRT,VRA,VRB | 4 | VRT | VRA | VRB | 708 | ٦ | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{l} \text{sh} \leftarrow \text{(VRB)}_{125:127} \\ \text{t} \leftarrow 1 \\ \text{do i=0 to 127 by 8} \\ \text{t} \leftarrow \text{t & ((VRB)}_{\text{i+5:i+7}}\text{=sh)} \\ \text{if t=1 then VRT} \leftarrow \text{(VRA)} >>_{\text{ui}} \text{ sh} \\ \text{else} \qquad \text{VRT} \leftarrow \text{undefined} \end{array} ``` The contents of VRA are shifted right by the number of bits specified in (VRB)<sub>125:127</sub>. - Bits shifted out of bit 127 are lost. - Zeros are supplied to the vacated bits on the left. The result is place into VRT, except if, for any byte element in register VRB, the low-order 3 bits are not equal to the shift amount, then VRT is undefined. ## **Special Registers Altered:** None ## **Programming Note** A double-register shift by a dynamically specified number of bits (0-127) can be performed in six instructions. The following example shifts Vw || Vx left by the number of bits specified in Vy and places the high-order 128 bits of the result into Vz. ``` vslo Vt1,Vw,Vy #shift high-order reg left vsl Vt1,Vt1,Vy vsububm Vt3,V0,Vy #adjust shift count ((V0)=0) vsro Vt2,Vx,Vt3 #shift low-order reg right vsr Vt2,Vt2,Vt3 vor Vz,Vt1,Vt2 #merge to get final result ``` | | | ,. | , | | | | |---|---|-----|-----|-----|------|----| | Ī | 4 | VRT | VRA | VRB | 1100 | | | c | ) | 6 | 11 | 16 | 21 | 31 | ``` shb \leftarrow (VRB)<sub>121:124</sub> VRT \leftarrow (VRA) >>_{\rm ui} ( shb || 0b000 ) ``` VRT.VRA.VRB The contents of VRA are shifted right by the number of bytes specified in (VRB)<sub>121:124</sub>. - Bytes shifted out of byte 15 are lost. - Zeros are supplied to the vacated bytes on the left. The result is placed into VRT. ## **Special Registers Altered:** None vsro ## 6.9 Vector Integer Instructions ## **6.9.1 Vector Integer Arithmetic Instructions** ## 6.9.1.1 Vector Integer Add Instructions ## Vector Add and Write Carry-Out Unsigned Word VX-form vaddcuw VRT,VRA,VRB | 4 | VRT | VRA | VRB | 384 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 32 aop \leftarrow EXTZ((VRA)<sub>i:i+31</sub>) bop \leftarrow EXTZ((VRB)<sub>i:i+31</sub>) VRT<sub>i:i+31</sub> \leftarrow Chop((aop +<sub>int</sub> bop) >><sub>ui</sub> 32,1) ``` For each vector element i from 0 to 3, do the following. Unsigned-integer word element i in VRA is added to unsigned-integer word element i in VRB. The carry out of the 32-bit sum is zero-extended to 32 bits and placed into word element i of VRT. ## **Special Registers Altered:** None ## Vector Add Signed Halfword Saturate VX-form vaddshs VRT,VRA,VRB | 4 | VRT | VRA | VRB | 832 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 16 aop \leftarrow EXTS((VRA)_{i:i+15}) bop \leftarrow EXTS((VRB)_{i:i+15}) VRT_{i:i+15} \leftarrow Clamp(aop +_{int} bop, -2^{15}, 2^{15}-1)_{16:31} ``` For each vector element *i* from 0 to 7, do the following. Signed-integer halfword element *i* in VRA is added to signed-integer halfword element *i* in VRB. - If the sum is greater than 2<sup>15</sup>-1 the result saturates to 2<sup>15</sup>-1 - If the sum is less than -2<sup>15</sup> the result saturates to -2<sup>15</sup>. The low-order 16 bits of the result are placed into halfword element *i* of VRT. #### **Special Registers Altered:** SAT ## Vector Add Signed Byte Saturate VX-form vaddsbs VRT,VRA,VRB | 4 | VRT | VRA | VRB | 768 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 8 aop \leftarrow EXTS(VRA_{i:i+7}) \\ bop \leftarrow EXTS(VRB_{i:i+7}) \\ VRT_{i:i+7} \leftarrow Clamp(aop +_{int}bop, -128, 127)_{24:31} ``` For each vector element i from 0 to 15, do the following. Signed-integer byte element *i* in VRA is added to signed-integer byte element *i* in VRB. - If the sum is greater than 127 the result saturates to 127. - If the sum is less than -128 the result saturates to -128. The low-order 8 bits of the result are placed into byte element *i* of VRT. ### **Special Registers Altered:** SAT ### Vector Add Signed Word Saturate VX-form vaddsws VRT.VRA.VRB | 4 | VRT | VRA | VRB | 896 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | For each vector element *i* from 0 to 3, do the following. Signed-integer word element *i* in VRA is added to signed-integer word element *i* in VRB. - If the sum is greater than 2<sup>31</sup>-1 the result saturates to 2<sup>31</sup>-1. - If the sum is less than -2<sup>31</sup> the result saturates to -2<sup>31</sup>. The low-order 32 bits of the result are placed into word element *i* of VRT. #### **Special Registers Altered:** SAT ## Vector Add Unsigned Byte Modulo VX-form vaddubm VRT,VRA,VRB | 4 | VRT | VRA | VRB | 0 | | |---|-----|-----|-----|-------------------|---| | 0 | 6 | 11 | 16 | 21 3 <sup>-</sup> | ı | For each vector element i from 0 to 15, do the following. Unsigned-integer byte element i in VRA is added to unsigned-integer byte element i in VRB. The low-order 8 bits of the result are placed into byte element *i* of VRT. ### **Special Registers Altered:** None ## Programming Note vaddubm can be used for unsigned or signed-integers. ## Vector Add Unsigned Word Modulo VX-form vadduwm VRT,VRA,VRB | 4 | VRT | V/RA | VRR | 128 | | |-----|------|-------|------|-----|----| | T T | VIXI | V1\/\ | VIND | 120 | | | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 32 \begin{array}{lll} \text{aop} & \leftarrow & \text{EXTZ}\left(\left(\text{VRA}\right)_{\text{i:i+31}}\right) \\ \text{bop} & \leftarrow & \text{EXTZ}\left(\left(\text{VRB}\right)_{\text{i:i+31}}\right) \\ \text{temp} & \leftarrow & \text{aop} & +_{\text{int}} & \text{bop} \\ \text{VRT}_{\text{i:i+31}} & \leftarrow & \text{Chop}\left(\left(\text{aop} +_{\text{int}} \right) & \text{bop}, & 32\right) \end{array} \right) ``` For each vector element *i* from 0 to 3, do the following. Unsigned-integer word element *i* in VRA is added to unsigned-integer word element *i* in VRB. The low-order 32 bits of the result are placed into word element *i* of VRT. ## **Special Registers Altered:** None ## **Programming Note** **vadduwm** can be used for unsigned or signed-integers. ## Vector Add Unsigned Halfword Modulo VX-form vadduhm VRT,VRA,VRB | 4 | VRT | VRA | VRB | 64 | |---|-----|-----|-----|-------| | 0 | 6 | 11 | 16 | 21 31 | ``` do i=0 to 127 by 16 aop \leftarrow EXTZ((VRA)<sub>i:i+15</sub>) bop \leftarrow EXTZ((VRB)<sub>i:i+15</sub>) VRT<sub>i:i+15</sub> \leftarrow Chop( aop +<sub>int</sub> bop, 16 ) ``` For each vector element *i* from 0 to 7, do the following. Unsigned-integer halfword element i in VRA is added to unsigned-integer halfword element i in VRB. The low-order 16 bits of the result are placed into halfword element *i* of VRT. #### **Special Registers Altered:** None #### Programming Note **vadduhm** can be used for unsigned or signed-integers. ## Vector Add Unsigned Byte Saturate VX-form vaddubs VRT,VRA,VRB | 4 | VRT | VRA | VRB | 512 | |---|-----|-----|-----|-------| | 0 | 6 | 11 | 16 | 21 31 | ``` do i=0 to 127 by 8 aop \leftarrow EXTZ((VRA)_{i:i+7}) \\ bop \leftarrow EXTZ((VRB)_{i:i+7}) \\ VRT_{i:i+7} \leftarrow Clamp(aop +_{int}bop, 0, 255)_{24:31} ``` For each vector element i from 0 to 15, do the following. Unsigned-integer byte element i in VRA is added to unsigned-integer byte element i in VRB. - If the sum is greater than 255 the result saturates to 255. The low-order 8 bits of the result are placed into byte element *i* of VRT. #### **Special Registers Altered:** SAT ## Vector Add Unsigned Word Saturate VX-form vadduws VRT,VRA,VRB | 4 | VRT | VRA | VRB | 640 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 32 aop \leftarrow EXTZ((VRA)<sub>i:i+31</sub>) bop \leftarrow EXTZ((VRB)<sub>i:i+31</sub>) VRT<sub>i:i+31</sub> \leftarrow Clamp(aop +<sub>int</sub> bop, 0, 2<sup>32</sup>-1) ``` For each vector element *i* from 0 to 3, do the following. Unsigned-integer word element *i* in VRA is added to unsigned-integer word element *i* in VRB. If the sum is greater than 2<sup>32</sup>-1 the result saturates to 2<sup>32</sup>-1. The low-order 32 bits of the result are placed into word element *i* of VRT. ## Special Registers Altered: SAT ## Vector Add Unsigned Halfword Saturate VX-form vadduhs VRT,VRA,VRB ``` do i=0 to 127 by 16 aop \leftarrow EXTZ((VRA)_{i:i+15}) bop \leftarrow EXTZ((VRB)_{i:i+15}) VRT_{i:i+15} \leftarrow Clamp(aop +_{int} bop, 0, 2^{16}-1)_{16:31} ``` For each vector element *i* from 0 to 7, do the following. Unsigned-integer halfword element *i* in VRA is added to unsigned-integer halfword element *i* in VRB. If the sum is greater than 2<sup>16</sup>-1 the result saturates to 2<sup>16</sup>-1. The low-order 16 bits of the result are placed into halfword element *i* of VRT. ## Special Registers Altered: SAT ## 6.9.1.2 Vector Integer Subtract Instructions #### Vector Subtract and Write Carry-Out **Unsigned Word** VX-form #### vsubcuw VRT,VRA,VRB | 4 | VRT | VRA | VRB | 1408 | |---|-----|-----|-----|------| | 0 | 6 | 11 | 16 | 21 3 | ``` do i=0 to 127 by 32 aop \leftarrow (VRA)_{i:i+31} bop \leftarrow (VRB)<sub>i:i+31</sub> temp \leftarrow (EXTZ(aop) +<sub>int</sub> EXTZ(¬bop) +<sub>int</sub> 1) >> 32 VRT_{i:i+31} \leftarrow temp \& 0x0000\_0001 ``` For each vector element *i* from 0 to 3, do the following. Unsigned-integer word element i in VRB is subtracted from unsigned-integer word element i in VRA. The complement of the borrow out of bit 0 of the 32-bit difference is zero-extended to 32 bits and placed into word element i of VRT. #### **Special Registers Altered:** None ## Vector Subtract Signed Byte Saturate VX-form VRT,VRA,VRB vsubsbs | 4 | VRT | VRA | VRB | 1792 | | |---|-----|-----|-----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 8 aop \leftarrow EXTS((VRA)_{i:i+7}) bop \leftarrow \text{EXTS}((\text{VRB})_{i:i+7}^{---}) VRT<sub>i:i+7</sub> ← Clamp(aop +_{int} ¬bop +_{int} 1, -128, 127)<sub>24:31</sub> ``` For each vector element i from 0 to 15, do the following. Signed-integer byte element i in VRB is subtracted from signed-integer byte element *i* in VRA. - If the intermediate result is greater than 127 the result saturates to 127. - If the intermediate result is less than -128 the result saturates to -128. The low-order 8 bits of the result are placed into byte element *i* of VRT. ## **Special Registers Altered:** SAT ## Vector Subtract Signed Halfword Saturate VX-form vsubshs VRT,VRA,VRB | 4 | VRT | VRA | VRR | 1856 | | |---|------|----------|-----|------|----| | | VIXI | V 1 (/ ( | VKD | 1000 | | | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 16 aop \leftarrow EXTS((VRA)_{i:i+15}) bop \leftarrow EXTS((VRB)_{i:i+15}) VRT_{i:i+15} \leftarrow Clamp(aop +<sub>int</sub> ¬bop +<sub>int</sub> 1, -2<sup>15</sup>, 2<sup>15</sup>-1)<sub>16:31</sub> ``` For each vector element *i* from 0 to 7, do the following. Signed-integer halfword element i in VRB is subtracted from signed-integer halfword element i in VRA. - If the intermediate result is greater than $2^{15}$ -1 the result saturates to $2^{15}$ -1. - If the intermediate result is less than -2<sup>15</sup> the result saturates to -215. The low-order 16 bits of the result are placed into halfword element *i* of VRT. ## **Special Registers Altered:** SAT ## Vector Subtract Signed Word Saturate VX-form vsubsws VRT,VRA,VRB | 4 | VRT | VRA | VRB | 1920 | | |---|-----|-----|-----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 32 aop \leftarrow EXTS((VRA)_{i:i+31}) bop \leftarrow \text{EXTS}((\text{VRB})_{i:i+31}) \mathtt{VRT}_{\texttt{i:i+31}} \leftarrow \mathtt{Clamp}(\mathtt{aop} +_{\texttt{int}} \neg \mathtt{bop} +_{\texttt{int}} 1, -2^{31}, 2^{31} - 1) ``` For each vector element *i* from 0 to 3, do the following. Signed-integer word element i in VRB is subtracted from signed-integer word element *i* in VRA. - If the intermediate result is greater than $2^{31}$ -1 the result saturates to $2^{31}$ -1. - If the intermediate result is less than -231 the result saturates to -231. The low-order 32 bits of the result are placed into word element *i* of VRT. ### **Special Registers Altered:** SAT ## Vector Subtract Unsigned Byte Modulo VX-form vsububm VRT,VRA,VRB | 4 | VRT | VRA | VRB | 1024 | | |---|-----|-----|-----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 8 \begin{array}{l} \text{aop} \leftarrow \text{EXTZ}\left(\left(\text{VRA}\right)_{\text{i:i+7}}\right) \\ \text{bop} \leftarrow \text{EXTZ}\left(\left(\text{VRB}\right)_{\text{i:i+7}}\right) \\ \text{VRT}_{\text{i:i+7}} \leftarrow \text{Chop}\left(\text{ aop } +_{\text{int}} \neg \text{bop } +_{\text{int}} 1, \ 8 \ \right) \end{array} ``` For each vector element i from 0 to 15, do the following. Unsigned-integer byte element *i* in VRB is subtracted from unsigned-integer byte element *i* in VRA. The low-order 8 bits of the result are placed into byte element *i* of VRT. ## **Special Registers Altered:** None ## Vector Subtract Unsigned Word Modulo VX-form vsubuwm VRT,VRA,VRB | 4 | VRT | VRA | VRB | 1152 | | |---|-----|-----|-----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 32 aop \leftarrow EXTZ((VRA)<sub>i:i+31</sub>) bop \leftarrow EXTZ((VRB)<sub>i:i+31</sub>) VRT<sub>i:i+31</sub> \leftarrow Chop( aop +<sub>int</sub> ¬bop +<sub>int</sub> 1, 32 ) ``` For each vector element *i* from 0 to 3, do the following. Unsigned-integer word element *i* in VRB is subtracted from unsigned-integer word element *i* in VRA. The low-order 32 bits of the result are placed into word element *i* of VRT. ## **Special Registers Altered:** None ## Vector Subtract Unsigned Halfword Modulo VX-form vsubuhm VRT,VRA,VRB | 4 | VRT | VRA | VRB | 1088 | ٦ | |---|-----|-----|-----|------|---| | 0 | 6 | 11 | 16 | 21 3 | 1 | ``` do i=0 to 127 by 16 \begin{array}{lll} & \text{aop} & \leftarrow \text{EXTZ}\left(\left(\text{VRA}\right)_{\text{i:i+15}}\right) \\ & \text{bop} & \leftarrow \text{EXTZ}\left(\left(\text{VRB}\right)_{\text{i:i+15}}\right) \\ & \text{VRT}_{\text{i:i+16}} & \leftarrow \text{Chop}\left(\text{ aop } +_{\text{int}} \neg \text{bop } +_{\text{int}} \ 1, \ 16 \ \right) \end{array} ``` For each vector element *i* from 0 to 7, do the following. Unsigned-integer halfword element *i* in VRB is subtracted from unsigned-integer halfword element *i* in VRA. The low-order 16 bits of the result are placed into halfword element *i* of VRT. ## Special Registers Altered: ## Vector Subtract Unsigned Byte Saturate VX-form #### vsububs VRT,VRA,VRB | 4 | VRT | VRA | VRB | 1536 | |---|-----|-----|-----|-------| | 0 | 6 | 11 | 16 | 21 31 | ``` do i=0 to 127 by 8 \begin{array}{l} \text{aop} \leftarrow \text{EXTZ}((\text{VRA})_{\text{i:i+7}}) \\ \text{bop} \leftarrow \text{EXTZ}((\text{VRB})_{\text{i:i+7}}) \\ \text{VRT}_{\text{i:i+7}} \leftarrow \text{Clamp}(\text{aop} +_{\text{int}} \neg \text{bop} +_{\text{int}} 1, 0, 255)_{24:31} \end{array} ``` For each vector element i from 0 to 15, do the following. Unsigned-integer byte element *i* in VRB is subtracted from unsigned-integer byte element *i* in VRA. If the intermediate result is less than 0 the result saturates to 0. The low-order 8 bits of the result are placed into byte element *i* of VRT. #### **Special Registers Altered:** SAT ## Vector Subtract Unsigned Word Saturate VX-form vsubuws VRT,VRA,VRB | 4 | VRT | VRA | VRB | 1664 | |---|-----|-----|-----|------| | 0 | 6 | 11 | 16 | 21 3 | ``` do i=0 to 127 by 32 aop \leftarrow EXTZ((VRA)<sub>i:i+31</sub>) bop \leftarrow EXTZ((VRB)<sub>i:i+31</sub>) VRT<sub>i:i+31</sub> \leftarrow Clamp(aop +<sub>int</sub> ¬bop +<sub>int</sub> 1, 0, 2<sup>32</sup>-1) ``` For each vector element i from 0 to 7, do the following. Unsigned-integer word element i in VRB is subtracted from unsigned-integer word element i in VRA. If the intermediate result is less than 0 the result saturates to 0. The low-order 32 bits of the result are placed into word element *i* of VRT. #### **Special Registers Altered:** SAT ## Vector Subtract Unsigned Halfword Saturate VX-form vsubuhs VRT,VRA,VRB | 4 | VRT | VRA | VRB | 1600 | ٦ | |---|-----|-----|-----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 16 aop \leftarrow \text{EXTZ}((\text{VRA})_{i:i+15}) bop \leftarrow \text{EXTZ}((\text{VRB})_{i:i+15}) \text{VRT}_{i:i+15} \leftarrow \text{Clamp}(\text{aop} +_{\text{int}} \neg \text{bop} +_{\text{int}} 1,0,2^{16}-1)_{16:31} ``` For each vector element *i* from 0 to 7, do the following. Unsigned-integer halfword element *i* in VRB is subtracted from unsigned-integer halfword element *i* in VRA. If the intermediate result is less than 0 the result saturates to 0. The low-order 16 bits of the result are placed into halfword element *i* of VRT. ## **Special Registers Altered:** SAT ## 6.9.1.3 Vector Integer Multiply Instructions ## Vector Multiply Even Signed Byte VX-form vmulesb VRT,VRA,VRB | 4 | VRT | VRA | VRB | 776 | |---|-----|-----|-----|-------| | 0 | 6 | 11 | 16 | 21 31 | ``` do i=0 to 127 by 16 prod \leftarrow EXTS((VRA)_{i:i+7}) \times_{si} EXTS((VRB)_{i:i+7}) VRT_{i:i+15} \leftarrow Chop( prod, 16 ) ``` For each vector element i from 0 to 7, do the following. Signed-integer byte element ix2 in VRA is multiplied by signed-integer byte element ix2 in VRB. The low-order 16 bits of the product are placed into halfword element i VRT. ### **Special Registers Altered:** None ## Vector Multiply Even Unsigned Byte VX-form vmuleub VRT,VRA,VRB | I | 4 | VRT | VRA | VRB | 520 | | |---|---|-----|-----|-----|-----|----| | | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 16 prod \leftarrow EXTZ((VRA)_{i:i+7}) \times_{ui} EXTZ((VRB)_{i:i+7}) VRT_{i:i+15} \leftarrow Chop(prod, 16) ``` For each vector element i from 0 to 7, do the following. Unsigned-integer byte element ix2 in VRA is multiplied by unsigned-integer byte element ix2 in VRB. The low-order 16 bits of the product are placed into halfword element i VRT. #### **Special Registers Altered:** None ## Vector Multiply Even Signed Halfword VX-form vmulesh VRT,VRA,VRB | 4 | VRT | VRA | VRB | 840 | | |---|-----|-----|-----|------|----| | 0 | 6 | 11 | 16 | 21 : | 31 | ``` do i=0 to 127 by 32 prod \leftarrow EXTS((VRA)_{i:i+15}) \times_{si} EXTS((VRB)_{i:i+15}) VRT_{i:i+31} \leftarrow Chop(prod, 32) ``` For each vector element i from 0 to 3, do the following. Signed-integer halfword element ix2 in VRA is multiplied by signed-integer halfword element ix2 in VRB. The low-order 32 bits of the product are placed into halfword element i VRT. ## **Special Registers Altered:** None ## Vector Multiply Even Unsigned Halfword VX-form vmuleuh VRT,VRA,VRB | 4 | VRT | VRA | VRB | 584 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 32 prod \leftarrow EXTZ((VRA)_{i:i+15}) \times_{ui} EXTZ((VRB)_{i:i+15}) VRT_{i:i+31} \leftarrow Chop(prod, 32) ``` For each vector element i from 0 to 3, do the following. Unsigned-integer halfword element ix2 in VRA is multiplied by unsigned-integer halfword element ix2 in VRB. The low-order 32 bits of the product are placed into halfword element i VRT. #### **Special Registers Altered:** #### Vector Multiply Odd Signed Byte VX-form vmulosb VRT,VRA,VRB | 4 | VRT | VRA | VRB | 264 | |---|-----|-----|-----|-------------------| | 0 | 6 | 11 | 16 | 21 3 <sup>-</sup> | ``` do i=0 to 127 by 16 \begin{array}{lll} & & & \\ & & \text{prod} \leftarrow \text{EXTS((VRA)}_{\text{i+8:i+15}}) & \times_{\text{si}} \text{ EXTS((VRB)}_{\text{i+8:i+15}}) \\ & & & \text{VRT}_{\text{i:i+15}} \leftarrow \text{Chop(prod, 16)} \end{array} ``` For each vector element i from 0 to 7, do the following. Signed-integer byte element ix2+1 in VRA is multiplied by signed-integer byte element ix2+1 in VRB. The low-order 16 bits of the product are placed into halfword element i VRT. #### **Special Registers Altered:** None #### Vector Multiply Odd Unsigned Byte VX-form vmuloub VRT,VRA,VRB | 4 | VRT | VRA | VRB | 8 | | |---|-----|-----|-----|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 16 prod \leftarrow EXTZ((VRA)_{i+8:i+15}) \times_{ui} EXTZ((VRB)_{i+8:i+15}) VRT_{i:i+15} \leftarrow Chop(prod, 16) ``` For each vector element i from 0 to 7, do the following. Unsigned-integer byte element ix2+1 in VRA is multiplied by unsigned-integer byte element ix2+1 in VRB. The low-order 16 bits of the product are placed into halfword element i VRT. #### **Special Registers Altered:** None #### Vector Multiply Odd Signed Halfword VX-form vmulosh VRT,VRA,VRB | 4 | VRT | VRA | VRB | 328 | |---|-----|-----|-----|------| | 0 | 6 | 11 | 16 | 21 3 | ``` do i=0 to 127 by 32 prod \leftarrow EXTS((VRA)_{i+16:i+31}) \times_{si} EXTS((VRB)_{i+16:i+31}) VRT_{i:i+31} \leftarrow Chop( prod, 32 ) ``` For each vector element i from 0 to 3, do the following. Signed-integer halfword element ix2+1 in VRA is multiplied by signed-integer halfword element ix2+1 in VRB. The low-order 32 bits of the product are placed into halfword element i VRT. #### **Special Registers Altered:** None #### Vector Multiply Odd Unsigned Halfword VX-form vmulouh VRT,VRA,VRB ``` do i=0 to 127 by 32 prod \leftarrow EXTZ((VRA)_{i+16:i+31}) \times_{ui} EXTZ((VRB)_{i+16:i+31}) VRT_{i:i+31} \leftarrow Chop(prod, 32) ``` For each vector element i from 0 to 3, do the following. Unsigned-integer halfword element ix2+1 in VRA is multiplied by unsigned-integer halfword element ix2+1 in VRB. The low-order 32 bits of the product are placed into halfword element i VRT. #### Special Registers Altered: #### 6.9.1.4 Vector Integer Multiply-Add/Sum Instructions #### Vector Multiply-High-Add Signed Halfword Saturate VA-form vmhaddshs VRT,VRA,VRB,VRC | ſ | 4 | VRT | VRA | VRB | VRC | 32 | | |---|---|-----|-----|-----|-----|----|----| | C | ) | 6 | 11 | 16 | 21 | 26 | 31 | ``` do i=0 to 127 by 16 prod \leftarrow EXTS((VRA)_{i:i+15}) \times_{si} EXTS((VRB)_{i:i+15}) sum \leftarrow (prod >>_{si} 15) +_{int} EXTS((VRC)_{i:i+15} VRT_{i:i+15} \leftarrow Clamp(sum, -2^{15}, 2^{15}-1)_{16:31} ``` For each vector element i from 0 to 7, do the following. Signed-integer halfword element i in VRA is multiplied by signed-integer halfword element i in VRB, producing a 32-bit signed-integer product. Bits 0:16 of the product are added to signed-integer halfword element i in VRC. - If the intermediate result is greater than 2<sup>15</sup>-1 the result saturates to 2<sup>15</sup>-1. - If the intermediate result is less than -2<sup>15</sup> the result saturates to -2<sup>15</sup>. The low-order 16 bits of the result are placed into halfword element i of VRT. #### **Special Registers Altered:** SAT ### Vector Multiply-High-Round-Add Signed Halfword Saturate VA-form vmhraddshs VRT,VRA,VRB,VRC | 4 | VRT | VRA | VRB | VRC | 33 | | |---|-----|-----|-----|-----|------|----| | 0 | 6 | 11 | 16 | 21 | 26 : | 31 | ``` do i=0 to 127 by 16 prod ← EXTS((VRA)<sub>i:i+15</sub>) \times_{si} EXTS((VRB)<sub>i:i+15</sub>) sum ← ((prod +<sub>int</sub> 0x0000_4000) >><sub>si</sub> 15) +int EXTS((VRC)<sub>i:i+15</sub>) VRT<sub>i:i+15</sub> ← Clamp(sum, -2<sup>15</sup>, 2<sup>15</sup>-1)<sub>16:31</sub> ``` For each vector element i from 0 to 7, do the following. Signed-integer halfword element i in VRA is multiplied by signed-integer halfword element i in VRB, producing a 32-bit signed-integer product. The value 0x0000\_4000 is added to the product, producing a 32-bit signed-integer sum. Bits 0:16 of the sum are added to signed-integer halfword element i in VRC. - If the intermediate result is greater than 2<sup>15</sup>-1 the result saturates to 2<sup>15</sup>-1. - If the intermediate result is less than -2<sup>15</sup> the result saturates to -2<sup>15</sup>. The low-order 16 bits of the result are placed into halfword element i of VRT. #### **Special Registers Altered:** #### Vector Multiply-Low-Add Unsigned Halfword Modulo VA-form vmladduhm VRT,VRA,VRB,VRC | 4 | VRT | VRA | VRB | VRC | 34 | | |---|-----|-----|-----|-----|----|----| | 0 | 6 | 11 | 16 | 21 | 26 | 31 | For each vector element i from 0 to 3, do the following. Unsigned-integer halfword element i in VRA is multiplied by unsigned-integer halfword element i in VRB, producing a 32-bit unsigned-integer product. The low-order 16 bits of the product are added to unsigned-integer halfword element i in VRC. The low-order 16 bits of the sum are placed into halfword element i of VRT. #### **Special Registers Altered:** None #### **Programming Note** **vmladduhm** can be used for unsigned or signed-integers. #### Vector Multiply-Sum Unsigned Byte Modulo VA-form vmsumubm VRT,VRA,VRB,VRC | 4 | VRT | VRA | VRB | VRC | 36 | | |---|-----|-----|-----|-----|----|----| | 0 | 6 | 11 | 16 | 21 | 26 | 31 | ``` do i=0 to 127 by 32 temp \leftarrow EXTZ((VRC)_{i:i+31}) do j=0 to 31 by 8 prod \leftarrow EXTZ((VRA)_{i+j:i+j+7}) \times_{ui} EXTZ((VRB)_{i+j:i+j+7}) temp \leftarrow temp +_{int} prod VRT_{i:i+31} \leftarrow Chop(temp, 32) ``` For each word element in VRT the following operations are performed, in the order shown. - Each of the four unsigned-integer byte elements contained in the corresponding word element of VRA is multiplied by the corresponding unsigned-integer byte element in VRB, producing an unsigned-integer halfword product. - The sum of these four unsigned-integer halfword products is added to the unsigned-integer word element in VRC. - The unsigned-integer word result is placed into the corresponding word element of VRT. #### **Special Registers Altered:** #### Vector Multiply-Sum Mixed Byte Modulo VA-form vmsummbm VRT,VRA,VRB,VRC ``` do i=0 to 127 by 32 temp \leftarrow (VRC)_{i:i+31} do j=0 to 31 by 8 prod_{0:15} \leftarrow (VRA)_{i+j:i+j+7} \times_{sui} (VRB)_{i+j:i+j+7} temp \leftarrow temp +_{int} EXTS (prod) VRT_{i:i+31} \leftarrow temp ``` For each word element in VRT the following operations are performed, in the order shown. - Each of the four signed-integer byte elements contained in the corresponding word element of VRA is multiplied by the corresponding unsigned-integer byte element in VRB, producing a signed-integer product. - The sum of these four signed-integer halfword products is added to the signed-integer word element in VRC. - The signed-integer result is placed into the corresponding word element of VRT. #### **Special Registers Altered:** None #### Vector Multiply-Sum Signed Halfword Modulo VA-form vmsumshm VRT,VRA,VRB,VRC ``` do i=0 to 127 by 32 \begin{array}{lll} \text{temp} \leftarrow & (\text{VRC})_{\text{i}:\text{i}+31} \\ \text{do j=0 to 31 by 16} \\ & \text{prod}_{0:31} \leftarrow & (\text{VRA})_{\text{i}+\text{j}:\text{i}+\text{j}+15} \times_{\text{si}} & (\text{VRB})_{\text{i}+\text{j}:\text{i}+\text{j}+15} \\ & \text{temp} & \leftarrow & \text{temp} +_{\text{int}} & \text{prod} \\ & \text{VRT}_{\text{i}:\text{i}+31} \leftarrow & \text{temp} \end{array} ``` For each word element in VRT the following operations are performed, in the order shown. - Each of the two signed-integer halfword elements contained in the corresponding word element of VRA is multiplied by the corresponding signed-integer halfword element in VRB, producing a signed-integer product. - The sum of these two signed-integer word products is added to the signed-integer word element in VRC. - The signed-integer word result is placed into the corresponding word element of VRT. #### **Special Registers Altered:** #### Vector Multiply-Sum Signed Halfword Saturate VA-form vmsumshs VRT,VRA,VRB,VRC ``` do i=0 to 127 by 32 temp ← EXTS((VRC)<sub>i:i+31</sub>) do j=0 to 31 by 16 prod ← EXTS((VRA)<sub>i+j:i+j+15</sub>) \times_{si} EXTS((VRB)<sub>i+j:i+j+15</sub>) temp ← temp +<sub>int</sub> prod VRT<sub>i:i+31</sub> ← Clamp(temp, -2<sup>31</sup>, 2<sup>31</sup>-1) ``` For each word element in VRT the following operations are performed, in the order shown. - Each of the two signed-integer halfword elements contained in the corresponding word element of VRA is multiplied by the corresponding signed-integer halfword element in VRB, producing a signed-integer product. - The sum of these two signed-integer word products is added to the signed-integer word element in VRC. - If the intermediate result is greater than 2<sup>31</sup>-1 the result saturates to 2<sup>31</sup>-1 and if it is less than -2<sup>31</sup> it saturates to -2<sup>31</sup>. - The result is placed into the corresponding word element of VRT. #### **Special Registers Altered:** SAT #### Vector Multiply-Sum Unsigned Halfword Modulo VA-form vmsumuhm VRT,VRA,VRB,VRC ``` do i=0 to 127 by 32 temp \leftarrow EXTZ((VRC)_{i:i+31}) do j=0 to 31 by 16 prod \leftarrow EXTZ((VRA)_{i+j:i+j+15}) \times_{ui} EXTZ((VRB)_{i+j:i+j+15}) temp \leftarrow temp +_{int} prod \\ VRT_{i:i+31} \leftarrow Chop(temp, 32) ``` For each word element in VRT the following operations are performed, in the order shown. - Each of the two unsigned-integer halfword elements contained in the corresponding word element of VRA is multiplied by the corresponding unsigned-integer halfword element in VRB, producing an unsigned-integer word product. - The sum of these two unsigned-integer word products is added to the unsigned-integer word element in VRC. - The unsigned-integer result is placed into the corresponding word element of VRT. #### **Special Registers Altered:** #### Vector Multiply-Sum Unsigned Halfword Saturate VA-form vmsumuhs VRT,VRA,VRB,VRC | 4 | VRT | VRA | VRB | VRC | 39 | | |---|-----|-----|-----|-----|----|----| | 0 | 6 | 11 | 16 | 21 | 26 | 31 | ``` do i=0 to 127 by 32 temp \leftarrow EXTZ((VRC)_{i:i+31}) do j=0 to 31 by 16 prod \leftarrow EXTZ((VRA)_{i+j:i+j+15}) \times_{ui} EXTZ((VRB)_{i+j:i+j+15}) temp \leftarrow temp +_{int} prod \\ VRT_{i:i+31} \leftarrow Clamp(temp, 0, 2^{32}-1) ``` For each word element in VRT the following operations are performed, in the order shown. - Each of the two unsigned-integer halfword elements contained in the corresponding word element of VRA is multiplied by the corresponding unsigned-integer halfword element in VRB, producing an unsigned-integer product. - The sum of these two unsigned-integer word products is added to the unsigned-integer word element in VRC. - If the intermediate result is greater than 2<sup>32</sup>-1 the result saturates to 2<sup>32</sup>-1. - The result is placed into the corresponding word element of VRT. #### **Special Registers Altered:** #### 6.9.1.5 Vector Integer Sum-Across Instructions #### Vector Sum across Signed Word Saturate VX-form vsumsws VRT,VRA,VRB | 4 | VRT | VRA | VRB | 1928 | | |---|-----|-----|-----|------|----| | 0 | 6 | 11 | 16 | | 31 | ``` \begin{array}{l} \text{temp} \leftarrow \text{EXTS}\,((\text{VRB})_{\,96:127}) \\ \text{do i=0 to 127 by } 32 \\ \qquad \text{temp} \leftarrow \text{temp +int EXTS}\,((\text{VRA})_{\,1:1+31}) \\ \text{VRT}_{0:31} \qquad \leftarrow 0\text{x}0000\_0000 \\ \text{VRT}_{32:63} \qquad \leftarrow 0\text{x}0000\_0000 \\ \text{VRT}_{64:95} \qquad \leftarrow 0\text{x}0000\_0000 \\ \text{VRT}_{96:127} \leftarrow \text{Clamp}\,(\text{temp}, -2^{31}, 2^{31}-1) \end{array} ``` The sum of the four signed-integer word elements in VRA is added to signed-integer word element 3 of VRB. - If the intermediate result is greater than 2<sup>31</sup>-1 the result saturates to 2<sup>31</sup>-1. - If the intermediate result is less than -2<sup>31</sup> the result saturates to -2<sup>31</sup>. The low-end 32 bits of the result are placed into word element 3 of VRT. Word elements 0 to 2 of VRT are set to 0. #### Special Registers Altered: SAT #### Vector Sum across Half Signed Word Saturate VX-form vsum2sws VRT,VRA,VRB | 4 | VRT | VRA | VRB | 1672 | | |---|-----|-----|-----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 64 temp \leftarrow EXTS((VRB)_{i+32:i+63}) \\ do j=0 to 63 by 32 \\ temp \leftarrow temp +_{int} EXTS((VRA)_{i+j:i+j+31}) \\ VRT_{i:i+63} \leftarrow 0x0000\_0000 \parallel Clamp(temp, -2^{31}, 2^{31}-1) ``` Word elements 0 and 2 of VRT are set to 0. The sum of the signed-integer word elements 0 and 1 in VRA is added to the signed-integer word element in bits 32:63 of VRB. - If the intermediate result is greater than 2<sup>31</sup>-1 the result saturates to 2<sup>31</sup>-1. - If the intermediate result is less than -2<sup>31</sup> the result saturates to -2<sup>31</sup>. The low-order 32 bits of the result are placed into word element 1 of VRT. The sum of signed-integer word elements 2 and 3 in VRA is added to the signed-integer word element in bits 96:127 of VRB. - If the intermediate result is greater than 2<sup>31</sup>-1 the result saturates to 2<sup>31</sup>-1. - If the intermediate result is less than -2<sup>31</sup> the result saturates to -2<sup>31</sup>. The low-order 32 bits of the result are placed into word element 3 of VRT. #### **Special Registers Altered:** ### Vector Sum across Quarter Signed Byte Saturate VX-form vsum4sbs VRT,VRA,VRB | 4 | VRT | VRA | VRB | 1800 | | |---|-----|-----|-----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 32 temp \leftarrow EXTS((VRB)_{i:i+31}) do j=0 to 31 by 8 temp \leftarrow temp +_{int} EXTS((VRA)_{i+j:i+j+7}) VRT_{i:i+31} \leftarrow Clamp(temp, -2^{31}, 2^{31}-1) ``` For each vector element i from 0 to 3, do the following. The sum of the four signed-integer byte elements contained in word element i of VRA is added to signed-integer word element i in VRB. - If the intermediate result is greater than 2<sup>31</sup>-1 the result saturates to 2<sup>31</sup>-1. - If the intermediate result is less than -2<sup>31</sup> the result saturates to -2<sup>31</sup>. The low-order 32 bits of the result are placed into word element i of VRT. #### Special Registers Altered: SAT #### Vector Sum across Quarter Unsigned Byte Saturate VX-form vsum4ubs VRT,VRA,VRB | 4 | VRT | VRA | VRB | 1544 | | |---|-----|-----|-----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 32 temp \leftarrow EXTZ((VRB)_{i:i+31}) do j=0 to 31 by 8 temp \leftarrow temp +_{int} EXTZ((VRA)_{i+j:i+j+7}) VRT_{i:i+31} \leftarrow Clamp(temp, 0, 2^{32}-1) ``` For each vector element i from 0 to 3, do the following. The sum of the four unsigned-integer byte elements contained in word element i of VRA is added to unsigned-integer word element i in VRB. - If the intermediate result is greater than 2<sup>32</sup>-1 it saturates to 2<sup>32</sup>-1. The low-order 32 bits of the result are placed into word element i of VRT. #### **Special Registers Altered:** SAT #### Vector Sum across Quarter Signed Halfword Saturate VX-form vsum4shs VRT,VRA,VRB ``` do i=0 to 127 by 32 temp ← EXTS((VRB)<sub>i:i+31</sub>) do j=0 to 31 by 16 temp ← temp +<sub>int</sub> EXTS((VRA)<sub>i+j:i+j+15</sub>) VRT<sub>i:i+31</sub> ← Clamp(temp, -2<sup>31</sup>, 2<sup>31</sup>-1) ``` For each vector element i from 0 to 3, do the following. The sum of the two signed-integer halfword elements contained in word element i of VRA is added to signed-integer word element i in VRB. - If the intermediate result is greater than 2<sup>31</sup>-1 the result saturates to 2<sup>31</sup>-1. - If the intermediate result is less than -2<sup>31</sup> the result saturates to -2<sup>31</sup>. The low-order 32 bits of the result are placed into the corresponding word element of VRT. #### **Special Registers Altered:** #### 6.9.1.6 Vector Integer Average Instructions #### Vector Average Signed Byte VX-form #### vavgsb VRT,VRA,VRB | 4 | VRT | VRA | VRB | 1282 | | |---|-----|-----|-----|------|----| | 0 | 6 | 11 | 16 | 21 : | 31 | ``` do i=0 to 127 by 8 \begin{array}{l} \text{aop} \leftarrow \text{EXTS}\left(\left(\text{VRA}\right)_{\text{i:i+7}}\right) \\ \text{bop} \leftarrow \text{EXTS}\left(\left(\text{VRB}\right)_{\text{i:i+7}}\right) \\ \text{VRT}_{\text{i:i+7}} \leftarrow \text{Chop}\left(\left(\text{aop} +_{\text{int}} \text{bop} +_{\text{int}} 1\right.\right) >> 1, \ 8\right) \end{array} ``` For each vector element *i* from 0 to 15, do the following. Signed-integer byte element *i* in VRA is added to signed-integer byte element *i* in VRB. The sum is incremented by 1 and then shifted right 1 bit. The low-order 8 bits of the result are placed into byte element *i* of VRT. #### **Special Registers Altered:** None #### Vector Average Signed Word VX-form vavgsw VRT,VRA,VRB | 4 | VRT | VRA | VRB | 1410 | | |---|-----|-----|-----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | For each vector element *i* from 0 to 3, do the following. Signed-integer word element *i* in VRA is added to signed-integer word element *i* in VRB. The sum is incremented by 1 and then shifted right 1 bit. The low-order 32 bits of the result are placed into word element *i* of VRT. #### **Special Registers Altered:** None #### Vector Average Signed Halfword VX-form vavgsh VRT,VRA,VRB | 4 | VRT | VRA | VRB | 1346 | | |---|-----|-----|-----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 16 aop \leftarrow EXTS((VRA)_{i:i+15}) bop \leftarrow EXTS((VRB)_{i:i+15}) VRT_{i:i+15} \leftarrow Chop((aop +_{int}bop +_{int}1) >> 1, 16) ``` For each vector element *i* from 0 to 7, do the following. Signed-integer halfword element *i* in VRA is added to signed-integer halfword element *i* in VRB. The sum is incremented by 1 and then shifted right 1 bit. The low-order 16 bits of the result are placed into halfword element *i* of VRT. #### **Special Registers Altered:** #### Vector Average Unsigned Byte VX-form vavgub VRT,VRA,VRB | 1 | 4 | VRT | VRA | VRB | 1026 | | |---|---|-----|-----|-----|------|----| | | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 8 aop \leftarrow EXTZ((VRA)_{i:i+7}) bop \leftarrow EXTZ((VRB)_{i:i+7}) VRT_{i:i+7} \leftarrow Chop((aop +_{int} bop +_{int} 1) >>_{ui} 1, 8) ``` For each vector element i from 0 to 15, do the following. Unsigned-integer byte element i in VRA is added to unsigned-integer byte element i in VRB. The sum is incremented by 1 and then shifted right 1 bit. The low-order 8 bits of the result are placed into byte element *i* of VRT. #### **Special Registers Altered:** None #### Vector Average Unsigned Word VX-form vavguw VRT,VRA,VRB | 4 | VRT | VRA | VRB | 1154 | П | |---|-----|-----|-----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 32 aop \leftarrow EXTZ((VRA)_{i:i+31}) bop \leftarrow EXTZ((VRB)_{i:i+31}) VRT_{i:i+31} \leftarrow Chop((aop +_{int} bop +_{int} 1) >>_{ui} 1, 32) ``` For each vector element *i* from 0 to 3, do the following. Unsigned-integer word element *i* in VRA is added to unsigned-integer word element *i* in VRB. The sum is incremented by 1 and then shifted right 1 bit. The low-order 32 bits of the result are placed into word element *i* of VRT. #### **Special Registers Altered:** None ### Vector Average Unsigned Halfword VX-form vavguh VRT,VRA,VRB | 4 | VRT | VRA | VRB | 1090 | | |---|-----|-----|-----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 16 \substack{\text{aop} \leftarrow \text{EXTZ}((\text{VRA})_{\text{i}:i+15})\\ \text{bop} \leftarrow \text{EXTZ}((\text{VRB})_{\text{i}:i+15})\\ \text{VRT}_{\text{i}:i+15} \leftarrow \text{Chop}((\text{aop} +_{\text{int}} \text{bop} +_{\text{int}} 1) >>_{\text{ui}} 1, 16) } ``` For each vector element *i* from 0 to 7, do the following. Unsigned-integer halfword element i in VRA is added to unsigned-integer halfword element i in VRB. The sum is incremented by 1 and then shifted right 1 bit. The low-order 16 bits of the result are placed into halfword element *i* of VRT. #### **Special Registers Altered:** #### 6.9.1.7 Vector Integer Maximum and Minimum Instructions #### Vector Maximum Signed Byte VX-form #### vmaxsb VRT,VRA,VRB | 4 | VRT | VRA | VRB | 258 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 8 \substack{\text{aop} \leftarrow \text{EXTS}((\text{VRA})_{\text{i:i+7}})\\ \text{bop} \leftarrow \text{EXTS}((\text{VRB})_{\text{i:i+7}})\\ \text{VRT}_{\text{i:i+7}} \leftarrow (\text{aop} >_{\text{si}} \text{bop})\\ ? (\text{VRA})_{\text{i:i+7}} : (\text{VRB})_{\text{i:i+7}} ``` For each vector element i from 0 to 15, do the following. Signed-integer byte element i in VRA is compared to signed-integer byte element i in VRB. The larger of the two values is placed into byte element i of VRT. #### **Special Registers Altered:** None #### Vector Maximum Signed Word VX-form vmaxsw VRT,VRA,VRB | 4 | VRT | VRA | VRB | 386 | | |---|-----|----------|------|-----|----| | • | * | V 1 (/ ) | **** | 000 | | | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 32 aop \leftarrow EXTS((VRA)_{i:i+31}) bop \leftarrow EXTS((VRB)_{i:i+31}) VRT_{i:i+31} \leftarrow (aop >_{si} bop) ? (VRA)_{i:i+31} : (VRB)_{i:i+31} ``` For each vector element *i* from 0 to 3, do the following. Signed-integer word element *i* in VRA is compared to signed-integer word element *i* in VRB. The larger of the two values is placed into word element *i* of VRT. #### **Special Registers Altered:** None #### Vector Maximum Signed Halfword VX-form vmaxsh VRT,VRA,VRB | 4 | VRT | VRA | VRB | 322 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 16 aop \leftarrow EXTS((VRA)_{i:i+15}) bop \leftarrow EXTS((VRB)_{i:i+15}) VRT_{i:i+15} \leftarrow (aop >_{si} bop) ? (VRA)_{i:i+15} : (VRB)_{i:i+15} ``` For each vector element *i* from 0 to 7, do the following. Signed-integer halfword element *i* in VRA is compared to signed-integer halfword element *i* in VRB. The larger of the two values is placed into halfword element *i* of VRT. #### **Special Registers Altered:** #### Vector Maximum Unsigned Byte VX-form vmaxub VRT,VRA,VRB | 4 | VRT | VRA | VRB | 2 | | |---|-----|-----|-----|------|----| | 0 | 6 | 11 | 16 | 21 3 | 31 | ``` do i=0 to 127 by 8 \begin{array}{l} \text{aop} \leftarrow \text{EXTZ}\left(\left(\text{VRA}\right)_{\text{i:i+7}}\right) \\ \text{bop} \leftarrow \text{EXTZ}\left(\left(\text{VRB}\right)_{\text{i:i+7}}\right) \\ \text{VRT}_{\text{i:i+7}} \leftarrow \text{(aop} >_{\text{ui}} \text{bop)} ? \left(\text{VRA}\right)_{\text{i:i+7}} : \left(\text{VRB}\right)_{\text{i:i+7}} \end{array} ``` For each vector element i from 0 to 15, do the following. Unsigned-integer byte element i in VRA is compared to unsigned-integer byte element i in VRB. The larger of the two values is placed into byte element i of VRT. #### **Special Registers Altered:** None #### Vector Maximum Unsigned Word VX-form vmaxuw VRT,VRA,VRB | Ī | 4 | VRT | VRA | VRB | 130 | | |---|---|-----|-----|-----|------|----| | ١ | 0 | 6 | 11 | 16 | 21 3 | 31 | ``` do i=0 to 127 by 32 aop \leftarrow EXTZ((VRA)_{1:i+31}) bop \leftarrow EXTZ((VRB)_{i:i+31}) VRT_{i:i+31} \leftarrow (aop >_{ui} bop) ? (VRA)_{i:i+31} : (VRB)_{i:i+31} ``` For each vector element i from 0 to 3, do the following. Unsigned-integer word element i in VRA is compared to unsigned-integer word element i in VRB. The larger of the two values is placed into word element i of VRT. #### **Special Registers Altered:** None ### Vector Maximum Unsigned Halfword VX-form vmaxuh VRT,VRA,VRB | 4 | VRT | VRA | VRB | 66 | | |---|-----|-----|-----|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 16 aop \leftarrow EXTZ((VRA)_{i:i+15}) bop \leftarrow EXTZ((VRB)_{i:i+15}) VRT_{i:i+15} \leftarrow (aop >_{ui} bop) ? (VRA)_{i:i+15} : (VRB)_{i:i+15} ``` For each vector element *i* from 0 to 7, do the following. Unsigned-integer halfword element *i* in VRA is compared to unsigned-integer halfword element *i* in VRB. The larger of the two values is placed into halfword element *i* of VRT. #### **Special Registers Altered:** #### Vector Minimum Signed Byte VX-form vminsb VRT,VRA,VRB | 4 | VRT | VRA | VRB | 770 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | For each vector element i from 0 to 15, do the following. Signed-integer byte element *i* in VRA is compared to signed-integer byte element *i* in VRB. The smaller of the two values is placed into byte element *i* of VRT. #### **Special Registers Altered:** None #### Vector Minimum Signed Word VX-form vminsw VRT.VRA.VRB | 4 | VRT | VRA | VRB | 898 | |---|-----|-----|-----|------| | 0 | 6 | 11 | 16 | 21 3 | ``` do i=0 to 127 by 32 \begin{array}{lll} & \text{aop} \leftarrow \text{EXTS}\left(\left(\text{VRA}\right)_{i:i+31}\right) \\ & \text{bop} \leftarrow \text{EXTS}\left(\left(\text{VRB}\right)_{i:i+31}\right) \\ & \text{VRT}_{i:i+31} \leftarrow \left(\left(\text{aop} <_{\text{si}}\right) \text{bop}\right) \\ & ? & \left(\text{VRA}\right)_{i:i+31} : \left(\text{VRB}\right)_{i:i+31} \end{array} ``` For each vector element *i* from 0 to 3, do the following. Signed-integer word element *i* in VRA is compared to signed-integer word element *i* in VRB. The smaller of the two values is placed into word element *i* of VRT. #### **Special Registers Altered:** None #### Vector Minimum Signed Halfword VX-form vminsh VRT,VRA,VRB | | 4 | VRT | VRA | VRB | 834 | ٦ | |---|---|-----|-----|-----|------|----| | 0 | | 6 | 11 | 16 | 21 3 | 31 | ``` \begin{array}{lll} \text{do i=0 to 127 by 16} \\ & \text{aop} \leftarrow \text{EXTS}((\text{VRA})_{\text{i:i+15}}) \\ & \text{bop} \leftarrow \text{EXTS}((\text{VRB})_{\text{i:i+15}}) \\ & \text{VRT}_{\text{i:i+15}} \leftarrow (\text{ aop} <_{\text{si}} \text{ bop}) \\ & ? & (\text{VRA})_{\text{i:i+15}} : (\text{VRB})_{\text{i:i+15}} \end{array} ``` For each vector element *i* from 0 to 7, do the following. Signed-integer halfword element *i* in VRA is compared to signed-integer halfword element *i* in VRB. The smaller of the two values is placed into halfword element *i* of VRT. #### **Special Registers Altered:** #### Vector Minimum Unsigned Byte VX-form vminub VRT,VRA,VRB | 4 | VRT | VRA | VRB | 514 | |---|-----|-----|-----|------| | 0 | 6 | 11 | 16 | 21 3 | ``` do i=0 to 127 by 8 \substack{\text{aop} \leftarrow \text{EXTZ}((\text{VRA})_{\text{i:i+7}})\\ \text{bop} \leftarrow \text{EXTZ}((\text{VRB})_{\text{i:i+7}}\\ \text{VRT}_{\text{i:i+7}} \leftarrow (\text{aop} <_{\text{ui}} \text{bop})\\ ? (\text{VRA})_{\text{i:i+7}} : (\text{VRB})_{\text{i:i+7}} ``` For each vector element i from 0 to 15, do the following. Unsigned-integer byte element *i* in VRA is compared to unsigned-integer byte element *i* in VRB. The smaller of the two values is placed into byte element *i* of VRT. #### **Special Registers Altered:** None #### Vector Minimum Unsigned Word VX-form vminuw VRT,VRA,VRB | 4 | VRT | VRA | VRB | 642 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 32 aop \leftarrow EXTZ((VRA)_{i:i+31}) bop \leftarrow EXTZ((VRB)_{i:i+31}) VRT_{i:i+31} \leftarrow (aop <_{ui} bop) ? (VRA)_{i:i+31} : (VRB)_{i:i+31} ``` For each vector element i from 0 to 3, do the following. Unsigned-integer word element *i* in VRA is compared to unsigned-integer word element *i* in VRB. The smaller of the two values is placed into word element *i* of VRT. #### **Special Registers Altered:** None ### Vector Minimum Unsigned Halfword VX-form vminuh VRT,VRA,VRB | 4 | VRT | VRA | VRB | 578 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{lll} \text{do i=0 to 127 by 16} \\ & \text{aop} \leftarrow \text{EXTZ}((\text{VRA})_{\text{i:i+15}}) \\ & \text{bop} \leftarrow \text{EXTZ}((\text{VRB})_{\text{i:i+15}}) \\ & \text{VRT}_{\text{i:i+15}} \leftarrow (\text{ aop} <_{\text{ui}} \text{ bop}) \\ & \text{? (VRA)}_{\text{i:i+15}} : (\text{VRB})_{\text{i:i+15}} \end{array} ``` For each vector element *i* from 0 to 7, do the following. Unsigned-integer halfword element *i* in VRA is compared to unsigned-integer halfword element *i* in VRB. The smaller of the two values is placed into halfword element *i* of VRT. #### **Special Registers Altered:** #### 6.9.2 Vector Integer Compare Instructions The Vector Integer Compare instructions compare two Vector Registers element by element, interpreting the elements as unsigned or signed-integers depending on the instruction, and set the corresponding element of the target Vector Register to all 1s if the relation being tested is true and to all 0s if the relation being tested is false. If Rc=1 CR Field 6 is set to reflect the result of the comparison, as follows. | Bit | Description | ı | |-----|-------------|---| - The relation is true for all element pairs (i.e., VRT is set to all 1s) - 1 Ò - The relation is false for all element pairs (i.e., VRT is set to all 0s) - 3 0 #### Programming Note - vcmpequb[.], vcmpequh[.] and vcmpequw[.] can be used for unsigned or signed-integers. ### Vector Compare Equal To Unsigned Byte VC-form vcmpequb VRT,VRA,VRB (Rc=0) vcmpequb. VRT,VRA,VRB (Rc=1) | | 4 | VRT | VRA | VRB | Rc | | 6 | | |---|---|-----|-----|-----|----|----|---|----| | 0 | | 6 | 11 | 16 | 21 | 22 | | 31 | For each vector element i from 0 to 15, do the following. Unsigned-integer byte element *i* in VRA is compared to unsigned-integer byte element *i* in VRB. Byte element *i* in VRT is set to all 1s if unsigned-integer byte element *i* in VRA is equal to unsigned-integer byte element *i* in VRB, and is set to all 0s otherwise. #### **Special Registers Altered:** CR6 (if Rc=1) #### Vector Compare Equal To Unsigned Halfword VC-form vcmpequh VRT,VRA,VRB (Rc=0) vcmpequh. VRT,VRA,VRB (Rc=1) | 4 | VRT | VRA | VRB | Rc | 70 | | |---|-----|-----|-----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 22 | 31 | ``` do i=0 to 127 by 16 VRT_{i:i+15} \leftarrow ((VRA)_{i:i+15} =_{int} (VRB)_{i:i+15}) ? ^{16}1 : ^{16}0 if Rc=1 then do t \leftarrow (VRT_{i}^{-128}1) f \leftarrow (VRT_{i}^{-128}0) CR6 \leftarrow t \parallel 0b0 \parallel f \parallel 0b0 ``` For each vector element *i* from 0 to 7, do the following. Unsigned-integer halfword element *i* in VRA is compared to unsigned-integer halfword element element *i* in VRB. Halfword element *i* in VRT is set to all 1s if unsigned-integer halfword element *i* in VRA is equal to unsigned-integer halfword element *i* in VRB, and is set to all 0s otherwise. #### **Special Registers Altered:** #### Vector Compare Equal To Unsigned Word VC-form | vcmpequw | VRT,VRA,VRB | (Rc=0) | |-----------|-------------|--------| | vcmpequw. | VRT,VRA,VRB | (Rc=1) | | 1 | 4 | VRT | VRA | VRB | Rc | 134 | | |---|---|-----|-----|-----|----|-----|----| | | 0 | 6 | 11 | 16 | 21 | 22 | 31 | ``` do i=0 to 127 by 32 VRT_{i:i+31} \leftarrow ((VRA)_{i:i+31} =_{int} (VRB)_{i:i+31}) ? ^{32}1 : ^{32}0 if Rc=1 then do t \leftarrow (VRT=^{128}1) f \leftarrow (VRT=^{128}0) CR6 \leftarrow t \parallel 0b0 \parallel f \parallel 0b0 ``` For each vector element *i* from 0 to 3, do the following. Unsigned-integer word element *i* in VRA is compared to unsigned-integer word element *i* in VRB. Word element *i* in VRT is set to all 1s if unsigned-integer word element *i* in VRA is equal to unsigned-integer word element *i* in VRB, and is set to all 0s otherwise. #### **Special Registers Altered:** CR6 (if Rc=1) ### Vector Compare Greater Than Signed Halfword VC-form vcmpgtsh VRT,VRA,VRB (Rc=0) vcmpgtsh. VRT,VRA,VRB (Rc=1) | 4 | VRT | VRA | VRB | Rc | 838 | |---|-----|-----|-----|----|------| | 0 | 6 | 11 | 16 | 21 | 22 3 | ``` do i=0 to 127 by 16 VRT_{i:i+15} \leftarrow ((VRA)_{i:i+15} >_{si} (VRB)_{i:i+15}) ? <sup>16</sup>1 : <sup>16</sup>0 if Rc=1 then do t \leftarrow (VRT=^{128}1) f \leftarrow (VRT=^{128}0) CR6 \leftarrow t \parallel 0b0 \parallel f \parallel 0b0 ``` For each vector element *i* from 0 to 7, do the following. Signed-integer halfword element *i* in VRA is compared to signed-integer halfword element *i* in VRB. Halfword element *i* in VRT is set to all 1s if signed-integer halfword element *i* in VRA is greater than signed-integer halfword element *i* in VRB, and is set to all 0s otherwise. #### **Special Registers Altered:** CR6 (if Rc=1) #### Vector Compare Greater Than Signed Byte VC-form vcmpgtsb VRT,VRA,VRB (Rc=0) vcmpgtsb. VRT,VRA,VRB (Rc=1) | 4 | VRT | VRA | VRB | Rc | 774 | | |---|-----|-----|-----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 22 | 31 | ``` do i=0 to 127 by 8 VRT_{i:i+7} \leftarrow ((VRA)_{i:i+7} >_{si} (VRB)_{i:i+7}) ? <sup>8</sup>1 : <sup>8</sup>0 if Rc=1 then do t \leftarrow (VRT_{i}^{=128}1) f \leftarrow (VRT_{i}^{=128}0) CR6 \leftarrow t \parallel 0b0 \parallel f \parallel 0b0 ``` For each vector element i from 0 to 15, do the following. Signed-integer byte element *i* in VRA is compared to signed-integer byte element *i* in VRB. Byte element *i* in VRT is set to all 1s if signed-integer byte element *i* in VRA is greater than to signed-integer byte element *i* in VRB, and is set to all 0s otherwise. #### **Special Registers Altered:** CR6 (if Rc=1) #### Vector Compare Greater Than Signed Word VC-form vcmpgtsw VRT,VRA,VRB (Rc=0) vcmpgtsw. VRT,VRA,VRB (Rc=1) | 4 | VRT | VRA | VRB | Rc | 902 | | |---|-----|-----|-----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 22 | 31 | ``` do i=0 to 127 by 32 \begin{array}{c} \text{VRT}_{i:i+31} \leftarrow ((\text{VRA})_{i:i+31} >_{\text{si}} (\text{VRB})_{i:i+31}) ? \ ^{32}1 : \ ^{32}0 \\ \text{if Rc=1 then do} \\ \text{t} \leftarrow (\text{VRT}=^{128}1) \\ \text{f} \leftarrow (\text{VRT}=^{128}0) \\ \text{CR6} \leftarrow \text{t} \parallel \text{Ob0} \parallel \text{f} \parallel \text{Ob0} \\ \end{array} ``` For each vector element *i* from 0 to 3, do the following. Signed-integer word element *i* in VRA is compared to signed-integer word element *i* in VRB. Word element *i* in VRT is set to all 1s if signed-integer word element *i* in VRA is greater than signed-integer word element *i* in VRB, and is set to all 0s otherwise. #### **Special Registers Altered:** #### Vector Compare Greater Than Unsigned Byte VC-form $\begin{array}{ccc} vcmpgtub & VRT,VRA,VRB & (Rc=0) \\ vcmpgtub. & VRT,VRA,VRB & (Rc=1) \end{array}$ | 4 | VRT | VRA | VRB | Rc | 518 | | |---|-----|-----|-----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 22 | 31 | ``` do i=0 to 127 by 8 VRT_{i:i+7} \leftarrow ((VRA)_{i:i+7} >_{ui} (VRB)_{i:i+7}) ? ^81 : ^80 if Rc=1 then do t \leftarrow (VRT=^{128}1) f \leftarrow (VRT=^{128}0) CR6 \leftarrow t \parallel 0b0 \parallel f \parallel 0b0 ``` For each vector element i from 0 to 15, do the following. Unsigned-integer byte element *i* in VRA is compared to unsigned-integer byte element *i* in VRB. Byte element *i* in VRT is set to all 1s if unsigned-integer byte element *i* in VRA is greater than to unsigned-integer byte element *i* in VRB, and is set to all 0s otherwise. #### **Special Registers Altered:** CR6 (if Rc=1) ### Vector Compare Greater Than Unsigned Halfword VC-form $\begin{array}{lll} \text{vcmpgtuh} & \text{VRT,VRA,VRB} & \text{(Rc=0)} \\ \text{vcmpgtuh.} & \text{VRT,VRA,VRB} & \text{(Rc=1)} \end{array}$ | 4 | VRT | VRA | VRB | Rc | 582 | | |---|-----|-----|-----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 22 3 | 31 | ``` do i=0 to 127 by 16 _{\rm VRT_{i:i+15}} \leftarrow (({\rm VRA})_{i:i+15}>_{\rm ui} ({\rm VRB})_{i:i+15}) ? ^{16}{\rm 1} : ^{16}{\rm 0} if Rc=1 then do t \leftarrow ({\rm VRT}_{=}^{=128}{\rm 1}) f \leftarrow ({\rm VRT}_{=}^{=128}{\rm 0}) CR6 \leftarrow t \parallel 0b0 \parallel f \parallel 0b0 ``` For each vector element *i* from 0 to 7, do the following. Unsigned-integer halfword element *i* in VRA is compared to unsigned-integer halfword element *i* in VRB. Halfword element *i* in VRT is set to all 1s if unsigned-integer halfword element *i* in VRA is greater than to unsigned-integer halfword element *i* in VRB, and is set to all 0s otherwise. #### **Special Registers Altered:** CR6 (if Rc=1) #### Vector Compare Greater Than Unsigned Word VC-form vcmpgtuw VRT,VRA,VRB (Rc=0) vcmpgtuw. VRT,VRA,VRB (Rc=1) | 4 | VRT | VRA | VRB | Rc | 646 | | |---|-----|-----|-----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 22 | 31 | ``` do i=0 to 127 by 32 _{\rm VRT_{i:i+31}} \leftarrow (({\rm VRA})_{i:i+31}>_{\rm ui} ({\rm VRB})_{i:i+31}) ? ^{32}{\rm 1} : ^{32}{\rm 0} if Rc=1 then do _{\rm t} \leftarrow ({\rm VRT}=^{128}{\rm 1}) f \leftarrow ({\rm VRT}=^{128}{\rm 0}) CR6 \leftarrow t \parallel 0b0 \parallel f \parallel 0b0 ``` For each vector element *i* from 0 to 3, do the following. Unsigned-integer word element *i* in VRA is compared to unsigned-integer word element *i* in VRB. Word element *i* in VRT is set to all 1s if unsigned-integer word element *i* in VRA is greater than to unsigned-integer word element *i* in VRB, and is set to all 0s otherwise. #### **Special Registers Altered:** #### 6.9.3 Vector Logical Instructions ### Extended mnemonics for vector logical operations Extended mnemonics are provided that use the Vector OR and Vector NOR instructions to copy the contents of one Vector Register to another, with and without complementing. These are shown as examples with the two instructions. #### Vector Move Register Several vector instructions can be coded in a way such that they simply copy the contents of one Vector Register to another. An extended mnemonic is provided to convey the idea that no computation is being performed but merely data movement (from one register to another). The following instruction copies the contents of register Vy to register Vx. vmr Vx,Vy (equivalent to: vor Vx,Vy,Vy) #### Vector Complement Register The *Vector NOR* instruction can be coded in a way such that it complements the contents of one Vector Register and places the result into another Vector Register. An extended mnemonic is provided that allows this operation to be coded easily. The following instruction complements the contents of register Vy and places the result into register Vx. vnot Vx,Vy (equivalent to: vnor Vx,Vy,Vy) #### Vector Logical AND VX-form | vand | VRT,VRA,VRB | |------|-------------| |------|-------------| | ſ | 4 | VRT | VRA | VRB | 1028 | | |---|---|-----|-----|-----|------|----| | ŀ | 0 | 6 | 11 | 16 | 21 | 31 | VRT ← (VRA) & (VRB) The contents of VRA are ANDed with the contents of VRB and the result is placed into VRT. #### **Special Registers Altered:** None ### Vector Logical AND with Complement VX-form vandc VRT,VRA,VRB | 4 | VRT | VRA | VRB | 1092 | | |---|-----|-----|-----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | VRT ← (VRA) & ¬(VRB) The contents of VRA are ANDed with the complement of the contents of VRB and the result is placed into VRT #### **Special Registers Altered:** None #### Vector Logical NOR VX-form vnor VRT,VRA,VRB | 4 | VRT | VRA | VRB | 1284 | | |---|-----|-----|-----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | VRT ← ¬( (VRA) | (VRB) ) The contents of VRA are ORed with the contents of VRB and the complemented result is placed into VRT. #### **Special Registers Altered:** None #### Vector Logical OR VX-form vor VRT,VRA,VRB | 4 | \/RT | V/RA | VRR | 1156 | | |---|---------|-----------|------|------|----| | | V 1 C 1 | V 1 1/7 1 | VILL | 1100 | | | 0 | 6 | 11 | 16 | 21 | 31 | VRT ← (VRA) | (VRB) The contents of VRA are ORed with the contents of VRB and the result is placed into VRT. #### **Special Registers Altered:** None #### Vector Logical XOR VX-form vxor VRT,VRA,VRB VRT ← (VRA) ⊕ (VRB) The contents of VRA are XORed with the contents of VRB and the result is placed into VRT. #### **Special Registers Altered:** #### 6.9.4 Vector Integer Rotate and Shift Instructions # Vector Rotate Left Byte VX-form vrlb VRT,VRA,VRB 4 VRT VRA VRB 4 0 6 11 16 21 31 do i=0 to 127 by 8 $$sh \leftarrow (VRB)_{i+5:i+7} \ VRT_{i:i+7} \leftarrow (VRA)_{i:i+7} <<< sh$$ For each vector element i from 0 to 15, do the following. Byte element *i* in VRA is rotated left by the number of bits specified in the low-order 3 bits of the corresponding byte element *i* in VRB. The result is placed into byte element *i* in VRT. #### **Special Registers Altered:** None #### Vector Rotate Left Word VX-form vrlw VRT,VRA,VRB | 4 | VRT | VRA | VRB | 132 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 32 sh \leftarrow (VRB)_{i+27:i+31} VRT_{i:i+31} \leftarrow (VRA)_{i:i+31} <<< sh ``` For each vector element *i* from 0 to 3, do the following. Word element *i* in VRA is rotated left by the number of bits specified in the low-order 5 bits of the corresponding word element *i* in VRB. The result is placed into word element *i* in VRT. #### **Special Registers Altered:** None #### Vector Rotate Left Halfword VX-form vrlh VRT,VRA,VRB 4 VRT VRA VRB 68 $$\begin{array}{ll} \mbox{do i=0 to 127 by 16} \\ \mbox{sh} \leftarrow \mbox{(VRB)}_{\mbox{i+12:i+15}} \\ \mbox{VRT}_{\mbox{i:i+15}} \leftarrow \mbox{(VRA)}_{\mbox{i:i+15}} <<< \mbox{sh} \end{array}$$ For each vector element *i* from 0 to 7, do the following. Halfword element *i* in VRA is rotated left by the number of bits specified in the low-order 4 bits of the corresponding halfword element *i* in VRB. The result is placed into halfword element *i* in VRT. #### **Special Registers Altered:** #### Vector Shift Left Byte #### VX-form vslh #### Vector Shift Left Halfword VRT,VRA,VRB #### VX-form vslb VRT,VRA,VRB | 4 | | VRT | VRA | VRB | 260 | | |---|---|-----|-----|-----|-----|----| | 0 | 6 | | 11 | 16 | 21 | 31 | do i=0 to 127 by 8 $$\begin{array}{l} \text{sh} \leftarrow \text{(VRB)}_{\text{i+5:i+7}} \\ \text{VRT}_{\text{i:i+7}} \leftarrow \text{(VRA)}_{\text{i:i+7}} << \text{sh} \end{array}$$ For each vector element i from 0 to 15, do the following. Byte element *i* in VRA is shifted left by the number of bits specified in the low-order 3 bits of byte element *i* in VRB. - Bits shifted out of bit 0 are lost. - Zeros are supplied to the vacated bits on the right. The result is placed into byte element *i* of VRT. #### **Special Registers Altered:** None #### **Vector Shift Left Word** VX-form vslw VRT,VRA,VRB | 4 | VRT | VRA | VRB | 388 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{lll} \mbox{do i=0 to 127 by 32} \\ \mbox{sh} \leftarrow \mbox{(VRB)}_{\mbox{$i+27:i+31$}} \\ \mbox{VRT}_{\mbox{$i:i+31$}} \leftarrow \mbox{(VRA)}_{\mbox{$i:i+31$}} << \mbox{sh} \end{array}$$ For each vector element *i* from 0 to 3, do the following. Word element *i* in VRA is shifted left by the number of bits specified in the low-order 5 bits of word element *i* in VRB. - Bits shifted out of bit 0 are lost. - Zeros are supplied to the vacated bits on the right. The result is placed into word element *i* of VRT. #### **Special Registers Altered:** None do i=0 to 127 by 16 sh $$\leftarrow$$ (VRB) $_{i+12:i+15}$ VRT $_{i:i+15} \leftarrow$ (VRA) $_{i:i+15} <<$ sh For each vector element *i* from 0 to 7, do the following. Halfword element *i* in VRA is shifted left by the number of bits specified in the low-order 4 bits of halfword element *i* in VRB. - Bits shifted out of bit 0 are lost. - Zeros are supplied to the vacated bits on the right. The result is placed into halfword element *i* of VRT. #### **Special Registers Altered:** #### Vector Shift Right Byte #### VX-form #### Vector Shift Right Halfword VX-form VRT.VRA.VRB vsrb VRT,VRA,VRB | 4 | VRT | VRA | VRB | 516 | 1 | |---|-----|-----|-----|------|---| | 0 | 6 | 11 | 16 | 21 3 | 1 | do i=0 to 127 by 8 $$\begin{array}{l} \text{sh} \leftarrow \text{(VRB)}_{\text{i+5:i+7}} \\ \text{VRT}_{\text{i:i+7}} \leftarrow \text{(VRA)}_{\text{i:i+7}} >>_{\text{ui}} \text{sh} \end{array}$$ For each vector element i from 0 to 15, do the following. Byte element *i* in VRA is shifted right by the number of bits specified in the low-order 3 bits of byte element *i* in VRB. Bits shifted out of the least-significant bit are lost. Zeros are supplied to the vacated bits on the left. The result is placed into byte element *i* of VRT. #### **Special Registers Altered:** None #### Vector Shift Right Word VX-form vsrw VRT,VRA,VRB | 4 | VRT | VRA | VRB | 644 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 32 sh \leftarrow (VRB)_{i+27:i+31} VRT_{i:i+31} \leftarrow (VRA)_{i:i+31} >>_{ui} sh ``` For each vector element *i* from 0 to 3, do the following. Word element *i* in VRA is shifted right by the number of bits specified in the low-order 5 bits of word element *i* in VRB. Bits shifted out of the least-significant bit are lost. Zeros are supplied to the vacated bits on the left. The result is placed into word element *i* of VRT. #### **Special Registers Altered:** None | 4 | VRT | VRA | VRB | 580 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 16 \begin{array}{l} \text{sh} \leftarrow \text{(VRB)}_{\text{i+12:i+15}} \\ \text{VRT}_{\text{i:i+15}} \leftarrow \text{(VRA)}_{\text{i:i+15}} >>_{\text{ui}} \text{sh} \end{array} ``` For each vector element *i* from 0 to 7, do the following. Halfword element *i* in VRA is shifted right by the number of bits specified in the low-order 4 bits of halfword element *i* in VRB. Bits shifted out of the least-significant bit are lost. Zeros are supplied to the vacated bits on the left. The result is placed into halfword element *i* of VRT. #### **Special Registers Altered:** None vsrh #### Vector Shift Right Algebraic Byte VX-form vsrab VRT,VRA,VRB | 4 | VRT | VRA | VRB | 772 | |---|-----|-----|-----|------| | 0 | 6 | 11 | 16 | 21 3 | do i=0 to 127 by 8 $$\text{sh} \leftarrow (\text{VRB})_{\text{i+5}:\text{i+7}} \\ \text{VRT}_{\text{i:i+7}} \leftarrow (\text{VRA})_{\text{i:i+7}} >>_{\text{si}} \text{sh}$$ For each vector element i from 0 to 15, do the following. Byte element *i* in VRA is shifted right by the number of bits specified in the low-order 3 bits of the corresponding byte element *i* in VRB. Bits shifted out of bit 7 of the byte element are lost. Bit 0 of the byte element is replicated to fill the vacated bits on the left. The result is placed into byte element *i* of VRT. #### **Special Registers Altered:** None #### Vector Shift Right Algebraic Word VX-form vsraw VRT,VRA,VRB | 4 | VRT | VRA | VRB | 900 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 32 \begin{array}{l} \text{sh} \leftarrow \text{(VRB)}_{\text{i+27:i+31}} \\ \text{VRT}_{\text{i:i+31}} \leftarrow \text{(VRA)}_{\text{i:i+31}} >>_{\text{si}} \text{sh} \end{array} ``` For each vector element *i* from 0 to 3, do the following. Word element *i* in VRA is shifted right by the number of bits specified in the low-order 5 bits of the corresponding word element *i* in VRB. Bits shifted out of bit 31 of the word are lost. Bit 0 of the word is replicated to fill the vacated bits on the left. The result is placed into word element *i* of VRT. #### **Special Registers Altered:** None #### Vector Shift Right Algebraic Halfword VX-form vsrah VRT,VRA,VRB | Ī | 4 | VRT | VRA | VRB | 836 | | |---|---|-----|-----|-----|-----|----| | ı | 0 | 6 | 11 | 16 | 21 | 31 | do i=0 to 127 by 16 $$\begin{array}{l} \text{sh} \leftarrow \text{(VRB)}_{\text{i+12:i+15}} \\ \text{VRT}_{\text{i:i+15}} \leftarrow \text{(VRA)}_{\text{i:i+15}} >>_{\text{si}} \text{sh} \end{array}$$ For each vector element *i* from 0 to 7, do the following. Halfword element *i* in VRA is shifted right by the number of bits specified in the low-order 4 bits of the corresponding halfword element *i* in VRB. Bits shifted out of bit 15 of the halfword are lost. Bit 0 of the halfword is replicated to fill the vacated bits on the left. The result is placed into halfword element *i* of VRT. #### **Special Registers Altered:** #### 6.10 Vector Floating-Point Instruction Set #### **6.10.1 Vector Floating-Point Arithmetic Instructions** #### **Vector Add Single-Precision** VX-form #### vsubfp VRT,VRA,VRB vaddfp VRT,VRA,VRB | 4 | VRT | VRA | VRB | 10 | | |---|-----|-----|-----|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | do i=0 to 127 by 32 $$\begin{array}{c} \text{VRT}_{\text{i:i+31}} \leftarrow \\ \text{RoundToNearSP((VRA)}_{\text{i:i+31}} +_{\text{fp}} \text{(VRB)}_{\text{i:i+31}}) \end{array}$$ For each vector element *i* from 0 to 3, do the following. Single-precision floating-point element i in VRA is added to single-precision floating-point element i in VRB. The intermediate result is rounded to the nearest single-precision floating-point number and placed into word element i of VRT. #### **Special Registers Altered:** None | 4 | VRT | VRA | VRB | | 74 | | |---|-----|-----|-----|----|----|----| | 0 | 6 | 11 | 16 | 21 | | 31 | Vector Subtract Single-Precision VX-form ``` do i=0 to 127 by 32 \mathtt{VRT}_{\mathtt{i}:\mathtt{i+31}} \leftarrow RoundToNearSP((VRA)_{i:i+31}-_{fp}(VRB)_{i:i+31}) ``` For each vector element *i* from 0 to 3, do the following. Single-precision floating-point element *i* in VRB is subtracted from single-precision floating-point element i in VRA. The intermediate result is rounded to the nearest single-precision floating-point number and placed into word element i of VRT. #### **Special Registers Altered:** #### Vector Multiply-Add Single-Precision VA-form vmaddfp VRT,VRA,VRC,VRB | 4 | VRT | VRA | VRB | VRC | 46 | | |---|-----|-----|-----|-----|----|----| | 0 | 6 | 11 | 16 | 21 | 26 | 31 | ``` do i=0 to 127 by 32 prod \leftarrow (VRA)_{i:i+31} \times_{fp} (VRC)_{i:i+31} VRT_{i:i+31} \leftarrow RoundToNearSP(prod +_{fp} (VRB)_{i:i+31}) ``` For each vector element *i* from 0 to 3, do the following. Single-precision floating-point element *i* in VRA is multiplied by single-precision floating-point element *i* in VRC. Single-precision floating-point element *i* in VRB is added to the infinitely-precise product. The intermediate result is rounded to the nearest single-precision floating-point number and placed into word element *i* of VRT. #### **Special Registers Altered:** None #### - Programming Note To use a multiply-add to perform an IEEE or Java compliant multiply, the addend must be -0.0. This is necessary to insure that the sign of a zero result will be correct when the product is -0.0 (+0.0 + -0.0 $\geq$ +0.0, and -0.0 + -0.0 $\geq$ -0.0). When the sign of a resulting 0.0 is not important, then +0.0 can be used as an addend which may, in some cases, avoid the need for a second register to hold a -0.0 in addition to the integer 0/floating-point +0.0 that may already be available. #### Vector Negative Multiply-Subtract Single-Precision VA-form vnmsubfp VRT,VRA,VRC,VRB | 4 | VRT | VRA | VRB | VRC | 47 | | |---|-----|-----|-----|-----|----|----| | 0 | 6 | 11 | 16 | 21 | 26 | 31 | For each vector element *i* from 0 to 3, do the following. Single-precision floating-point element *i* in VRA is multiplied by single-precision floating-point element *i* in VRC. Single-precision floating-point element *i* in VRB is subtracted from the infinitely-precise product. The intermediate result is rounded to the nearest single-precision floating-point number, then negated and placed into word element *i* of VRT. #### **Special Registers Altered:** #### 6.10.2 Vector Floating-Point Maximum and Minimum Instructions #### Vector Maximum Single-Precision VX-form ### vmaxfp VRT,VRA,VRB | 4 | VRI | VRA | VRB | 1034 | | |-----|-----|-----|-----|------|----| | 0 6 | | 11 | 16 | 21 | 31 | do i=0 to 127 by 32 $$\begin{array}{c} \text{VRT}_{\texttt{i}:\texttt{i}+31} \leftarrow ( \text{ (VRA)}_{\texttt{i}:\texttt{i}+31} >_{\texttt{fp}} (\text{VRB})_{\texttt{i}:\texttt{i}+31} ) \\ ? \text{ (VRA)}_{\texttt{i}:\texttt{i}+31} : \text{ (VRB)}_{\texttt{i}:\texttt{i}+31} \end{array}$$ For each vector element *i* from 0 to 3, do the following. Single-precision floating-point element *i* in VRA is compared to single-precision floating-point element *i* in VRB. The larger of the two values is placed into word element *i* of VRT. The maximum of +0 and -0 is +0. The maximum of any value and a NaN is a QNaN. #### **Special Registers Altered:** None #### Vector Minimum Single-Precision VX-form vminfp VRT,VRA,VRB | 4 | VRT | VRA | VRB | 1098 | | |---|-----|-----|-----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | do i=0 to 127 by 32 $$\begin{array}{c} \text{VRT}_{\text{i:i+31}} \leftarrow (\text{ (VRA)}_{\text{i:i+31}} <_{\text{fp}} \text{ (VRB)}_{\text{i:i+31}} ) \\ ? \text{ (VRA)}_{\text{i:i+31}} : \text{ (VRB)}_{\text{i:i+31}} \end{array}$$ For each vector element *i* from 0 to 3, do the following. Single-precision floating-point element i in VRA is compared to single-precision floating-point element i in VRB. The smaller of the two values is placed into word element i of VRT. The minimum of +0 and -0 is -0. The minimum of any value and a NaN is a QNaN. #### **Special Registers Altered:** #### 6.10.3 Vector Floating-Point Rounding and Conversion Instructions See Appendix B, "Vector RTL Functions [Category: Vector]" on page 375, for RTL function descriptions. #### Vector Convert To Signed Fixed-Point Word Saturate VX-form vctsxs VRT,VRB,UIM | 4 | VRT | UIM | VRB | 970 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | do i=0 to 127 by 32 $$\begin{array}{c} \text{VRT}_{\text{i:i+31}} \leftarrow \\ \text{ConvertSPtoSXWsaturate((VRB)}_{\text{i:i+31}}, \text{ UIM)} \end{array}$$ For each vector element *i* from 0 to 3, do the following. Single-precision floating-point word element i in VRB is multiplied by $2^{\text{UIM}}$ . The product is converted to a 32-bit signed fixed-point integer using the rounding mode Round toward Zero. - If the intermediate result is greater than 2<sup>31</sup>-1 the result saturates to 2<sup>31</sup>-1. - If the intermediate result is less than -2<sup>31</sup> the result saturates to -2<sup>31</sup>. The result is placed into word element *i* of VRT. #### **Special Registers Altered:** SAT #### **Extended Mnemonics:** Example of an extended mnemonics for *Vector Convert* to Signed Fixed-Point Word Saturate: ### Extended: Equivalent to: vcfpsxws VRT,VRB,UIM vctsxs VRT,VRB,UIM #### Vector Convert To Unsigned Fixed-Point Word Saturate VX-form vctuxs VRT,VRB,UIM | 4 | VRT | UIM | VRB | 906 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | do i=0 to 127 by 32 $$\begin{array}{c} \text{VRT}_{\text{i:i+31}} \leftarrow \\ \text{ConvertSPtoUXWsaturate((VRB)}_{\text{i:i+31}}, \text{ UIM)} \end{array}$$ For each vector element *i* from 0 to 3, do the following. Single-precision floating-point word element i in VRB is multiplied by $2^{\text{UIM}}$ . The product is converted to a 32-bit unsigned fixed-point integer using the rounding mode Round toward Zero. - If the intermediate result is greater than 2<sup>32</sup>-1 the result saturates to 2<sup>32</sup>-1. The result is placed into word element *i* of VRT. #### **Special Registers Altered:** SAT #### **Extended Mnemonics:** Example of an extended mnemonics for *Vector Convert to Unsigned Fixed-Point Word Saturate*: **Extended: Equivalent to:** vcfpuxws VRT,VRB,UIM vctuxs VRT,VRB,UIM #### Vector Convert From Signed Fixed-Point Word VX-form vcfsx VRT,VRB,UIM | 4 | VRT | UIM | VRB | 842 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | do i=0 to 127 by 32 $$\begin{array}{c} \text{VRT}_{\text{i:i+31}} \leftarrow \\ \text{ConvertSXWtoSP( (VRB)}_{\text{i:i+31}} ) \div_{\text{fp}} 2^{\text{UIM}} \end{array}$$ For each vector element *i* from 0 to 3, do the following. Signed fixed-point word element i in VRB is converted to the nearest single-precision floating-point value. Each result is divided by $2^{\text{UIM}}$ and placed into word element i of VRT. #### **Special Registers Altered:** None #### **Extended Mnemonics:** Examples of extended mnemonics for Vector Convert from Signed Fixed-Point Word **Extended: Equivalent to:** vcsxwfp VRT,VRB,UIM vcfsx VRT,VRB,UIM #### Vector Convert From Unsigned Fixed-Point Word VX-form vcfux VRT,VRB,UIM | 4 | VRT | UIM | VRB | 778 | ٦ | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | do i=0 to 127 by 32 $$\begin{array}{c} \text{VRT}_{\text{i:i+31}} \leftarrow \\ \text{ConvertUXWtoSP( (VRB)}_{\text{i:i+31}} ) \div_{\text{fp}} 2^{\text{UIM}} \end{array}$$ For each vector element *i* from 0 to 3, do the following. Unsigned fixed-point word element *i* in VRB is converted to the nearest single-precision floating-point value. The result is divided by 2<sup>UIM</sup> and placed into word element *i* of VRT. #### **Special Registers Altered:** None #### **Extended Mnemonics:** Examples of extended mnemonics for Vector Convert from Unsigned Fixed-Point Word Extended: Equivalent to: vcuxwfp VRT,VRB,UIM vcfux VRT,VRB,UIM ### Vector Round to Single-Precision Integer toward -Infinity VX-form vrfim VRT,VRB | 4 | VRT | /// | VRB | 714 | ٦ | |---|-----|-----|-----|------|---| | 0 | 6 | 11 | 16 | 21 3 | 1 | do i=0 to 127 by 32 $$\text{VRT}_{0:31} \leftarrow \text{RoundToSPIntFloor((VRB)}_{0:31})$$ For each vector element i from 0 to 3, do the following. Single-precision floating-point element *i* in VRB is rounded to a single-precision floating-point integer using the rounding mode Round toward -Infinity. The result is placed into the corresponding word element *i* of VRT. #### **Special Registers Altered:** None #### Programming Note - The Vector Convert To Fixed-Point Word instructions support only the rounding mode Round toward Zero. A floating-point number can be converted to a fixed-point integer using any of the other three rounding modes by executing the appropriate Vector Round to Floating-Point Integer instruction before the Vector Convert To Fixed-Point Word instruction. #### Programming Note The fixed-point integers used by the *Vector Convert* instructions can be interpreted as consisting of 32-UIM integer bits followed by UIM fraction bits. ### Vector Round to Single-Precision Integer toward +Infinity VX-form vrfip VRT,VRB | 4 | VRT | /// | VRB | 650 | |---|-----|-----|-----|------| | 0 | 6 | 11 | 16 | 21 3 | For each vector element *i* from 0 to 3, do the following. Single-precision floating-point element *i* in VRB is rounded to a single-precision floating-point integer using the rounding mode Round toward +Infinity. The result is placed into the corresponding word element *i* of VRT. #### **Special Registers Altered:** None #### Vector Round to Single-Precision Integer Nearest VX-form vrfin VRT,VRB | 4 | VRT | /// | VRB | 522 | |---|-----|-----|-----|-------------------| | 0 | 6 | 11 | 16 | 21 3 <sup>-</sup> | ``` do i=0 to 127 by 32 \text{VRT}_{0:31} \leftarrow \text{RoundToSPIntNear( (VRB)}_{0:31} ) ``` For each vector element *i* from 0 to 3, do the following. Single-precision floating-point element *i* in VRB is rounded to a single-precision floating-point integer using the rounding mode Round to Nearest. The result is placed into the corresponding word element *i* of VRT. #### **Special Registers Altered:** None ### Vector Round to Single-Precision Integer toward Zero VX-form vrfiz VRT,VRB | 4 | VRT | /// | VRB | 586 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 32 VRT<sub>0:31</sub> ← RoundToSPIntTrunc( (VRB)<sub>0:31</sub> ) ``` For each vector element *i* from 0 to 3, do the following. Single-precision floating-point element *i* in VRB is rounded to a single-precision floating-point integer using the rounding mode Round toward Zero. The result is placed into the corresponding word element *i* of VRT. #### **Special Registers Altered:** #### 6.10.4 Vector Floating-Point Compare Instructions The Vector Floating-Point Compare instructions compare two Vector Registers word element by word element, interpreting the elements as single-precision floating-point numbers. With the exception of the Vector Compare Bounds Floating-Point instruction, they set the target Vector Register, and CR Field 6 if Rc=1, in the same manner as do the Vector Integer Compare instructions; see Section 6.9.2. The Vector Compare Bounds Floating-Point instruction sets the target Vector Register, and CR Field 6 if Rc=1, to indicate whether the elements in VRA are within the bounds specified by the corresponding element in VRB, as explained in the instruction description. A single-precision floating-point value x is said to be "within the bounds" specified by a single-precision floating-point value y if $-y \le x \le y$ . ### Vector Compare Bounds Single-Precision VC-form $\begin{array}{lll} \text{vcmpbfp} & \text{VRT,VRA,VRB} & \text{(Rc=0)} \\ \text{vcmpbfp.} & \text{VRT,VRA,VRB} & \text{(Rc=1)} \end{array}$ | 4 | VRT | VRA | VRB | Rc | ç | 966 | |---|-----|-----|-----|----|----|-----| | 0 | 6 | 11 | 16 | 21 | 22 | 3 | ``` do i=0 to 127 by 32 le \leftarrow ( (VRA)_{i:i+31} \leq_{fp} (VRB)_{i:i+31} ) \\ ge \leftarrow ( (VRA)_{i:i+31} \geq_{fp} -(VRB)_{i:i+31} ) \\ VRT_{i:i+31} \leftarrow \neg le \parallel \neg ge \parallel^{-30} 0 if Rc=1 then do lb \leftarrow (VRT^{=128}_{-100}) \\ CR6 \leftarrow 0b00 \parallel lb \parallel 0b0 ``` For each vector element *i* from 0 to 3, do the following. Single-precision floating-point word element i in VRA is compared to single-precision floating-point word element i in VRB. A 2-bit value is formed that indicates whether the element in VRA is within the bounds specified by the element in VRB, as follows. - Bit 0 of the 2-bit value is set to 0 if the element in VRA is less than or equal to the element in VRB, and is set to 1 otherwise. - Bit 1 of the 2-bit value is set to 0 if the element in VRA is greater than or equal to the negation of the element in VRB, and is set to 1 otherwise. The 2-bit value is placed into the high-order two bits of word element *i* of VRT and the remaining bits of element *i* are set to 0. If Rc=1, CR field 6 is set as follows. #### Bit Description - 0 Set to 0 - 1 Set to 0 - 2 Set to indicate whether all four elements in VRA are within the bounds specified by the corresponding element in VRB, otherwise set to 0. - 3 Set to 0 #### **Special Registers Altered:** CR6 (if Rc=1) #### **Programming Note** Each single-precision floating-point word element in VRB should be non-negative; if it is negative, the corresponding element in VRA will necessarily be out of bounds. One exception to this is when the value of an element in VRB is -0.0 and the value of the corresponding element in VRA is either +0.0 or -0.0. +0.0 and -0.0 compare equal to -0.0. ### Vector Compare Equal To Single-Precision VC-form vcmpeqfp VRT,VRA,VRB (Rc=0) vcmpeqfp. VRT,VRA,VRB (Rc=1) | | 4 | VRT | VRA | VRB | Rc | 198 | | |---|---|-----|-----|-----|----|-----|----| | 0 | | 6 | 11 | 16 | 21 | 22 | 31 | ``` do i=0 to 127 by 32 \begin{array}{c} \text{VRT}_{\text{i:i+31}} \leftarrow ((\text{VRA})_{\text{i:i+31}} =_{\text{fp}} (\text{VRB})_{\text{i:i+31}}) ? ^{32} 1 : ^{32} 0 \\ \text{if Rc=1 then do} \\ \text{t} \leftarrow (\text{VRT} = ^{128} 1) \\ \text{f} \leftarrow (\text{VRT} = ^{128} 0) \\ \text{CR6} \leftarrow \text{t} \parallel \text{Ob0} \parallel \text{f} \parallel \text{Ob0} \\ \end{array} ``` For each vector element *i* from 0 to 3, do the following. Single-precision floating-point element *i* in VRA is compared to single-precision floating-point element *i* in VRB. Word element *i* in VRT is set to all 1s if single-precision floating-point element *i* in VRA is equal to single-precision floating-point element *i* in VRB, and is set to all 0s otherwise. If the source element i in VRA or the source element i in VRB is a NaN, VRT is set to all 0s, indicating "not equal to". If the source element i in VRA and the source element i in VRB are both infinity with the same sign, VRT is set to all 1s, indicating "equal to". #### **Special Registers Altered:** #### Vector Compare Greater Than or Equal To Single-Precision VC-form | vcmpgefp | VRT,VRA,VRB | (Rc=0) | |-----------|-------------|--------| | vcmpgefp. | VRT,VRA,VRB | (Rc=1) | | 4 | VRT | VRA | VRB | Rc | 454 | | |---|-----|-----|-----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 22 | 31 | ``` do i=0 to 127 by 32 \begin{array}{c} \text{VRT}_{\text{i}:\text{i}+31} \leftarrow ((\text{VRA})_{\text{i}:\text{i}+31} \geq_{\text{fp}} (\text{VRB})_{\text{i}:\text{i}+31}) ? \ ^{32}\text{1} : \ ^{32}\text{0} \\ \text{if Rc=1 then do} \\ \text{t} \leftarrow (\text{VRT}=^{128}\text{1}) \\ \text{f} \leftarrow (\text{VRT}=^{128}\text{0}) \\ \text{CR6} \leftarrow \text{t} \parallel \text{0b0} \parallel \text{f} \parallel \text{0b0} \\ \end{array} ``` For each vector element *i* from 0 to 3, do the following. Single-precision floating-point element *i* in VRA is compared to single-precision floating-point element *i* in VRB. Word element *i* in VRT is set to all 1s if single-precision floating-point element *i* in VRA is greater than or equal to single-precision floating-point element *i* in VRB, and is set to all 0s otherwise. If the source element i in VRA or the source element i in VRB is a NaN, VRT is set to all 0s, indicating "not greater than or equal to". If the source element i in VRA and the source element i in VRB are both infinity with the same sign, VRT is set to all 1s, indicating "greater than or equal to". #### **Special Registers Altered:** CR6 (if Rc=1) ### Vector Compare Greater Than Single-Precision VC-form $\begin{array}{lll} \text{vcmpgtfp} & \text{VRT,VRA,VRB} & \text{(Rc=0)} \\ \text{vcmpgtfp.} & \text{VRT,VRA,VRB} & \text{(Rc=1)} \end{array}$ | 4 | VRT | VRA | VRB | Rc | 710 | | |---|-----|-----|-----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 22 | 31 | ``` do i=0 to 127 by 32 \begin{array}{c} \text{VRT}_{i:i+31} \leftarrow ((\text{VRA})_{i:i+31} >_{fp} (\text{VRB})_{i:i+31}) ? \ ^{32}1 : \ ^{32}0 \\ \text{if Rc=1 then do} \\ \text{t} \leftarrow (\text{VRT}=^{128}1) \\ \text{f} \leftarrow (\text{VRT}=^{128}0) \\ \text{CR6} \leftarrow \text{t} \parallel \text{ObO} \parallel \text{f} \parallel \text{ObO} \\ \end{array} ``` For each vector element *i* from 0 to 3, do the following. Single-precision floating-point element *i* in VRA is compared to single-precision floating-point element *i* in VRB. Word element *i* in VRT is set to all 1s if single-precision floating-point element *i* in VRA is greater than single-precision floating-point element *i* in VRB, and is set to all 0s otherwise. If the source element i in VRA or the source element i in VRB is a NaN, VRT is set to all 0s, indicating "not greater than". If the source element i in VRA and the source element i in VRB are both infinity with the same sign, VRT is set to all 0s, indicating "not greater than". #### **Special Registers Altered:** #### 6.10.5 Vector Floating-Point Estimate Instructions ### Vector 2 Raised to the Exponent Estimate Floating-Point VX-form vexptefp VRT,VRB | 4 | VRT | /// | VRB | 394 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | For each vector element *i* from 0 to 3, do the following. The single-precision floating-point estimate of 2 raised to the power of single-precision floating-point element *i* in VRB is placed into word element *i* of VRT. Let x be any single-precision floating-point input value. Unless x < -146 or the single-precision floating-point result of computing 2 raised to the power x would be a zero, an infinity, or a QNaN, the estimate has a relative error in precision no greater than one part in 16. The most significant 12 bits of the estimate's significand are monotonic. An integral input value returns an integral value when the result is representable. The result for various special cases of the source value is given below. | Value | Result | |------------|-----------| | - Infinity | +0 | | -0 | +1 | | +0 | +1 | | +Infinity | +Infinity | | NaN | QNaN | #### **Special Registers Altered:** None ### Vector Log Base 2 Estimate Floating-Point VRT,VRB vlogefp VX-form | 4 | VRT | /// | VRB | 458 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 32 VRT_{i:i+31} \leftarrow LogBase2EstimateSP((VRB)_{i:i+31}) ``` For each vector element *i* from 0 to 3, do the following. The single-precision floating-point estimate of the base 2 logarithm of single-precision floating-point element *i* in VRB is placed into the corresponding word element of VRT. Let x be any single-precision floating-point input value. Unless | x-1 | is less than or equal to 0.125 or the single-precision floating-point result of computing the base 2 logarithm of x would be an infinity or a QNaN, the estimate has an absolute error in precision (absolute value of the difference between the estimate and the infinitely precise value) no greater than $2^{-5}$ . Under the same conditions, the estimate has a relative error in precision no greater than one part in 8. The most significant 12 bits of the estimate's significand are monotonic. The estimate is exact if $x=2^y$ , where y is an integer between -149 and +127 inclusive. Otherwise the value placed into the element of register VRT may vary between implementations, and between different executions on the same implementation. The result for various special cases of the source value is given below. | Value | Result | |------------|------------| | - Infinity | QNaN | | < 0 | QNaN | | - 0 | - Infinity | | +0 | - Infinity | | +Infinity | +Infinity | | NaN | QNaN | #### **Special Registers Altered:** ### Vector Reciprocal Estimate Single-Precision #### VX-form vrefp VRT,VRB | 4 | VRT | /// | VRB | 266 | |---|-----|-----|-----|-------------------| | 0 | 6 | 11 | 16 | 21 3 <sup>-</sup> | do i=0 to 127 by 32 $$\text{VRT}_{i:i+31} \leftarrow \text{ReciprocalEstimateSP((VRB)}_{i:i+31})$$ For each vector element i from 0 to 3, do the following. The single-precision floating-point estimate of the reciprocal of single-precision floating-point element *i* in VRB is placed into word element *i* of VRT. Unless the single-precision floating-point result of computing the reciprocal of a value would be a zero, an infinity, or a QNaN, the estimate has a relative error in precision no greater than one part in 4096. Note that results may vary between implementations, and between different executions on the same implementation. The result for various special cases of the source value is given below. | Value | Result | |------------|------------| | - Infinity | -0 | | - 0 | - Infinity | | +0 | + Infinity | | +Infinity | +0 | | NaN | QNaN | #### **Special Registers Altered:** None #### Vector Reciprocal Square Root Estimate Single-Precision VX-form vrsqrtefp VRT,VRB | 4 | VRT | /// | VRB | 330 | | |---|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` do i=0 to 127 by 32 VRT_{i:i+31} \leftarrow ReciprocalSquareRootEstimateSP((VRB)_{i:i+31}) ``` For each vector element *i* from 0 to 3, do the following. The single-precision floating-point estimate of the reciprocal of the square root of single-precision floating-point element *i* in VRB is placed into word element *i* of VRT. Let *x* be any single-precision floating-point value. Unless the single-precision floating-point result of computing the reciprocal of the square root of *x* would be a zero, an infinity, or a QNaN, the estimate has a relative error in precision no greater than one part in 4096. Note that results may vary between implementations, and between different executions on the same implementation. The result for various special cases of the source value is given below. | Value | Result | |------------|------------| | - Infinity | QNaN | | < 0 | QNaN | | - 0 | - Infinity | | +0 | + Infinity | | +Infinity | +0 | | NaN | QNaN | #### **Special Registers Altered:** ### **6.11 Vector Status and Control Register Instructions** #### Move To Vector Status and Control Register VX-form mtvscr VRB VSCR ← (VRB)<sub>96:127</sub> The contents of word element 3 of VRB are placed into the VSCR. #### **Special Registers Altered:** None #### Move From Vector Status and Control Register VX-form mfvscr VRT | 4 | VRT | /// | 1540 | | |---|-----|-----|------|----| | 0 | 6 | 11 | 21 : | 31 | $VRT \leftarrow ^{96}0 \parallel (VSCR)$ The contents of the VSCR are placed into word element 3 of VRT. The remaining word elements in VRT are set to 0. #### **Special Registers Altered:** ## Chapter 7. Signal Processing Engine (SPE) [Category: Signal Processing Engine] | 7.1 Overview | 7.3.5.2 Fractional Format 265 | |-----------------------------------------|------------------------------------------| | 7.2 Nomenclature and Conventions 261 | 7.3.6 Computational Operations 266 | | 7.3 Programming Model 262 | 7.3.7 SPE Instructions 267 | | 7.3.1 General Operation 262 | 7.3.8 Saturation, Shift, and Bit Reverse | | 7.3.2 GPR Registers 262 | Models | | 7.3.3 Accumulator Register 262 | 7.3.8.1 Saturation 267 | | 7.3.4 Signal Processing Embedded Float- | 7.3.8.2 Shift Left 267 | | ing-Point Status and Control Register | 7.3.8.3 Bit Reverse 267 | | (SPEFSCR) | 7.3.9 SPE Instruction Set 268 | | 7.3.5 Data Formats 265 | | | 7 3 5 1 Integer Format 265 | | #### 7.1 Overview The Signal Processing Engine (SPE) accelerates signal processing applications normally suited to DSP operation. This is accomplished using short vectors (two element) within 64-bit GPRs and using single instruction multiple data (SIMD) operations to perform the requisite computations. SPE also architects an Accumulator register to allow for back to back operations without loop unrolling. ### 7.2 Nomenclature and Conventions Several conventions regarding nomenclature are used for SPE: - The Signal Processing Engine category is abbreviated as SPE. - Bits 0 to 31 of a 64-bit register are referenced as upper word, even word or high word element of the register. Bits 32:63 are referred to as lower word, odd word or low word element of the register. Each half is an element of a 64-bit GPR. - Bits 0 to 15 and bits 32 to 47 are referenced as even halfwords. Bits 16 to 31 and bits 48 to 63 are referenced as odd halfwords. - Mnemonics for SPE instructions generally begin with the letters 'ev' (embedded vector). The RTL conventions in described below are used in addition to those described in Section 1.3:Additional RTL functions are described in Appendix C. #### **Notation Meaning** ×sf Signed fractional multiplication. Result of multiplying 2 signed fractional quantities having bit length n taking the least significant 2n-1 bits of the sign extended product and concatenating a 0 to the least significant bit forming a signed fractional result of 2n bits. Two 16-bit signed fractional quantities, a and b are multiplied, as shown below: $ea_{0:31} = EXTS(a)$ $eb_{0:31} = EXTS(b)$ $prod_{0:63} = ea X eb$ $eprod_{0:63} = EXTS(prod_{32:63})$ $result_{0:31} = eprod_{33:63} || 0b0$ ×gsf Guarded signed fractional multiplication. Result of multiplying 2 signed fractional quantities having bit length 16 taking the least significant 31 bits of the sign extended product and concatenating a 0 to the least significant bit forming a guarded signed fractional result of 64 bits. Since guarded signed fractional multiplication produces a 64-bit result, fractional input quantities of -1 and -1 can produce +1 in the intermediate product. Two 16-bit fractional quantities, a and b are multiplied, as shown below: ea<sub>0:31</sub> = EXTS(a) eb<sub>0:31</sub> = EXTS(b) prod<sub>0:63</sub> = ea X eb eprod<sub>0:63</sub> = EXTS(prod<sub>32:63</sub>) result<sub>0:63</sub> = eprod<sub>1:63</sub> || 0b0 << Logical shift left. x << y shifts value x left by y bits, leaving zeros in the vacated bits. >> Logical shift right. x >> y shifts value x right by y bits, leaving zeros in the vacated bits. #### 7.3 Programming Model #### 7.3.1 General Operation SPE instructions generally take elements from one source register and operate on them with the corresponding elements of a second source register (and/or the accumulator) to produce results. Results are placed in the destination register and/or the accumulator. Instructions that are vector in nature (i.e. produce results of more than one element) provide results for each element that are independent of the computation of the other elements. These instructions can also be used to perform scalar DSP operations by ignoring the results of the upper 32-bit half of the register file. There are no record forms of *SPE* instructions. As a result, the meaning of bits in the CR is different than for other categories. *SPE Compare* instructions specify a CR field, two source registers, and the type of compare: greater than, less than, or equal. Two bits of the CR field are written with the result of the vector compare, one for each element. The remaining two bits reflect the ANDing and ORing of the vector compare results. #### 7.3.2 GPR Registers The SPE requires a GPR register file with thirty-two 64-bit registers. For 32-bit implementations, instructions that normally operate on a 32-bit register file access and change only the least significant 32-bits of the GPRs leaving the most significant 32-bits unchanged. For 64-bit implementations, operation of these instructions is unchanged, i.e. those instructions continue to operate on the 64-bit registers as they would if the SPE was not implemented. Most *SPE* instructions view the 64-bit register as being composed of a vector of two elements, each of which is 32 bits wide (some instructions read or write 16-bit elements). The most significant 32-bits are called the upper word, high word or even word. The least significant 32-bits are called the lower word, low word or odd word. Unless otherwise specified, *SPE* instructions write all 64-bits of the destination register. | | GPR Upper Word | GPR Lower Word | | |---|----------------|----------------|----| | 0 | | 32 | 63 | Figure 105.GPR #### 7.3.3 Accumulator Register A partially visible accumulator register (ACC) is provided for some SPE instructions. The accumulator is a 64-bit register that holds the results of the *Multiply Accumulate* (*MAC*) forms of *SPE Fixed-Point* instructions. The accumulator allows the back-to-back execution of dependent *MAC* instructions, something that is found in the inner loops of DSP code such as FIR and FFT filters. The accumulator is partially visible to the programmer in the sense that its results do not have to be explicitly read to use them. Instead they are always copied into a 64-bit destination GPR which is specified as part of the instruction. Based upon the type of instruction, the accumulator can hold either a single 64-bit value or a vector of two 32-bit elements. | | ACC Upper Word | | ACC Lower Word | | |---|----------------|----|----------------|----| | 0 | | 32 | | 63 | Figure 106.Accumulator # 7.3.4 Signal Processing Embedded Floating-Point Status and Control Register (SPEFSCR) Status and control for SPE uses the SPEFSCR register. This register is also used by the SPE.Embedded Float Scalar Double, SPE.Embedded Float Scalar Single, and SPE.Embedded Float Vector categories. Status and control bits are shared with these categories. The SPEFSCR register is implemented as special purpose register (SPR) number 512 and is read and written by the *mfspr* and *mtspr* instructions. *SPE* instructions affect both the high element (bits 32:33) and low element status flags (bits 48:49) of the SPEFSCR. Figure 107. Signal Processing and Embedded Floating-Point Status and Control Register The SPEFSCR bits are defined as shown below. Bit Description 32 Summary Integer Overflow High (SOVH) SOVH is set to 1 when an SPE instruction sets OVH. This is a sticky bit. #### 33 Integer Overflow High (OVH) OVH is set to 1 to indicate that an overflow has occurred in the upper element during execution of an *SPE* instruction. The bit is set to 1 if a result of an operation performed by the instruction cannot be represented in the number of bits into which the result is to be placed, and is set to 0 otherwise. The OVH bit is not altered by *Modulo* instructions, nor by other instructions that cannot overflow. 34 Embedded Floating-Point Guard Bit High (FGH) [Category: SP.FV] FGH is supplied for use by the Embedded Floating-Point Round interrupt handler. FGH is an extension of the low-order bits of the fractional result produced from an SPE.Embedded Float Vector instruction on the high word. FGH is zeroed if an overflow, underflow, or invalid input error is detected on the high element of an SPE.Embedded Float Vector instruction. Execution of an SPE.Embedded Float Scalar instruction leaves FGH undefined. 35 Embedded Floating-Point Inexact Bit High (FXH) [Category: SP.FV] FXH is supplied for use by the Embedded Floating-Point Round interrupt handler. FXH is an extension of the low-order bits of the fractional result produced from an *SPE.Embedded Float Vector* instruction on the high word. FXH represents the logical 'or' of all the bits shifted right from the Guard bit when the fractional result is normalized. FXH is zeroed if an overflow, underflow, or invalid input error is detected on the high element of an *SPE.Embedded Float Vector* instruction. Execution of an SPE.Embedded Float Scalar instruction leaves FXH undefined. 36 Embedded Floating-Point Invalid Operation/Input Error High (FINVH) [Category: SP.FV] The FINVH bit is set to 1 if any high word operand of an *SPE.Embedded Float Vector* instruction is infinity, NaN, or a denormalized value, or if the instruction is a divide and the dividend and divisor are both 0, or if a conversion to integer or fractional value overflows. Execution of an SPE.Embedded Float Scalar instruction leaves FINVH undefined. 37 Embedded Floating-Point Divide By Zero High (FDBZH) [Category: SP.FV] The FDBZH bit is set to 1 when an SPE.Embedded Vector Floating-Point Divide instruction is executed with a divisor of 0 in the high word operand, and the dividend is a finite nonzero number. Execution of an SPE.Embedded Float Scalar instruction leaves FDBZH undefined. 38 Embedded Floating-Point Underflow High (FUNFH) [Category: SP.FV] The FUNFH bit is set to 1 when the execution of an *SPE.Embedded Float Vector* instruction results in an underflow on the high word operation. Execution of an SPE.Embedded Float Scalar instruction leaves FUNFH undefined. 39 Embedded Floating-Point Overflow High (FOVFH) [Category: SP.FV] The FOVFH bit is set to 1 when the execution of an *SPE.Embedded Float Vector* instruction results in an overflow on the high word operation Execution of an SPE.Embedded Float Scalar instruction leaves FOVFH undefined. 40:41 Reserved 42 **Embedded Floating-Point Inexact Sticky Flag** (FINXS) [Categories: SP.FV, SP.FD, SP.FS] The FINXS bit is set to 1 whenever the execution of an *Embedded Floating-Point* instruction delivers an inexact result for either the low or high element and no Embedded Floating-Point Data interrupt is taken for either element, or if an *Embedded Floating-Point* instruction results in overflow (FOVF=1 or FOVFH=1), but Embedded Floating-Point Overflow exceptions are disabled (FOVFE=0), or if an *Embedded Floating-Point* instruction results in underflow (FUNF=1 or FUNFH=1), but Embedded Floating-Point Underflow exceptions are disabled (FUNFE=0), and no Embedded Floating-Point Data interrupt occurs. This is a sticky bit. 43 Embedded Floating-Point Invalid Operation/Input Sticky Flag (FINVS) [Categories: SPFV, SPFD, SPFS] The FINVS bit is defined to be the sticky result of any *Embedded Floating-Point* instruction that causes FINVH or FINV to be set to 1. That is, FINVS ← FINVS | FINV | FINVH. This is a sticky bit. 44 **Embedded Floating-Point Divide By Zero Sticky Flag** (FDBZS) [Categories: SP.FV, SP.FD, SP.FS] The FDBZS bit is set to 1 when an *Embedded Floating-Point Divide* instruction sets FDBZH or FDBZ to 1. That is, FDBZS ← FDBZS | FDBZ | FDBZH. This is a sticky bit. 45 **Embedded Floating-Point Underflow Sticky Flag** (FUNFS) [Categories: SP.FV, SP.FD, SP.FS] The FUNFS bit is defined to be the sticky result of any *Embedded Floating-Point* instruction that causes FUNFH or FUNF to be set to 1. That is, FUNFS ← FUNFS | FUNF | FUNFH. This is a sticky bit. 46 **Embedded Floating-Point Overflow Sticky Flag** (FOVFS) [Categories: SP.FV, SP.FD, SP.FS] The FOVFS bit is defined to be the sticky result of any *Embedded Floating-Point* instruction that causes FOVH or FOVF to be set to 1. That is, FOVFS ← FOVFS | FOVF | FOVFH. This is a sticky bit. - 47 Reserved - 48 **Summary Integer Overflow** (SOV) SOV is set to 1 when an *SPE* instruction sets OV to 1. This is a sticky bit. - 49 Integer Overflow (OV) OV is set to 1 to indicate that an overflow has occurred in the lower element during execution of an SPE instruction. The bit is set to 1 if a result of an operation performed by the instruction cannot be represented in the number of bits into which the result is to be placed, and is set to 0 otherwise. The OV bit is not 50 Embedded Floating-Point Guard Bit (Low/ scalar) (FG) [Categories: SP.FV, SP.FD, SP.FS] instructions that cannot overflow. FG is supplied for use by the Embedded Floating-Point Round interrupt handler. FG is an extension of the low-order bits of the fractional result produced from an *Embedded Floating-Point* instruction on the low word. FG is zeroed if an overflow, underflow, or invalid input error is detected on the low element of an *Embedded Floating-Point* instruction. altered by Modulo instructions, nor by other 51 Embedded Floating-Point Inexact Bit (Low/scalar) (FX) [Categories: SP.FV, SP.FD, SP.FS] FX is supplied for use by the Embedded Floating-Point Round interrupt handler. FX is an extension of the low-order bits of the fractional result produced from an *Embedded Floating-Point* instruction on the low word. FX represents the logical 'or' of all the bits shifted right from the Guard bit when the fractional result is normalized. FX is zeroed if an overflow, underflow, or invalid input error is detected on *Embedded Floating-Point* instruction 52 Embedded Floating-Point Invalid Operation/Input Error (Low/scalar) (FINV) [Categories: SP.FV, SP.FD, SP.FS] The FINV bit is set to 1 if any low word operand of an *Embedded Floating-Point* instruction is infinity, NaN, or a denormalized value, or if the operation is a divide and the dividend and divisor are both 0, or if a conversion to integer or fractional value overflows. 53 Embedded Floating-Point Divide By Zero (Low/scalar) (FDBZ) [Categories: SP.FV, SP.FD, SP.FS] The FDBZ bit is set to 1 when an *Embedded Floating-Point Divide* instruction is executed with a divisor of 0 in the low word operand, and the dividend is a finite nonzero number. 54 Embedded Floating-Point Underflow (Low/ scalar) (FUNF) [Categories: SP.FV, SP.FD, SP.FS] The FUNF bit is set to 1 when the execution of an *Embedded Floating-Point* instruction results in an underflow on the low word operation. 55 **Embedded Floating-Point Overflow (Low/ scalar)** (FOVF) [Categories: SP.FV, SP.FD, SP.FS] The FOVF bit is set to 1 when the execution of an *Embedded Floating-Point* instruction results in an overflow on the low word operation. - 56 Reserved - 57 Embedded Floating-Point Round (Inexact) Exception Enable (FINXE) [Categories: SP.FV, SP.FD, SP.FS] - 0 Exception disabled - 1 Exception enabled The Embedded Floating-Point Round interrupt is taken if the exception is enabled and if FG | FGH | FX | FXH (signifying an inexact result) is set to 1 as a result of an *Embedded Floating-Point* instruction. If an *Embedded Floating-Point* instruction results in overflow or underflow and the corresponding Embedded Floating-Point Underflow or Embedded Floating-Point Overflow exception is disabled then the Embedded Floating-Point Round interrupt is taken. - 58 Embedded Floating-Point Invalid Operation/Input Error Exception Enable (FINVE) [Categories: SP.FV, SP.FD, SP.FS] - 0 Exception disabled - Exception enabled If the exception is enabled, an Embedded Floating-Point Data interrupt is taken if the FINV or FINVH bit is set to 1 by an *Embedded Floating-Point* instruction. 59 Embedded Floating-Point Divide By Zero Exception Enable (FDBZE) [Categories: SPFV, SPFD, SPFS] 0 Exception disabled #### 1 Exception enabled If the exception is enabled, an Embedded Floating-Point Data interrupt is taken if the FDBZ or FDBZH bit is set to 1 by an *Embedded Floating-Point* instruction. - 60 Embedded Floating-Point Underflow Exception Enable (FUNFE) [Categories: SP.FV, SP.FD, SP.FS] - 0 Exception disabled - 1 Exception enabled If the exception is enabled, an Embedded Floating-Point Data interrupt is taken if the FUNF or FUNFH bit is set to 1 by an *Embedded Floating-Point* instruction. - 61 Embedded Floating-Point Overflow Exception Enable (FOVFE) [Categories: SP.FV, SP.FD, SP.FS] - 0 Exception disabled - 1 Exception enabled If the exception is enabled, an Embedded Floating-Point Data interrupt is taken if the FOVF or FOVFH bit is set to 1 by an *Embedded Floating-Point* instruction. - 62:63 Embedded Floating-Point Rounding Mode Control (FRMC) [Categories: SP.FV, SP.FD, SP.FS] - 00 Round to Nearest - 01 Round toward Zero - 10 Round toward +Infinity - 11 Round toward -Infinity #### **Programming Note** Rounding modes 0b10 (+Infinity) and 0b11 (-Infinity) may not be supported by some implementations. If an implementation does not support these, Embedded Floating-Point Round interrupts are generated for every *Embedded Floating-Point* instruction for which rounding is required when +Infinity or -Infinity modes are set and software is required to produce the correctly rounded result #### 7.3.5 Data Formats The SPE provides two different data formats, integer and fractional. Both data formats can be treated as signed or unsigned quantities. #### 7.3.5.1 Integer Format Unsigned integers consist of 16, 32, or 64-bit binary integer values. The largest representable value is 2<sup>n</sup>-1 where n represents the number of bits in the value. The smallest representable value is 0. Computations that produce values larger than 2<sup>n</sup>-1 or smaller than 0 may set OV or OVH in the SPEFSCR. Signed integers consist of 16, 32, or 64-bit binary values in two's complement form. The largest representable value is 2<sup>n-1</sup>-1 where n represents the number of bits in the value. The smallest representable value is -2<sup>n-1</sup>. Computations that produce values larger than 2<sup>n-1</sup>-1 or smaller than -2<sup>n-1</sup> may set OV or OVH in the SPEFSCR. #### 7.3.5.2 Fractional Format Fractional data format is conventionally used for DSP fractional arithmetic. Fractional data is useful for representing data converted from analog devices. Unsigned fractions consist of 16, 32, or 64-bit binary fractional values that range from 0 to less than 1. Unsigned fractions place the radix point immediately to the left of the most significant bit. The most significant bit of the value represents the value 2<sup>-1</sup>, the next most significant bit represents the value 2<sup>-2</sup> and so on. The largest representable value is 1-2<sup>-n</sup> where n represents the number of bits in the value. The smallest representable value is 0. Computations that produce values larger than 1-2<sup>-n</sup> or smaller than 0 may set OV or OVH in the SPEFSCR. The SPE category does not define unsigned fractional forms of instructions to manipulate unsigned fractional data since the unsigned integer forms of the instructions produce the same results as would the unsigned fractional forms. Guarded unsigned fractions are 64-bit binary fractional values. Guarded unsigned fractions place the decimal point immediately to the left of bit 32. The largest representable value is $2^{32}$ - $2^{-32}$ . The smallest representable value is 0. Guarded unsigned fractional computations are always modulo and do not set OV or OVH in the SPEFSCR. Signed fractions consist of 16, 32, or 64-bit binary fractional values in two's-complement form that range from -1 to less than 1. Signed fractions place the decimal point immediately to the right of the most significant bit. The largest representable value is 1-2<sup>-(n-1)</sup> where n represents the number of bits in the value. The smallest representable value is -1. Computations that produce values larger than 1-2<sup>-(n-1)</sup>or smaller than -1 may set OV or OVH in the SPEFSCR. Multiplication of two signed fractional values causes the result to be shifted left one bit to remove the resultant redundant sign bit in the product. In this case, a 0 bit is concatenated as the least significant bit of the shifted result. Guarded signed fractions are 64-bit binary fractional values. Guarded signed fractions place the decimal point immediately to the left of bit 33. The largest representable value is 2<sup>32</sup>-2<sup>-31</sup>. The smallest representable value is -2<sup>32</sup>-1+2<sup>-31</sup>. Guarded signed fractional computations are always modulo and do not set OV or OVH in the SPEFSCR. ### 7.3.6 Computational Operations The SPE category supports several different computational capabilities. Both modulo and saturation results can be performed. Modulo results produce truncation of the overflow bits in a calculation, therefore overflow does not occur and no saturation is performed. For instructions for which overflow occurs, saturation provides a maximum or minimum representable value (for the data type) in the case of overflow. Instructions are provided for a wide range of computational capability. The operation types can be divided into 4 basic categories: Simple Vector instructions. These instructions use the corresponding low and high word elements of the operands to produce a vector result that is placed in the destination register, the accumulator, or both. - Multiply and Accumulate instructions. These instructions perform multiply operations, optionally add the result to the accumulator, and place the result into the destination register and optionally into the accumulator. These instructions are composed of different multiply forms, data formats and data accumulate options. The mnemonics for these instructions indicate their various characteristics. These are shown in Table 2. - Load and Store instructions. These instructions provide load and store capabilities for moving data to and from memory. A variety of forms are provided that position data for efficient computation. - Compare and miscellaneous instructions. These instructions perform miscellaneous functions such as field manipulation, bit reversed incrementing, and vector compares. | Table 2: Mn | emonic Extensions for Multiply Accumu | late Instructions | | | |-------------|---------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--| | Extension | Meaning | Comments | | | | | Multiply | y Form | | | | he | halfword even | 16 X 16 → 32 | | | | heg | halfword even guarded | 16 X 16 $\rightarrow$ 32, 64-bit final accumulate result | | | | ho | halfword odd | 16 X 16 → 32 | | | | hog | halfword odd guarded | 16 X 16 $\rightarrow$ 32, 64-bit final accumulate result | | | | W | word | 32 X 32 → 64 | | | | wh | word high | 32 X 32 → 32 (high-order 32 bits of product) | | | | w/ | word low | $32 \times 32 \rightarrow 32$ (low-order 32 bits of product) | | | | | Data F | Format | | | | smf | signed modulo fractional | modulo, no saturation or overflow | | | | smi | signed modulo integer | modulo, no saturation or overflow | | | | ssf | signed saturate fractional | saturation on product and accumulate | | | | ssi | signed saturate integer | saturation on product and accumulate | | | | umi | unsigned modulo integer | modulo, no saturation or overflow | | | | usi | unsigned saturate integer | saturation on product and accumulate | | | | | Accumula | te Option | | | | а | place in accumulator | $result \to accumulator$ | | | | aa | add to accumulator | accumulator + result → accumulator | | | | aaw | add to accumulator as word elements | $\begin{array}{l} \text{accumulator}_{0:31} + \text{result}_{0:31} \rightarrow \text{accumulator}_{0:31} \\ \text{accumulator}_{32:63} + \text{result}_{32:63} \rightarrow \text{accumulator}_{32:63} \end{array}$ | | | | an | add negated to accumulator | accumulator - result → accumulator | | | | anw | add negated to accumulator as word elements | $\begin{array}{l} \text{accumulator}_{0:31} \text{ - result}_{0:31} \rightarrow \text{accumulator}_{0:31} \\ \text{accumulator}_{32:63} \text{ - result}_{32:63} \rightarrow \text{accumulator}_{32:63} \end{array}$ | | | #### 7.3.7 SPE Instructions ## 7.3.8 Saturation, Shift, and Bit Reverse Models For saturation, left shifts, and bit reversal, the pseudo RTL is provided here to more accurately describe those functions that are referenced in the instruction pseudo RTL. #### 7.3.8.1 Saturation ``` SATURATE(ov, carry, sat_ovn, sat_ov, val) if ov then if carry then return sat_ovn else return sat_ov else return val ``` #### 7.3.8.2 Shift Left ``` SL(value, cnt) if cnt > 31 then return 0 else return (value << cnt)</pre> ``` #### **7.3.8.3** Bit Reverse #### BITREVERSE (value) ``` result ← 0 mask ← 1 shift ← 31 cnt ← 32 while cnt > 0 then do t ← value & mask if shift >= 0 then result ← (t << shift) | result else result ← (t >> -shift) | result cnt ← cnt - 1 shift ← shift - 2 mask ← mask << 1 return result</pre> ``` #### 7.3.9 SPE Instruction Set #### Bit Reversed Increment **EVX-form** brinc RT,RA,RB | 4 | RT | RA | RB | 527 | |---|----|----|----|-------| | 0 | 6 | 11 | 16 | 21 31 | ``` n ← implementation-dependent number of mask bits mask ← (RB)_{64-n:63} a ← (RA)_{64-n:63} d ← BITREVERSE(1 + BITREVERSE(a | (\neg mask))) RT ← (RA)_{0:63-n} || (d & mask) ``` **brinc** computes a bit-reverse index based on the contents of RA and a mask specified in RB. The new index is written to RT. The number of bits in the mask is implementation-dependent but may not exceed 32. #### **Special Registers Altered:** None #### Programming Note **brinc** provides a way for software to access FFT data in a bit-reversed manner. RA contains the index into a buffer that contains data on which FFT is to be performed. RB contains a mask that allows the index to be updated with bit-reversed addressing. Typically this instruction precedes a load with index instruction; for example, ``` brinc r2, r3, r4 lhax r8, r5, r2 ``` RB contains a bit-mask that is based on the number of points in an FFT. To access a buffer containing n byte sized data that is to be accessed with bit-reversed addressing, the mask has $\log_2 n$ 1s in the least significant bit positions and 0s in the remaining most significant bit positions. If, however, the data size is a multiple of a halfword or a word, the mask is constructed so that the 1s are shifted left by $\log_2$ (size of the data) and 0s are placed in the least significant bit positions. #### Programming Note This instruction only modifies the lower 32 bits of the destination register in 32-bit implementations. For 64-bit implementations in 32-bit mode, the contents of the upper 32-bits of the destination register are undefined. #### Programming Note Execution of **brinc** does not cause SPE Unavailable exceptions regardless of MSR<sub>SPV</sub>. #### Vector Absolute Value **EVX-form** evabs RT,RA | 4 | RT | RA | /// | 520 | | |---|----|----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` RT_{0:31} \leftarrow ABS((RA)_{0:31}) RT_{32:63} \leftarrow ABS((RA)_{32:63}) ``` The absolute value of each element of RA is placed in the corresponding elements of RT. An absolute value of 0x8000\_0000 (most negative number) returns 0x8000\_0000. #### **Special Registers Altered:** None #### Vector Add Immediate Word EVX-form evaddiw RT,RB,UI | 4 | RT | U | RB | 514 | |---|----|----|----|-------| | 0 | 6 | 11 | 16 | 21 31 | $$\begin{array}{l} \operatorname{RT}_{0:31} \leftarrow \left( \operatorname{RB} \right)_{0:31} + \operatorname{EXTZ} \left( \operatorname{UI} \right) \\ \operatorname{RT}_{32:63} \leftarrow \left( \operatorname{RB} \right)_{32:63} + \operatorname{EXTZ} \left( \operatorname{UI} \right) \end{array}$$ UI is zero-extended and added to both the high and low elements of RB and the results are placed in RT. Note that the same value is added to both elements of the register. #### **Special Registers Altered:** None #### Vector Add Signed, Modulo, Integer to Accumulator Word EVX-form evaddsmiaaw RT,RA | 4 | RT | RA | /// | 1225 | | |---|----|----|-----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$RT_{0:31} \leftarrow (ACC)_{0:31} + (RA)_{0:31}$$ $RT_{32:63} \leftarrow (ACC)_{32:63} + (RA)_{32:63}$ $ACC_{0:63} \leftarrow (RT)_{0:63}$ Each word element in RA is added to the corresponding element in the accumulator and the results are placed in RT and into the accumulator. #### **Special Registers Altered:** #### Vector Add Signed, Saturate, Integer to Accumulator Word EVX-form evaddssiaaw RT,RA ``` \begin{array}{l} \mathsf{temp}_{0:63} \leftarrow \mathsf{EXTS}\,((\mathsf{ACC})_{0:31}) \, + \, \mathsf{EXTS}\,((\mathsf{RA})_{0:31}) \\ \mathsf{ovh} \leftarrow \mathsf{temp}_{31} \oplus \mathsf{temp}_{32} \\ \mathsf{RT}_{0:31} \leftarrow \mathsf{SATURATE}\,(\mathsf{ovh}, \, \, \mathsf{temp}_{31}, \, \, \mathsf{0x8000\_0000}, \\ \mathsf{0x7FFF\_FFFF}, \, \, \mathsf{temp}_{32:63}) \\ \\ \mathsf{temp}_{0:63} \leftarrow \mathsf{EXTS}\,((\mathsf{ACC})_{32:63}) \, + \, \mathsf{EXTS}\,((\mathsf{RA})_{32:63}) \\ \mathsf{ovl} \leftarrow \mathsf{temp}_{31} \oplus \mathsf{temp}_{32} \\ \mathsf{RT}_{32:63} \leftarrow \mathsf{SATURATE}\,(\mathsf{ovl}, \, \, \mathsf{temp}_{31}, \, \, \mathsf{0x8000\_0000}, \\ \mathsf{0x7FFF\_FFFF}, \, \, \mathsf{temp}_{32:63}) \\ \\ \mathsf{ACC}_{0:63} \leftarrow (\mathsf{RT})_{0:63} \\ \mathsf{SPEFSCR}_{\mathsf{OVH}} \leftarrow \mathsf{ovh} \\ \mathsf{SPEFSCR}_{\mathsf{SOVH}} \leftarrow \mathsf{ovl} \\ \mathsf{SPEFSCR}_{\mathsf{SOVH}} \leftarrow \mathsf{SPEFSCR}_{\mathsf{SOVH}} \mid \, \mathsf{ovh} \\ \end{array} ``` Each signed-integer word element in RA is sign-extended and added to the corresponding sign-extended element in the accumulator saturating if overflow occurs, and the results are placed in RT and the accumulator. #### **Special Registers Altered:** $SPEFSCR_{SOV} \leftarrow SPEFSCR_{SOV} \mid ovl$ ACC OV OVH SOV SOVH #### Vector Add Unsigned, Modulo, Integer to Accumulator Word EVX-form evaddumiaaw RT,RA $$RT_{0:31} \leftarrow (ACC)_{0:31} + (RA)_{0:31}$$ $RT_{32:63} \leftarrow (ACC)_{32:63} + (RA)_{32:63}$ $ACC_{0:63} \leftarrow (RT)_{0:63}$ Each unsigned-integer word element in RA is added to the corresponding element in the accumulator and the results are placed in RT and the accumulator. #### **Special Registers Altered:** ACC #### Vector Add Unsigned, Saturate, Integer to Accumulator Word EVX-form evaddusiaaw RT,RA ``` \begin{array}{l} \operatorname{temp}_{0:63} \leftarrow \operatorname{EXTZ}\left(\left(\operatorname{ACC}\right)_{0:31}\right) + \operatorname{EXTZ}\left(\left(\operatorname{RA}\right)_{0:31}\right) \\ \operatorname{ovh} \leftarrow \operatorname{temp}_{31} \\ \operatorname{RT}_{0:31} \leftarrow \operatorname{SATURATE}\left(\operatorname{ovh}, \operatorname{temp}_{31}, \operatorname{0xFFFF}_{FFFF}, \operatorname{temp}_{32:63}\right) \\ \operatorname{temp}_{0:63} \leftarrow \operatorname{EXTZ}\left(\left(\operatorname{ACC}\right)_{32:63}\right) + \operatorname{EXTZ}\left(\left(\operatorname{RA}\right)_{32:63}\right) \\ \operatorname{ovl} \leftarrow \operatorname{temp}_{31} \\ \operatorname{RT}_{32:63} \leftarrow \operatorname{SATURATE}\left(\operatorname{ovl}, \operatorname{temp}_{31}, \operatorname{0xFFFF}_{FFFF}, \operatorname{temp}_{32:63}\right) \\ \operatorname{ACC}_{0:63} \leftarrow \left(\operatorname{RT}\right)_{0:63} \\ \operatorname{SPEFSCR}_{\text{OV}} \leftarrow \operatorname{ovh} \\ \operatorname{SPEFSCR}_{\text{SOVH}} \leftarrow \operatorname{ovh} \\ \operatorname{SPEFSCR}_{\text{SOVH}} \leftarrow \operatorname{SPEFSCR}_{\text{SOVH}} \mid \operatorname{ovh} \\ \operatorname{SPEFSCR}_{\text{SOV}} \leftarrow \operatorname{SPEFSCR}_{\text{SOV}} \mid \operatorname{ovh} \\ \operatorname{SPEFSCR}_{\text{SOV}} \leftarrow \operatorname{SPEFSCR}_{\text{SOV}} \mid \operatorname{ovh} \\ \operatorname{SPEFSCR}_{\text{SOV}} \leftarrow \operatorname{SPEFSCR}_{\text{SOV}} \mid \operatorname{ovl} \end{array} ``` Each unsigned-integer word element in RA is zero-extended and added to the corresponding zero-extended element in the accumulator saturating if overflow occurs, and the results are placed in RT and the accumulator. #### **Special Registers Altered:** ACC OV OVH SOV SOVH #### **Vector Add Word** **EVX-form** evaddw RT,RA,RB | 4 | RT | RA | RB | 512 | |---|----|----|----|------| | 0 | 6 | 11 | 16 | 21 3 | $$\begin{array}{l} \mathtt{RT_{0:31}} \leftarrow \mathtt{(RA)_{0:31}} + \mathtt{(RB)_{0:31}} \\ \mathtt{RT_{32:63}} \leftarrow \mathtt{(RA)_{32:63}} + \mathtt{(RB)_{32:63}} \end{array}$$ The corresponding elements of RA and RB are added and the results are placed in RT. The sum is a modulo sum #### **Special Registers Altered:** #### **Vector AND** #### **EVX-form** ### Vector AND with Complement EVX-form RA RT,RA,RB evand RT,RA,RB | 4 RT | | RA RB | | 529 | | |------|---|-------|----|------|--| | 0 | 6 | 11 | 16 | 21 3 | | $$RT_{0:31} \leftarrow (RA)_{0:31} \& (RB)_{0:31}$$ $RT_{32:63} \leftarrow (RA)_{32:63} \& (RB)_{32:63}$ The corresponding elements of RA and RB are ANDed bitwise and the results are placed in the corresponding element of RT. #### **Special Registers Altered:** None #### $RT_{0:31} \leftarrow (RA)_{0:31} \& (\neg (RB)_{0:31})$ $RT_{32:63} \leftarrow (RA)_{32:63} \& (\neg (RB)_{32:63})$ RT The word elements of RA are ANDed bitwise with the complement of the corresponding elements of RB. The results are placed in the corresponding element of RT. RB 530 #### **Special Registers Altered:** None evandc 4 ### Vector Compare Equal #### EVX-form evcmpeq BF,RA,RB | 4 | BF | // | RA | RB | 564 | П | |---|----|----|----|----|-----|----| | 0 | 6 | 9 | 11 | 16 | 21 | 31 | ``` ah \leftarrow (RA)<sub>0:31</sub> al \leftarrow (RA)<sub>32:63</sub> bh \leftarrow (RB)<sub>0:31</sub> bl \leftarrow (RB)<sub>32:63</sub> if (ah = bh) then ch \leftarrow 1 else ch \leftarrow 0 if (al = bl) then cl \leftarrow 1 else cl \leftarrow 0 CR<sub>4×BF+32:4×BF+35</sub> \leftarrow ch || cl || (ch | cl) || (ch & cl) ``` The most significant bit in BF is set if the high-order element of RA is equal to the high-order element of RB; it is cleared otherwise. The next bit in BF is set if the low-order element of RA is equal to the low-order element of RB and cleared otherwise. The last two bits of BF are set to the OR and AND of the result of the compare of the high and low elements. #### **Special Registers Altered:** CR field BF #### Vector Compare Greater Than Signed EVX-form evcmpgts BF,RA,RB | Ī | 4 | BF | // | RA | RB | 561 | 1 | |---|---|----|----|----|----|------|---| | ( | 0 | 6 | 9 | 11 | 16 | 21 3 | 1 | ``` ah ← (RA) _{0:31} al ← (RA) _{32:63} bh ← (RB) _{0:31} bl ← (RB) _{32:63} if (ah > bh) then ch ← 1 else ch ← 0 if (al > bl) then cl ← 1 else cl ← 0 CR<sub>4×BF+32:4×BF+35</sub> ← ch || cl || (ch | cl) || (ch & cl) ``` The most significant bit in BF is set if the high-order element of RA is greater than the high-order element of RB; it is cleared otherwise. The next bit in BF is set if the low-order element of RA is greater than the low-order element of RB and cleared otherwise. The last two bits of BF are set to the OR and AND of the result of the compare of the high and low elements. #### **Special Registers Altered:** CR field BF #### Vector Compare Greater Than Unsigned EVX-form evcmpgtu BF,RA,RB | 4 | BF | // | RA | RB | 560 | |---|----|----|----|----|-------------------| | 0 | 6 | 9 | 11 | 16 | 21 3 <sup>-</sup> | ``` \begin{array}{l} \text{ah} \leftarrow (\text{RA})_{\,0\,:\,3\,1} \\ \text{al} \leftarrow (\text{RA})_{\,3\,2\,:\,6\,3} \\ \text{bh} \leftarrow (\text{RB})_{\,0\,:\,3\,1} \\ \text{bl} \leftarrow (\text{RB})_{\,3\,2\,:\,6\,3} \\ \text{if (ah} >^\text{u} \text{bh) then ch} \leftarrow 1 \\ \text{else ch} \leftarrow 0 \\ \text{if (al} >^\text{u} \text{bl) then cl} \leftarrow 1 \\ \text{else cl} \leftarrow 0 \\ \text{CR}_{4\,\times\,\mathrm{BF}+3\,2\,:\,4\,\times\,\mathrm{BF}+3\,5} \leftarrow \text{ch} \mid \mid \text{cl} \mid \mid \text{(ch} \mid \text{cl)} \mid \mid \text{(ch} \,\,\&\,\,\text{cl)} \end{array} ``` The most significant bit in BF is set if the high-order element of RA is greater than the high-order element of RB; it is cleared otherwise. The next bit in BF is set if the low-order element of RA is greater than the low-order element of RB and cleared otherwise. The last two bits of BF are set to the OR and AND of the result of the compare of the high and low elements. #### **Special Registers Altered:** CR field BF ## Vector Compare Less Than Unsigned EVX-form evcmpltu BF,RA,RB | 4 | BF | // | RA | RB | 562 | | |---|----|----|----|----|-----|----| | 0 | 6 | 9 | 11 | 16 | 21 | 31 | ``` \begin{array}{l} \text{ah} \leftarrow (\text{RA})_{\,0:31} \\ \text{al} \leftarrow (\text{RA})_{\,32:63} \\ \text{bh} \leftarrow (\text{RB})_{\,0:31} \\ \text{bl} \leftarrow (\text{RB})_{\,32:63} \\ \text{if (ah} <^u \text{bh) then ch} \leftarrow 1 \\ \text{else ch} \leftarrow 0 \\ \text{if (al} <^u \text{bl) then cl} \leftarrow 1 \\ \text{else cl} \leftarrow 0 \\ \text{CR}_{4\times \text{BF}+32:4\times \text{BF}+35} \leftarrow \text{ch} \mid \mid \text{cl} \mid \mid \text{(ch} \mid \text{cl)} \mid \mid \text{(ch} \& \text{cl)} \end{array} ``` The most significant bit in BF is set if the high-order element of RA is less than the high-order element of RB; it is cleared otherwise. The next bit in BF is set if the low-order element of RA is less than the low-order element of RB and cleared otherwise. The last two bits of BF are set to the OR and AND of the result of the compare of the high and low elements. #### **Special Registers Altered:** CR field BF #### Vector Compare Less Than Signed EVX-form evcmplts BF,RA,RB | 4 | BF | // | RA | RB | 563 | ٦ | |---|----|----|----|----|-----|----| | 0 | 6 | 9 | 11 | 16 | 21 | 31 | ``` ah ← (RA) _{0:31} al ← (RA) _{32:63} bh ← (RB) _{0:31} bl ← (RB) _{32:63} if (ah < bh) then ch ← 1 else ch ← 0 if (al < bl) then cl ← 1 else cl ← 0 CR_{4\times BF+32:4\times BF+35} ← ch | cl | (ch | cl) | (ch & cl) ``` The most significant bit in BF is set if the high-order element of RA is less than the high-order element of RB; it is cleared otherwise. The next bit in BF is set if the low-order element of RA is less than the low-order element of RB and cleared otherwise. The last two bits of BF are set to the OR and AND of the result of the compare of the high and low elements. #### **Special Registers Altered:** CR field BF #### Vector Count Leading Signed Bits Word EVX-form evcntlsw RT,RA | 4 | RT | RA | /// | 526 | | |---|----|----|-----|------|----| | 0 | 6 | 11 | 16 | 21 : | 31 | ``` \begin{array}{l} n \leftarrow 0 \\ s \leftarrow (RA)_n \\ \text{do while } n < 32 \\ \quad \text{if } (RA)_n \neq s \text{ then leave} \\ \quad n \leftarrow n+1 \\ RT_{0:31} \leftarrow n \\ \quad n \leftarrow 0 \\ s \leftarrow (RA)_{n+32} \\ \text{do while } n < 32 \\ \quad \text{if } (RA)_{n+32} \neq s \text{ then leave} \\ \quad n \leftarrow n+1 \\ RT_{32:63} \leftarrow n \end{array} ``` The leading sign bits in each element of RA are counted, and the respective count is placed into each element of RT. #### **Special Registers Altered:** None #### **Programming Note** evcntlzw is used for unsigned operands; evcntlsw is used for signed operands. #### Vector Count Leading Zeros Word EVX-form evcntlzw RT,RA ``` n \leftarrow 0 do while n < 32 if (RA)_n = 1 then leave n \leftarrow n + 1 RT_{0:31} \leftarrow n n \leftarrow 0 do while n < 32 if (RA)_{n+32} = 1 then leave n \leftarrow n + 1 RT_{32:63} \leftarrow n ``` The leading zero bits in each element of RA are counted, and the respective count is placed into each element of RT. #### Special Registers Altered: None #### Vector Divide Word Signed EVX-form evdivws RT,RA,RB ``` ddh \leftarrow (RA)_{0:31} ddl \leftarrow (RA)_{32:63} dvh \leftarrow (RB)_{0:31} dvl \leftarrow (RB)_{32:63} RT_{0:31} \leftarrow ddh \div dvh RT_{32:63} \leftarrow ddl \div dvl ovh \leftarrow 0 ovl \leftarrow 0 if ((ddh < 0) & (dvh = 0)) then RT_{0:31} \leftarrow 0x8000_0000 ovh \leftarrow 1 else if ((ddh >= 0) & (dvh = 0)) then RT_{0:31} \leftarrow 0x7FFFFFFF ovh \leftarrow 1 else if (ddh = 0x8000_0000)&(dvh = 0xFFFF_FFFF) RT_{0:31} \leftarrow 0x7FFFFFFF ovh \leftarrow 1 if ((ddl < 0) & (dvl = 0)) then RT_{32:63} \leftarrow 0x8000_0000 ovl \leftarrow 1 else if ((ddl >= 0) & (dvl = 0)) then RT_{32:63} \leftarrow 0x7FFFFFFF ovl ← 1 else if (ddl = 0x8000_0000)&(dvl = 0xFFFF_FFFF) RT_{32:63} \leftarrow 0x7FFFFFFF ovl \leftarrow 1 \texttt{SPEFSCR}_{\texttt{OVH}} \leftarrow \texttt{ovh} SPEFSCR<sub>OV</sub> ← ovl SPEFSCR<sub>SOVH</sub> ← SPEFSCR<sub>SOVH</sub> | ovh SPEFSCR_{SOV} \leftarrow SPEFSCR_{SOV} \mid ovl ``` The two dividends are the two elements of the contents of RA. The two divisors are the two elements of the contents of RB. The resulting two 32-bit quotients on each element are placed into RT. The remainders are not supplied. The operands and quotients are interpreted as signed integers. #### **Special Registers Altered:** OV OVH SOV SOVH #### Programming Note Note that any overflow indication is always set as a side effect of this instruction. No form is defined that disables the setting of the overflow bits. In case of overflow, a saturated value is delivered into the destination register. #### Vector Divide Word Unsigned EVX-form evdivwu RT,RA,RB | 4 | RT | RA | RB | 1223 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` ddh \leftarrow (RA)_{0:31} ddl \leftarrow (RA)_{32:63} dvh \leftarrow (RB)_{0:31} dvl \leftarrow (RB)_{32:63} RT_{0:31} \leftarrow ddh \div dvh RT_{32:63} \leftarrow ddl \div dvl ovh \leftarrow 0 ovl \leftarrow 0 if (dvh = 0) then RT_{0:31} \leftarrow OxFFFFFFFF ovh ← 1 if (dvl = 0) then RT_{32:63} \leftarrow Oxffffffff ovl \leftarrow 1 \texttt{SPEFSCR}_{\texttt{OVH}} \; \leftarrow \; \texttt{ovh} \texttt{SPEFSCR}_{\texttt{OV}} \; \leftarrow \; \texttt{ovl} \texttt{SPEFSCR}_{\texttt{SOVH}} \leftarrow \texttt{SPEFSCR}_{\texttt{SOVH}} \mid \texttt{ovh} SPEFSCR_{SOV} \leftarrow SPEFSCR_{SOV} \mid ovl ``` The two dividends are the two elements of the contents of RA. The two divisors are the two elements of the contents of RB. Two 32-bit quotients are formed as a result of the division on each of the high and low elements and the quotients are placed into RT. Remainders are not supplied. Operands and quotients are interpreted as unsigned integers. #### **Special Registers Altered:** OV OVH SOV SOVH #### - Programming Note Note that any overflow indication is always set as a side effect of this instruction. No form is defined that disables the setting of the overflow bits. In case of overflow, a saturated value is delivered into the destination register. #### Vector Equivalent **EVX-form** eveqv RT,RA,RB | 4 | RT | RA | RB | 537 | | |---|----|-----|------|-----|----| | ' | 1 | 100 | IND. | 007 | | | 0 | 6 | 11 | 16 | 21 | 31 | $$RT_{0:31} \leftarrow (RA)_{0:31} \equiv (RB)_{0:31}$$ $RT_{32:63} \leftarrow (RA)_{32:63} \equiv (RB)_{32:63}$ The corresponding elements of RA and RB are XORed bitwise, and the complemented results are placed in RT #### **Special Registers Altered:** None #### Vector Extend Sign Byte EVX-form evextsb RT,RA | 4 | RT | RA | /// | 522 | | |---|----|----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{l} \mathtt{RT_{0:31}} \leftarrow \mathtt{EXTS}\left(\left(\mathtt{RA}\right)_{24:31}\right) \\ \mathtt{RT_{32:63}} \leftarrow \mathtt{EXTS}\left(\left(\mathtt{RA}\right)_{56:63}\right) \end{array}$$ The signs of the low-order byte in each of the elements in RA are extended, and the results are placed in RT. #### **Special Registers Altered:** None #### Vector Extend Sign Halfword EVX-form evextsh RT,RA | | 4 | RT | RA | /// | 523 | | |---|---|----|----|-----|-----|----| | 0 | | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{l} \mathtt{RT_{0:31}} \leftarrow \mathtt{EXTS} \left( \left( \mathtt{RA} \right)_{16:31} \right) \\ \mathtt{RT_{32:63}} \leftarrow \mathtt{EXTS} \left( \left( \mathtt{RA} \right)_{48:63} \right) \end{array}$$ The signs of the odd halfwords in each of the elements in RA are extended, and the results are placed in RT. #### **Special Registers Altered:** ### Vector Load Double Word into Double Word EVX-form evldd RT,D(RA) | 4 | RT | RA | UI | 769 | | |---|----|----|----|------|---| | 0 | 6 | 11 | 16 | 21 3 | 1 | ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTZ(UI \times 8) RT \leftarrow MEM(EA, 8) ``` D in the instruction mnemonic is UI $\times$ 8. The doubleword addressed by EA is loaded from memory and placed in RT. #### **Special Registers Altered:** None #### Vector Load Double into Four Halfwords EVX-form evldh RT,D(RA) | 4 | RT | RA | UI | 773 | 3 | |---|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTZ(UI \times 8) RT_{0:15} \leftarrow MEM(EA, 2) RT_{16:31} \leftarrow MEM(EA+2,2) RT_{32:47} \leftarrow MEM(EA+4,2) RT_{48:63} \leftarrow MEM(EA+6,2) ``` D in the instruction mnemonic is UI $\times$ 8. The doubleword addressed by EA is loaded from memory and placed in RT. #### **Special Registers Altered:** None ### Vector Load Double Word into Double Word Indexed EVX-form evlddx RT,RA,RB | 4 | RT | RA | RB | 768 | | |---|----|----|----|------|---| | 0 | 6 | 11 | 16 | 21 3 | 1 | ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) RT \leftarrow MEM(EA, 8) ``` The doubleword addressed by EA is loaded from memory and placed in RT. #### **Special Registers Altered:** None #### Vector Load Double into Four Halfwords Indexed EVX-form evldhx RT,RA,RB | 4 | RT | RA | RB | 772 | |---|----|----|----|-------| | 0 | 6 | 11 | 16 | 21 31 | if (RA = 0) then $b \leftarrow 0$ else $b \leftarrow (RA)$ $EA \leftarrow b + (RB)$ $RT_{0:15} \leftarrow MEM(EA, 2)$ $RT_{16:31} \leftarrow MEM(EA+2,2)$ $RT_{32:47} \leftarrow MEM(EA+4,2)$ $RT_{48:63} \leftarrow MEM(EA+6,2)$ The doubleword addressed by EA is loaded from memory and placed in RT. #### **Special Registers Altered:** #### Vector Load Double into Two Words EVX-form evldw RT,D(RA) | Ī | 4 | RT | RA | UI | 771 | |---|---|----|----|----|-------| | | 0 | 6 | 11 | 16 | 21 31 | ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTZ(UI \times 8) RT_{0:31} \leftarrow MEM(EA, 4) RT_{32:63} \leftarrow MEM(EA+4, 4) ``` D in the instruction mnemonic is UI $\times$ 8. The doubleword addressed by EA is loaded from memory and placed in RT. #### **Special Registers Altered:** None #### Vector Load Halfword into Halfwords Even and Splat EVX-form evlhhesplat RT,D(RA) | 4 | RT | RA | UI | 777 | |---|----|----|----|-------| | 0 | 6 | 11 | 16 | 21 31 | ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTZ(UI \times 2) RT_{0:15} \leftarrow MEM(EA, 2) RT_{16:31} \leftarrow 0x0000 RT_{32:47} \leftarrow MEM(EA, 2) RT_{48:63} \leftarrow 0x0000 ``` D in the instruction mnemonic is UI $\times$ 2. The halfword addressed by EA is loaded from memory and placed in the even halfwords of each element of RT. The odd halfwords of each element of RT are set to 0. #### **Special Registers Altered:** None #### Vector Load Double into Two Words Indexed EVX-form evldwx RT,RA,RB | 4 | RT | RA | RB | 770 | | |---|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) RT_{0:31} \leftarrow MEM(EA, 4) RT_{32:63} \leftarrow MEM(EA+4, 4) ``` The doubleword addressed by EA is loaded from memory and placed in RT. #### **Special Registers Altered:** None #### Vector Load Halfword into Halfwords Even and Splat Indexed EVX-form evlhhesplatx RT,RA,RB | 4 | RT | RA | RB | 776 | |---|----|----|----|------| | 0 | 6 | 11 | 16 | 21 3 | ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) RT_{0:15} \leftarrow MEM(EA, 2) RT_{16:31} \leftarrow 0x0000 RT_{32:47} \leftarrow MEM(EA, 2) RT_{48:63} \leftarrow 0x0000 ``` The halfword addressed by EA is loaded from memory and placed in the even halfwords of each element of RT. The odd halfwords of each element of RT are set to 0. #### **Special Registers Altered:** #### Vector Load Halfword into Halfword Odd Signed and Splat EVX-form evlhhossplat RT,D(RA) ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTZ(UI \times 2) RT_{0:31} \leftarrow EXTS(MEM(EA, 2)) RT_{32:63} \leftarrow EXTS(MEM(EA, 2)) ``` D in the instruction mnemonic is UI $\times$ 2. The halfword addressed by EA is loaded from memory and placed in the odd halfwords sign extended in each element of RT. #### **Special Registers Altered:** None #### Vector Load Halfword into Halfword Odd Unsigned and Splat EVX-form evlhhousplat RT,D(RA) | 4 | RT | RA | UI | 781 | |---|----|----|----|-------| | o | 6 | 11 | 16 | 21 31 | ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTZ(UI\times2) RT<sub>0:31</sub> \leftarrow EXTZ(MEM(EA,2)) RT<sub>32:63</sub> \leftarrow EXTZ(MEM(EA,2)) ``` D in the instruction mnemonic is UI $\times$ 2. The halfword addressed by EA is loaded from memory and placed in the odd halfwords zero-extended in each element of RT. #### **Special Registers Altered:** None #### Vector Load Halfword into Halfword Odd Signed and Splat Indexed EVX-form evlhhossplatx RT,RA,RB | 4 | RT | RA | RB | 782 | |---|----|----|----|-------| | 0 | 6 | 11 | 16 | 21 31 | ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) RT<sub>0:31</sub> \leftarrow EXTS(MEM(EA,2)) RT<sub>32:63</sub> \leftarrow EXTS(MEM(EA,2)) ``` The halfword addressed by EA is loaded from memory and placed in the odd halfwords sign extended in each element of RT. #### **Special Registers Altered:** None #### Vector Load Halfword into Halfword Odd Unsigned and Splat Indexed EVX-form evIhhousplatx RT,RA,RB | 4 | RT | RA | RB | 780 | | |---|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) RT_{0:31} \leftarrow EXTZ(MEM(EA, 2)) RT_{32:63} \leftarrow EXTZ(MEM(EA, 2)) ``` The halfword addressed by EA is loaded from memory and placed in the odd halfwords zero-extended in each element of RT. #### **Special Registers Altered:** #### Vector Load Word into Two Halfwords Even EVX-form evlwhe RT,D(RA) | 1 | 4 | RT | RA | UI | 785 | |---|---|----|----|----|-------| | | 0 | 6 | 11 | 16 | 21 31 | ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTZ(UI \times 4) RT_{0:15} \leftarrow MEM(EA, 2) RT_{16:31} \leftarrow 0x0000 RT_{32:47} \leftarrow MEM(EA+2, 2) RT_{48:63} \leftarrow 0x0000 ``` D in the instruction mnemonic is UI $\times$ 4. The word addressed by EA is loaded from memory and placed in the even halfwords of each element of RT. The odd halfwords of each element of RT are set to 0. #### **Special Registers Altered:** None #### Vector Load Word into Two Halfwords Odd Signed (with sign extension) EVX-form evlwhos RT,D(RA) | | 1 | DT | DΛ | 111 | 701 | 1 | |---|---|----|------|-----|-----|----| | | 4 | N | I KA | UI | 79 | l | | | | | | | | | | 0 | | 6 | 11 | 16 | 21 | 31 | ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTZ(UI\times4) RT<sub>0:31</sub> \leftarrow EXTS(MEM(EA,2)) RT<sub>32:63</sub> \leftarrow EXTS(MEM(EA+2,2)) ``` D in the instruction mnemonic is UI $\times$ 4. The word addressed by EA is loaded from memory and placed in the odd halfwords sign extended in each element of RT. #### **Special Registers Altered:** None #### Vector Load Word into Two Halfwords Even Indexed EVX-form evlwhex RT,RA,RB | 4 | RT | RA | RB | 784 | | |---|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) RT_{0:15} \leftarrow MEM(EA, 2) RT_{16:31} \leftarrow 0x0000 RT_{32:47} \leftarrow MEM(EA+2, 2) RT_{48:63} \leftarrow 0x0000 ``` The word addressed by EA is loaded from memory and placed in the even halfwords in each element of RT. The odd halfwords of each element of RT are set to 0. #### **Special Registers Altered:** None #### Vector Load Word into Two Halfwords Odd Signed Indexed (with sign extension) EVX-form evlwhosx RT,RA,RB | 4 | RT | RA | RB | 790 | | |---|----|----|----|-----|----| | О | 6 | 11 | 16 | 21 | 31 | ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) RT<sub>0:31</sub> \leftarrow EXTS(MEM(EA,2)) RT<sub>32:63</sub> \leftarrow EXTS(MEM(EA+2,2)) ``` The word addressed by EA is loaded from memory and placed in the odd halfwords sign extended in each element of RT. #### **Special Registers Altered:** #### Vector Load Word into Two Halfwords Odd Unsigned (zero-extended) EVX-form evlwhou RT,D(RA) ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTZ(UI\times4) RT<sub>0:31</sub> \leftarrow EXTZ(MEM(EA,2)) RT<sub>32:63</sub> \leftarrow EXTZ(MEM(EA+2,2)) ``` D in the instruction mnemonic is UI $\times$ 4. The word addressed by EA is loaded from memory and placed in the odd halfwords zero-extended in each element of RT. #### **Special Registers Altered:** None ## Vector Load Word into Two Halfwords and Splat EVX-form evlwhsplat RT,D(RA) | ſ | 4 | RT | RA | UI | 797 | |---|---|----|----|----|-------| | ١ | 0 | 6 | 11 | 16 | 21 31 | ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTZ(UIX4) RT_{0:15} \leftarrow MEM(EA, 2) RT_{16:31} \leftarrow MEM(EA, 2) RT_{32:47} \leftarrow MEM(EA+2, 2) RT_{48:63} \leftarrow MEM(EA+2, 2) ``` D in the instruction mnemonic is UI $\times$ 4. The word addressed by EA is loaded from memory and placed in both the even and odd halfwords in each element of RT. #### **Special Registers Altered:** None #### Vector Load Word into Two Halfwords Odd Unsigned Indexed (zero-extended) EVX-form evlwhoux RT,RA,RB | 4 | RT | RA | RB | 788 | |---|----|----|----|------| | 0 | 6 | 11 | 16 | 21 3 | ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) RT<sub>0:31</sub> \leftarrow EXTZ(MEM(EA,2)) RT<sub>32:63</sub> \leftarrow EXTZ(MEM(EA+2,2)) ``` The word addressed by EA is loaded from memory and placed in the odd halfwords zero-extended in each element of RT. #### **Special Registers Altered:** None ### Vector Load Word into Two Halfwords and Splat Indexed EVX-form evlwhsplatx RT,RA,RB | 4 | RT | RA | RB | 796 | |---|----|----|----|-------| | 0 | 6 | 11 | 16 | 21 3° | ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) RT_{0:15} \leftarrow MEM(EA, 2) RT_{16:31} \leftarrow MEM(EA, 2) RT_{32:47} \leftarrow MEM(EA+2, 2) RT_{48:63} \leftarrow MEM(EA+2, 2) ``` The word addressed by EA is loaded from memory and placed in both the even and odd halfwords in each element of RT. #### **Special Registers Altered:** #### Vector Load Word into Word and Splat EVX-form evlwwsplat RT,D(RA) | 4 | RT | RA | UI | 793 | |---|----|----|----|-------| | 0 | 6 | 11 | 16 | 21 31 | ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTZ(UI \times 4) RT_{0:31} \leftarrow MEM(EA, 4) RT_{32:63} \leftarrow MEM(EA, 4) ``` D in the instruction mnemonic is UI $\times$ 4. The word addressed by EA is loaded from memory and placed in both elements of RT. #### **Special Registers Altered:** None #### Vector Merge High EVX-form evmergehi RT,RA,RB | 4 | RT | RA | RB | 556 | |---|----|----|----|-------| | 0 | 6 | 11 | 16 | 21 31 | $$\begin{array}{l} \operatorname{RT}_{0:31} \leftarrow \left( \operatorname{RA} \right)_{0:31} \\ \operatorname{RT}_{32:63} \leftarrow \left( \operatorname{RB} \right)_{0:31} \end{array}$$ The high-order elements of RA and RB are merged and placed in RT. #### **Special Registers Altered:** None #### - Programming Note A vector splat high can be performed by specifying the same register in RA and RB. #### Vector Load Word into Word and Splat Indexed EVX-form evlwwsplatx RT,RA,RB | 4 | RT | RA | RB | 792 | 1 | |---|----|----|----|------|---| | 0 | 6 | 11 | 16 | 21 3 | 1 | ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) RT_{0:31} \leftarrow MEM(EA,4) RT_{32:63} \leftarrow MEM(EA,4) ``` The word addressed by EA is loaded from memory and placed in both elements of RT. #### **Special Registers Altered:** None ### Vector Merge Low EVX-form evmergelo RT,RA,RB | 4 | RT | RA | RB | 557 | | |---|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$RT_{0:31} \leftarrow (RA)_{32:63} RT_{32:63} \leftarrow (RB)_{32:63}$$ The low-order elements of RA and RB are merged and placed in RT. #### **Special Registers Altered:** None #### Programming Note A vector splat low can be performed by specifying the same register in RA and RB. #### Vector Merge High/Low #### **EVX-form** evmergehilo RT,RA,RB | 4 | RT | RA | RB | 558 | | | |---|----|----|----|------|----|--| | 0 | 6 | 11 | 16 | 21 : | 31 | | $$RT_{0:31} \leftarrow (RA)_{0:31}$$ $RT_{32:63} \leftarrow (RB)_{32:63}$ The high-order element of RA and the low-order element of RB are merged and placed in RT. #### **Special Registers Altered:** None #### **Programming Note** With appropriate specification of RA and RB, evmergehi, evmergelo, evmergehilo, evmergelohi provide a full 32-bit permute of two source operands. ### Vector Merge Low/High EVX-form evmergelohi RT,RA,RB | 4 | RT | RA | RB | 559 | 1 | |---|----|----|----|------|----| | o | 6 | 11 | 16 | 21 : | 31 | $$RT_{0:31} \leftarrow (RA)_{32:63}$$ $RT_{32:63} \leftarrow (RB)_{0:31}$ The low-order element of RA and the high-order element of RB are merged and placed in RT. #### **Special Registers Altered:** None #### Programming Note - A vector swap can be performed by specifying the same register in RA and RB. #### Vector Multiply Halfwords, Even, Guarded, Signed, Modulo, Fractional and Accumulate **EVX-form** evmhegsmfaa RT,RA,RB | 4 | RT | RA | RB | 1323 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{l} \mathsf{temp}_{0:63} \leftarrow (\mathsf{RA})_{32:47} \times_{\mathsf{gsf}} (\mathsf{RB})_{32:47} \\ \mathsf{RT}_{0:63} \leftarrow (\mathsf{ACC})_{0:63} + \mathsf{temp}_{0:63} \\ \mathsf{ACC}_{0:63} \leftarrow (\mathsf{RT})_{0:63} \end{array}$$ The corresponding low even-numbered, halfword signed fractional elements in RA and RB are multiplied using guarded signed fractional multiplication producing a sign extended 64-bit fractional product with the decimal between bits 32 and 33. The product is added to the contents of the 64-bit accumulator and the result is placed in RT and the accumulator #### **Special Registers Altered:** ACC If the two input operands are both -1.0, the intermediate product is represented as +1.0. #### Vector Multiply Halfwords, Even, Guarded, Signed, Modulo, Fractional and Accumulate Negative **EVX-form** evmhegsmfan RT,RA,RB | | 4 | RT | RA | RB | 1451 | | |---|---|----|----|----|------|----| | 0 | | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{l} \mathsf{temp}_{0:63} \leftarrow (\mathsf{RA})_{32:47} \times_{\mathsf{gsf}} (\mathsf{RB})_{32:47} \\ \mathsf{RT}_{0:63} \leftarrow (\mathsf{ACC})_{0:63} - \mathsf{temp}_{0:63} \\ \mathsf{ACC}_{0:63} \leftarrow (\mathsf{RT})_{0:63} \end{array}$$ The corresponding low even-numbered, halfword signed fractional elements in RA and RB are multiplied using guarded signed fractional multiplication producing a sign extended 64-bit fractional product with the decimal between bits 32 and 33. The product is subtracted from the contents of the 64-bit accumulator and the result is placed in RT and the accumulator. #### **Special Registers Altered:** ACC If the two input operands are both -1.0, the intermediate product is represented as +1.0. #### Vector Multiply Halfwords, Even, Guarded, Signed, Modulo, Integer and Accumulate EVX-form evmhegsmiaa RT,RA,RB | 4 | RT | RA | RB | 1321 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{lll} \mathsf{temp}_{0:31} \leftarrow & (\mathsf{RA})_{32:47} \times_{\mathsf{si}} & (\mathsf{RB})_{32:47} \\ \mathsf{temp}_{0:63} \leftarrow & \mathsf{EXTS} (\mathsf{temp}_{0:31}) \\ \mathsf{RT}_{0:63} \leftarrow & (\mathsf{ACC})_{0:63} + \mathsf{temp}_{0:63} \\ \mathsf{ACC}_{0:63} \leftarrow & (\mathsf{RT})_{0:63} \end{array} ``` The corresponding low even-numbered halfword signed-integer elements in RA and RB are multiplied. The intermediate product is sign-extended and added to the contents of the 64-bit accumulator, and the resulting sum is placed in RT and into the accumulator. #### **Special Registers Altered:** ACC #### Vector Multiply Halfwords, Even, Guarded, Unsigned, Modulo, Integer and Accumulate EVX-form evmhegumiaa RT,RA,RB | 4 | RT | RA | RB | 1320 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{l} \mathsf{temp}_{0:31} \leftarrow (\mathsf{RA})_{32:47} \times_{\mathsf{ui}} (\mathsf{RB})_{32:47} \\ \mathsf{temp}_{0:63} \leftarrow \mathsf{EXTZ}(\mathsf{temp}_{0:31}) \\ \mathsf{RT}_{0:63} \leftarrow (\mathsf{ACC})_{0:63} + \mathsf{temp}_{0:63} \\ \mathsf{ACC}_{0:63} \leftarrow (\mathsf{RT})_{0:63} \end{array} ``` The corresponding low even-numbered halfword unsigned-integer elements in RA and RB are multiplied. The intermediate product is zero-extended and added to the contents of the 64-bit accumulator. The resulting sum is placed in RT and into the accumulator. #### **Special Registers Altered:** ACC #### Vector Multiply Halfwords, Even, Guarded, Signed, Modulo, Integer and Accumulate Negative EVX-form evmhegsmian RT,RA,RB | 4 | RT | RA | RB | 1449 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{lll} \mathsf{temp}_{0:31} & \leftarrow & (\mathsf{RA})_{32:47} \; \mathsf{X_{si}} & (\mathsf{RB})_{32:47} \\ \mathsf{temp}_{0:63} & \leftarrow & \mathsf{EXTS} \, (\mathsf{temp}_{0:31}) \\ \mathsf{RT}_{0:63} & \leftarrow & (\mathsf{ACC})_{0:63} \; - \; \mathsf{temp}_{0:63} \\ \mathsf{ACC}_{0:63} & \leftarrow & (\mathsf{RT})_{0:63} \end{array} ``` The corresponding low even-numbered halfword signed-integer elements in RA and RB are multiplied. The intermediate product is sign-extended and subtracted from the contents of the 64-bit accumulator, and the result is placed in RT and into the accumulator. #### **Special Registers Altered:** ACC #### Vector Multiply Halfwords, Even, Guarded, Unsigned, Modulo, Integer and Accumulate Negative EVX-form evmhegumian RT,RA,RB | 4 | RT | RA | RB | 1448 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{lll} \mathsf{temp}_{0:31} & \leftarrow & (\mathsf{RA})_{32:47} \; \mathsf{X_{ui}} & (\mathsf{RB})_{32:47} \\ \mathsf{temp}_{0:63} & \leftarrow & \mathsf{EXTZ} \, (\mathsf{temp}_{0:31}) \\ \mathsf{RT}_{0:63} & \leftarrow & (\mathsf{ACC})_{0:63} \; - \; \mathsf{temp}_{0:63} \\ \mathsf{ACC}_{0:63} & \leftarrow & (\mathsf{RT})_{0:63} \end{array} ``` The corresponding low even-numbered unsigned-integer elements in RA and RB are multiplied. The intermediate product is zero-extended and subtracted from the contents of the 64-bit accumulator. The result is placed in RT and into the accumulator. #### **Special Registers Altered:** ## Vector Multiply Halfwords, Even, Signed, Modulo, Fractional EVX-form evmhesmf RT,RA,RB $$\begin{array}{l} \text{RT}_{0:31} \leftarrow (\text{RA})_{0:15} \times_{\text{sf}} (\text{RB})_{0:15} \\ \text{RT}_{32:63} \leftarrow (\text{RA})_{32:47} \times_{\text{sf}} (\text{RB})_{32:47} \end{array}$$ The corresponding even-numbered halfword signed fractional elements in RA and RB are multiplied then placed into the corresponding words of RT. #### **Special Registers Altered:** None #### Vector Multiply Halfwords, Even, Signed, Modulo, Fractional and Accumulate into Words EVX-form evmhesmfaaw RT,RA,RB | 4 | RT | RA | RB | 1291 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | For each word element in the accumulator, the corresponding even-numbered halfword signed fractional elements in RA and RB are multiplied. The 32 bits of each intermediate product are added to the contents of the accumulator words to form intermediate sums, which are placed into the corresponding RT words and into the accumulator. #### Special Registers Altered: ACC #### Vector Multiply Halfwords, Even, Signed, Modulo, Fractional to Accumulator EVX-form evmhesmfa RT,RA,RB | 4 | RT | RA | RB | 1067 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$\mathrm{RT}_{0:31} \leftarrow \left(\mathrm{RA}\right)_{0:15} \times_{\mathrm{sf}} \left(\mathrm{RB}\right)_{0:15}$$ $$RT_{32:63} \leftarrow (RA)_{32:47} \times_{sf} (RB)_{32:47}$$ $ACC_{0:63} \leftarrow (RT)_{0:63}$ The corresponding even-numbered halfword signed fractional elements in RA and RB are multiplied then placed into the corresponding words of RT and into the accumulator. #### **Special Registers Altered:** ACC #### Vector Multiply Halfwords, Even, Signed, Modulo, Fractional and Accumulate Negative into Words EVX-form evmhesmfanw RT,RA,RB $$\begin{array}{l} \mathsf{temp}_{0:31} \leftarrow (\mathtt{RA})_{\,0:15} \,\, \mathsf{X}_{\mathtt{sf}} \,\, (\mathtt{RB})_{\,0:15} \\ \mathtt{RT}_{0:31} \leftarrow (\mathtt{ACC})_{\,0:31} \,\, \text{-} \,\, \mathsf{temp}_{0:31} \end{array}$$ $$\begin{array}{l} \mathsf{temp}_{0:31} \leftarrow (\mathsf{RA})_{32:47} \times_{\mathsf{sf}} (\mathsf{RB})_{32:47} \\ \mathsf{RT}_{32:63} \leftarrow (\mathsf{ACC})_{32:63} - \mathsf{temp}_{0:31} \\ \mathsf{ACC}_{0:63} \leftarrow (\mathsf{RT})_{0:63} \end{array}$$ For each word element in the accumulator, the corresponding even-numbered halfword signed fractional elements in RA and RB are multiplied. The 32-bit intermediate products are subtracted from the contents of the accumulator words to form intermediate differences, which are placed into the corresponding RT words and into the accumulator. #### **Special Registers Altered:** #### Vector Multiply Halfwords, Even, Signed, Modulo, Integer EVX-form evmhesmi RT,RA,RB | 4 | RT | RA | RB | 1033 | 1 | |---|----|----|----|------|---| | 0 | 6 | 11 | 16 | 21 3 | 1 | $$\begin{array}{l} {\rm RT_{0:31}} \leftarrow {\rm (RA)_{0:15}} \times_{\rm si} {\rm (RB)_{0:15}} \\ {\rm RT_{32:63}} \leftarrow {\rm (RA)_{32:47}} \times_{\rm si} {\rm (RB)_{32:47}} \end{array}$$ The corresponding even-numbered halfword signed-integer elements in RA and RB are multiplied. The two 32-bit products are placed into the corresponding words of RT. #### **Special Registers Altered:** None # Vector Multiply Halfwords, Even, Signed, Modulo, Integer and Accumulate into Words EVX-form evmhesmiaaw RT,RA,RB | 4 | RT | RA | RB | 1289 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | For each word element in the accumulator, the corresponding even-numbered halfword signed-integer elements in RA and RB are multiplied. Each intermediate 32-bit product is added to the contents of the accumulator words to form intermediate sums, which are placed into the corresponding RT words and into the accumulator. #### **Special Registers Altered:** ACC ### Vector Multiply Halfwords, Even, Signed, Modulo, Integer to Accumulator EVX-form evmhesmia RT,RA,RB | 4 | RT | RA | RB | 1065 | 1 | |---|----|----|----|------|---| | 0 | 6 | 11 | 16 | 21 3 | 1 | $$RT_{0,31} \leftarrow (RA)_{0,15} \times_{si} (RB)_{0,15}$$ $RT_{32:63} \leftarrow (RA)_{32:47} \times_{si} (RB)_{32:47}$ $ACC_{0,63} \leftarrow (RT)_{0:63}$ The corresponding even-numbered halfword signed-integer elements in RA and RB are multiplied. The two 32-bit products are placed into the corresponding words of RT and into the accumulator. #### **Special Registers Altered:** ACC #### Vector Multiply Halfwords, Even, Signed, Modulo, Integer and Accumulate Negative into Words EVX-form evmhesmianw RT,RA,RB | 4 | RT | RA | RB | 1417 | |---|----|----|----|------| | 0 | 6 | 11 | 16 | 21 3 | ``` \begin{array}{l} \mathsf{temp}_{0:31} \leftarrow (\mathsf{RA})_{0:15} \; \mathsf{X_{si}} \; \; (\mathsf{RB})_{0:15} \\ \mathsf{RT}_{0:31} \leftarrow (\mathsf{ACC})_{0:31} \; - \; \mathsf{temp}_{0:31} \\ \mathsf{temp}_{0:31} \leftarrow (\mathsf{RA})_{32:47} \; \mathsf{X_{si}} \; \; (\mathsf{RB})_{32:47} \\ \mathsf{RT}_{32:63} \leftarrow (\mathsf{ACC})_{32:63} \; - \; \mathsf{temp}_{0:31} \\ \mathsf{ACC}_{0:63} \leftarrow (\mathsf{RT})_{0:63} \end{array} ``` For each word element in the accumulator, the corresponding even-numbered halfword signed-integer elements in RA and RB are multiplied. Each intermediate 32-bit product is subtracted from the contents of the accumulator words to form intermediate differences, which are placed into the corresponding RT words and into the accumulator. #### **Special Registers Altered:** #### Vector Multiply Halfwords, Even, Signed, Saturate, Fractional EVX-form evmhessf RT,RA,RB ``` \begin{array}{l} \operatorname{temp}_{0:31} \leftarrow (\operatorname{RA})_{0:15} \times_{\operatorname{sf}} (\operatorname{RB})_{0:15} \\ \operatorname{if} \ ((\operatorname{RA})_{0:15} = \operatorname{0x8000}) \ \& \ ((\operatorname{RB})_{0:15} = \operatorname{0x8000}) \ \operatorname{then} \\ \operatorname{RT}_{0:31} \leftarrow \operatorname{0x7FFF\_FFFF} \\ \operatorname{movh} \leftarrow 1 \\ \operatorname{else} \\ \operatorname{RT}_{0:31} \leftarrow \operatorname{temp}_{0:31} \\ \operatorname{movh} \leftarrow 0 \\ \operatorname{temp}_{0:31} \leftarrow (\operatorname{RA})_{32:47} \times_{\operatorname{sf}} (\operatorname{RB})_{32:47} \\ \operatorname{if} \ ((\operatorname{RA})_{32:47} = \operatorname{0x8000}) \ \& \ ((\operatorname{RB})_{32:47} = \operatorname{0x8000}) \ \operatorname{then} \\ \operatorname{RT}_{32:63} \leftarrow \operatorname{0x7FFF\_FFFF} \\ \operatorname{movl} \leftarrow 1 \\ \operatorname{else} \\ \operatorname{RT}_{32:63} \leftarrow \operatorname{temp}_{0:31} \\ \operatorname{movl} \leftarrow 0 \\ \operatorname{SPEFSCR}_{\operatorname{OVH}} \leftarrow \operatorname{movh} \\ \operatorname{SPEFSCR}_{\operatorname{SOVH}} \leftarrow \operatorname{speFSCR}_{\operatorname{SOVH}} \mid \operatorname{movh} \\ \operatorname{SPEFSCR}_{\operatorname{SOVH}} \leftarrow \operatorname{SPEFSCR}_{\operatorname{SOVH}} \mid \operatorname{movh} \\ \operatorname{SPEFSCR}_{\operatorname{SOV}} \leftarrow \operatorname{SPEFSCR}_{\operatorname{SOV}} \mid \operatorname{movl} \end{array} ``` The corresponding even-numbered halfword signed fractional elements in RA and RB are multiplied. The 32 bits of each product are placed into the corresponding words of RT. If both inputs are -1.0, the result saturates to the largest positive signed fraction. #### **Special Registers Altered:** OV OVH SOV SOVH #### Vector Multiply Halfwords, Even, Signed, Saturate, Fractional to Accumulator EVX-form evmhessfa RT,RA,RB ``` \begin{array}{l} {\rm temp_{0:31}} \leftarrow {\rm (RA)_{\,0:15}} \,\, {\rm X_{sf}} \,\, {\rm (RB)_{\,0:15}} \\ {\rm if} \,\, {\rm ((RA)_{\,0:15}} \,=\, 0x8000) \,\,\, \& \,\, {\rm ((RB)_{\,0:15}} \,=\, 0x8000) \,\,\, {\rm then} \end{array} RT_{0:31} \leftarrow 0x7FFF\_FFFF movh ← 1 else \begin{array}{l} \mathtt{RT_{0:31}} \leftarrow \mathtt{temp_{0:31}} \\ \mathtt{movh} \leftarrow \mathtt{0} \end{array} temp_{0:31} \leftarrow (RA)_{32:47} \times_{sf} (RB)_{32:47} if ((RA)<sub>32:47</sub> = 0x8000) & ((RB)<sub>32:47</sub> = 0x8000) then RT_{32:63} \leftarrow 0x7FFF\_FFFF movl ←1 else \mathtt{RT}_{32:63} \leftarrow \mathtt{temp}_{0:31} movl \leftarrow 0 ACC_{0:63} \leftarrow (RT)_{0:63} \texttt{SPEFSCR}_{\texttt{OVH}} \; \leftarrow \texttt{movh} SPEFSCR<sub>OV</sub> ← movl SPEFSCR_{SOVH} \leftarrow SPEFSCR_{SOVH} \mid movh \texttt{SPEFSCR}_{\texttt{SOV}} \leftarrow \texttt{SPEFSCR}_{\texttt{SOV}} \mid \ \texttt{movl} ``` The corresponding even-numbered halfword signed fractional elements in RA and RB are multiplied. The 32 bits of each product are placed into the corresponding words of RT and into the accumulator. If both inputs are -1.0, the result saturates to the largest positive signed fraction. #### **Special Registers Altered:** #### Vector Multiply Halfwords, Even, Signed, Saturate, Fractional and Accumulate into Words EVX-form evmhessfaaw RT,RA,RB | 4 | RT | RA | RB | 1283 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` temp_{0:31} \leftarrow (RA)_{0:15} \times_{sf} (RB)_{0:15} if ((RA)_{0:15} = 0x8000) & ((RB)_{0:15} = 0x8000) then temp_{0:31} \leftarrow 0x7FFF_FFFF movh \leftarrow 1 else movh \leftarrow 0 temp_{0:63} \leftarrow EXTS((ACC)_{0:31}) + EXTS(temp_{0:31}) ovh \leftarrow (temp<sub>31</sub> \oplus temp<sub>32</sub>) RT_{0:31} \leftarrow SATURATE(ovh, temp_{31}, 0x8000_0000, 0x7FFF_FFFF, temp<sub>32:63</sub>) temp_{0:31} \leftarrow (RA)_{32:47} \times_{sf} (RB)_{32:47} if ((RA)_{32:47} = 0x8000) & ((RB)_{32:47} = 0x8000) then temp_{0:31} \leftarrow 0x7FFF\_FFFF movl \leftarrow 1 else mov1 \leftarrow 0 \texttt{temp}_{0:63} \, \leftarrow \texttt{EXTS} \, (\, \texttt{(ACC)}_{\, 32:63}) \, \, + \, \texttt{EXTS} \, (\texttt{temp}_{0:31}) \texttt{ovl} \; \gets (\texttt{temp}_{\texttt{31}} \; \oplus \; \texttt{temp}_{\texttt{32}}) RT_{32:63} \leftarrow SATURATE(ovl, temp_{31}, 0x8000_0000, 0x7FFF_FFFF, temp<sub>32:63</sub>) ACC_{0:63} \leftarrow (RT)_{0:63} SPEFSCR_{OVH} \leftarrow ovh \mid movh \texttt{SPEFSCR}_{\texttt{OV}} \; \leftarrow \texttt{ovl} \; | \; \texttt{movl} \texttt{SPEFSCR}_{\texttt{SOVH}} \, \leftarrow \texttt{SPEFSCR}_{\texttt{SOVH}} \, \mid \, \texttt{ovh} \, \mid \, \texttt{movh} \texttt{SPEFSCR}_{\texttt{SOV}} \leftarrow \texttt{SPEFSCR}_{\texttt{SOV}} \ | \ \texttt{ovl} \ | \ \texttt{movl} ``` The corresponding even-numbered halfword signed fractional elements in RA and RB are multiplied producing a 32-bit product. If both inputs are -1.0, the result saturates to 0x7FFF\_FFFF. Each 32-bit product is then added to the corresponding word in the accumulator saturating if overflow occurs, and the result is placed in RT and the accumulator. #### **Special Registers Altered:** ACC OV OVH SOV SOVH # Vector Multiply Halfwords, Even, Signed, Saturate, Fractional and Accumulate Negative into Words EVX-form evmhessfanw RT,RA,RB | 4 | RT | RA | RB | 1411 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` \begin{split} & \mathsf{temp}_{0:31} \leftarrow (\mathsf{RA})_{\,0:15} \,\, \mathsf{x_{sf}} \,\, (\mathsf{RB})_{\,0:15} \\ & \mathsf{if} \,\, ((\mathsf{RA})_{\,0:15} \,=\, 0x8000) \,\, \& \,\, ((\mathsf{RB})_{\,0:15} \,=\, 0x8000) \,\, \mathsf{then} \end{split} temp_{0:31} \leftarrow 0x7FFF\_FFFF movh ←1 else movh \leftarrow 0 temp_{0:63} \leftarrow EXTS((ACC)_{0:31}) - EXTS(temp_{0:31}) ovh \leftarrow (temp<sub>31</sub> \oplus temp<sub>32</sub>) RT_{0:31} \leftarrow SATURATE(ovh, temp_{31}, 0x8000_0000, 0x7FFF_FFFF, temp<sub>32:63</sub>) temp_{0:31} \leftarrow (RA)_{32:47} \times_{sf} (RB)_{32:47} if ((RA)_{32:47} = 0x8000) & ((RB)_{32:47} = 0x8000) then \texttt{temp}_{\texttt{0:31}} \leftarrow \texttt{0x7FFF\_FFFF} movl \leftarrow 1 else mov1 \leftarrow 0 temp_{0:63} \leftarrow EXTS((ACC)_{32:63}) - EXTS(temp_{0:31}) ovl \leftarrow (temp<sub>31</sub> \oplus temp<sub>32</sub>) RT_{32:63} \leftarrow SATURATE(ovl, temp_{31}, 0x8000_0000, 0x7FFF FFFF, temp<sub>32:63</sub>) ACC_{0:63} \leftarrow (RT)_{0:63} SPEFSCR<sub>OVH</sub> ← ovh | movh SPEFSCR<sub>OV</sub> ← ovl | movl SPEFSCR_{SOVH} \leftarrow SPEFSCR_{SOVH} \mid ovh \mid movh \texttt{SPEFSCR}_{\texttt{SOV}} \leftarrow \texttt{SPEFSCR}_{\texttt{SOV}} \ | \ \texttt{ovl} | \ \texttt{movl} ``` The corresponding even-numbered halfword signed fractional elements in RA and RB are multiplied producing a 32-bit product. If both inputs are -1.0, the result saturates to 0x7FFF\_FFFF. Each 32-bit product is then subtracted from the corresponding word in the accumulator saturating if overflow occurs, and the result is placed in RT and the accumulator. #### **Special Registers Altered:** #### Vector Multiply Halfwords, Even, Signed, Saturate, Integer and Accumulate into Words EVX-form evmhessiaaw RT,RA,RB ``` \begin{array}{l} \mathsf{temp_{0:31}} \leftarrow (\mathsf{RA})_{0:15} \times_{\mathsf{si}} (\mathsf{RB})_{0:15} \\ \mathsf{temp_{0:63}} \leftarrow \mathsf{EXTS}((\mathsf{ACC})_{0:31}) + \mathsf{EXTS}(\mathsf{temp_{0:31}}) \\ \mathsf{ovh} \leftarrow (\mathsf{temp_{31}} \oplus \mathsf{temp_{32}}) \\ \mathsf{RT_{0:31}} \leftarrow \mathsf{SATURATE}(\mathsf{ovh}, \, \mathsf{temp_{31}}, \, \mathsf{0x8000\_0000}, \\ & & \mathsf{0x7FFF\_FFFF}, \, \mathsf{temp_{32:63}}) \\ \\ \mathsf{temp_{0:31}} \leftarrow (\mathsf{RA})_{32:47} \times_{\mathsf{si}} (\mathsf{RB})_{32:47} \\ \mathsf{temp_{0:63}} \leftarrow \mathsf{EXTS}((\mathsf{ACC})_{32:63}) + \mathsf{EXTS}(\mathsf{temp_{0:31}}) \\ \mathsf{ovl} \leftarrow (\mathsf{temp_{31}} \oplus \mathsf{temp_{32}}) \\ \mathsf{RT_{32:63}} \leftarrow \mathsf{SATURATE}(\mathsf{ovl}, \, \mathsf{temp_{31}}, \, \mathsf{0x8000\_0000}, \\ & & \mathsf{0x7FFF\_FFFF}, \, \mathsf{temp_{32:63}}) \\ \mathsf{ACC}_{0:63} \leftarrow (\mathsf{RT})_{0:63} \\ \mathsf{SPEFSCR_{OVH}} \leftarrow \mathsf{ovh} \\ \mathsf{SPEFSCR_{OV}} \leftarrow \mathsf{ovl} \\ \mathsf{SPEFSCR_{SOVH}} \leftarrow \mathsf{SPEFSCR_{SOVH}} \mid \mathsf{ovh} \\ \mathsf{SPEFSCR_{SOV}} \leftarrow \mathsf{SPEFSCR_{SOVH}} \mid \mathsf{ovh} \\ \mathsf{SPEFSCR_{SOV}} \leftarrow \mathsf{SPEFSCR_{SOV}} \mid \mathsf{ovl} \\ \end{array} ``` The corresponding even-numbered halfword signed-integer elements in RA and RB are multiplied producing a 32-bit product. Each 32-bit product is then added to the corresponding word in the accumulator saturating if overflow occurs, and the result is placed in RT and the accumulator. #### **Special Registers Altered:** ACC OV OVH SOV SOVH # Vector Multiply Halfwords, Even, Signed, Saturate, Integer and Accumulate Negative into Words EVX-form evmhessianw RT,RA,RB | ſ | 4 | RT | RA | RB | 1409 | | |---|---|----|----|----|------|----| | | 0 | 6 | 11 | 16 | 21 | 31 | The corresponding even-numbered halfword signed-integer elements in RA and RB are multiplied producing a 32-bit product. Each 32-bit product is then subtracted from the corresponding word in the accumulator saturating if overflow occurs, and the result is placed in RT and the accumulator. #### **Special Registers Altered:** #### Vector Multiply Halfwords, Even, Unsigned, Modulo, Integer EVX-form evmheumi RT,RA,RB $$\begin{array}{l} \text{RT}_{0:31} \leftarrow (\text{RA})_{0:15} \times_{\text{ui}} (\text{RB})_{0:15} \\ \text{RT}_{32:63} \leftarrow (\text{RA})_{32:47} \times_{\text{ui}} (\text{RB})_{32:47} \end{array}$$ The corresponding even-numbered halfword unsigned-integer elements in RA and RB are multiplied. The two 32-bit products are placed into the corresponding words of RT. #### **Special Registers Altered:** None #### Vector Multiply Halfwords, Even, Unsigned, Modulo, Integer and Accumulate into Words EVX-form evmheumiaaw RT,RA,RB | 4 | RT | RA | RB | 1288 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{l} \mathsf{temp}_{0:31} \leftarrow (\mathsf{RA})_{0:15} \; \mathsf{X}_{ui} \; (\mathsf{RB})_{0:15} \\ \mathsf{RT}_{0:31} \leftarrow (\mathsf{ACC})_{0:31} + \mathsf{temp}_{0:31} \\ \mathsf{temp}_{0:31} \leftarrow (\mathsf{RA})_{32:47} \; \mathsf{X}_{ui} \; (\mathsf{RB})_{32:47} \\ \mathsf{RT}_{32:63} \leftarrow (\mathsf{ACC})_{32:63} + \mathsf{temp}_{0:31} \\ \mathsf{ACC}_{0:63} \leftarrow (\mathsf{RT})_{0:63} \end{array}$$ For each word element in the accumulator, the corresponding even-numbered halfword unsigned-integer elements in RA and RB are multiplied. Each intermediate product is added to the contents of the corresponding accumulator words and the sums are placed into the corresponding RT and accumulator words. #### **Special Registers Altered:** ACC #### Vector Multiply Halfwords, Even, Unsigned, Modulo, Integer to Accumulator EVX-form evmheumia RT,RA,RB | 4 | RT | RA | RB | 1064 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$RT_{0:31} \leftarrow (RA)_{0:15} \times_{ui} (RB)_{0:15}$$ $RT_{32:63} \leftarrow (RA)_{32:47} \times_{ui} (RB)_{32:47}$ $ACC_{0:63} \leftarrow (RT)_{0:63}$ The corresponding even-numbered halfword unsigned-integer elements in RA and RB are multiplied. The two 32-bit products are placed into RT and into the accumulator. #### **Special Registers Altered:** ACC #### Vector Multiply Halfwords, Even, Unsigned, Modulo, Integer and Accumulate Negative into Words EVX-form evmheumianw RT,RA,RB | 4 | RT | RA | RB | 1416 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{l} \mathsf{temp}_{0:31} \leftarrow (\mathsf{RA})_{0:15} \; \mathsf{X}_{ui} \; (\mathsf{RB})_{0:15} \\ \mathsf{RT}_{0:31} \leftarrow (\mathsf{ACC})_{0:31} \; - \; \mathsf{temp}_{0:31} \\ \mathsf{temp}_{0:31} \leftarrow (\mathsf{RA})_{32:47} \; \mathsf{X}_{ui} \; (\mathsf{RB})_{32:47} \\ \mathsf{RT}_{32:63} \leftarrow (\mathsf{ACC})_{32:63} \; - \; \mathsf{temp}_{0:31} \\ \mathsf{ACC}_{0:63} \leftarrow (\mathsf{RT})_{0:63} \end{array} ``` For each word element in the accumulator, the corresponding even-numbered halfword unsigned-integer elements in RA and RB are multiplied. Each intermediate product is subtracted from the contents of the corresponding accumulator words. The differences are placed into the corresponding RT and accumulator words. #### **Special Registers Altered:** #### Vector Multiply Halfwords, Even, Unsigned, Saturate, Integer and Accumulate into Words EVX-form evmheusiaaw RT,RA,RB For each word element in the accumulator, corresponding even-numbered halfword unsigned-integer elements in RA and RB are multiplied producing a 32-bit product. Each 32-bit product is then added to the corresponding word in the accumulator saturating if overflow occurs, and the result is placed in RT and the accumulator #### **Special Registers Altered:** ACC OV OVH SOV SOVH #### Vector Multiply Halfwords, Even, Unsigned, Saturate, Integer and Accumulate Negative into Words EVX-form evmheusianw RT,RA,RB | Ī | 4 | RT | RA | RB | 1408 | | |---|---|----|----|----|------|----| | ( | 0 | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{llll} \text{temp}_{0:31} \leftarrow & (\text{RA})_{0:15} \times_{ui} & (\text{RB})_{0:15} \\ \text{temp}_{0:63} \leftarrow & \text{EXTZ}((\text{ACC})_{0:31}) - & \text{EXTZ}(\text{temp}_{0:31}) \\ \text{ovh} \leftarrow & \text{temp}_{31} \\ \text{RT}_{0:31} \leftarrow & \text{SATURATE}(\text{ovh}, 0, 0\text{x}0000\_0000, 0\text{x}00000\_0000, \\ & & \text{temp}_{32:63}) \\ \text{temp}_{0:31} \leftarrow & (\text{RA})_{32:47} \times_{ui} & (\text{RB})_{32:47} \\ \text{temp}_{0:63} \leftarrow & \text{EXTZ}((\text{ACC})_{32:63}) - & \text{EXTZ}(\text{temp}_{0:31}) \\ \text{ovl} \leftarrow & \text{temp}_{31} \\ \text{RT}_{32:63} \leftarrow & \text{SATURATE}(\text{ovl}, 0, 0\text{x}0000\_0000, \\ & & 0\text{x}0000\_0000, \text{temp}_{32:63}) \\ \text{ACC}_{0:63} \leftarrow & (\text{RT})_{0:63} \\ \text{SPEFSCR}_{\text{OVH}} \leftarrow & \text{ovh} \\ \text{SPEFSCR}_{\text{SOVH}} \leftarrow & \text{spefsCR}_{\text{SOVH}} \mid \text{ovh} \\ \text{SPEFSCR}_{\text{SOV}} \leftarrow & \text{SPEFSCR}_{\text{SOV}} \mid \text{ovl} \end{array} ``` For each word element in the accumulator, corresponding even-numbered halfword unsigned-integer elements in RA and RB are multiplied producing a 32-bit product. Each 32-bit product is then subtracted from the corresponding word in the accumulator saturating if overflow occurs, and the result is placed in RT and the accumulator. #### **Special Registers Altered:** #### Vector Multiply Halfwords, Odd, Guarded, Signed, Modulo, Fractional and Accumulate EVX-form evmhogsmfaa RT,RA,RB | 4 | RT | RA | RB | 1327 | |---|----|----|----|------| | 0 | 6 | 11 | 16 | 21 3 | $$\begin{array}{l} \mathsf{temp}_{0:63} \leftarrow (\mathsf{RA})_{48:63} \times_{\mathsf{gsf}} (\mathsf{RB})_{48:63} \\ \mathsf{RT}_{0:63} \leftarrow (\mathsf{ACC})_{0:63} + \mathsf{temp}_{0:63} \\ \mathsf{ACC}_{0:63} \leftarrow (\mathsf{RT})_{0:63} \end{array}$$ The corresponding low odd-numbered, halfword signed fractional elements in RA and RB are multiplied using guarded signed fractional multiplication producing a sign extended 64-bit fractional product with the decimal between bits 32 and 33. The product is added to the contents of the 64-bit accumulator and the result is placed in RT and the accumulator. #### **Special Registers Altered:** ACC #### Note If the two input operands are both -1.0, the intermediate product is represented as +1.0. #### Vector Multiply Halfwords, Odd, Guarded, Signed, Modulo, Integer and Accumulate EVX-form evmhogsmiaa RT,RA,RB | 4 | RT | RA | RB | 1325 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{l} \mathsf{temp}_{0:31} \leftarrow & (\mathsf{RA})_{48:63} \times_{\mathsf{si}} & (\mathsf{RB})_{48:63} \\ \mathsf{temp}_{0:63} \leftarrow & \mathsf{EXTS} (\mathsf{temp}_{0:31}) \\ \mathsf{RT}_{0:63} \leftarrow & (\mathsf{ACC})_{0:63} + \mathsf{temp}_{0:63} \\ \mathsf{ACC}_{0:63} \leftarrow & (\mathsf{RT})_{0:63} \end{array}$$ The corresponding low odd-numbered halfword signed-integer elements in RA and RB are multiplied. The intermediate product is sign-extended to 64 bits then added to the contents of the 64-bit accumulator, and the result is placed in RT and into the accumulator. #### **Special Registers Altered:** ACC #### Vector Multiply Halfwords, Odd, Guarded, Signed, Modulo, Fractional and Accumulate Negative EVX-form evmhogsmfan RT,RA,RB | 4 | RT | RA | RB | 1455 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 3 | 31 | $$\begin{array}{l} \mathsf{temp}_{0:63} \leftarrow (\mathsf{RA})_{48:63} \; \mathsf{X}_{\mathsf{gsf}} \; (\mathsf{RB})_{48:63} \\ \mathsf{RT}_{0:63} \leftarrow (\mathsf{ACC})_{0:63} - \mathsf{temp}_{0:63} \\ \mathsf{ACC}_{0:63} \leftarrow (\mathsf{RT})_{0:63} \end{array}$$ The corresponding low odd-numbered, halfword signed fractional elements in RA and RB are multiplied using guarded signed fractional multiplication producing a sign extended 64-bit fractional product with the decimal between bits 32 and 33. The product is subtracted from the contents of the 64-bit accumulator and the result is placed in RT and the accumulator. #### **Special Registers Altered:** ACC #### - Note - If the two input operands are both -1.0, the intermediate product is represented as +1.0. #### Vector Multiply Halfwords, Odd, Guarded, Signed, Modulo, Integer and Accumulate Negative EVX-form evmhogsmian RT,RA,RB | | 4 | RT | RA | RB | 1453 | | |---|---|----|----|----|------|----| | 0 | | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{l} \mathsf{temp}_{0:31} \leftarrow & (\mathsf{RA})_{48:63} \times_{\mathsf{si}} & (\mathsf{RB})_{48:63} \\ \mathsf{temp}_{0:63} \leftarrow & \mathsf{EXTS} \, (\mathsf{temp}_{0:31}) \\ \mathsf{RT}_{0:63} \leftarrow & (\mathsf{ACC})_{0:63} - \mathsf{temp}_{0:63} \\ \mathsf{ACC}_{0:63} \leftarrow & (\mathsf{RT})_{0:63} \end{array}$$ The corresponding low odd-numbered halfword signed-integer elements in RA and RB are multiplied. The intermediate product is sign-extended to 64 bits then subtracted from the contents of the 64-bit accumulator, and the result is placed in RT and into the accumulator. #### Special Registers Altered: #### Vector Multiply Halfwords, Odd, Guarded, Unsigned, Modulo, Integer and Accumulate EVX-form evmhogumiaa RT,RA,RB | 4 | RT | RA | RB | 1324 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | The corresponding low odd-numbered halfword unsigned-integer elements in RA and RB are multiplied. The intermediate product is zero-extended to 64 bits then added to the contents of the 64-bit accumulator, and the result is placed in RT and into the accumulator. #### **Special Registers Altered:** ACC ### Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional EVX-form evmhosmf RT,RA,RB | 4 | RT | RA | RB | 1039 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{l} {\rm RT_{0:31}} \leftarrow {\rm (RA)_{16:31}} \ {\rm X_{sf}} \ {\rm (RB)_{16:31}} \\ {\rm RT_{32:63}} \leftarrow {\rm (RA)_{48:63}} \ {\rm X_{sf}} \ {\rm (RB)_{48:63}} \end{array}$$ The corresponding odd-numbered, halfword signed fractional elements in RA and RB are multiplied. Each product is placed into the corresponding words of RT. #### **Special Registers Altered:** None #### Vector Multiply Halfwords, Odd, Guarded, Unsigned, Modulo, Integer and Accumulate Negative EVX-form evmhogumian RT,RA,RB | 4 | RT | RA | RB | 1452 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 : | 31 | The corresponding low odd-numbered halfword unsigned-integer elements in RA and RB are multiplied. The intermediate product is zero-extended to 64 bits then subtracted from the contents of the 64-bit accumulator, and the result is placed in RT and into the accumulator. #### **Special Registers Altered:** ACC #### Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional to Accumulator EVX-form evmhosmfa RT,RA,RB | 4 | RT | RA | RB | 1071 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{l} \text{RT}_{0:31} \leftarrow (\text{RA})_{16:31} \times_{\text{sf}} (\text{RB})_{16:31} \\ \text{RT}_{32:63} \leftarrow (\text{RA})_{48:63} \times_{\text{sf}} (\text{RB})_{48:63} \\ \text{ACC}_{0:63} \leftarrow (\text{RT})_{0:63} \end{array} ``` The corresponding odd-numbered, halfword signed fractional elements in RA and RB are multiplied. Each product is placed into the corresponding words of RT. and into the accumulator. #### Special Registers Altered: #### Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional and Accumulate into Words EVX-form evmhosmfaaw RT,RA,RB | 4 | RT | RA | RB | 1295 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{l} \mathsf{temp}_{0:31} \leftarrow (\mathsf{RA})_{16:31} \; \mathsf{X}_{\mathsf{sf}} \; (\mathsf{RB})_{16:31} \\ \mathsf{RT}_{0:31} \leftarrow (\mathsf{ACC})_{0:31} + \mathsf{temp}_{0:31} \\ \mathsf{temp}_{0:31} \leftarrow (\mathsf{RA})_{48:63} \; \mathsf{X}_{\mathsf{sf}} \; (\mathsf{RB})_{48:63} \\ \mathsf{RT}_{32:63} \leftarrow (\mathsf{ACC})_{32:63} + \mathsf{temp}_{0:31} \\ \mathsf{ACC}_{0:63} \leftarrow (\mathsf{RT})_{0:63} \end{array} ``` For each word element in the accumulator, the corresponding odd-numbered halfword signed fractional elements in RA and RB are multiplied. The 32 bits of each intermediate product are added to the contents of the corresponding accumulator word and the results are placed into the corresponding RT words and into the accumulator. #### **Special Registers Altered:** ACC #### Vector Multiply Halfwords, Odd, Signed, Modulo, Integer EVX-form evmhosmi RT,RA,RB | 4 | RT | RA | RB | 1037 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{l} \mathrm{RT}_{0:31} \leftarrow \left(\mathrm{RA}\right)_{16:31} \times_{\mathrm{si}} \left(\mathrm{RB}\right)_{16:31} \\ \mathrm{RT}_{32:63} \leftarrow \left(\mathrm{RA}\right)_{48:63} \times_{\mathrm{si}} \left(\mathrm{RB}\right)_{48:63} \end{array}$$ The corresponding odd-numbered halfword signed-integer elements in RA and RB are multiplied. The two 32-bit products are placed into the corresponding words of RT. #### **Special Registers Altered:** None #### Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional and Accumulate Negative into Words EVX-form evmhosmfanw RT,RA,RB | 4 | RT | RA | RB | 1423 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{l} \mathsf{temp}_{0:31} \leftarrow (\mathsf{RA})_{16:31} \times_{\mathsf{sf}} (\mathsf{RB})_{16:31} \\ \mathsf{RT}_{0:31} \leftarrow (\mathsf{ACC})_{0:31} - \mathsf{temp}_{0:31} \\ \mathsf{temp}_{0:31} \leftarrow (\mathsf{RA})_{48:63} \times_{\mathsf{sf}} (\mathsf{RB})_{48:63} \\ \mathsf{RT}_{32:63} \leftarrow (\mathsf{ACC})_{32:63} - \mathsf{temp}_{0:31} \\ \mathsf{ACC}_{0:63} \leftarrow (\mathsf{RT})_{0:63} \end{array} ``` For each word element in the accumulator, the corresponding odd-numbered halfword signed fractional elements in RA and RB are multiplied. The 32 bits of each intermediate product are subtracted from the contents of the corresponding accumulator word and the results are placed into the corresponding RT words and into the accumulator. #### **Special Registers Altered:** ACC ### Vector Multiply Halfwords, Odd, Signed, Modulo, Integer to Accumulator EVX-form evmhosmia RT,RA,RB | 4 | RT | RA | RB | 1069 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{l} \text{RT}_{0:31} \leftarrow (\text{RA})_{16:31} \times_{\text{si}} (\text{RB})_{16:31} \\ \text{RT}_{32:63} \leftarrow (\text{RA})_{48:63} \times_{\text{si}} (\text{RB})_{48:63} \\ \text{ACC}_{0:63} \leftarrow (\text{RT})_{0:63} \end{array}$$ The corresponding odd-numbered halfword signed-integer elements in RA and RB are multiplied. The two 32-bit products are placed into the corresponding words of RT and into the accumulator. #### **Special Registers Altered:** #### Vector Multiply Halfwords, Odd, Signed, Modulo, Integer and Accumulate into Words EVX-form evmhosmiaaw RT,RA,RB | 4 | RT | RA | RB | 1293 | | |---|----|----|----|------|----| | 0 | 6 | 11 | | 21 | 31 | $$\begin{array}{l} \mathsf{temp}_{0:31} \leftarrow (\mathsf{RA})_{16:31} \times_{\mathtt{si}} (\mathsf{RB})_{16:31} \\ \mathsf{RT}_{0:31} \leftarrow (\mathsf{ACC})_{0:31} + \mathsf{temp}_{0:31} \\ \mathsf{temp}_{0:31} \leftarrow (\mathsf{RA})_{48:63} \times_{\mathtt{si}} (\mathsf{RB})_{48:63} \\ \mathsf{RT}_{32:63} \leftarrow (\mathsf{ACC})_{32:63} + \mathsf{temp}_{0:31} \\ \mathsf{ACC}_{0:63} \leftarrow (\mathsf{RT})_{0:63} \end{array}$$ For each word element in the accumulator, the corresponding odd-numbered halfword signed-integer elements in RA and RB are multiplied. Each intermediate 32-bit product is added to the contents of the corresponding accumulator word and the results are placed into the corresponding RT words and into the accumulator. #### **Special Registers Altered:** ACC #### Vector Multiply Halfwords, Odd, Signed, Modulo, Integer and Accumulate Negative into Words EVX-form evmhosmianw RT,RA,RB | | 4 | RT | RA | RB | 1421 | | |---|---|----|----|----|------|----| | 0 | | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{l} \mathsf{temp}_{0:31} \leftarrow (\mathsf{RA})_{16:31} \times_{\mathbf{si}} (\mathsf{RB})_{16:31} \\ \mathsf{RT}_{0:31} \leftarrow (\mathsf{ACC})_{0:31} - \mathsf{temp}_{0:31} \\ \mathsf{temp}_{0:31} \leftarrow (\mathsf{RA})_{48:63} \times_{\mathbf{si}} (\mathsf{RB})_{48:63} \\ \mathsf{RT}_{32:63} \leftarrow (\mathsf{ACC})_{32:63} - \mathsf{temp}_{0:31} \\ \mathsf{ACC}_{0:63} \leftarrow (\mathsf{RT})_{0:63} \end{array}$$ For each word element in the accumulator, the corresponding odd-numbered halfword signed-integer elements in RA and RB are multiplied. Each intermediate 32-bit product is subtracted from the contents of the corresponding accumulator word and the results are placed into the corresponding RT words and into the accumulator. #### **Special Registers Altered:** #### Vector Multiply Halfwords, Odd, Signed, Saturate, Fractional EVX-form evmhossf RT,RA,RB | 4 | RT | RA | RB | 1031 | |---|----|----|----|------| | 0 | 6 | 11 | 16 | 21 3 | ``` \begin{array}{l} \text{temp}_{0:31} \leftarrow (\text{RA})_{16:31} \times_{\text{sf}} (\text{RB})_{16:31} \\ \text{if } ((\text{RA})_{16:31} = 0\text{x8000}) & & ((\text{RB})_{16:31} = 0\text{x8000}) \text{ then} \\ \text{RT}_{0:31} \leftarrow 0\text{x7FFF\_FFFF} \\ \text{movh} \leftarrow 1 \\ \text{else} \\ \text{RT}_{0:31} \leftarrow \text{temp}_{0:31} \\ \text{movh} \leftarrow 0 \\ \text{temp}_{0:31} \leftarrow (\text{RA})_{48:63} \times_{\text{sf}} (\text{RB})_{48:63} \\ \text{if } ((\text{RA})_{48:63} = 0\text{x8000}) & & ((\text{RB})_{48:63} = 0\text{x8000}) \text{ then} \\ \text{RT}_{32:63} \leftarrow 0\text{x7FFF\_FFFF} \\ \text{movl} \leftarrow 1 \\ \text{else} \\ \text{RT}_{32:63} \leftarrow \text{temp}_{0:31} \\ \text{movl} \leftarrow 0 \\ \text{SPEFSCR}_{\text{OV}} \leftarrow \text{movh} \\ \text{SPEFSCR}_{\text{SOVH}} \leftarrow \text{movh} \\ \text{SPEFSCR}_{\text{SOVH}} \leftarrow \text{SPEFSCR}_{\text{SOVH}} \mid \text{movh} \\ \text{SPEFSCR}_{\text{SOV}} \leftarrow \text{SPEFSCR}_{\text{SOV}} \mid \text{movh} \\ \text{SPEFSCR}_{\text{SOV}} \leftarrow \text{SPEFSCR}_{\text{SOV}} \mid \text{movh} \\ \text{SPEFSCR}_{\text{SOV}} \leftarrow \text{SPEFSCR}_{\text{SOV}} \mid \text{movh} \\ \text{SPEFSCR}_{\text{SOV}} \leftarrow \text{SPEFSCR}_{\text{SOV}} \mid \text{movh} \\ \text{SPEFSCR}_{\text{SOV}} \leftarrow \text{SPEFSCR}_{\text{SOV}} \mid \text{movh} \\ \end{array} ``` The corresponding odd-numbered halfword signed fractional elements in RA and RB are multiplied. The 32 bits of each product are placed into the corresponding words of RT. If both inputs are -1.0, the result saturates to the largest positive signed fraction. #### **Special Registers Altered:** OV OVH SOV SOVH #### Vector Multiply Halfwords, Odd, Signed, Saturate, Fractional to Accumulator EVX-form evmhossfa RT,RA,RB ``` \mathsf{temp}_{0:31} \leftarrow (\mathtt{RA})_{16:31} \times_{\mathtt{sf}} (\mathtt{RB})_{16:31} if ((RA)_{16:31} = 0x8000) & ((RB)_{16:31} = 0x8000) then RT_{0:31} \leftarrow 0x7FFF FFFF movh ← 1 else RT_{0:31} \leftarrow temp_{0:31} movh \leftarrow 0 temp_{0:31} \leftarrow (RA)_{48:63} \times_{sf} (RB)_{48:63} if ((RA)_{48:63} = 0x8000) & ((RB)_{48:63} = 0x8000) then RT_{32:63} \leftarrow 0x7FFF\_FFFF movl ← 1 else RT_{32:63} \leftarrow temp_{0:31} movl \leftarrow 0 ACC_{0:63} \leftarrow (RT)_{0:63} \texttt{SPEFSCR}_{\texttt{OVH}} \; \leftarrow \texttt{movh} SPEFSCR_{OV} \leftarrow movl \texttt{SPEFSCR}_{\texttt{SOVH}} \leftarrow \texttt{SPEFSCR}_{\texttt{SOVH}} \hspace{0.1cm} | \hspace{0.1cm} \texttt{movh} \texttt{SPEFSCR}_{\texttt{SOV}} \leftarrow \texttt{SPEFSCR}_{\texttt{SOV}} \mid \ \texttt{movl} ``` The corresponding odd-numbered halfword signed fractional elements in RA and RB are multiplied. The 32 bits of each product are placed into the corresponding words of RT and into the accumulator. If both inputs are -1.0, the result saturates to the largest positive signed fraction. #### **Special Registers Altered:** #### Vector Multiply Halfwords, Odd, Signed, Saturate, Fractional and Accumulate into Words EVX-form evmhossfaaw RT,RA,RB ``` \begin{array}{l} {\rm temp_{0:31}} \leftarrow {\rm (RA)_{16:31}} \ {\rm X_{sf}} \ {\rm (RB)_{16:31}} \\ {\rm if} \ ({\rm (RA)_{16:31}} = {\rm 0x8000}) \ \& \ ({\rm (RB)_{16:31}} = {\rm 0x8000}) \ {\rm then} \end{array} temp_{0:31} \leftarrow 0x7FFF_FFFF movh ← 1 movh ← 0 temp_{0:63} \leftarrow EXTS((ACC)_{0:31}) + EXTS(temp_{0:31}) \texttt{ovh} \; \gets (\texttt{temp}_{\texttt{31}} \; \oplus \; \texttt{temp}_{\texttt{32}}) RT_{0:31} \leftarrow SATURATE(ovh, temp_{31}, 0x8000_0000, 0x7FFF_FFFF, temp<sub>32:63</sub>) temp_{0:31} \leftarrow (RA)_{48:63} \times_{sf} (RB)_{48:63} if ((RA)_{48:63} = 0x8000) & ((RB)_{48:63} = 0x8000) then \texttt{temp}_{\texttt{0:31}} \leftarrow \texttt{0x7FFF\_FFFF} movl \leftarrow 1 else temp_{0:63} \leftarrow EXTS((ACC)_{32:63}) + EXTS(temp_{0:31}) ovl \leftarrow (temp<sub>31</sub> \oplus temp<sub>32</sub>) RT_{32:63} \leftarrow SATURATE(ovl, temp_{31}, 0x8000_0000, 0x7FFF FFFF, temp<sub>32:63</sub>) ACC_{0:63} \leftarrow (RT)_{0:63} \texttt{SPEFSCR}_{\texttt{OVH}} \; \leftarrow \texttt{ovh} \; \mid \; \texttt{movh} SPEFSCR<sub>OV</sub> ← ovl | movl \texttt{SPEFSCR}_{\texttt{SOVH}} \; \leftarrow \texttt{SPEFSCR}_{\texttt{SOVH}} \; \mid \; \texttt{ovh} \; \mid \; \texttt{movh} SPEFSCR_{SOV} \leftarrow SPEFSCR_{SOV} \mid ovl \mid movl ``` The corresponding odd-numbered halfword signed fractional elements in RA and RB are multiplied producing a 32-bit product. If both inputs are -1.0, the result saturates to 0x7FFF\_FFFF. Each 32-bit product is then added to the corresponding word in the accumulator saturating if overflow occurs, and the result is placed in RT and the accumulator. #### **Special Registers Altered:** ACC OV OVH SOV SOVH #### Vector Multiply Halfwords, Odd, Signed, Saturate, Fractional and Accumulate Negative into Words EVX-form evmhossfanw RT,RA,RB | 4 | RT | RA | RB | 1415 | ī | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{l} {\rm temp_{0:31}} \leftarrow {\rm (RA)_{16:31}} \ {\rm X_{sf}} \ {\rm (RB)_{16:31}} \\ {\rm if} \ {\rm ((RA)_{16:31}} = {\rm 0x8000)} \ \& \ {\rm ((RB)_{16:31}} = {\rm 0x8000)} \ {\rm then} \end{array} temp_{0:31} \leftarrow 0x7FFF\_FFFF movh \leftarrow 1 movh \leftarrow 0 temp_{0:63} \leftarrow EXTS((ACC)_{0:31}) - EXTS(temp_{0:31}) ovh \leftarrow (temp<sub>31</sub> \oplus temp<sub>32</sub>) \mathtt{RT_{0:31}} \leftarrow \mathtt{SATURATE}(\mathtt{ovh, temp_{31}, 0x8000\_0000}, 0x7FFF_FFFFF, temp<sub>32:63</sub>) temp_{0:31} \leftarrow (RA)_{48:63} \times_{sf} (RB)_{48:63} if ((RA)_{48:63} = 0x8000) & ((RB)_{48:63} = 0x8000) then \texttt{temp}_{\texttt{0:31}} \leftarrow \texttt{0x7FFF\_FFFF} movl \leftarrow 1 else mov1 \leftarrow 0 temp_{0:63} \leftarrow EXTS((ACC)_{32:63}) - EXTS(temp_{0:31}) ovl \leftarrow (temp_{31} \oplus temp_{32}) \mathtt{RT}_{32:63} \leftarrow \mathtt{SATURATE} \, (\mathtt{ovl}, \ \mathtt{temp}_{31}, \ \mathtt{0x8000\_0000}, 0x7FFF FFFF, temp<sub>32:63</sub>) \begin{array}{l} {\rm ACC_{0:63}} \leftarrow {\rm (RT)_{0:63}} \\ {\rm SPEFSCR_{OVH}} \leftarrow {\rm ovh} \mid {\rm movh} \end{array} SPEFSCR<sub>OV</sub> ← ovl | movl \texttt{SPEFSCR}_{\texttt{SOVH}} \, \leftarrow \texttt{SPEFSCR}_{\texttt{SOVH}} \, \mid \, \texttt{ovh} \, \mid \, \texttt{movh} SPEFSCR_{SOV} \leftarrow SPEFSCR_{SOV} \mid ovl \mid movl ``` The corresponding odd-numbered halfword signed fractional elements in RA and RB are multiplied producing a 32-bit product. If both inputs are -1.0, the result saturates to 0x7FFF\_FFFF. Each 32-bit product is then subtracted from the corresponding word in the accumulator saturating if overflow occurs, and the result is placed in RT and the accumulator. #### **Special Registers Altered:** #### Vector Multiply Halfwords, Odd, Signed, Saturate, Integer and Accumulate into Words EVX-form evmhossiaaw RT,RA,RB | 4 | RT | RA | RB | 1285 | |---|----|----|----|------| | 0 | 6 | 11 | 16 | 21 3 | The corresponding odd-numbered halfword signed-integer elements in RA and RB are multiplied producing a 32-bit product. Each 32-bit product is then added to the corresponding word in the accumulator saturating if overflow occurs, and the result is placed in RT and the accumulator. #### **Special Registers Altered:** ACC OV OVH SOV SOVH #### Vector Multiply Halfwords, Odd, Unsigned, Modulo, Integer EVX-form evmhoumi RT,RA,RB | 4 | RT | RA | RB | 1036 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` RT_{0:31} \leftarrow (RA)_{16:31} \times_{ui} (RB)_{16:31} RT_{32:63} \leftarrow (RA)_{48:63} \times_{ui} (RB)_{48:63} ``` The corresponding odd-numbered halfword unsigned-integer elements in RA and RB are multiplied. The two 32-bit products are placed into the corresponding words of RT. #### Special Registers Altered: None #### Vector Multiply Halfwords, Odd, Signed, Saturate, Integer and Accumulate Negative into Words EVX-form evmhossianw RT,RA,RB | 4 | RT | RA | RB | 1413 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{lll} \operatorname{temp}_{0:31} \leftarrow (\operatorname{RA})_{16:31} \times_{\operatorname{si}} & (\operatorname{RB})_{16:31} \\ \operatorname{temp}_{0:63} \leftarrow \operatorname{EXTS}((\operatorname{ACC})_{0:31}) & - \operatorname{EXTS}(\operatorname{temp}_{0:31}) \\ \operatorname{ovh} \leftarrow (\operatorname{temp}_{31} \oplus \operatorname{temp}_{32}) \\ \operatorname{RT}_{0:31} \leftarrow \operatorname{SATURATE}(\operatorname{ovh}, \operatorname{temp}_{31}, \operatorname{0x8000\_0000}, \\ & \operatorname{0x7FFF\_FFFF}, \operatorname{temp}_{32:63}) \\ \operatorname{temp}_{0:31} \leftarrow (\operatorname{RA})_{48:63} \times_{\operatorname{si}} & (\operatorname{RB})_{48:63} \\ \operatorname{temp}_{0:63} \leftarrow \operatorname{EXTS}((\operatorname{ACC})_{32:63}) & - \operatorname{EXTS}(\operatorname{temp}_{0:31}) \\ \operatorname{ovl} \leftarrow (\operatorname{temp}_{31} \oplus \operatorname{temp}_{32}) \\ \operatorname{RT}_{32:63} \leftarrow \operatorname{SATURATE}(\operatorname{ovl}, \operatorname{temp}_{31}, \operatorname{0x8000\_0000}, \\ & \operatorname{0x7FFF\_FFFF}, \operatorname{temp}_{32:63}) \\ \operatorname{ACC}_{0:63} \leftarrow & (\operatorname{RT})_{0:63} \\ \operatorname{SPEFSCR}_{\operatorname{OVH}} \leftarrow \operatorname{ovh} \\ \operatorname{SPEFSCR}_{\operatorname{SOVH}} \leftarrow \operatorname{ovl} \\ \operatorname{SPEFSCR}_{\operatorname{SOVH}} \leftarrow \operatorname{SPEFSCR}_{\operatorname{SOVH}} \mid \operatorname{ovh} \\ \operatorname{SPEFSCR}_{\operatorname{SOV}} \leftarrow \operatorname{SPEFSCR}_{\operatorname{SOV}} \mid \operatorname{ovl} \\ \end{array} ``` The corresponding odd-numbered halfword signed-integer elements in RA and RB are multiplied producing a 32-bit product. Each 32-bit product is then subtracted from the corresponding word in the accumulator saturating if overflow occurs, and the result is placed in RT and the accumulator. #### **Special Registers Altered:** ACC OV OVH SOV SOVH #### Vector Multiply Halfwords, Odd, Unsigned, Modulo, Integer to Accumulator EVX-form evmhoumia RT,RA,RB ``` RT_{0;31} \leftarrow (RA)_{16:31} \times_{ui} (RB)_{16:31} RT_{32:63} \leftarrow (RA)_{48:63} \times_{ui} (RB)_{48:63} ACC_{0:63} \leftarrow (RT)_{0:63} ``` The corresponding odd-numbered halfword unsigned-integer elements in RA and RB are multiplied. The two 32-bit products are placed into RT and into the accumulator. #### **Special Registers Altered:** #### Vector Multiply Halfwords, Odd, Unsigned, Modulo, Integer and Accumulate into Words EVX-form evmhoumiaaw RT,RA,RB ``` \begin{array}{l} \mathsf{temp}_{0:31} \leftarrow (\mathsf{RA})_{16:31} \times_{\mathsf{ui}} (\mathsf{RB})_{16:31} \\ \mathsf{RT}_{0:31} \leftarrow (\mathsf{ACC})_{0:31} + \mathsf{temp}_{0:31} \\ \mathsf{temp}_{0:31} \leftarrow (\mathsf{RA})_{48:63} \times_{\mathsf{ui}} (\mathsf{RB})_{48:63} \\ \mathsf{RT}_{32:63} \leftarrow (\mathsf{ACC})_{32:63} + \mathsf{temp}_{0:31} \\ \mathsf{ACC}_{0:63} \leftarrow (\mathsf{RT})_{0:63} \end{array} ``` For each word element in the accumulator, the corresponding odd-numbered halfword unsigned-integer elements in RA and RB are multiplied. Each intermediate product is added to the contents of the corresponding accumulator word. The sums are placed into the corresponding RT and accumulator words. #### **Special Registers Altered:** ACC #### Vector Multiply Halfwords, Odd, Unsigned, Saturate, Integer and Accumulate into Words EVX-form evmhousiaaw RT,RA,RB For each word element in the accumulator, corresponding odd-numbered halfword unsigned-integer elements in RA and RB are multiplied producing a 32-bit product. Each 32-bit product is then added to the corresponding word in the accumulator saturating if overflow occurs, and the result is placed in RT and the accumulator. #### **Special Registers Altered:** ACC OV OVH SOV SOVH #### Vector Multiply Halfwords, Odd, Unsigned, Modulo, Integer and Accumulate Negative into Words EVX-form evmhoumianw RT,RA,RB | 4 | 1 | RT | RA | RB | 1420 | | |---|---|----|----|----|------|----| | 0 | 6 | 1 | 11 | 16 | 21 | 31 | ``` \begin{array}{l} \mathsf{temp}_{0:31} \leftarrow (\mathsf{RA})_{16:31} \times_{\mathsf{ui}} (\mathsf{RB})_{16:31} \\ \mathsf{RT}_{0:31} \leftarrow (\mathsf{ACC})_{0:31} - \mathsf{temp}_{0:31} \\ \mathsf{temp}_{0:31} \leftarrow (\mathsf{RA})_{48:63} \times_{\mathsf{ui}} (\mathsf{RB})_{48:63} \\ \mathsf{RT}_{32:63} \leftarrow (\mathsf{ACC})_{32:63} - \mathsf{temp}_{0:31} \\ \mathsf{ACC}_{0:63} \leftarrow (\mathsf{RT})_{0:63} \end{array} ``` For each word element in the accumulator, the corresponding odd-numbered halfword unsigned-integer elements in RA and RB are multiplied. Each intermediate product is subtracted from the contents of the corresponding accumulator word. The results are placed into the corresponding RT and accumulator words. #### **Special Registers Altered:** ACC #### Vector Multiply Halfwords, Odd, Unsigned, Saturate, Integer and Accumulate Negative into Words **EVX-form** evmhousianw RT,RA,RB | 4 | RT | RA | RB | 1412 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | For each word element in the accumulator, corresponding odd-numbered halfword unsigned-integer elements in RA and RB are multiplied producing a 32-bit product. Each 32-bit product is then subtracted from the corresponding word in the accumulator saturating if overflow occurs, and the result is placed in RT and the accumulator. #### Special Registers Altered: #### Initialize Accumulator #### EVX-form evmra RT,RA | 4 | RT | RA | /// | 1220 | Ī | |---|----|----|-----|------|---| | 0 | 6 | 11 | 16 | 21 3 | 1 | $$ACC_{0:63} \leftarrow (RA)_{0:63}$$ $$RT_{0:63} \leftarrow (RA)_{0:63}$$ The contents of RA are placed into the accumulator and RT. This is the method for initializing the accumulator. #### **Special Registers Altered:** ACC #### Vector Multiply Word High Signed, Modulo, Fractional EVX-form evmwhsmf RT,RA,RB $$\begin{array}{l} \mathsf{temp}_{0:63} \leftarrow (\mathsf{RA})_{\,0:31} \,\, \mathsf{X}_{\mathsf{sf}} \,\, \left(\mathsf{RB}\right)_{\,0:31} \\ \mathsf{RT}_{0:31} \leftarrow \mathsf{temp}_{0:31} \\ \mathsf{temp}_{0:63} \leftarrow (\mathsf{RA})_{\,32:63} \,\, \mathsf{X}_{\mathsf{sf}} \,\, \left(\mathsf{RB}\right)_{\,32:63} \\ \mathsf{RT}_{32:63} \leftarrow \mathsf{temp}_{0:31} \end{array}$$ The corresponding word signed fractional elements in RA and RB are multiplied and bits 0:31 of the two products are placed into the two corresponding words of RT. #### **Special Registers Altered:** None #### Vector Multiply Word High Signed, Modulo, Integer EVX-form evmwhsmi RT,RA,RB | 4 | RT | RA | RB | 1101 | - | |---|----|----|----|------|----| | | | 44 | 10 | 04 | 24 | | U | ь | 11 | 16 | 21 | 31 | $$\begin{array}{l} \mathsf{temp}_{0:63} \leftarrow (\mathsf{RA})_{\,0:31} \,\, \mathsf{X_{si}} \,\, (\mathsf{RB})_{\,0:31} \\ \mathsf{RT}_{0:31} \leftarrow \mathsf{temp}_{0:31} \\ \mathsf{temp}_{0:63} \leftarrow (\mathsf{RA})_{\,32:63} \,\, \mathsf{X_{si}} \,\, (\mathsf{RB})_{\,32:63} \\ \mathsf{RT}_{32:63} \leftarrow \mathsf{temp}_{0:31} \end{array}$$ The corresponding word signed-integer elements in RA and RB are multiplied. Bits 0:31 of the two 64-bit products are placed into the two corresponding words of RT. #### **Special Registers Altered:** None #### Vector Multiply Word High Signed, Modulo, Fractional to Accumulator EVX-form evmwhsmfa RT,RA,RB | | | | | | _ | |---|-------|-----|-----|------|----| | 1 | RT | RΔ | RB | 1135 | | | - | 1 \ 1 | 11/ | IND | 1133 | | | 0 | 6 | 11 | 16 | 21 : | 31 | | U | U | | 10 | ۷ ، | J | ``` \begin{array}{l} \mathsf{temp}_{0:63} \leftarrow (\mathsf{RA})_{\,0:31} \ \times_{\mathsf{sf}} \ (\mathsf{RB})_{\,0:31} \\ \mathsf{RT}_{0:31} \leftarrow \mathsf{temp}_{0:31} \\ \mathsf{temp}_{0:63} \leftarrow (\mathsf{RA})_{\,32:63} \ \times_{\mathsf{sf}} \ (\mathsf{RB})_{\,32:63} \\ \mathsf{RT}_{32:63} \leftarrow \mathsf{temp}_{0:31} \\ \mathsf{ACC}_{0:63} \leftarrow (\mathsf{RT})_{\,0:63} \end{array} ``` The corresponding word signed fractional elements in RA and RB are multiplied and bits 0:31 of the two products are placed into the two corresponding words of RT and into the accumulator. #### **Special Registers Altered:** ACC #### Vector Multiply Word High Signed, Modulo, Integer to Accumulator EVX-form evmwhsmia RT,RA,RB | 4 | RT | RA | RB | 1133 | | |---|----|----|----|------|----| | = | | | | | | | 0 | 6 | 11 | 16 | 21 | 31 | The corresponding word signed-integer elements in RA and RB are multiplied. Bits 0:31 of the two 64-bit products are placed into the two corresponding words of RT and into the accumulator. #### **Special Registers Altered:** #### Vector Multiply Word High Signed, Saturate, Fractional EVX-form evmwhssf RT,RA,RB ``` \mathsf{temp}_{\text{0:63}} \, \leftarrow \, (\text{RA})_{\,\text{0:31}} \, \times_{\text{sf}} \, (\text{RB})_{\,\text{0:31}} if ((RA)_{0:31} = 0x8000_0000) & ((RB)_{0:31} = 0x8000_0000) RT<sub>0:31</sub> ← 0x7FFF_FFFF movh \leftarrow 1 else RT_{0:31} \leftarrow temp_{0:31} movh \leftarrow 0 \mathsf{temp}_{0:63} \leftarrow (\mathtt{RA})_{32:63} \times_{\mathtt{sf}} (\mathtt{RB})_{32:63} if ((RA)_{32:63} = 0x8000_0000 & (RB)_{32:63} = 0x8000_0000) RT_{32:63} \leftarrow 0x7FFF\_FFFF movl ←1 else \mathtt{RT}_{32:63} \leftarrow \mathtt{temp}_{0:31} movl \leftarrow 0 SPEFSCR<sub>OVH</sub> ← movh \texttt{SPEFSCR}_{\texttt{OV}} \; \leftarrow \texttt{movl} SPEFSCR_{SOVH} \leftarrow SPEFSCR_{SOVH} \mid movh SPEFSCR_{SOV} \leftarrow SPEFSCR_{SOV} \mid movl ``` The corresponding word signed fractional elements in RA and RB are multiplied. Bits 0:31 of each product are placed into the corresponding words of RT. If both inputs are -1.0, the result saturates to the largest positive signed fraction. #### **Special Registers Altered:** OV OVH SOV SOVH #### Vector Multiply Word High Unsigned, Modulo, Integer EVX-form evmwhumi RT,RA,RB | 4 | RT | RA | RB | 1100 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{l} \mathsf{temp}_{0:63} \leftarrow (\mathsf{RA})_{\,0:31} \,\, \mathsf{X_{ui}} \,\, (\mathsf{RB})_{\,0:31} \\ \mathsf{RT_{0:31}} \leftarrow \mathsf{temp}_{0:31} \\ \mathsf{temp}_{0:63} \leftarrow (\mathsf{RA})_{\,32:63} \,\, \mathsf{X_{ui}} \,\, (\mathsf{RB})_{\,32:63} \\ \mathsf{RT_{32:63}} \leftarrow \mathsf{temp}_{0:31} \end{array} ``` The corresponding word unsigned-integer elements in RA and RB are multiplied. Bits 0:31 of the two products are placed into the two corresponding words of RT. #### **Special Registers Altered:** None #### Vector Multiply Word High Signed, Saturate, Fractional to Accumulator EVX-form evmwhssfa RT,RA,RB ``` temp_{0:63} \leftarrow (RA)_{0:31} \times_{sf} (RB)_{0:31} if ((RA)_{0:31} = 0x8000\_0000) & ((RB)_{0:31} = 0x8000\_0000) then RT_{0:31} \leftarrow 0x7FFF_FFFF movh ← 1 else RT_{0:31} \leftarrow temp_{0:31} movh \leftarrow 0 temp_{0:63} \leftarrow (RA)_{32:63} \times_{sf} (RB)_{32:63} if ((RA)_{32:63}=0x8000\_0000)&((RB)_{32:63}=0x8000\_0000) RT_{32:63} \leftarrow 0x7FFF\_FFFF movl \leftarrow 1 else \mathtt{RT}_{32:63} \leftarrow \mathtt{temp}_{0:31} movl ← 0 ACC_{0:63} \leftarrow (RT)_{0:63} SPEFSCR<sub>OVH</sub> ← movh \texttt{SPEFSCR}_{\texttt{OV}} \; \leftarrow \texttt{movl} SPEFSCR<sub>SOVH</sub> ← SPEFSCR<sub>SOVH</sub> | movh SPEFSCR_{SOV} \leftarrow SPEFSCR_{SOV} movl ``` The corresponding word signed fractional elements in RA and RB are multiplied. Bits 0:31 of each product are placed into the corresponding words of RT and into the accumulator. If both inputs are -1.0, the result saturates to the largest positive signed fraction. #### **Special Registers Altered:** ACC OV OVH SOV SOVH #### Vector Multiply Word High Unsigned, Modulo, Integer to AccumulatorEVX-form evmwhumia RT,RA,RB ``` \begin{array}{lll} \mathsf{temp}_{0:63} \leftarrow (\mathsf{RA})_{0:31} \; \mathsf{X}_{ui} \; & (\mathsf{RB})_{0:31} \\ \mathsf{RT}_{0:31} \leftarrow \mathsf{temp}_{0:31} \\ \mathsf{temp}_{0:63} \leftarrow (\mathsf{RA})_{32:63} \; \mathsf{X}_{ui} \; & (\mathsf{RB})_{32:63} \\ \mathsf{RT}_{32:63} \leftarrow \mathsf{temp}_{0:31} \\ \mathsf{ACC}_{0:63} \leftarrow (\mathsf{RT})_{0:63} \end{array} ``` The corresponding word unsigned-integer elements in RA and RB are multiplied. Bits 0:31 of the two products are placed into the two corresponding words of RT and into the accumulator. #### **Special Registers Altered:** ### Vector Multiply Word Low Signed, Modulo, Integer and Accumulate into Words EVX-form evmwlsmiaaw RT,RA,RB | 4 | RT | RA | RB | 1353 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | For each word element in the accumulator, the corresponding word signed-integer elements in RA and RB are multiplied. The least significant 32 bits of each intermediate product are added to the contents of the corresponding accumulator words, and the result is placed in RT and the accumulator. ### **Special Registers Altered:** ACC ### Vector Multiply Word Low Signed, Saturate, Integer and Accumulate into Words EVX-form evmwlssiaaw RT,RA,RB | 4 | RT | RA | RB | 1345 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | The corresponding word signed-integer elements in RA and RB are multiplied producing a 64-bit product. The least significant 32 bits of each product are then added to the corresponding word in the accumulator saturating if overflow occurs, and the result is placed in RT and the accumulator. ### **Special Registers Altered:** ACC OV OVH SOV SOVH ### Vector Multiply Word Low Signed, Modulo, Integer and Accumulate Negative in Words EVX-form evmwlsmianw RT,RA,RB | 4 | RT | RA | RB | 1481 | |---|----|----|----|------| | 0 | 6 | 11 | 16 | 21 3 | ``` \begin{array}{l} \mathsf{temp}_{0:63} \leftarrow (\mathsf{RA})_{\,0:31} \,\, \mathsf{x_{si}} \,\, (\mathsf{RB})_{\,0:31} \\ \mathsf{RT}_{0:31} \leftarrow (\mathsf{ACC})_{\,0:31} \,\, - \,\, \mathsf{temp}_{32:63} \\ \mathsf{temp}_{0:63} \leftarrow (\mathsf{RA})_{\,32:63} \,\, \mathsf{x_{si}} \,\, (\mathsf{RB})_{\,32:63} \\ \mathsf{RT}_{32:63} \leftarrow (\mathsf{ACC})_{\,32:63} \,\, - \,\, \mathsf{temp}_{32:63} \\ \mathsf{ACC}_{0:63} \leftarrow (\mathsf{RT})_{\,0:63} \end{array} ``` For each word element in the accumulator, the corresponding word elements in RA and RB are multiplied. The least significant 32 bits of each intermediate product are subtracted from the contents of the corresponding accumulator words and the result is placed in RT and the accumulator. ### **Special Registers Altered:** ACC ### Vector Multiply Word Low Signed, Saturate, Integer and Accumulate Negative in Words EVX-form evmwlssianw RT,RA,RB | 4 | RT | RA | RB | 1473 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | The corresponding word signed-integer elements in RA and RB are multiplied producing a 64-bit product. The least significant 32 bits of each product are then subtracted from the corresponding word in the accumulator saturating if overflow occurs, and the result is placed in RT and the accumulator. ### **Special Registers Altered:** ACC OV OVH SOV SOVH ### Vector Multiply Word Low Unsigned, Modulo, Integer EVX-form evmwlumi RT,RA,RB | 4 | RT | RA | RB | 1096 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 : | 31 | The corresponding word unsigned-integer elements in RA and RB are multiplied. The least significant 32 bits of each product are placed into the two corresponding words of RT. ### **Special Registers Altered:** None ### Programming Note - The least significant 32 bits of the product are independent of whether the word elements in RA and RB are treated as signed or unsigned 32-bit integers. Note that **evmwlumi** can be used for signed or unsigned integers. ### Vector Multiply Word Low Unsigned, Modulo, Integer and Accumulate into Words EVX-form evmwlumiaaw RT,RA,RB | Ī | 4 | RT | RA | RB | 1352 | | |---|---|----|----|----|------|----| | ١ | 0 | 6 | 11 | 16 | 21 3 | 31 | $$\begin{array}{l} {\sf temp_{0:63}} \leftarrow {\sf (RA)_{0:31}} \times_{\tt ui} {\sf (RB)_{0:31}} \\ {\sf RT_{0:31}} \leftarrow {\sf (ACC)_{0:31}} + {\sf temp_{32:63}} \end{array}$$ $$temp_{0.63} \leftarrow (RA)_{32.63} \times_{ui} (RB)_{32.63}$$ $RT_{32.63} \leftarrow (ACC)_{32.63} + temp_{32.63}$ $ACC_{0.63} \leftarrow (RT)_{0.63}$ For each word element in the accumulator, the corresponding word unsigned-integer elements in RA and RB are multiplied. The least significant 32 bits of each product are added to the contents of the corresponding accumulator word and the result is placed in RT and the accumulator. #### **Special Registers Altered:** ACC ### Vector Multiply Word Low Unsigned, Modulo, Integer to AccumulatorEVX-form evmwlumia RT,RA,RB ``` \begin{array}{l} \mathsf{temp}_{0:63} \leftarrow (\mathsf{RA})_{\,0:31} \; \mathsf{X}_{\mathsf{ui}} \; (\mathsf{RB})_{\,0:31} \\ \mathsf{RT}_{0:31} \leftarrow \mathsf{temp}_{32:63} \\ \mathsf{temp}_{0:63} \leftarrow (\mathsf{RA})_{\,32:63} \; \mathsf{X}_{\mathsf{ui}} \; (\mathsf{RB})_{\,32:63} \\ \mathsf{RT}_{32:63} \leftarrow \mathsf{temp}_{32:63} \\ \mathsf{ACC}_{0:63} \leftarrow (\mathsf{RT})_{\,0:63} \end{array} ``` The corresponding word unsigned-integer elements in RA and RB are multiplied. The least significant 32 bits of each product are placed into the two corresponding words of RT and into the accumulator. ### **Special Registers Altered:** ACC ### Programming Note - The least significant 32 bits of the product are independent of whether the word elements in RA and RB are treated as signed or unsigned 32-bit integers. Note that **evmwlumia** can be used for signed or unsigned integers. ### Vector Multiply Word Low Unsigned, Modulo, Integer and Accumulate Negative in Words EVX-form evmwlumianw RT,RA,RB | | 4 | RT | RA | RB | 1480 | | |---|---|----|----|----|------|----| | 0 | | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{l} \mathsf{temp}_{0:63} \leftarrow (\mathsf{RA})_{\,0:31} \; \mathsf{X}_{ui} \; (\mathsf{RB})_{\,0:31} \\ \mathsf{RT}_{0:31} \leftarrow (\mathsf{ACC})_{\,0:31} \; - \; \mathsf{temp}_{32:63} \\ \mathsf{temp}_{0:63} \leftarrow (\mathsf{RA})_{\,32:63} \; \mathsf{X}_{ui} \; (\mathsf{RB})_{\,32:63} \\ \mathsf{RT}_{32:63} \leftarrow (\mathsf{ACC})_{\,32:63} \; - \; \mathsf{temp}_{32:63} \\ \mathsf{ACC}_{0:63} \leftarrow (\mathsf{RT})_{\,0:63} \end{array} ``` For each word element in the accumulator, the corresponding word unsigned-integer elements in RA and RB are multiplied. The least significant 32 bits of each product are subtracted from the contents of the corresponding accumulator word and the result is placed in RT and the accumulator. ### **Special Registers Altered:** ### Vector Multiply Word Low Unsigned, Saturate, Integer and Accumulate into Words EVX-form evmwlusiaaw RT,RA,RB | 4 | RT | RA | RB | 1344 | |---|----|----|----|-------| | 0 | 6 | 11 | 16 | 21 31 | ``` \begin{array}{l} \mathsf{temp}_{0:63} \leftarrow (\mathsf{RA})_{0:31} \; \mathsf{X}_{ui} \; (\mathsf{RB})_{0:31} \\ \mathsf{temp}_{0:63} \leftarrow \mathsf{EXTZ}((\mathsf{ACC})_{0:31}) \; + \; \mathsf{EXTZ}(\mathsf{temp}_{32:63}) \\ \mathsf{ovh} \leftarrow \mathsf{temp}_{31} \\ \mathsf{RT}_{0:31} \leftarrow \mathsf{SATURATE}(\mathsf{ovh}, \; \mathsf{0}, \; \mathsf{0xffff}_\mathsf{FFF}_\mathsf{FFF}, \; \mathsf{0xffff}_\mathsf{FFFF}, \\ \mathsf{temp}_{32:63}) \\ \mathsf{temp}_{0:63} \leftarrow (\mathsf{RA})_{32:63} \; \mathsf{X}_{ui} \; (\mathsf{RB})_{32:63} \\ \mathsf{temp}_{0:63} \leftarrow \mathsf{EXTZ}((\mathsf{ACC})_{32:63}) \; + \; \mathsf{EXTZ}(\mathsf{temp}_{32:63}) \\ \mathsf{ovl} \leftarrow \mathsf{temp}_{31} \\ \mathsf{RT}_{32:63} \leftarrow \mathsf{SATURATE}(\mathsf{ovl}, \; \mathsf{0}, \; \mathsf{0xffff}_\mathsf{FFF}, \\ \mathsf{0xffff}_\mathsf{Ffff}, \; \mathsf{temp}_{32:63}) \\ \mathsf{ACC}_{0:63} \leftarrow (\mathsf{RT})_{0:63} \\ \mathsf{SPEFSCR}_{\mathsf{OVH}} \leftarrow \mathsf{ovh} \\ \mathsf{SPEFSCR}_{\mathsf{SOVH}} \leftarrow \mathsf{ovl} \\ \mathsf{SPEFSCR}_{\mathsf{SOVH}} \leftarrow \mathsf{SPEFSCR}_{\mathsf{SOVH}} \mid \mathsf{ovh} \\ \mathsf{SPEFSCR}_{\mathsf{SOV}} \leftarrow \mathsf{SPEFSCR}_{\mathsf{SOV}} \mid \mathsf{ovl} \end{array} ``` For each word element in the accumulator, corresponding word unsigned-integer elements in RA and RB are multiplied producing a 64-bit product. The least significant 32 bits of each product are then added to the corresponding word in the accumulator saturating if overflow occurs, and the result is placed in RT and the accumulator. #### **Special Registers Altered:** ACC OV OVH SOV SOVH ### Vector Multiply Word Signed, Modulo, Fractional EVX-form evmwsmf RT,RA,RB | Ī | 4 | RT | RA | RB | 1115 | | |---|---|----|----|----|------|----| | | 0 | 6 | 11 | 16 | 21 | 31 | $$RT_{0:63} \leftarrow (RA)_{32:63} \times_{sf} (RB)_{32:63}$$ The corresponding low word signed fractional elements in RA and RB are multiplied. The product is placed in RT ### **Special Registers Altered:** None ### Vector Multiply Word Low Unsigned, Saturate, Integer and Accumulate Negative in Words EVX-form evmwlusianw RT,RA,RB | 4 | RT | RA | RB | 1472 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | For each word element in the accumulator, corresponding word unsigned-integer elements in RA and RB are multiplied producing a 64-bit product. The least significant 32 bits of each product are then subtracted from the corresponding word in the accumulator saturating if overflow occurs, and the result is placed in RT and the accumulator. #### **Special Registers Altered:** ACC OV OVH SOV SOVH ### Vector Multiply Word Signed, Modulo, Fractional to Accumulator EVX-form evmwsmfa RT,RA,RB | 4 | RT | RA | RB | 1147 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$RT_{0:63} \leftarrow (RA)_{32:63} \times_{sf} (RB)_{32:63} ACC_{0:63} \leftarrow (RT)_{0:63}$$ The corresponding low word signed fractional elements in RA and RB are multiplied. The product is placed in RT and into the accumulator. ### **Special Registers Altered:** ### Vector Multiply Word Signed, Modulo, Fractional and Accumulate EVX-form evmwsmfaa RT,RA,RB $$\begin{array}{l} \mathsf{temp}_{0:63} \leftarrow \mathsf{(RA)}_{32:63} \ \mathsf{X}_{\mathsf{sf}} \ \mathsf{(RB)}_{32:63} \\ \mathsf{RT}_{0:63} \leftarrow \mathsf{(ACC)}_{0:63} + \mathsf{temp}_{0:63} \\ \mathsf{ACC}_{0:63} \leftarrow \mathsf{(RT)}_{0:63} \end{array}$$ The corresponding low word signed fractional elements in RA and RB are multiplied. The intermediate product is added to the contents of the 64-bit accumulator and the result is placed in RT and the accumulator. ### **Special Registers Altered:** ACC ### Vector Multiply Word Signed, Modulo, Integer EVX-form evmwsmi RT,RA,RB | 4 | RT | RA | RB | 1113 | | |---|----|----|----|------|---| | 0 | 6 | 11 | 16 | 21 | 3 | $$RT_{0:63} \leftarrow (RA)_{32:63} \times_{si} (RB)_{32:63}$$ The low word signed-integer elements in RA and RB are multiplied. The product is placed in RT. ### **Special Registers Altered:** None ### Vector Multiply Word Signed, Modulo, Integer and Accumulate EVX-form evmwsmiaa RT,RA,RB | 4 | RT | RA | RB | 1369 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | temp<sub>0.63</sub> $$\leftarrow$$ (RA)<sub>32:63</sub> $\times$ <sub>si</sub> (RB)<sub>32:63</sub> RT<sub>0.63</sub> $\leftarrow$ (ACC)<sub>0.63</sub> + temp<sub>0.63</sub> ACC<sub>0.63</sub> $\leftarrow$ (RT)<sub>0.63</sub> The low word signed-integer elements in RA and RB are multiplied. The intermediate product is added to the contents of the 64-bit accumulator and the result is placed in RT and the accumulator. ### **Special Registers Altered:** ACC ### Vector Multiply Word Signed, Modulo, Fractional and Accumulate Negative EVX-form evmwsmfan RT,RA,RB | 4 | RT | RA | RB | 1499 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 : | 31 | $$\begin{array}{l} \mathsf{temp}_{0:63} \leftarrow (\mathtt{RA})_{32:63} \times_{\mathrm{sf}} (\mathtt{RB})_{32:63} \\ \mathtt{RT}_{0:63} \leftarrow (\mathtt{ACC})_{0:63} - \mathtt{temp}_{0:63} \\ \mathtt{ACC}_{0:63} \leftarrow (\mathtt{RT})_{0:63} \end{array}$$ The corresponding low word signed fractional elements in RA and RB are multiplied. The intermediate product is subtracted from the contents of the accumulator and the result is placed in RT and the accumulator. ### **Special Registers Altered:** ACC ### Vector Multiply Word Signed, Modulo, Integer to Accumulator EVX-form evmwsmia RT,RA,RB | 4 | RT | RA | RB | 1145 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$RT_{0:63} \leftarrow (RA)_{32:63} \times_{si} (RB)_{32:63}$$ $ACC_{0:63} \leftarrow (RT)_{0:63}$ The low word signed-integer elements in RA and RB are multiplied. The product is placed in RT and the accumulator. ### **Special Registers Altered:** ACC ### Vector Multiply Word Signed, Modulo, Integer and Accumulate Negative EVX-form evmwsmian RT,RA,RB $$\begin{array}{l} \mathsf{temp}_{0:63} \leftarrow (\mathtt{RA})_{32:63} \times_{\mathtt{si}} (\mathtt{RB})_{32:63} \\ \mathtt{RT}_{0:63} \leftarrow (\mathtt{ACC})_{0:63} - \mathtt{temp}_{0:63} \\ \mathtt{ACC}_{0:63} \leftarrow (\mathtt{RT})_{0:63} \end{array}$$ The low word signed-integer elements in RA and RB are multiplied. The intermediate product is subtracted from the contents of the 64-bit accumulator and the result is placed in RT and the accumulator. ### **Special Registers Altered:** ### Vector Multiply Word Signed, Saturate, Fractional EVX-form evmwssf RT,RA,RB | 4 | RT | RA | RB | 1107 | |---|----|----|----|-------------------| | 0 | 6 | 11 | 16 | 21 3 <sup>-</sup> | ``` \begin{array}{l} \text{temp}_{0.63} \leftarrow \text{(RA)}_{32:63} \times_{\text{sf}} \text{(RB)}_{32:63} \\ \text{if ((RA)}_{32:63} = 0\text{x80000}\_0000) & \text{(RB}_{32:63} = 0\text{x80000}\_0000) \\ \text{then} \\ \text{RT}_{0.63} \leftarrow 0\text{x7FFF}\_\text{FFFF}\_\text{FFFF}\_\text{FFFF} \\ \text{mov} \leftarrow 1 \\ \text{else} \\ \text{RT}_{0:63} \leftarrow \text{temp}_{0:63} \\ \text{mov} \leftarrow 0 \\ \text{SPEFSCR}_{\text{OVH}} \leftarrow 0 \\ \text{SPEFSCR}_{\text{OV}} \leftarrow \text{mov} \\ \text{SPEFSCR}_{\text{SOV}} \leftarrow \text{spefscr}_{\text{SOV}} \mid \text{mov} \end{array} ``` The low word signed fractional elements in RA and RB are multiplied. The 64-bit product is placed in RT. If both inputs are -1.0, the result saturates to the largest positive signed fraction. ### **Special Registers Altered:** OV OVH SOV ### Vector Multiply Word Signed, Saturate, Fractional to Accumulator EVX-form evmwssfa RT,RA,RB | 4 | RT | RA | RB | 1139 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{l} \text{temp}_{0.63} \leftarrow \text{(RA)}_{32:63} \times_{\text{sf}} \text{ (RB)}_{32:63} \\ \text{if } \text{ ((RA)}_{32:63} = 0 \times 8000\_0000) \& \text{((RB)}_{32:63} = 0 \times 8000\_0000) \\ \text{then} \\ \text{RT}_{0:63} \leftarrow 0 \times 7 \text{FFF\_FFFF\_FFFF\_FFFF} \\ \text{mov} \leftarrow 1 \\ \text{else} \\ \text{RT}_{0:63} \leftarrow \text{temp}_{0:63} \\ \text{mov} \leftarrow 0 \\ \text{ACC}_{0:63} \leftarrow \text{(RT)}_{0:63} \\ \text{SPEFSCR}_{\text{OVH}} \leftarrow 0 \\ \text{SPEFSCR}_{\text{OV}} \leftarrow \text{mov} \\ \text{SPEFSCR}_{\text{SOV}} \leftarrow \text{SPEFSCR}_{\text{SOV}} \mid \text{mov} \end{array} ``` The low word signed fractional elements in RA and RB are multiplied. The 64-bit product is placed in RT and into the accumulator. If both inputs are -1.0, the result saturates to the largest positive signed fraction. ### **Special Registers Altered:** ACC OV OVH SOV ### Vector Multiply Word Signed, Saturate, Fractional and Accumulate EVX-form evmwssfaa RT,RA,RB ``` \begin{array}{l} \text{temp}_{0:63} \leftarrow (\text{RA})_{32:63} \times_{\text{sf}} (\text{RB})_{32:63} \\ \text{if } ((\text{RA})_{32:63} = 0 \times 8000\_0000) \& ((\text{RB})_{32:63} = 0 \times 8000\_0000) \\ \text{then} \\ \text{temp}_{0:63} \leftarrow 0 \times 7 \text{FFF\_FFFF\_FFFF\_FFFF} \\ \text{mov} \leftarrow 1 \\ \text{else} \\ \text{mov} \leftarrow 0 \\ \text{temp}_{0:64} \leftarrow \text{EXTS} ((\text{ACC})_{0:63}) + \text{EXTS} (\text{temp}_{0:63}) \\ \text{ov} \leftarrow (\text{temp}_0 \oplus \text{temp}_1) \\ \text{RT}_{0:63} \leftarrow \text{temp}_{1:64} \\ \\ \text{ACC}_{0:63} \leftarrow (\text{RT})_{0:63} \\ \text{SPEFSCR}_{\text{OV}} \leftarrow \text{ov} \mid \text{mov} \\ \text{SPEFSCR}_{\text{SOV}} \leftarrow \text{SPEFSCR}_{\text{SOV}} \mid \text{ov} \mid \text{mov} \end{array} ``` The low word signed fractional elements in RA and RB are multiplied producing a 64-bit product. If both inputs are -1.0, the product saturates to the largest positive signed fraction. The 64-bit product is then added to the accumulator and the result is placed in RT and the accumulator. ### **Special Registers Altered:** ACC OV OVH SOV ### Vector Multiply Word Unsigned, Modulo, Integer EVX-form evmwumi RT,RA,RB | ı | 4 | RT | RA | RB | 1112 | ٦ | |---|---|----|----|----|------|----| | | 0 | 6 | 11 | 16 | 21 | 31 | $$RT_{0:63} \leftarrow (RA)_{32:63} \times_{ui} (RB)_{32:63}$$ The low word unsigned-integer elements in RA and RB are multiplied to form a 64-bit product that is placed in RT ### **Special Registers Altered:** None ### Vector Multiply Word Signed, Saturate, Fractional and Accumulate Negative EVX-form evmwssfan RT,RA,RB ``` \begin{array}{llll} \text{temp}_{0:63} &\leftarrow (\text{RA})_{32:63} \times_{\text{sf}} &(\text{RB})_{32:63} \\ \text{if } &((\text{RA})_{32:63} \!=\! 0\text{x8000}\_0000) \& ((\text{RB})_{32:63} \!=\! 0\text{x8000}\_0000) \\ \text{then} &&& \text{temp}_{0:63} &\leftarrow 0\text{x7FFF}\_\text{FFFF}\_\text{FFFF}\_\text{FFFF} \\ &&\text{mov} &\leftarrow 1 \\ \text{else} &&& \text{mov} &\leftarrow 0 \\ \text{temp}_{0:64} &\leftarrow \text{EXTS} ((\text{ACC})_{0:63}) &- \text{EXTS} (\text{temp}_{0:63}) \\ \text{ov} &\leftarrow (\text{temp}_0 &\oplus \text{temp}_1) \\ \text{RT}_{0:63} &\leftarrow \text{temp}_{1:64} \\ \text{ACC}_{0:63} &\leftarrow &(\text{RT})_{0:63} \\ \text{SPEFSCR}_{\text{OV}} &\leftarrow \text{ov} &|& \text{mov} \\ \\ \text{SPEFSCR}_{\text{SOV}} &\leftarrow \text{ov} &|& \text{mov} \\ \end{array} ``` The low word signed fractional elements in RA and RB are multiplied producing a 64-bit product. If both inputs are -1.0, the product saturates to the largest positive signed fraction. The 64-bit product is then subtracted from the accumulator and the result is placed in RT and the accumulator. ### **Special Registers Altered:** ACC OV OVH SOV ### Vector Multiply Word Unsigned, Modulo, Integer to Accumulator EVX-form evmwumia RT,RA,RB $$\begin{array}{l} \mathtt{RT_{0:63}} \leftarrow \mathtt{(RA)_{32:63}} \ \mathtt{X_{ui}} \ \mathtt{(RB)_{32:63}} \\ \mathtt{ACC_{0:63}} \leftarrow \mathtt{(RT)_{0:63}} \end{array}$$ The low word unsigned-integer elements in RA and RB are multiplied to form a 64-bit product that is placed in RT and into the accumulator. ### **Special Registers Altered:** ### Vector Multiply Word Unsigned, Modulo, Integer and Accumulate EVX-form evmwumiaa RT,RA,RB $$\begin{array}{l} \mathsf{temp}_{0:63} \leftarrow (\mathtt{RA})_{32:63} \ \mathsf{X_{ui}} \ (\mathtt{RB})_{32:63} \\ \mathsf{RT}_{0:63} \leftarrow (\mathtt{ACC})_{0:63} + \mathsf{temp}_{0:63} \\ \mathsf{ACC}_{0:63} \leftarrow (\mathtt{RT})_{0:63} \end{array}$$ The low word unsigned-integer elements in RA and RB are multiplied. The intermediate product is added to the contents of the 64-bit accumulator, and the resulting value is placed into the accumulator and in RT. ### **Special Registers Altered:** **ACC** ### **Vector NAND** **EVX-form** evnand RT,RA,RB | 4 | RT | RA | RB | 542 | ٦ | |---|----|----|----|------|---| | 0 | 6 | 11 | 16 | 21 3 | 1 | $$RT_{0:31} \leftarrow \neg ((RA)_{0:31} \& (RB)_{0:31})$$ $RT_{32:63} \leftarrow \neg ((RA)_{32:63} \& (RB)_{32:63})$ Each element of RA and RB is bitwise NANDed. The result is placed in the corresponding element of RT. ### **Special Registers Altered:** None ### Vector Multiply Word Unsigned, Modulo, Integer and Accumulate Negative EVX-form evmwumian RT,RA,RB | 4 | RT | RA | RB | 1496 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{l} \mathsf{temp}_{0:63} \leftarrow (\mathtt{RA})_{32:63} \times_{\mathrm{ui}} (\mathtt{RB})_{32:63} \\ \mathtt{RT}_{0:63} \leftarrow (\mathtt{ACC})_{0:63} - \mathtt{temp}_{0:63} \\ \mathtt{ACC}_{0:63} \leftarrow (\mathtt{RT})_{0:63} \end{array}$$ The low word unsigned-integer elements in RA and RB are multiplied. The intermediate product is subtracted from the contents of the 64-bit accumulator, and the resulting value is placed into the accumulator and in RT. ### **Special Registers Altered:** ACC ### Vector Negate **EVX-form** evneg RT,RA | 4 | RT | RA | /// | 521 | | |---|----|----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{l} \text{RT}_{0:31} \leftarrow \text{NEG}(\text{(RA)}_{0:31}) \\ \text{RT}_{32:63} \leftarrow \text{NEG}(\text{(RA)}_{32:63}) \end{array}$$ The negative of each element of RA is placed in RT. The negative of 0x8000\_0000 (most negative number) returns 0x8000\_0000. ### **Special Registers Altered:** ### Vector NOR ### EVX-form ### Vector OR evor 4 **EVX-form** 535 evnor RT,RA,RB | 4 | RT | RA | RB | 536 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 3 | 31 | $$RT_{0:31} \leftarrow (RA)_{0:31} \mid (RB)_{0:31}$$ $RT_{32:63} \leftarrow (RA)_{32:63} \mid (RB)_{32:63}$ RT RT,RA,RB RA $RT_{0:31} \leftarrow \neg((RA)_{0:31} \mid (RB)_{0:31})$ $RT_{32:63} \leftarrow \neg ((RA)_{32:63} \mid (RB)_{32:63})$ Each element of RA and RB is bitwise NORed. The result is placed in the corresponding element of RT. Each element of RA and RB is bitwise ORed. The result is placed in the corresponding element of RT. RB ### **Special Registers Altered:** None **Extended:** evnot RT,RA ### **Extended Mnemonics:** Extended mnemonics are provided for the Vector NOR instruction to produce a vector bitwise complement Equivalent to: evnor RT,RA,RA operation. ### **Special Registers Altered:** None **Extended Mnemonics:** Extended mnemonics are provided for the Vector OR instruction to provide a 64-bit vector move instruction. Extended: Equivalent to: evmr RT,RA evor RT,RA,RA #### Vector OR with Complement **EVX-form** RT,RA,RB evorc | 4 | RT | RA | RB | 539 | | |---|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{l} \operatorname{RT}_{0:31} \leftarrow (\operatorname{RA})_{0:31} \ | \ (\neg(\operatorname{RB})_{0:31}) \\ \operatorname{RT}_{32:63} \leftarrow (\operatorname{RA})_{32:63} \ | \ (\neg(\operatorname{RB})_{32:63}) \end{array}$$ Each element of RA is bitwise ORed with the complement of RB. The result is placed in the corresponding element of RT. ### Special Registers Altered: None #### Vector Rotate Left Word EVX-form RT,RA,RB evrlw | 4 | RT | RA | RB | 552 | | |---|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 3′ | ``` nh \leftarrow (RB)_{27:31} nl \leftarrow (RB)_{59:63} RT_{0:31} \leftarrow ROTL((RA)_{0:31}, nh) RT_{32:63} \leftarrow ROTL((RA)_{32:63}, nl) ``` Each of the high and low elements of RA is rotated left by an amount specified in RB. The result is placed in RT. Rotate values for each element of RA are found in bit positions RB<sub>27:31</sub> and RB<sub>59:63</sub>. ### **Special Registers Altered:** ### Vector Rotate Left Word Immediate EVX-form evrlwi RT,RA,UI ``` \begin{array}{l} n \; \leftarrow \text{UI} \\ \text{RT}_{0:31} \; \leftarrow \; \text{ROTL} \left( \; \left( \text{RA} \right)_{0:31}, \; \; n \right) \\ \text{RT}_{32:63} \; \leftarrow \; \text{ROTL} \left( \; \left( \text{RA} \right)_{32:63}, \; \; n \right) \end{array} ``` Both the high and low elements of RA are rotated left by an amount specified by UI. ### **Special Registers Altered:** None ### **Vector Select** EVS-form evsel RT,RA,RB,BFA | | 4 | RT | RA | RB | 79 | BFA | |---|---|----|----|----|----|-------| | 0 | | 6 | 11 | 16 | 21 | 29 31 | ``` \begin{array}{l} \text{ch} \leftarrow \text{CR}_{\text{BFAX4}} \\ \text{cl} \leftarrow \text{CR}_{\text{BFAX4+1}} \\ \text{if (ch = 1) then RT}_{0:31} \leftarrow (\text{RA})_{0:31} \\ \text{else RT}_{0:31} \leftarrow (\text{RB})_{0:31} \\ \text{if (cl = 1) then RT}_{32:63} \leftarrow (\text{RA})_{32:63} \\ \text{else RT}_{32:63} \leftarrow (\text{RB})_{32:63} \end{array} ``` If the most significant bit in the BFA field of CR is set to 1, the high-order element of RA is placed in the high-order element of RT; otherwise, the high-order element of RB is placed into the high-order element of RT. If the next most significant bit in the BFA field of CR is set to 1, the low-order element of RA is placed in the low-order element of RT, otherwise, the low-order element of RB is placed into the low-order element of RT. ### **Special Registers Altered:** None ### Vector Round Word **EVX-form** evrndw RT,RA | 4 | RT | RA | /// | 524 | | |---|----|----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{l} \text{RT}_{0:31} \leftarrow ((\text{RA})_{0:31} + 0\text{x00008000}) & \text{0xFFFF0000} \\ \text{RT}_{32:63} \leftarrow ((\text{RA})_{32:63} + 0\text{x00008000}) & \text{0xFFFF0000} \end{array} ``` The 32-bit elements of RA are rounded into 16 bits. The result is placed in RT. The resulting 16 bits are placed in the most significant 16 bits of each element of RT, zeroing out the low-order 16 bits of each element. ### **Special Registers Altered:** ### Vector Shift Left Word ### **EVX-form** evslw RT,RA,RB | 4 | RT | RA | RB | 548 | | |---|----|----|----|------|---| | 0 | 6 | 11 | 16 | 21 3 | 1 | $$\begin{array}{l} \text{nh} \leftarrow (\text{RB})_{\,26:\,31} \\ \text{nl} \leftarrow (\text{RB})_{\,58:\,63} \\ \text{RT}_{\,0:\,31} \leftarrow \text{SL}(\,(\text{RA})_{\,0:\,31},\,\,\text{nh}) \\ \text{RT}_{\,32:\,63} \leftarrow \text{SL}(\,(\text{RA})_{\,32:\,63},\,\,\text{nl}) \end{array}$$ Each of the high and low elements of RA is shifted left by an amount specified in RB. The result is placed in RT. The separate shift amounts for each element are specified by 6 bits in RB that lie in bit positions 26:31 and 58:63. Shift amounts from 32 to 63 give a zero result. ### **Special Registers Altered:** None ### Vector Splat Fractional Immediate EVX-form evsplatfi RT,SI | 4 | RT | SI | /// | 555 | | |---|----|----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{l} \operatorname{RT}_{0:31} \leftarrow \operatorname{SI} \mid \mid \ ^{27} \operatorname{0} \\ \operatorname{RT}_{32:63} \leftarrow \operatorname{SI} \mid \mid \ ^{27} \operatorname{0} \end{array}$$ The value specified by SI is padded with trailing zeros and placed in both elements of RT. The SI ends up in bit positions $RT_{0:4}$ and $RT_{32:36}$ . ### **Special Registers Altered:** None # Vector Shift Right Word Immediate Signed EVX-form evsrwis RT,RA,UI | 4 | RT | RA | UI | 547 | | |---|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{l} \text{n} \leftarrow \text{UI} \\ \text{RT}_{\text{0:31}} \leftarrow \text{EXTS}\left(\left(\text{RA}\right)_{\text{0:31-n}}\right) \\ \text{RT}_{\text{32:63}} \leftarrow \text{EXTS}\left(\left(\text{RA}\right)_{\text{32:63-n}}\right) \end{array}$$ Both high and low elements of RA are shifted right by the 5-bit UI value. Bits in the most significant positions vacated by the shift are filled with a copy of the sign bit. ### **Special Registers Altered:** None ## Vector Shift Left Word Immediate EVX-form evslwi RT,RA,UI | Ī | 4 | RT | RA | UI | 550 | | |---|---|----|----|----|-----|----| | | 0 | 6 | 11 | 16 | 21 | 31 | $$n \leftarrow UI$$ $RT_{0:31} \leftarrow SL((RA)_{0:31}, n)$ $RT_{32:63} \leftarrow SL((RA)_{32:63}, n)$ Both high and low elements of RA are shifted left by the 5-bit UI value and the results are placed in RT. ### **Special Registers Altered:** None ### Vector Splat Immediate EVX-form evsplati RT,SI | 4 | RT | SI | /// | 553 | | |---|----|----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{l} \operatorname{RT}_{0:31} \leftarrow \operatorname{EXTS}\left(\operatorname{SI}\right) \\ \operatorname{RT}_{32:63} \leftarrow \operatorname{EXTS}\left(\operatorname{SI}\right) \end{array}$$ The value specified by SI is sign extended and placed in both elements of RT. ### **Special Registers Altered:** None # Vector Shift Right Word Immediate Unsigned EVX-form evsrwiu RT,RA,UI $$\begin{array}{l} n \leftarrow \text{UI} \\ \text{RT}_{0:31} \leftarrow \text{EXTZ}\left(\left(\text{RA}\right)_{0:31-n}\right) \\ \text{RT}_{32:63} \leftarrow \text{EXTZ}\left(\left(\text{RA}\right)_{32:63-n}\right) \end{array}$$ Both high and low elements of RA are shifted right by the 5-bit UI value; zeros are shifted into the most significant position. ### **Special Registers Altered:** ### Vector Shift Right Word Signed EVX-form evsrws RT,RA,RB ``` \begin{array}{l} \text{nh} \leftarrow (\text{RB})_{26:31} \\ \text{nl} \leftarrow (\text{RB})_{58:63} \\ \text{RT}_{0:31} \leftarrow \text{EXTS} ((\text{RA})_{0:31-\text{nh}}) \\ \text{RT}_{32:63} \leftarrow \text{EXTS} ((\text{RA})_{32:63-\text{nl}}) \end{array} ``` Both the high and low elements of RA are shifted right by an amount specified in RB. The result is placed in RT. The separate shift amounts for each element are specified by 6 bits in RB that lie in bit positions 26:31 and 58:63. The sign bits are shifted into the most significant position. Shift amounts from 32 to 63 give a result of 32 sign bits. ### **Special Registers Altered:** None ### Vector Store Double of Double EVX-form evstdd RS,D(RA) | 4 | RS | RA | UI | 801 | | |---|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTZ(UI\times8) MEM(EA,8) \leftarrow (RS)<sub>0:63</sub> ``` D in the instruction mnemonic is UI $\times$ 8. The contents of RS are stored as a doubleword in storage addressed by EA. ### **Special Registers Altered:** None ### Vector Shift Right Word Unsigned EVX-form evsrwu RT,RA,RB | 4 | RT | RA | RB | 544 | | |---|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{l} \text{nh} \; \leftarrow \; (\text{RB})_{\,26:\,31} \\ \text{nl} \; \leftarrow \; (\text{RB})_{\,58:\,63} \\ \text{RT}_{\,0:\,31} \; \leftarrow \; \text{EXTZ} \left( \; (\text{RA})_{\,0:\,31-\text{nh}} \right) \\ \text{RT}_{\,32:\,63} \; \leftarrow \; \text{EXTZ} \left( \; (\text{RA})_{\,32:\,63-\text{nl}} \right) \end{array} ``` Both the high and low elements of RA are shifted right by an amount specified in RB. The result is placed in RT. The separate shift amounts for each element are specified by 6 bits in RB that lie in bit positions 26:31 and 58:63. Zeros are shifted into the most significant position. Shift amounts from 32 to 63 give a zero result. ### **Special Registers Altered:** None ### Vector Store Double of Double Indexed EVX-form evstddx RS,RA,RB ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) MEM(EA, 8) \leftarrow (RS)_{0:63} ``` The contents of RS are stored as a doubleword in storage addressed by EA. ### **Special Registers Altered:** ### Vector Store Double of Four Halfwords EVX-form evstdh RS,D(RA) | 4 | RS | RA | UI | 805 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 3 | 31 | if (RA = 0) then $b \leftarrow 0$ else $b \leftarrow (RA)$ $EA \leftarrow b + EXTZ(UI \times 8)$ $MEM(EA, 2) \leftarrow (RS)_{0:15}$ $MEM(EA+2, 2) \leftarrow (RS)_{16:31}$ $MEM(EA+4, 2) \leftarrow (RS)_{32:47}$ $MEM(EA+6, 2) \leftarrow (RS)_{48:63}$ D in the instruction mnemonic is UI $\times$ 8. The contents of RS are stored as four halfwords in storage addressed by EA. ### **Special Registers Altered:** None ### Vector Store Double of Two Words EVX-form evstdw RS,D(RA) | 4 | RS | RA | UI | 803 | | |---|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | if (RA = 0) then $b \leftarrow 0$ else $b \leftarrow (RA)$ $EA \leftarrow b + EXTZ(UI \times 8)$ $MEM(EA, 4) \leftarrow (RS)_{0:31}$ $MEM(EA+4, 4) \leftarrow (RS)_{32:63}$ D in the instruction mnemonic is UI $\times$ 8. The contents of RS are stored as two words in storage addressed by EA. ### **Special Registers Altered:** None ### Vector Store Double of Four Halfwords Indexed EVX-form evstdhx RS,RA,RB | 4 | RS | RA | RB | 804 | |---|----|----|----|------| | 0 | 6 | 11 | 16 | 21 3 | if (RA = 0) then $b \leftarrow 0$ else $b \leftarrow (RA)$ $EA \leftarrow b + (RB)$ $MEM(EA,2) \leftarrow (RS)_{0:15}$ $MEM(EA+2,2) \leftarrow (RS)_{16:31}$ $MEM(EA+4,2) \leftarrow (RS)_{32:47}$ $MEM(EA+6,2) \leftarrow (RS)_{48:63}$ The contents of RS are stored as four halfwords in storage addressed by EA. ### **Special Registers Altered:** None ### Vector Store Double of Two Words Indexed EVX-form evstdwx RS,RA,RB | 4 | RS | RA | RB | 802 | | |---|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | if (RA = 0) then $b \leftarrow 0$ else $b \leftarrow (RA)$ $EA \leftarrow b + (RB)$ $MEM(EA,4) \leftarrow (RS)_{0:31}$ $MEM(EA+4,4) \leftarrow (RS)_{32:63}$ The contents of RS are stored as two words in storage addressed by EA. ### **Special Registers Altered:** ### Vector Store Word of Two Halfwords from Even EVX-form evstwhe RS,D(RA) | 4 | RS | RA | UI | 817 | |---|----|----|----|-------| | 0 | 6 | 11 | 16 | 21 31 | ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTZ(UI\times4) MEM(EA,2) \leftarrow (RS)<sub>0:15</sub> MEM(EA+2,2) \leftarrow (RS)<sub>32:47</sub> ``` D in the instruction mnemonic is $UI \times 4$ . The even halfwords from each element of RS are stored as two halfwords in storage addressed by EA. ### **Special Registers Altered:** None # Vector Store Word of Two Halfwords from Odd EVX-form evstwho RS,D(RA) | 4 | RS | RA | UI | 821 | | |---|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTZ(UI\times4) MEM(EA,2) \leftarrow (RS)<sub>16:31</sub> MEM(EA+2,2) \leftarrow (RS)<sub>48:63</sub> ``` D in the instruction mnemonic is UI $\times$ 4. The odd halfwords from each element of RS are stored as two halfwords in storage addressed by EA. ### **Special Registers Altered:** None ### Vector Store Word of Word from Even EVX-form evstwwe RS,D(RA) | 4 | RS | RA | UI | 825 | | |---|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTZ(UI \times 4) MEM(EA, 4) \leftarrow (RS)_{0:31} ``` D in the instruction mnemonic is $UI \times 4$ . The even word of RS is stored in storage addressed by EA. ### **Special Registers Altered:** None ### Vector Store Word of Two Halfwords from Even Indexed EVX-form evstwhex RS,RA,RB | 4 | RS | RA | RB | 816 | | |---|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) MEM(EA,2) \leftarrow (RS)_{0:15} MEM(EA+2,2) \leftarrow (RS)_{32:47} ``` The even halfwords from each element of RS are stored as two halfwords in storage addressed by EA. ### **Special Registers Altered:** None ## Vector Store Word of Two Halfwords from Odd Indexed EVX-form evstwhox RS,RA,RB | 4 | RS | RA | RB | 820 | | |---|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) MEM(EA, 2) \leftarrow (RS)_{16:31} MEM(EA+2, 2) \leftarrow (RS)_{48:63} ``` The odd halfwords from each element of RS are stored as two halfwords in storage addressed by EA. ### **Special Registers Altered:** None ### Vector Store Word of Word from Even Indexed EVX-form evstwwex RS,RA,RB | 4 | RS | RA | RB | 824 | | |---|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | if (RA = 0) then $b \leftarrow 0$ else $b \leftarrow (RA)$ $EA \leftarrow b + (RB)$ $MEM(EA, 4) \leftarrow (RS)_{0:31}$ The even word of RS is stored in storage addressed by $\mathsf{F}\Delta$ ### **Special Registers Altered:** ### Vector Store Word of Word from Odd EVX-form evstwwo RS,D(RA) | 4 | RS | RA | UI | 829 | ٦ | |---|----|----|----|------|---| | 0 | 6 | 11 | 16 | 21 3 | 1 | ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTZ(UI \times 4) MEM(EA, 4) \leftarrow (RS)_{32:63} ``` D in the instruction mnemonic is $UI \times 4$ . The odd word of RS is stored in storage addressed by EA. ### **Special Registers Altered:** None ### Vector Subtract Signed, Modulo, Integer to Accumulator Word EVX-form evsubfsmiaaw RT,RA | 4 | RT | RA | /// | 1227 | | |---|----|----|-----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{lll} \text{RT}_{0:31} & \leftarrow (\text{ACC})_{0:31} & - & (\text{RA})_{0:31} \\ \text{RT}_{32:63} & \leftarrow (\text{ACC})_{32:63} & - & (\text{RA})_{32:63} \\ \text{ACC}_{0:63} & \leftarrow (\text{RT})_{0:63} \end{array}$$ Each word element in RA is subtracted from the corresponding element in the accumulator and the difference is placed into the corresponding RT word and into the accumulator. ### **Special Registers Altered:** ACC ### Vector Store Word of Word from Odd Indexed EVX-form evstwwox RS,RA,RB | 4 | RS | RA | RB | 828 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 : | 31 | ``` if (RA = 0) then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) MEM(EA, 4) \leftarrow (RS)_{32:63} ``` The odd word of RS is stored in storage addressed by FA ### **Special Registers Altered:** None ## Vector Subtract Signed, Saturate, Integer to Accumulator Word EVX-form evsubfssiaaw RT,RA | 4 | RT | RA | /// | 1219 | | |---|----|----|-----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{llll} \operatorname{temp}_{0:63} \leftarrow \operatorname{EXTS}\left(\left(\operatorname{ACC}\right)_{0:31}\right) & - & \operatorname{EXTS}\left(\left(\operatorname{RA}\right)_{0:31}\right) \\ \operatorname{ovh} & \leftarrow \operatorname{temp}_{31} \ \oplus \ \operatorname{temp}_{32} \\ \operatorname{RT}_{0:31} & \leftarrow \operatorname{SATURATE}\left(\operatorname{ovh}, \ \operatorname{temp}_{31}, \ \operatorname{0x8000\_0000}, \\ & & \operatorname{0x7FFF\_FFFF}, \ \operatorname{temp}_{32:63}\right) \\ \operatorname{temp}_{0:63} & \leftarrow \operatorname{EXTS}\left(\left(\operatorname{ACC}\right)_{32:63}\right) & - & \operatorname{EXTS}\left(\left(\operatorname{RA}\right)_{32:63}\right) \\ \operatorname{ovl} & \leftarrow \operatorname{temp}_{31} \ \oplus \ \operatorname{temp}_{32} \\ \operatorname{RT}_{32:63} & \leftarrow \operatorname{SATURATE}\left(\operatorname{ovl}, \ \operatorname{temp}_{31}, \ \operatorname{0x8000\_0000}, \\ & & \operatorname{0x7FFF\_FFFF}, \ \operatorname{temp}_{32:63}\right) \\ \operatorname{ACC}_{0:63} & \leftarrow \left(\operatorname{RT}\right)_{0:63} \\ \operatorname{SPEFSCR}_{\operatorname{OVH}} & \leftarrow \operatorname{ovh} \\ \operatorname{SPEFSCR}_{\operatorname{SOVH}} & \leftarrow \operatorname{sPEFSCR}_{\operatorname{SOVH}} & | \ \operatorname{ovh} \\ \operatorname{SPEFSCR}_{\operatorname{SOVH}} & \leftarrow \operatorname{SPEFSCR}_{\operatorname{SOVH}} & | \ \operatorname{ovh} \\ \operatorname{SPEFSCR}_{\operatorname{SOV}} & \leftarrow \operatorname{SPEFSCR}_{\operatorname{SOV}} & | \ \operatorname{ovl} \end{array} ``` Each signed-integer word element in RA is sign-extended and subtracted from the corresponding sign-extended element in the accumulator saturating if overflow occurs, and the results are placed in RT and the accumulator. ### **Special Registers Altered:** ACC OV OVH SOV SOVH ### Vector Subtract Unsigned, Modulo, Integer to Accumulator Word EVX-form evsubfumiaaw RT,RA ``` RT_{0:31} \leftarrow (ACC)_{0:31} - (RA)_{0:31} RT_{32:63} \leftarrow (ACC)_{32:63} - (RA)_{32:63} ACC_{0:63} \leftarrow (RT)_{0:63} ``` Each unsigned-integer word element in RA is subtracted from the corresponding element in the accumulator and the results are placed in RT and into the accumulator. ### **Special Registers Altered:** ACC ### Vector Subtract Unsigned, Saturate, Integer to Accumulator Word EVX-form evsubfusiaaw RT,RA ``` \begin{array}{l} {\rm temp}_{0:63} \leftarrow {\rm EXTZ}\,(\,({\rm ACC})_{\,0:31}) \ - \ {\rm EXTZ}\,(\,({\rm RA})_{\,0:31}) \\ {\rm ovh} \leftarrow {\rm temp}_{31} \\ {\rm RT}_{0:31} \leftarrow {\rm SATURATE}\,({\rm ovh}, \ {\rm temp}_{31}, \ 0{\rm x00000\_0000}, \\ {\rm 0x00000\_0000}, \ {\rm temp}_{32:63}) \\ {\rm temp}_{0:63} \leftarrow {\rm EXTS}\,(\,({\rm ACC})_{\,32:63}) \ - \ {\rm EXTS}\,(\,({\rm RA})_{\,32:63}) \\ {\rm ovl} \leftarrow {\rm temp}_{31} \\ {\rm RT}_{32:63} \leftarrow {\rm SATURATE}\,({\rm ovl}, \ {\rm temp}_{31}, \ 0{\rm x00000\_0000}, \\ {\rm 0x00000\_0000}, \ {\rm temp}_{32:63}) \\ {\rm ACC}_{0:63} \leftarrow ({\rm RT})_{\,0:63} \\ \\ {\rm SPEFSCR_{\rm OVH}} \leftarrow {\rm ovh} \\ {\rm SPEFSCR_{\rm SOVH}} \leftarrow {\rm sPEFSCR_{\rm SOVH}} \mid {\rm ovh} \\ {\rm SPEFSCR_{\rm SOVH}} \leftarrow {\rm SPEFSCR_{\rm SOVH}} \mid {\rm ovh} \\ {\rm SPEFSCR_{\rm SOV}} \leftarrow {\rm SPEFSCR_{\rm SOV}} \mid {\rm ovl} \\ \end{array} ``` Each unsigned-integer word element in RA is zero-extended and subtracted from the corresponding zero-extended element in the accumulator saturating if overflow occurs, and the results are placed in RT and the accumulator. ### **Special Registers Altered:** ACC OV OVH SOV SOVH ### Vector Subtract from Word EVX-form evsubfw RT,RA,RB | 4 | RT | RA | RB | 516 | ٦ | |---|----|----|----|------|---| | 0 | 6 | 11 | 16 | 21 3 | 1 | $$RT_{0:31} \leftarrow (RB)_{0:31} - (RA)_{0:31}$$ $RT_{32:63} \leftarrow (RB)_{32:63} - (RA)_{32:63}$ Each signed-integer element of RA is subtracted from the corresponding element of RB and the results are placed in RT. ### **Special Registers Altered:** None # Vector Subtract Immediate from Word EVX-form evsubifw RT,UI,RB | 4 | RT | UI | RB | 518 | |---|----|----|----|------| | 0 | 6 | 11 | 16 | 21 3 | $$\begin{array}{l} \mathtt{RT_{0:31}} \leftarrow \mathtt{(RB)_{0:31}} - \mathtt{EXTZ}\mathtt{(UI)} \\ \mathtt{RT_{32:63}} \leftarrow \mathtt{(RB)_{32:63}} - \mathtt{EXTZ}\mathtt{(UI)} \end{array}$$ UI is zero-extended and subtracted from both the high and low elements of RB. Note that the same value is subtracted from both elements of the register. ### **Special Registers Altered:** None ### Vector XOR **EVX-form** evxor RT,RA,RB | 4 | RT | RA | RB | 534 | | |---|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$RT_{0:31} \leftarrow (RA)_{0:31} \oplus (RB)_{0:31}$$ $RT_{32:63} \leftarrow (RA)_{32:63} \oplus (RB)_{32:63}$ Each element of RA and RB is exclusive-ORed. The results are placed in RT. ### **Special Registers Altered:** **Chapter 8. Embedded Floating-Point** [Category: SPE.Embedded Float Scalar Double] [Category: SPE.Embedded Float Scalar Single] [Category: SPE.Embedded Float Vector] | <ul><li>8.1 Overview</li></ul> | 8.2.4.1 Sticky Bit Handling For Exception Conditions | |-------------------------------------------------|------------------------------------------------------------------------------------------| | (SPEFSCR) | 8.3.1 Load/Store Instructions 319 8.3.2 SPE.Embedded Float Vector Instruc- | | <ul><li>8.2.3 Exception Conditions</li></ul> | tions [Category: SPE.Embedded Float<br>Vector] | | and Underflow | Instructions [Category: SPE.Embedded Float Scalar Single] | | 8.2.3.4 Embedded Floating-Point Round (Inexact) | 8.3.4 SPE.Embedded Float Scalar Double Instructions [Category: SPE.Embedded Float Scalar | | by Zero | Double] | | , | , | ### 8.1 Overview The *Embedded Floating-Point* categories require the implementation of the Signal Processing Engine (SPE) category and consist of three distinct categories: - Embedded vector single-precision floating-point (SPE.Embedded Float Vector [SP.FV]) - Embedded scalar single-precision floating-point (SPE.Embedded Float Scalar Single [SP.FS]) - Embedded scalar double-precision floating-point (SPE.Embedded Float Scalar Double [SP.FD]) Although each of these may be implemented independently, they are defined in a single chapter because it is likely that they may be implemented together. References to *Embedded Floating-Point* categories, *Embedded Floating-Point* instructions, or *Embedded Floating-Point* operations apply to all 3 categories. Single-precision floating-point is handled by the SPE.Embedded Float Vector and SPE.Embedded Float Scalar Single categories; double-precision floating-point is handled by the SPE.Embedded Float Scalar Double category. ### 8.2 Programming Model Embedded floating-point operations are performed in the GPRs of the processor. The SPE.Embedded Float Vector and SPE.Embedded Float Scalar Double categories require a GPR register file with thirty-two 64-bit registers as required by the Signal Processing Engine category. The SPE.Embedded Float Scalar Single category requires a GPR register file with thirty-two 32-bit registers. When implemented with a 64-bit register file on a 32-bit implementation, instructions in this category only use and modify bits 32:63 of the GPR. In this case, bits 0:31 of the GPR are left unchanged by the operation. For 64-bit implementations, bits 0:31 are unchanged after the operation. Instructions in the SPE.Embedded Float Scalar Double category operate on the entire 64 bits of the GPRs. Instructions in the SPE.Embedded Float Vector category operate on the entire 64 bits of the GPRs as well, but contain two 32-bit data items that are operated on independently of each other in a SIMD fashion. The format of both data items is the same as the format of a data item in the SPE.Embedded Float Scalar Single category. The data item contained in bits 0:31 is called the 'high word'. The data item contained in bits 32:63 is called the 'low word'. There are no record forms of *Embedded Floating-Point* instructions. *Embedded Floating-Point Compare* instructions treat NaNs, Infinity, and Denorm as normalized numbers for the comparison calculation when default results are provided. ### 8.2.1 Signal Processing Embedded Floating-Point Status and Control Register (SPEFSCR) Status and control for the *Embedded Floating-Point* categories uses the SPEFSCR. This register is defined by the Signal Processing Engine category in Section 7.3.4. Status and control bits are shared for *Embedded Floating-Point* and SPE operations. Instructions in the SPE.Embedded Float Vector category affect both the high element (bits 34:39) and low element floating-point status flags (bits 50:55). Instructions in the SPE.Embedded Float Scalar Double and SPE.Embedded Float Scalar Single categories affect only the low element floating-point status flags and leave the high element floating-point status flags undefined. ### 8.2.2 Floating-Point Data Formats Single-precision floating-point data elements are 32 bits wide with 1 sign bit (s), 8 bits of biased exponent (e) and 23 bits of fraction (f). Double-precision float- ing-point data elements are 64 bits wide with 1 sign bit (s), 11 bits of biased exponent (e) and 52 bits of fraction (f). In the IEEE 754 specification, floating-point values are represented in a format consisting of three explicit fields (sign field, biased exponent field, and fraction field) and an implicit hidden bit. Figure 108.Floating-Point Data Format fraction - fractional portion of number For single-precision normalized numbers, the biased exponent value *e* lies in the range of 1 to 254 corresponding to an actual exponent value E in the range -126 to +127. For double-precision normalized numbers, the biased exponent value *e* lies in the range of 1 to 2046 corresponding to an actual exponent value E in the range -1022 to +1023. With the hidden bit implied to be '1' (for normalized numbers), the value of the number is interpreted as follows: $$(-1)^{s} \times 2^{E} \times (1.fraction)$$ where E is the unbiased exponent and 1.fraction is the mantissa (or significand) consisting of a leading '1' (the hidden bit) and a fractional part (fraction field). For the single-precision format, the maximum positive normalized number (pmax) is represented by the encoding 0x7F7FFFFF which is approximately 3.4E+38 (2<sup>128</sup>), and the minimum positive normalized value (pmin) is represented by the encoding 0x00800000 which is approximately 1.2E-38 (2<sup>-126</sup>). For the double-precision format, the maximum positive normalized number is represented the by 0x7feFFFFF FFFFFFF which is approximately 1.8E+307 (2<sup>1024</sup>), and the minimum positive normalized value (pmin) is represented by the encoding 0x00100000 00000000 which is approximately 2.2E-308 (2<sup>-1022</sup>). Two specific values of the biased exponent are reserved (0 and 255 for single-precision; 0 and 2047 for double-precision) for encoding special values of +0, -0, +infinity, -infinity, and NaNs. Zeros of both positive and negative sign are represented by a biased exponent value *e* of 0 and a fraction *f* which is 0. Infinities of both positive and negative sign are represented by a maximum exponent field value (255 for single-precision, 2047 for double-precision) and a fraction which is 0. Denormalized numbers of both positive and negative sign are represented by a biased exponent value *e* of 0 and a fraction *f*, which is nonzero. For these numbers, the hidden bit is defined by the IEEE 754 standard to be 0. This number type is not directly supported in hardware. Instead, either a software interrupt handler is invoked, or a default value is defined. Not-a-Numbers (NaNs) are represented by a maximum exponent field value (255 for single-precision, 2047 for double-precision) and a fraction *f* which is nonzero. ### 8.2.3 Exception Conditions ### 8.2.3.1 Denormalized Values on Input Any denormalized value used as an operand may be truncated by the implementation to a properly signed zero value. ### 8.2.3.2 Embedded Floating-Point Overflow and Underflow Defining *pmax* to be the most positive normalized value (farthest from zero), *pmin* the smallest positive normalized value (closest to zero), *nmax* the most negative normalized value (farthest from zero) and *nmin* the smallest normalized negative value (closest to zero), an overflow is said to have occurred if the numerically correct result (*r*) of an instruction is such that *r>pmax* or *r<nmax*. An underflow is said to have occurred if the numerically correct result of an instruction is such that 0<*r<pmin* or *nmin<r<0*. In this case, *r* may be denormalized, or may be smaller than the smallest denormalized number. The Embedded Floating-Point categories do not produce +Infinity, -Infinity, NaN, or denormalized numbers. If the result of an instruction overflows and Embedded Floating-Point Overflow exceptions are disabled (SPEFSCR<sub>FOVFE</sub>=0), pmax or nmax is generated as the result of that instruction depending upon the sign of the result. If the result of an instruction underflows and Embedded Floating-Point Underflow exceptions are disabled (SPEFSCR<sub>FUNFE</sub>=0), +0 or -0 is generated as the result of that instruction based upon the sign of the result. If an overflow occurs, SPEFSCR<sub>FOVF</sub> FOVFH are set appropriately, or if an underflow occurs, SPEFSCR-FUNF FUNFH are set appropriately. If either Embedded Floating-Point Underflow or Embedded Floating-Point Overflow exceptions are enabled and a corresponding status bit is 1, an Embedded Floating-Point Data interrupt is taken and the destination register is not updated. ### Programming Note - On some implementations, operations that result in overflow or underflow are likely to take significantly longer than operations that do not. For example, these operations may cause a system error handler to be invoked; on such implementations, the system error handler updates the overflow bits appropriately. # 8.2.3.3 Embedded Floating-Point Invalid Operation/Input Errors Embedded Floating-Point Invalid Operation/Input errors occur when an operand to an operation contains an invalid input value. If any of the input values are Infinity, Denorm, or NaN, or for an *Embedded Floating-Point Divide* instruction both operands are +/-0, SPEFSCR<sub>FINV FINVH</sub> are set to 1 appropriately, and SPEFSCR<sub>FGH FXH FG FX</sub> are set to 0 appropriately. If SPEFSCR<sub>FINVE</sub>=1, an Embedded Floating-Point Data interrupt is taken and the destination register is not updated. # 8.2.3.4 Embedded Floating-Point Round (Inexact) If any result element of an *Embedded Floating-Point* instruction is inexact, or overflows but Embedded Floating-Point Overflow exceptions are disabled, or underflows but Embedded Floating-Point Underflow exceptions are disabled, and no higher priority interrupt occurs, SPEFSCR<sub>FINXS</sub> is set to 1. If the Embedded Floating-Point Round (Inexact) exception is enabled, an Embedded Floating-Point Round interrupt occurs. In this case, the destination register is updated with the truncated result(s). The SPEFSCR<sub>FGH FXH FG FX</sub> bits are properly updated to allow rounding to be performed in the interrupt handler. SPEFSCR $_{\text{FG}}$ FX (SPEFSCR $_{\text{FGH}}$ FXH) are set to 0 if an Embedded Floating-Point Data interrupt is taken due to overflow, underflow, or if an Embedded Floating-Point Invalid Operation/Input error is signaled for the low (high) element (regardless of SPEFSCR $_{\text{FINVE}}$ ). # 8.2.3.5 Embedded Floating-Point Divide by Zero If an *Embedded Floating-Point Divide* instruction executes and an Embedded Floating-Point Invalid Operation/Input error does not occur and the instruction is executed with a +/-0 divisor value and a finite normalized nonzero dividend value, an Embedded Floating-Point Divide By Zero exception occurs and SPEFSCR<sub>FDBZ</sub> FDBZH are set appropriately. If Embedded Floating-Point Divide By Zero exceptions are enabled, an Embedded Floating-Point Data interrupt is then taken and the destination register is not updated. ### 8.2.3.6 Default Results Default results are generated when an Embedded Floating-Point Invalid Operation/Input Error, Embedded Floating-Point Overflow, Embedded Floating-Point Underflow, or Embedded Floating-Point Divide by Zero occurs on an Embedded Floating-Point operation. Default results provide a normalized value as a result of the operation. In general, Denorm results and underflows are set to 0 and overflows are saturated to the maximum representable number. Default results produced for each operation are described in Section 8.4, "Embedded Floating-Point Results Summary". ### 8.2.4 IEEE 754 Compliance The Embedded Floating-Point categories require a floating-point system as defined in the ANSI/IEEE Standard 754-1985 but may rely on software support in order to conform fully with the standard. Thus, whenever an input operand of the Embedded Floating-Point instruction has data values that are +Infinity, -Infinity, Denormalized, NaN, or when the result of an operation produces an overflow or an underflow, an Embedded Floating-Point Data interrupt may be taken and the interrupt handler is responsible for delivering IEEE 754 compliant behavior if desired. When Embedded Floating-Point Invalid Operation/ Input Error exceptions are disabled (SPEFSCR<sub>FINIVE</sub> = 0), default results are provided by the hardware when an Infinity, Denormalized, or NaN input is received, or for the operation 0/0. When Embedded Floating-Point Underflow exceptions are disabled (SPEFSCR<sub>FLINEF</sub> = 0) and the result of a floating-point operation underflows, a signed zero result is produced. The Embedded Floating-Point Round (Inexact) exception is also signaled for this condition. When Embedded Floating-Point Overflow exceptions are disabled (SPEFSCR<sub>FOVFE</sub> = 0) and the result of a floating-point operation overflows, a pmax or nmax result is produced. The Embedded Floating-Point Round (Inexact) exception is also signaled for this condition. An exception enable flag (SPEFSCR<sub>FINXE</sub>) is also provided for generating an Embedded Floating-Point Round interrupt when an inexact result is produced, to allow a software handler to conform to the IEEE 754 standard. An Embedded Floating-Point Divide By Zero exception enable flag (SPEFSCR<sub>FDBZE</sub>) is provided for generating an Embedded Floating-Point Data interrupt when a divide by zero operation is attempted to allow a software handler to conform to the IEEE 754 standard. All of these exceptions may be disabled, and the hardware will then deliver an appropriate default result. The sign of the result of an addition operation is the sign of the source operand having the larger absolute value. If both operands have the same sign, the sign of the result is the same as the sign of the operands. This includes subtraction which is addition with the negation of the sign of the second operand. The sign of the result of an addition operation with operands of differing signs for which the result is zero is positive except when rounding to negative infinity. Thus -0 + -0 = -0, and all other cases which result in a zero value give +0 unless the rounding mode is round to negative infinity. ### Programming Note Note that when exceptions are disabled and default results computed, operations having input values that are denormalized may provide different results on different implementations. An implementation may choose to use the denormalized value or a zero value for any computation. Thus a computational operation involving a denormalized value and a normal value may return different results depending on the implementation. # 8.2.4.1 Sticky Bit Handling For Exception Conditions The SPEFSCR register defines sticky bits for retaining information about exception conditions that are detected. There are 5 sticky bits (FINXS, FINVS, FDBZS, FUNFS and FOVFS) that can be used to help provide IEEE 754 compliance. The sticky bits represent the combined 'or' of all the previous status bits produced from any *Embedded Floating-Point* operation since the last time software zeroed the sticky bit. The hardware will never set a sticky bit to 0. ### 8.3 Embedded Floating-Point Instructions ### 8.3.1 Load/Store Instructions Embedded Floating-Point instructions use GPRs to hold and operate on floating-point values. The Embedded Floating-Point categories do not define Load and Store instructions to move the data to and from memory, but instead rely on existing instructions in Book I to load and store data. # 8.3.2 SPE.Embedded Float Vector Instructions [Category: SPE.Embedded Float Vector] All SPE.Embedded Float Vector instructions are single-precision. There are no vector floating-point double-precision instructions # Vector Floating-Point Single-Precision Absolute Value EVX-form evfsabs RT,RA | 4 | RT | RA | /// | 644 | | |---|----|----|-----|-----|---| | 0 | 6 | 11 | 16 | 21 | 3 | $$\begin{array}{l} {\rm RT_{0:31}} \leftarrow {\rm 0b0} \mid \mid \ ({\rm RA})_{1:31} \\ {\rm RT_{32:63}} \leftarrow {\rm 0b0} \mid \mid \ ({\rm RA})_{33:63} \end{array}$$ The sign bit of each element in register RA is set to 0 and the results are placed into register RT. Regardless of the value of register RA, no exceptions are taken during the execution of this instruction. ### **Special Registers Altered:** None ### Vector Floating-Point Single-Precision Negative Absolute Value EVX-form evfsnabs RT,RA | 4 | RT | RA | /// | 645 | ٦ | |---|----|----|-----|------|---| | 0 | 6 | 11 | 16 | 21 3 | 1 | $$\begin{array}{l} {\rm RT_{0:31}} \leftarrow {\rm 0b1} \mid \mid \ ({\rm RA})_{1:31} \\ {\rm RT_{32:63}} \leftarrow {\rm 0b1} \mid \mid \ ({\rm RA})_{33:63} \end{array}$$ The sign bit of each element in register RA is set to 1 and the results are placed into register RT. Regardless of the value of register RA, no exceptions are taken during the execution of this instruction. ### **Special Registers Altered:** None ### Vector Floating-Point Single-Precision Negate EVX-form evfsneg RT,RA $$RT_{0:31} \leftarrow \neg (RA)_0 \mid | (RA)_{1:31}$$ $RT_{32:63} \leftarrow \neg (RA)_{32} \mid | (RA)_{33:63}$ The sign bit of each element in register RA is complemented and the results are placed into register RT. Regardless of the value of register RA, no exceptions are taken during the execution of this instruction. ### **Special Registers Altered:** ### Vector Floating-Point Single-Precision Add EVX-form evfsadd RT,RA,RB $$RT_{0:31} \leftarrow (RA)_{0:31} +_{sp} (RB)_{0:31}$$ $RT_{32:63} \leftarrow (RA)_{32:63} +_{sp} (RB)_{32:63}$ Each single-precision floating-point element of register RA is added to the corresponding element of register RB and the results are stored in register RT. If an underflow occurs, +0 (for rounding modes RN, RZ, RP) or -0 (for rounding mode RM) is stored in the corresponding element of register RT. ### **Special Registers Altered:** FINV FINVH FINVS FGH FXH FG FX FINXS FOVF FOVFH FOVFS FUNF FUNFH FUNFS ### Vector Floating-Point Single-Precision Multiply EVX-form evfsmul RT,RA,RB | 4 | RT | RA | RB | 648 | | |---|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{l} {\rm RT_{0:31}} \leftarrow {\rm (RA)_{0:31}} \times _{\rm sp} {\rm (RB)_{0:31}} \\ {\rm RT_{32:63}} \leftarrow {\rm (RA)_{32:63}} \times _{\rm sp} {\rm (RB)_{32:63}} \end{array}$$ Each single-precision floating-point element of register RA is multiplied with the corresponding element of register RB and the result is stored in register RT. ### **Special Registers Altered:** FINV FINVH FINVS FGH FXH FG FX FINXS FOVF FOVFH FOVFS FUNF FUNFH FUNFS ### Vector Floating-Point Single-Precision Subtract EVX-form evfssub RT,RA,RB | 4 | RT | RA | RB | 641 | |---|----|----|----|-------| | 0 | 6 | 11 | 16 | 21 31 | $$\begin{array}{l} \text{RT}_{0:31} \leftarrow & (\text{RA})_{0:31} \, \text{-sp} \, (\text{RB})_{0:31} \\ \text{RT}_{32:63} \leftarrow & (\text{RA})_{32:63} \, \text{-sp} \, (\text{RB})_{32:63} \end{array}$$ Each single-precision floating-point element of register RB is subtracted from the corresponding element of register RA and the results are stored in register RT. If an underflow occurs, +0 (for rounding modes RN, RZ, RP) or -0 (for rounding mode RM) is stored in the corresponding element of register RT. ### **Special Registers Altered:** FINV FINVH FINVS FGH FXH FG FX FINXS FOVF FOVFH FOVFS FUNF FUNFH FUNFS ### Vector Floating-Point Single-Precision Divide EVX-form evfsdiv RT,RA,RB | 4 | RT | RA | RB | 649 | | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 : | 31 | $$\begin{array}{l} \mathtt{RT_{0:31}} \leftarrow \mathtt{(RA)_{0:31}} \div_{sp} \mathtt{(RB)_{0:31}} \\ \mathtt{RT_{32:63}} \leftarrow \mathtt{(RA)_{32:63}} \div_{sp} \mathtt{(RB)_{32:63}} \end{array}$$ Each single-precision floating-point element of register RA is divided by the corresponding element of register RB and the result is stored in register RT. ### **Special Registers Altered:** FINV FINVH FINVS FGH FXH FG FX FINXS FDBZ FDBZH FDBZS FOVF FOVFH FOVFS FUNF FUNFH FUNFS ### Vector Floating-Point Single-Precision Compare Greater Than EVX-form evfscmpgt BF,RA,RB | 4 | BF | // | RA | RB | 652 | |---|----|----|----|----|------| | 0 | 6 | 9 | 11 | 16 | 21 3 | ``` ah \leftarrow (RA)<sub>0:31</sub> al \leftarrow (RA)<sub>32:63</sub> bh \leftarrow (RB)<sub>0:31</sub> bl \leftarrow (RB)<sub>32:63</sub> if (ah > bh) then ch \leftarrow 1 else ch \leftarrow 0 if (al > bl) then cl \leftarrow 1 else cl \leftarrow 0 \operatorname{CR}_{4 \times \operatorname{BF}: 4 \times \operatorname{BF}+3} \leftarrow ch || cl || (ch | cl) || (ch & cl) ``` Each element of register RA is compared against the corresponding element of register RB. The results of the comparisons are placed into CR field BF. If $RA_{0:31}$ is greater than $RB_{0:31},$ bit 0 of CR field BF is set to 1, otherwise it is set to 0. If $RA_{32:63}$ is greater than $RB_{32:63},$ bit 1 of CR field BF is set to 1, otherwise it is set to 0. Bit 2 of CR field BF is set to the OR of both result bits and Bit 3 of CR field BF is set to the AND of both result bits. Comparison ignores the sign of 0 (+0 = -0). If an input error occurs and default results are generated, NaNs, Infinities, and Denorms as treated as normalized numbers, using their values of 'e' and 'f directly. ### **Special Registers Altered:** FINV FINVH FINVS FGH FXH FG FX CR field BF # Vector Floating-Point Single-Precision Compare Less Than EVX-form evfscmplt BF,RA,RB | 4 | BF | // | RA | RB | 653 | 1 | |---|----|----|----|----|------|---| | 0 | 6 | 9 | 11 | 16 | 21 3 | 1 | ``` ah \leftarrow (RA)<sub>0:31</sub> al \leftarrow (RA)<sub>32:63</sub> bh \leftarrow (RB)<sub>0:31</sub> bl \leftarrow (RB)<sub>32:63</sub> if (ah < bh) then ch \leftarrow 1 else ch \leftarrow 0 if (al < bl) then cl \leftarrow 1 else cl \leftarrow 0 \operatorname{CR}_{4 \times BF : 4 \times BF + 3} \leftarrow ch || cl || (ch | cl) || (ch & cl) ``` Each element of register RA is compared against the corresponding element of register RB. The results of the comparisons are placed into CR field BF. If RA $_{0:31}$ is less than RB $_{0:31}$ , bit 0 of CR field BF is set to 1, otherwise it is set to 0. If RA $_{32:63}$ is less than RB $_{32:63}$ , bit 1 of CR field BF is set to 1, otherwise it is set to 0. Bit 2 of CR field BF is set to the OR of both result bits and Bit 3 of CR field BF is set to the AND of both result bits. Comparison ignores the sign of 0 (+0 = -0). If an input error occurs and default results are generated, NaNs, Infinities, and Denorms as treated as normalized numbers, using their values of 'e' and 'f directly. #### Special Registers Altered: FINV FINVH FINVS FGH FXH FG FX CR field BF ### Vector Floating-Point Single-Precision Compare Equal EVX-form evfscmpeq BF,RA,RB ``` ah \leftarrow (RA)<sub>0:31</sub> al \leftarrow (RA)<sub>32:63</sub> bh \leftarrow (RB)<sub>0:31</sub> bl \leftarrow (RB)<sub>32:63</sub> if (ah = bh) then ch \leftarrow 1 else ch \leftarrow 0 if (al = bl) then cl \leftarrow 1 else cl \leftarrow 0 \operatorname{CR}_{4\times BF:4\times BF+3} \leftarrow ch || cl || (ch | cl) || (ch & cl) ``` Each element of register RA is compared against the corresponding element of register RB. The results of the comparisons are placed into CR field BF. If RA $_{0:31}$ is equal to RB $_{0:31}$ , bit 0 of CR field BF is set to 1, otherwise it is set to 0. If RA $_{32:63}$ is equal to RB $_{32:63}$ , bit 1 of CR field BF is set to 1, otherwise it is set to 0. Bit 2 of CR field BF is set to the OR of both result bits and Bit 3 of CR field BF is set to the AND of both result bits. Comparison ignores the sign of 0 (+0 = -0). If an input error occurs and default results are generated, NaNs, Infinities, and Denorms as treated as normalized numbers, using their values of 'e' and 'f directly. #### **Special Registers Altered:** FINV FINVH FINVS FGH FXH FG FX CR field BF ### Vector Floating-Point Single-Precision Test Greater Than EVX-form evfststgt BF,RA,RB ``` \begin{array}{l} \text{ah} \leftarrow \text{ (RA)}_{\,0:31} \\ \text{al} \leftarrow \text{ (RA)}_{\,32:63} \\ \text{bh} \leftarrow \text{ (RB)}_{\,0:31} \\ \text{bl} \leftarrow \text{ (RB)}_{\,32:63} \\ \text{if (ah > bh) then ch} \leftarrow 1 \\ \text{else ch} \leftarrow 0 \\ \text{if (al > bl) then cl} \leftarrow 1 \\ \text{else cl} \leftarrow 0 \\ \text{CR}_{4\times BF:4\times BF+3} \leftarrow \text{ch} \mid \mid \text{cl} \mid \mid \text{(ch} \mid \text{cl)} \mid \mid \text{(ch} \& \text{cl)} \end{array} ``` Each element of register RA is compared against the corresponding element of register RB. The results of the comparisons are placed into CR field BF. If $RA_{0:31}$ is greater than $RB_{0:31}$ , bit 0 of CR field BF is set to 1, otherwise it is set to 0. If $RA_{32:63}$ is greater than $RB_{32:63}$ , bit 1 of CR field BF is set to 1, otherwise it is set to 0. Bit 2 of CR field BF is set to the OR of both result bits and Bit 3 of CR field BF is set to the AND of both result bits. Comparison ignores the sign of 0 (+0 = -0). The comparison proceeds after treating NaNs, Infinities, and Denorms as normalized numbers, using their values of 'e' and 'f' directly. No exceptions are taken during the execution of **evfststgt**. ### **Special Registers Altered:** CR field BF ### **Programming Note** In an implementation, the execution of *evfststgt* is likely to be faster than the execution of *evfscmpgt*; however, if strict IEEE 754 compliance is required, the program should use *evfscmpgt*. ### Vector Floating-Point Single-Precision Test Less Than EVX-form evfststlt BF,RA,RB | 4 | BF | // | RA | RB | 669 | | |---|----|----|----|----|------|----| | 0 | 6 | 9 | 11 | 16 | 21 : | 31 | ``` \begin{array}{l} \text{ah} \leftarrow \text{ (RA)}_{\,0:31} \\ \text{al} \leftarrow \text{ (RA)}_{\,32:63} \\ \text{bh} \leftarrow \text{ (RB)}_{\,0:31} \\ \text{bl} \leftarrow \text{ (RB)}_{\,32:63} \\ \text{if (ah < bh) then ch} \leftarrow 1 \\ \text{else ch} \leftarrow 0 \\ \text{if (al < bl) then cl} \leftarrow 1 \\ \text{else cl} \leftarrow 0 \\ \text{CR}_{4\times BF:4\times BF+3} \leftarrow \text{ch} \mid \mid \text{cl} \mid \mid \text{(ch} \mid \text{cl)} \mid \mid \text{(ch \& cl)} \end{array} ``` Each element of register RA is compared with the corresponding element of register RB. The results of the comparisons are placed into CR field BF. If $RA_{0:31}$ is less than $RB_{0:31}$ , bit 0 of CR field BF is set to 1, otherwise it is set to 0. If $RA_{32:63}$ is less than $RB_{32:63}$ , bit 1 of CR field BF is set to 1, otherwise it is set to 0. Bit 2 of CR field BF is set to the OR of both result bits and Bit 3 of CR field BF is set to the AND of both result bits. Comparison ignores the sign of 0 (+0 = -0). The comparison proceeds after treating NaNs, Infinities, and Denorms as normalized numbers, using their values of 'e' and 'f' directly. No exceptions are taken during the execution of **evfststlt**. ### **Special Registers Altered:** CR field BF ### **Programming Note** In an implementation, the execution of *evfststlt* is likely to be faster than the execution of *evfscmplt*; however, if strict IEEE 754 compliance is required, the program should use *evfscmplt*. ### Vector Floating-Point Single-Precision Test Equal EVX-form evfststeq BF,RA,RB | 4 | BF | // | RA | RB | 670 | ٦ | |---|----|----|----|----|------|---| | 0 | 6 | 9 | 11 | 16 | 21 3 | 1 | ``` ah \leftarrow (RA)<sub>0:31</sub> al \leftarrow (RA)<sub>32:63</sub> bh \leftarrow (RB)<sub>0:31</sub> bl \leftarrow (RB)<sub>32:63</sub> if (ah = bh) then ch \leftarrow 1 else ch \leftarrow 0 if (al = bl) then cl \leftarrow 1 else cl \leftarrow 0 \operatorname{CR}_{4\times BF:4\times BF+3} \leftarrow ch || cl || (ch | cl) || (ch & cl) ``` Each element of register RA is compared against the corresponding element of register RB. The results of the comparisons are placed into CR field BF. If $RA_{0:31}$ is equal to $RB_{0:31}$ , bit 0 of CR field BF is set to 1, otherwise it is set to 0. If $RA_{32:63}$ is equal to $RB_{32:63}$ , bit 1 of CR field BF is set to 1, otherwise it is set to 0. Bit 2 of CR field BF is set to the OR of both result bits and Bit 3 of CR field BF is set to the AND of both result bits. Comparison ignores the sign of 0 (+0 = -0). The comparison proceeds after treating NaNs, Infinities, and Denorms as normalized numbers, using their values of 'e' and 'f' directly. No exceptions are taken during the execution of **evfst**steq. ### **Special Registers Altered:** CR field BF ### **Programming Note** In an implementation, the execution of *evfststeq* is likely to be faster than the execution of *evfsc-mpeq*; however, if strict IEEE 754 compliance is required, the program should use *evfscmpeq*. ### Vector Convert Floating-Point Single-Precision from Signed Integer EVX-form evfscfsi RT,RB $$\begin{array}{l} \mathtt{RT_{0:31}} \leftarrow \mathtt{CnvtI32ToFP32} \, (\,(\mathtt{RB})_{\,0:31}, \ \mathtt{S, \ HI, \ I}) \\ \mathtt{RT_{32:63}} \leftarrow \mathtt{CnvtI32ToFP32} \, (\,(\mathtt{RB})_{\,32:63}, \ \mathtt{S, \ LO, \ I}) \end{array}$$ Each signed integer element of register RB is converted to the nearest single-precision floating-point value using the current rounding mode and the results are placed into the corresponding element of register RT ### **Special Registers Altered:** FGH FXH FG FX FINXS ### Vector Convert Floating-Point Single-Precision from Signed Fraction EVX-form evfscfsf RT,RB | ı | 4 | RT | /// | RB | 659 | |---|---|----|-----|----|------| | | 0 | 6 | 11 | 16 | 21 3 | $$\begin{array}{l} \mathtt{RT_{0:31}} \leftarrow \mathtt{CnvtI32ToFP32}\left(\left(\mathtt{RB}\right)_{0:31},\ \mathtt{S},\ \mathtt{HI},\ \mathtt{F}\right) \\ \mathtt{RT_{32:63}} \leftarrow \mathtt{CnvtI32ToFP32}\left(\left(\mathtt{RB}\right)_{32:63},\ \mathtt{S},\ \mathtt{LO},\ \mathtt{F}\right) \end{array}$$ Each signed fractional element of register RB is converted to a single-precision floating-point value using the current rounding mode and the results are placed into the corresponding elements of register RT. ### **Special Registers Altered:** FGH FXH FG FX FINXS ### Vector Convert Floating-Point Single-Precision from Unsigned Integer EVX-form evfscfui RT,RB | | 4 | RT | /// | RB | 656 | |---|---|----|-----|----|------| | 0 | | 6 | 11 | 16 | 21 3 | $$\begin{array}{l} \mathtt{RT_{0:31}} \leftarrow \mathtt{CnvtI32ToFP32} \, (\,(\mathtt{RB})_{\,0:31}, \ \mathtt{U}, \ \mathtt{HI}, \ \mathtt{I}\,) \\ \mathtt{RT_{32:63}} \leftarrow \mathtt{CnvtI32ToFP32} \, (\,(\mathtt{RB})_{\,32:63}, \ \mathtt{U}, \ \mathtt{LO}, \ \mathtt{I}\,) \end{array}$$ Each unsigned integer element of register RB is converted to the nearest single-precision floating-point value using the current rounding mode and the results are placed into the corresponding elements of register RT ### **Special Registers Altered:** FGH FXH FG FX FINXS ### Vector Convert Floating-Point Single-Precision from Unsigned Fraction EVX-form evfscfuf RT,RB | 4 | RT | /// | RB | 658 | | |---|----|-----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{l} \mathtt{RT_{0:31}} \leftarrow \mathtt{CnvtI32ToFP32}\left(\left(\mathtt{RB}\right)_{0:31},\ \mathtt{U},\ \mathtt{HI},\ \mathtt{F}\right) \\ \mathtt{RT_{32:63}} \leftarrow \mathtt{CnvtI32ToFP32}\left(\left(\mathtt{RB}\right)_{32:63},\ \mathtt{U},\ \mathtt{LO},\ \mathtt{F}\right) \end{array}$$ Each unsigned fractional element of register RB is converted to a single-precision floating-point value using the current rounding mode and the results are placed into the corresponding elements of register RT. ### **Special Registers Altered:** FGH FXH FG FX FINXS ### Vector Convert Floating-Point Single-Precision to Signed Integer EVX-form evfsctsi RT,RB | 4 | RT | /// | RB | 661 | | |---|----|-----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{l} \mathtt{RT_{0:31}} \leftarrow \mathtt{CnvtFP32ToI32Sat((RB)_{0:31},\ S,\ HI,\ RND,\ I)} \\ \mathtt{RT_{32:63}} \leftarrow \mathtt{CnvtFP32ToI32Sat((RB)_{32:63},\ S,\ LO,\ RND,\ I)} \end{array}$$ Each single-precision floating-point element in register RB is converted to a signed integer using the current rounding mode and the result is saturated if it cannot be represented in a 32-bit integer. NaNs are converted as though they were zero. ### **Special Registers Altered:** FINV FINVH FINVS FGH FXH FG FX FINXS ### Vector Convert Floating-Point Single-Precision to Unsigned Integer EVX-form evfsctui RT,RB | 4 | RT | /// | RB | 660 | | |---|----|-----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $\begin{array}{l} \mathtt{RT_{0:31}} \leftarrow \mathtt{CnvtFP32ToI32Sat((RB)_{0:31},\ U,\ HI,\ RND,\ I)} \\ \mathtt{RT_{32:63}} \leftarrow \mathtt{CnvtFP32ToI32Sat((RB)_{32:63},U,\ LO,\ RND,\ I)} \end{array}$ Each single-precision floating-point element in register RB is converted to an unsigned integer using the current rounding mode and the result is saturated if it cannot be represented in a 32-bit integer. NaNs are converted as though they were zero. #### **Special Registers Altered:** FINV FINVH FINVS FGH FXH FG FX FINXS # Vector Convert Floating-Point Single-Precision to Signed Integer with Round toward Zero EVX-form evfsctsiz RT,RB | 4 | RT | /// | RB | 666 | ٦ | |---|----|-----|----|------|---| | 0 | 6 | 11 | 16 | 21 3 | 1 | $\begin{array}{l} \mathtt{RT_{0:31}} \leftarrow \mathtt{CnvtFP32ToI32Sat((RB)_{0:31},\ S,\ HI,\ ZER,\ I)} \\ \mathtt{RT_{32:63}} \leftarrow \mathtt{CnvtFP32ToI32Sat((RB)_{32:63},\ S,\ LO,\ ZER,\ I)} \end{array}$ Each single-precision floating-point element in register RB is converted to a signed integer using the rounding mode Round toward Zero and the result is saturated if it cannot be represented in a 32-bit integer. NaNs are converted as though they were zero. ### **Special Registers Altered:** FINV FINVH FINVS FGH FXH FG FX FINXS ### Vector Convert Floating-Point Single-Precision to Unsigned Integer with Round toward Zero EVX-form evfsctuiz RT,RB | 4 | RT | /// | RB | 664 | ٦ | |---|----|-----|----|------|----| | 0 | 6 | 11 | 16 | 21 3 | 31 | $$\begin{split} & \text{RT}_{0:31} \leftarrow \text{CnvtFP32ToI32Sat}\left(\left(\text{RB}\right)_{0:31}, \text{ U, HI, ZER, I}\right) \\ & \text{RT}_{32:63} \leftarrow \text{CnvtFP32ToI32Sat}\left(\left(\text{RB}\right)_{32:63}, \text{ U, LO, ZER, I}\right) \end{split}$$ Each single-precision floating-point element in register RB is converted to an unsigned integer using the rounding mode Round toward Zero and the result is saturated if it cannot be represented in a 32-bit integer. NaNs are converted as though they were zero. ### **Special Registers Altered:** FINV FINVH FINVS FGH FXH FG FX FINXS ### Vector Convert Floating-Point Single-Precision to Signed Fraction EVX-form evfsctsf RT,RB | 4 | RT | /// | RB | 663 | | |---|----|-----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{l} \mathtt{RT_{0;31}} \leftarrow \mathtt{CnvtFP32ToI32Sat((RB)_{0;31},\ S,\ HI,\ RND\ ,F)} \\ \mathtt{RT_{32;63}} \leftarrow \mathtt{CnvtFP32ToI32Sat((RB)_{32:63},\ S,\ LO,\ RND,\ F)} \end{array}$$ Each single-precision floating-point element in register RB is converted to a signed fraction using the current rounding mode and the result is saturated if it cannot be represented in a 32-bit signed fraction. NaNs are converted as though they were zero. ### **Special Registers Altered:** FINV FINVH FINVS FGH FXH FG FX FINXS ### Vector Convert Floating-Point Single-Precision to Unsigned Fraction EVX-form evfsctuf RT,RB | 4 | RT | /// | RB | 662 | |---|----|-----|----|------| | 0 | 6 | 11 | 16 | 21 3 | $$\begin{array}{l} \mathtt{RT_{0:31}} \leftarrow \mathtt{CnvtFP32ToI32Sat((RB)_{0:31},\ U,\ HI,\ RND,\ F)} \\ \mathtt{RT_{32:63}} \leftarrow \mathtt{CnvtFP32ToI32Sat((RB)_{32:63},\ U,\ LO,\ RND,\ F)} \end{array}$$ Each single-precision floating-point element in register RB is converted to an unsigned fraction using the current rounding mode and the result is saturated if it cannot be represented in a 32-bit fraction. NaNs are converted as though they were zero. ### **Special Registers Altered:** FINV FINVH FINVS FGH FXH FG FX FINXS # 8.3.3 SPE.Embedded Float Scalar Single Instructions [Category: SPE.Embedded Float Scalar Single] ## Floating-Point Single-Precision Absolute Value EVX-form efsabs RT,RA $$RT_{32:63} \leftarrow 0b0 \mid \mid (RA)_{33:63}$$ The sign bit of the low element of register RA is set to 0 and the result is placed into the low element of register RT Regardless of the value of register RA, no exceptions are taken during the execution of this instruction. ### **Special Registers Altered:** None # Floating-Point Single-Precision Negate EVX-form efsneg RT,RA | 4 | RT | RA | /// | 710 | | |---|----|----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$RT_{32:63} \leftarrow \neg (RA)_{32} \mid | (RA)_{33:63}$$ The sign bit of the low element of register RA is complemented and the result is placed into the low element of register RT. Regardless of the value of register RA, no exceptions are taken during the execution of this instruction. ### Special Registers Altered: None ### Floating-Point Single-Precision Negative Absolute Value EVX-form efsnabs RT.RA | 4 | RT | RA | /// | 709 | | |---|----|----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$RT_{32:63} \leftarrow Ob1 \mid \mid (RA)_{33:63}$$ The sign bit of the low element of register RA is set to 1 and the result is placed into the low element of register RT Regardless of the value of register RA, no exceptions are taken during the execution of this instruction. ### **Special Registers Altered:** ### Floating-Point Single-Precision Add EVX-form efsadd RT,RA,RB | Ī | 4 | RT | RA | RB | 704 | |---|---|----|----|----|-------| | ١ | 0 | 6 | 11 | 16 | 21 31 | $$\mathtt{RT}_{32:63} \leftarrow \ (\mathtt{RA})_{\,32:63} +_{sp} \ (\mathtt{RB})_{\,32:63}$$ The low element of register RA is added to the low element of register RB and the result is stored in the low element of register RT. If an underflow occurs, +0 (for rounding modes RN, RZ, RP) or -0 (for rounding mode RM) is stored in register RT. ### **Special Registers Altered:** FINV FINVS FOVF FOVFS FUNF FUNFS FG FX FINXS # Floating-Point Single-Precision Multiply EVX-form efsmul RT,RA,RB | 4 | RT | RA | RB | 712 | | |---|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$RT_{32:63} \leftarrow (RA)_{32:63} \times_{sp} (RB)_{32:63}$$ The low element of register RA is multiplied by the low element of register RB and the result is stored in the low element of register RT. ### **Special Registers Altered:** FINV FINVS FOVF FOVFS FUNF FUNFS FG FX FINXS ### Floating-Point Single-Precision Subtract EVX-form efssub RT,RA,RB | | 4 | RT | RA | RB | 705 | $\neg$ | |---|---|----|----|----|-----|--------| | 0 | | 6 | 11 | 16 | 21 | 31 | $$RT_{32:63} \leftarrow (RA)_{32:63} -_{sp} (RB)_{32:63}$$ The low element of register RB is subtracted from the low element of register RA and the result is stored in the low element of register RT. If an underflow occurs, +0 (for rounding modes RN, RZ, RP) or -0 (for rounding mode RM) is stored in register RT. ### **Special Registers Altered:** FINV FINVS FOVF FOVFS FUNF FUNFS FG FX FINXS ### Floating-Point Single-Precision Divide EVX-form efsdiv RT,RA,RB | 4 | RT | RA | RB | 713 | | |---|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$RT_{32:63} \leftarrow (RA)_{32:63} \div_{sp} (RB)_{32:63}$$ The low element of register RA is divided by the low element of register RB and the result is stored in the low element of register RT. ### **Special Registers Altered:** FINV FINVS FG FX FINXS FDBZ FDBZS FOVF FOVFS FUNF FUNFS ### Floating-Point Single-Precision Compare Greater Than EVX-form efscmpgt BF,RA,RB | 4 | BF | // | RA | RB | 716 | | |---|----|----|----|----|------|---| | 0 | 6 | 9 | 11 | 16 | 21 3 | 1 | ``` al \leftarrow (RA) 32:63 bl \leftarrow (RB) 32:63 if (al > bl) then cl \leftarrow 1 else cl \leftarrow 0 CR<sub>4\timesBF:4\timesBF+3 \leftarrow undefined || cl || undefined || undefined</sub> ``` The low element of register RA is compared against the low element of register RB. The results of the comparisons are placed into CR field BF. If $RA_{32:63}$ is greater than $RB_{32:63}$ , bit 1 of CR field BF is set to 1, otherwise it is set to 0. Bits 0, 2, and 3 of CR field BF are undefined. Comparison ignores the sign of 0 (+0 = -0). If an Input Error occurs and default results are generated, NaNs, Infinities, and Denorms are treated as normalized numbers, using their values of 'e' and 'f directly. ### **Special Registers Altered:** FINV FINVS FG FX CR field BF ### Floating-Point Single-Precision Compare Less Than EVX-form efscmplt BF,RA,RB | 4 | BF | // | RA | RB | 717 | |---|----|----|----|----|-------| | 0 | 6 | 9 | 11 | 16 | 21 31 | ``` al \leftarrow (RA) 32:63 bl \leftarrow (RB) 32:63 if (al < bl) then cl \leftarrow 1 else cl \leftarrow 0 CR<sub>4\timesBF:4\timesBF+3 \leftarrow undefined || cl || undefined || undefined</sub> ``` The low element of register RA is compared against the low element of register RB. If $RA_{32:63}$ is less than $RB_{32:63}$ , bit 1 of CR field BF is set to 1, otherwise it is set to 0. Bits 0, 2, and 3 of CR field BF are undefined. Comparison ignores the sign of 0 (+0 = -0). If an Input Error occurs and default results are generated, NaNs, Infinities, and Denorms are treated as normalized numbers, using their values of 'e' and 'f directly. ### **Special Registers Altered:** FINV FINVS FG FX CR field BF # Floating-Point Single-Precision Compare Equal EVX-form efscmpeq BF,RA,RB | | 4 | BF | // | RA | RB | 718 | | |---|---|----|----|----|----|-------|---| | C | ) | 6 | 9 | 11 | 16 | 21 31 | í | ``` al \leftarrow (RA) 32:63 bl \leftarrow (RB) 32:63 if (al = bl) then cl \leftarrow 1 else cl \leftarrow 0 CR<sub>4\timesBF:4\timesBF+3 \leftarrow undefined || cl || undefined || undefined</sub> ``` The low element of register RA is compared against the low element of register RB. If $RA_{32:63}$ is equal to $RB_{32:63}$ , bit 1 of CR field BF is set to 1, otherwise it is set to 0. Bits 0, 2, and 3 of CR field BF are undefined. Comparison ignores the sign of 0 (+0 = -0). If an Input Error occurs and default results are generated, NaNs, Infinities, and Denorms are treated as normalized numbers, using their values of 'e' and 'f directly. ### **Special Registers Altered:** FINV FINVS FG FX CR field BF ### Floating-Point Single-Precision Test Greater Than EVX-form efststgt BF,RA,RB | 4 | BF | // | RA | RB | 732 | ٦ | |---|----|----|----|----|-----|----| | 0 | 6 | 9 | 11 | 16 | 21 | 31 | ``` al \leftarrow (RA) _{32:63} bl \leftarrow (RB) _{32:63} if (al > bl) then cl \leftarrow 1 else cl \leftarrow 0 CR<sub>4×BF:4×BF+3</sub> \leftarrow undefined || cl || undefined || undefined ``` The low element of register RA is compared against the low element of register RB. If $RA_{32:63}$ is greater than $RB_{32:63}$ , bit 1 of CR field BF is set to 1, otherwise it is set to 0. Bits 0, 2, and 3 of CR field BF are undefined. Comparison ignores the sign of 0 (+0 = -0). The comparison proceeds after treating NaNs, Infinities, and Denorms as normalized numbers, using their values of 'e' and 'f' directly. No exceptions are generated during the execution of efststgt. ### **Special Registers Altered:** CR field BF ### Programming Note In an implementation, the execution of *efststgt* is likely to be faster than the execution of *efscmpgt*; however, if strict IEEE 754 compliance is required, the program should use *efscmpgt*. ### Floating-Point Single-Precision Test Less Than EVX-form efststlt BF,RA,RB | ĺ | 4 | BF | // | RA | RB | 733 | | |---|---|----|----|----|----|------|----| | ١ | 0 | 6 | 9 | 11 | 16 | 21 3 | 31 | ``` al \leftarrow (RA) 32:63 bl \leftarrow (RB) 32:63 if (al < bl) then cl \leftarrow 1 else cl \leftarrow 0 CR<sub>4\timesBF:4\timesBF+3 \leftarrow undefined || cl || undefined || undefined</sub> ``` The low element of register RA is compared against the low element of register RB. If $RA_{32:63}$ is less than $RB_{32:63}$ , bit 1 of CR field BF is set to 1, otherwise it is set to 0. Bits 0, 2, and 3 of CR field BF are undefined. Comparison ignores the sign of 0 (+0 = -0). The comparison proceeds after treating NaNs, Infinities, and Denorms as normalized numbers, using their values of 'e' and 'f directly. No exceptions are generated during the execution of efststlt. ### **Special Registers Altered:** CR field BF ### **Programming Note** In an implementation, the execution of *efststlt* is likely to be faster than the execution of *efscmplt*; however, if strict IEEE 754 compliance is required, the program should use *efscmplt*. ### Floating-Point Single-Precision Test Equal EVX-form efststeq BF,RA,RB | 4 | BF | // | RA | RB | 734 | | |---|----|----|----|----|------|---| | 0 | 6 | 9 | 11 | 16 | 21 3 | 1 | ``` al \leftarrow (RA) 32:63 bl \leftarrow (RB) 32:63 if (al = bl) then cl \leftarrow 1 else cl \leftarrow 0 CR<sub>4\timesBF:4\timesBF+3 \leftarrow undefined || cl || undefined || undefined</sub> ``` The low element of register RA is compared against the low element of register RB. If $RA_{32:63}$ is equal to $RB_{32:63}$ , bit 1 of CR field BF is set to 1, otherwise it is set to 0. Bits 0, 2, and 3 of CR field BF are undefined. Comparison ignores the sign of 0 (+0 = -0). The comparison proceeds after treating NaNs, Infinities, and Denorms as normalized numbers, using their values of 'e' and 'f directly. No exceptions are generated during the execution of *efststeq*. ### **Special Registers Altered:** CR field BF ### **Programming Note** In an implementation, the execution of **efststeq** is likely to be faster than the execution of **efscmpeq**; however, if strict IEEE 754 compliance is required, the program should use **efscmpeq**. # Convert Floating-Point Single-Precision from Signed Integer EVX-form efscfsi RT,RB | 4 | RT | /// | RB | 721 | |---|----|-----|----|-------| | 0 | 6 | 11 | 16 | 21 31 | $\mathtt{RT}_{32:63} \leftarrow \mathtt{CnvtI32ToFP32} \left( \, (\mathtt{RB})_{\, 32:63}, \ \, \mathtt{S}, \ \, \mathtt{LO}, \ \, \mathtt{I} \right)$ The signed integer low element in register RB is converted to a single-precision floating-point value using the current rounding mode and the result is placed into the low element of register RT. ### **Special Registers Altered:** FINXS FG FX # Convert Floating-Point Single-Precision from Signed Fraction EVX-form efscfsf RT,RB | | 4 | RT | /// | RB | 723 | | |---|---|----|-----|----|-----|----| | _ | | | 44 | 10 | 24 | 24 | | U | 1 | ь | 11 | 16 | 21 | 31 | $RT_{32:63} \leftarrow CnvtI32ToFP32((RB)_{32:63}, S, LO, F)$ The signed fractional low element in register RB is converted to a single-precision floating-point value using the current rounding mode and the result is placed into the low element of register RT. ### **Special Registers Altered:** FINXS FG FX # Convert Floating-Point Single-Precision to Signed Integer EVX-form efsctsi RT,RB | 4 | RT | /// | RB | 725 | |---|----|-----|----|------| | 0 | 6 | 11 | 16 | 21 3 | $\mathtt{RT_{32:63}} \leftarrow \mathtt{CnvtFP32ToI32Sat}\left(\left(\mathtt{RB}\right)_{32:63}, \ \mathtt{S, LO, RND, I}\right)$ The single-precision floating-point low element in register RB is converted to a signed integer using the current rounding mode and the result is saturated if it cannot be represented in a 32-bit integer. NaNs are converted as though they were zero. ### **Special Registers Altered:** FINV FINVS FINXS FG FX # Convert Floating-Point Single-Precision from Unsigned Integer EVX-form efscfui RT,RB | 4 | RT | /// | RB | 720 | ٦ | |---|----|-----|----|------|----| | 0 | 6 | 11 | 16 | 21 3 | 31 | $\mathtt{RT}_{32:63} \leftarrow \mathtt{CnvtI32ToFP32} \, (\, \mathtt{(RB)}_{\, 32:63}, \ \mathtt{U}, \ \mathtt{LO}, \ \mathtt{I})$ The unsigned integer low element in register RB is converted to a single-precision floating-point value using the current rounding mode and the result is placed into the low element of register RT. ### **Special Registers Altered:** FINXS FG FX # Convert Floating-Point Single-Precision from Unsigned Fraction EVX-form efscfuf RT,RB | 4 | RT | /// | RB | 722 | | |---|----|-----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $RT_{32:63} \leftarrow CnvtI32ToFP32((RB)_{32:63}, U, LO, F)$ The unsigned fractional low element in register RB is converted to a single-precision floating-point value using the current rounding mode and the result is placed into the low element of register RT. ### **Special Registers Altered:** FINXS FG FX # Convert Floating-Point Single-Precision to Unsigned Integer EVX-form efsctui RT,RB | 4 | RT | /// | RB | 724 | | |---|----|-----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $RT_{32.63} \leftarrow CnvtFP32ToI32Sat((RB)_{32.63}, U, LO, RND, I)$ The single-precision floating-point low element in register RB is converted to an unsigned integer using the current rounding mode and the result is saturated if it cannot be represented in a 32-bit integer. NaNs are converted as though they were zero. ### **Special Registers Altered:** FINV FINVS FINXS FG FX ### Convert Floating-Point Single-Precision to Signed Integer with Round toward Zero EVX-form efsctsiz RT,RB | Ī | 4 | RT | /// | RB | 730 | |---|---|----|-----|----|------| | ١ | 0 | 6 | 11 | 16 | 21 3 | $$RT_{32:63} \leftarrow CnvtFP32ToI32Sat((RB)_{32:63}, S, LO, ZER, I)$$ The single-precision floating-point low element in register RB is converted to a signed integer using the rounding mode Round toward Zero and the result is saturated if it cannot be represented in a 32-bit integer. NaNs are converted as though they were zero. ### **Special Registers Altered:** FINV FINVS FINXS FG FX # Convert Floating-Point Single-Precision to Signed Fraction EVX-form efsctsf RT,RB $$RT_{32:63} \leftarrow CnvtFP32ToI32Sat((RB)_{32:63}, S, LO, RND, F)$$ The single-precision floating-point low element in register RB is converted to a signed fraction using the current rounding mode and the result is saturated if it cannot be represented in a 32-bit fraction. NaNs are converted as though they were zero. ### **Special Registers Altered:** FINV FINVS FINXS FG FX # Convert Floating-Point Single-Precision to Unsigned Integer with Round toward Zero EVX-form efsctuiz RT,RB | 4 | RT | /// | RB | 728 | |---|----|-----|----|-------| | 0 | 6 | 11 | 16 | 21 31 | $\mathtt{RT}_{32:63} \leftarrow \mathtt{CnvtFP32ToI32Sat}\left(\left(\mathtt{RB}\right)_{32:63},\ \mathtt{U},\ \mathtt{LO},\ \mathtt{ZER},\ \mathtt{I}\right)$ The single-precision floating-point low element in register RB is converted to an unsigned integer using the rounding mode Round toward Zero and the result is saturated if it cannot be represented in a 32-bit integer. NaNs are converted as though they were zero. ### **Special Registers Altered:** FINV FINVS FINXS FG FX # Convert Floating-Point Single-Precision to Unsigned Fraction EVX-form efsctuf RT,RB | 4 | RT | /// | RB | 726 | | |---|----|-----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$RT_{32:63} \leftarrow CnvtFP32ToI32Sat((RB)_{32:63}, U, LO, RND, F)$$ The single-precision floating-point low element in register RB is converted to an unsigned fraction using the current rounding mode and the result is saturated if it cannot be represented in a 32-bit unsigned fraction. NaNs are converted as though they were zero. ### **Special Registers Altered:** FINV FINVS FINXS FG FX ## 8.3.4 SPE.Embedded Float Scalar Double Instructions [Category: SPE.Embedded Float Scalar Double] ### Floating-Point Double-Precision Absolute Value EVX-form efdabs RT,RA | 4 | RT | RA | /// | 740 | | |---|----|----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$RT_{0:63} \leftarrow 0b0 \mid \mid (RA)_{1:63}$$ The sign bit of register RA is set to 0 and the result is placed in register RT. Regardless of the value of register RA, no exceptions are taken during the execution of this instruction. ### **Special Registers Altered:** None ## Floating-Point Double-Precision Negate EVX-form efdneg RT,RA | 4 | RT | RA | /// | 742 | | |---|----|----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$RT_{0:63} \leftarrow \neg (RA)_0 \mid \mid (RA)_{1:63}$$ The sign bit of register RA is complemented and the result is placed in register RT. Regardless of the value of register RA, no exceptions are taken during the execution of this instruction. ### **Special Registers Altered:** None ### Floating-Point Double-Precision Negative Absolute Value EVX-form efdnabs RT,RA | 4 | RT | RA | /// | 741 | | |---|----|----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$RT_{0:63} \leftarrow Ob1 \mid \mid (RA)_{1:63}$$ The sign bit of register RA is set to 1 and the result is placed in register RT. Regardless of the value of register RA, no exceptions are taken during the execution of this instruction. ### **Special Registers Altered:** None ### Floating-Point Double-Precision Add EVX-form efdadd RT,RA,RB | 4 | RT | RA | RB | 736 | | |---|----|----|----|------|---| | 0 | 6 | 11 | 16 | 21 3 | 1 | $$\text{RT}_{\text{0:63}} \leftarrow \text{ (RA)}_{\text{0:63}} +_{\text{dp}} \text{(RB)}_{\text{0:63}}$$ RA is added to RB and the result is stored in register If an underflow occurs, +0 (for rounding modes RN, RZ, RP) or -0 (for rounding mode RM) is stored in register RT. ### **Special Registers Altered:** FINV FINVS FOVF FOVFS FUNF FUNFS FG FX FINXS ## Floating-Point Double-Precision Multiply EVX-form efdmul RT.RA.RB | 4 | RT | RA | RB | 744 | |---|----|----|----|-------| | 0 | 6 | 11 | 16 | 21 31 | $$RT_{0:63} \leftarrow (RA)_{0:63} \times_{dp} (RB)_{0:63}$$ RA is multiplied by RB and the result is stored in register RT. ### **Special Registers Altered:** FINV FINVS FOVF FOVFS FUNF FUNFS FG FX FINXS ## Floating-Point Double-Precision Subtract EVX-form efdsub RT,RA,RB | | 4 | RT | RA | RB | 737 | | |---|---|----|----|----|------|----| | 0 | | 6 | 11 | 16 | 21 : | 31 | $$\text{RT}_{\text{0:63}} \leftarrow \text{(RA)}_{\text{0:63}} \text{-}_{\text{dp}} \text{(RB)}_{\text{0:63}}$$ RB is subtracted from RA and the result is stored in register RT. If an underflow occurs, +0 (for rounding modes RN, RZ, RP) or -0 (for rounding mode RM) is stored in register RT ### **Special Registers Altered:** FINV FINVS FOVF FOVFS FUNF FUNFS FG FX FINXS ## Floating-Point Double-Precision Divide EVX-form efddiv RT,RA,RB | 4 | RT | RA | RB | 745 | ٦ | |---|----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 : | 31 | $$RT_{0:63} \leftarrow (RA)_{0:63} \div_{dp} (RB)_{0:63}$$ RA is divided by RB and the result is stored in register RT. ### **Special Registers Altered:** FINV FINVS FG FX FINXS FDBZ FDBZS FOVF FOVFS FUNF FUNFS ### Floating-Point Double-Precision Compare Greater Than EVX-form efdcmpgt BF,RA,RB | Ī | 4 | BF | // | RA | RB | 748 | |---|---|----|----|----|----|-------| | 0 | ) | 6 | 9 | 11 | 16 | 21 31 | ``` al \leftarrow (RA) 0:63 bl \leftarrow (RB) 0:63 if (al > bl) then cl \leftarrow 1 else cl \leftarrow 0 CR<sub>4×BF:4×BF+3</sub> \leftarrow undefined || cl || undefined || undefined ``` RA is compared against RB. If RA is greater than RB, bit 1 of CR field BF is set to 1, otherwise it is set to 0. Bits 0, 2, and 3 of CR field BF are undefined. Comparison ignores the sign of 0 (+0 = -0). If an input error occurs and default results are generated, NaNs, Infinities, and Denorms are treated as normalized numbers, using their values of 'e' and 'f directly. ### **Special Registers Altered:** FINV FINVS FG FX CR field BF ## Floating-Point Double-Precision Compare Equal EVX-form efdcmpeq BF,RA,RB | | | | | | | _ | |---|----|----|----|----|------|---| | 4 | BF | // | RA | RB | 750 | | | • | | ′′ | | | , 00 | | | 0 | 6 | 9 | 11 | 16 | 21 3 | 1 | ``` al \leftarrow (RA) _{0:63} bl \leftarrow (RB) _{0:63} if (al = bl) then cl \leftarrow 1 else cl \leftarrow 0 CR<sub>4×BF+3</sub> \leftarrow undefined || cl || undefined || undefined ``` RA is compared against RB. If RA is equal to RB, bit 1 of CR field BF is set to 1, otherwise it is set to 0. Bits 0, 2, and 3 of CR field BF are undefined. Comparison ignores the sign of 0 (+0 = -0). If an input error occurs and default results are generated, NaNs, Infinities, and Denorms are treated as normalized numbers, using their values of 'e' and 'f directly. ### **Special Registers Altered:** FINV FINVS FG FX CR field BF ### Floating-Point Double-Precision Compare Less Than EVX-form efdcmplt BF,RA,RB | 4 | BF | // | RA | RB | 749 | ٦ | |---|----|----|----|----|-----|----| | 0 | 6 | 9 | 11 | 16 | 21 | 31 | ``` al \leftarrow (RA) _{0:63} bl \leftarrow (RB) _{0:63} if (al < bl) then cl \leftarrow 1 else cl \leftarrow 0 CR<sub>4×BF:4×BF+3</sub> \leftarrow undefined || cl || undefined || undefined ``` RA is compared against RB. If RA is less than RB, bit 1 of CR field BF is set to 1, otherwise it is set to 0. Bits 0, 2, and 3 of CR field BF are undefined. Comparison ignores the sign of 0 (+0 = -0). If an input error occurs and default results are generated, NaNs, Infinities, and Denorms are treated as normalized numbers, using their values of 'e' and 'f directly. ### **Special Registers Altered:** FINV FINVS FG FX CR field BF ### Floating-Point Double-Precision Test Greater Than EVX-form efdtstgt BF,RA,RB ``` al \leftarrow (RA) _{0:63} bl \leftarrow (RB) _{0:63} if (al > bl) then cl \leftarrow 1 else cl \leftarrow 0 CR<sub>4×BF:4×BF+3</sub> \leftarrow undefined || cl || undefined || undefined ``` RA is compared against RB. If RA is greater than RB, bit 1 of CR field BF is set to 1, otherwise it is set to 0. Bits 0, 2, and 3 of CR field BF are undefined. Comparison ignores the sign of 0 (+0 = -0). The comparison proceeds after treating NaNs, Infinities, and Denorms as normalized numbers, using their values of 'e' and 'f directly. No exceptions are generated during the execution of efdtstgt. ### **Special Registers Altered:** CR field BF ### Programming Note In an implementation, the execution of *efdtstgt* is likely to be faster than the execution of *efdcmpgt*; however, if strict IEEE 754 compliance is required, the program should use *efdcmpgt*. ### Floating-Point Double-Precision Test Less Than EVX-form efdtstlt BF,RA,RB | 4 | BF | // | RA | RB | 765 | | |---|----|----|----|----|------|----| | 0 | 6 | 9 | 11 | 16 | 21 : | 31 | ``` al \leftarrow (RA) 0:63 bl \leftarrow (RB) 0:63 if (al < bl) then cl \leftarrow 1 else cl \leftarrow 0 CR<sub>4XBF:4XBF+3</sub> \leftarrow undefined || cl || undefined || undefined ``` RA is compared against RB. If RA is less than RB, bit 1 of CR field BF is set to 1, otherwise it is set to 0. Bits 0, 2, and 3 of CR field BF are undefined. Comparison ignores the sign of 0 (+0 = -0). The comparison proceeds after treating NaNs, Infinities, and Denorms as normalized numbers, using their values of 'e' and 'f directly. No exceptions are generated during the execution of efdtstlt. ### **Special Registers Altered:** CR field BF ### Programming Note In an implementation, the execution of *efdtstlt* is likely to be faster than the execution of *efdcmplt*, however, if strict IEEE 754 compliance is required, the program should use *efdcmplt*. ### Floating-Point Double-Precision Test Equal EVX-form efdtsteq BF,RA,RB | 4 | BF | // | RA | RB | 766 | | |---|----|----|----|----|------|----| | 0 | 6 | 9 | 11 | 16 | 21 3 | 31 | ``` al \leftarrow (RA) 0.63 bl \leftarrow (RB) 0.63 if (al = bl) then cl \leftarrow 1 else cl \leftarrow 0 CR<sub>4×BF:4×BF+3</sub> \leftarrow undefined || cl || undefined || undefined ``` RA is compared against RB. If RA is equal to RB, bit 1 of CR field BF is set to 1, otherwise it is set to 0. Bits 0, 2, and 3 of CR field BF are undefined. Comparison ignores the sign of 0 (+0 = -0). The comparison proceeds after treating NaNs, Infinities, and Denorms as normalized numbers, using their values of 'e' and 'f directly. No exceptions are generated during the execution of efdtsteg. ### **Special Registers Altered:** CR field BF ### **Programming Note** In an implementation, the execution of *efdtsteq* is likely to be faster than the execution of *efdcmpeq*; however, if strict IEEE 754 compliance is required, the program should use *efdcmpeq*. ## Convert Floating-Point Double-Precision from Signed Integer EVX-form efdcfsi RT,RB | 4 | RT | /// | RB | 753 | |---|----|-----|----|-------| | o | 6 | 11 | 16 | 21 31 | $$\mathbf{RT_{0:63}} \leftarrow \mathtt{CnvtI32ToFP64} \, (\, \mathtt{(RB)}_{\, 32:63}, \ \mathtt{S, \ I})$$ The signed integer low element in register RB is converted to a double-precision floating-point value using the current rounding mode and the result is placed in register RT. ### **Special Registers Altered:** None ## Convert Floating-Point Double-Precision from Unsigned Integer EVX-form efdcfui RT,RB | 4 | RT | /// | RB | 752 | | |---|----|-----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$\mathtt{RT_{0:63}} \leftarrow \mathtt{CnvtI32ToFP64} \left( \left( \mathtt{RB} \right)_{32:63}, \ \mathtt{U}, \ \mathtt{I} \right)$$ The unsigned integer low element in register RB is converted to a double-precision floating-point value using the current rounding mode and the result is placed in register RT. ### **Special Registers Altered:** None ### Convert Floating-Point Double-Precision from Signed Integer Doubleword EVX-form efdcfsid RT,RB $$RT_{0:63} \leftarrow CnvtI64ToFP64((RB)_{0:63}, S)$$ The signed integer doubleword in register RB is converted to a double-precision floating-point value using the current rounding mode and the result is placed in register RT. ### **Corequisite Categories:** 64-Bit ### **Special Registers Altered:** FINXS FG FX ## Convert Floating-Point Double-Precision from Signed Fraction EVX-form efdcfsf RT,RB $$RT_{0:63} \leftarrow CnvtI32ToFP64((RB)_{32:63}, S, F)$$ The signed fractional low element in register RB is converted to a double-precision floating-point value using the current rounding mode and the result is placed in register RT. ### **Special Registers Altered:** None ## Convert Floating-Point Double-Precision from Unsigned Fraction EVX-form efdcfuf RT,RB | 4 | RT | /// | RR | 754 | | |---|-------|-----|-----|-----|----| | - | 1 1 1 | /// | IND | 754 | | | 0 | 6 | 11 | 16 | 21 | 31 | $$RT_{0:63} \leftarrow CnvtI32ToFP64((RB)_{32:63}, U, F)$$ The unsigned fractional low element in register RB is converted to a double-precision floating-point value using the current rounding mode and the result is placed in register RT. ### Special Registers Altered: None ### Convert Floating-Point Double-Precision from Unsigned Integer Doubleword EVX-form efdcfuid RT,RB | | 4 | RT | /// | RB | 738 | |---|---|----|-----|----|------| | 0 | 1 | 6 | 11 | 16 | 21 3 | $$RT_{0:63} \leftarrow CnvtI64ToFP64 ((RB)_{0:63}, U)$$ The unsigned integer doubleword in register RB is converted to a double-precision floating-point value using the current rounding mode and the result is placed in register RT. ### **Corequisite Categories:** 64-Bit #### Special Registers Altered: FINXS FG FX ## Convert Floating-Point Double-Precision to Signed Integer EVX-form efdctsi RT,RB | 4 | RT | /// | RB | 757 | |---|----|-----|----|------| | 0 | 6 | 11 | 16 | 21 3 | $$RT_{32:63} \leftarrow CnvtFP64ToI32Sat((RB)_{0:63}, S, RND, I)$$ The double-precision floating-point value in register RB is converted to a signed integer using the current rounding mode and the result is saturated if it cannot be represented in a 32-bit integer. NaNs are converted as though they were zero. ### **Special Registers Altered:** FINV FINVS FINXS FG FX ## Convert Floating-Point Double-Precision to Unsigned Integer EVX-form efdctui RT,RB $RT_{32:63} \leftarrow CnvtFP64ToI32Sat((RB)_{0:63}, U, RND, I)$ The double-precision floating-point value in register RB is converted to an unsigned integer using the current rounding mode and the result is saturated if it cannot be represented in a 32-bit integer. NaNs are converted as though they were zero. ### **Special Registers Altered:** FINV FINVS FINXS FG FX # Convert Floating-Point Double-Precision to Signed Integer Doubleword with Round toward Zero EVX-form efdctsidz RT,RB $$RT_{0:63} \leftarrow CnvtFP64ToI64Sat((RB)_{0:63}, S, ZER)$$ The double-precision floating-point value in register RB is converted to a signed integer doubleword using the rounding mode Round toward Zero and the result is saturated if it cannot be represented in a 64-bit integer. NaNs are converted as though they were zero. ### **Corequisite Categories:** 64-Bit ### **Special Registers Altered:** FINV FINVS FINXS FG FX # Convert Floating-Point Double-Precision to Unsigned Integer Doubleword with Round toward Zero EVX-form efdctuidz RT,RB | 4 | RT | /// | RB | 746 | |---|----|-----|----|-------| | 0 | 6 | 11 | 16 | 21 31 | $RT_{0:63} \leftarrow CnvtFP64ToI64Sat((RB)_{0:63}, U, ZER)$ The double-precision floating-point value in register RB is converted to an unsigned integer doubleword using the rounding mode Round toward Zero and the result is saturated if it cannot be represented in a 64-bit integer. NaNs are converted as though they were zero. ### **Corequisite Categories:** 64-Bit ### **Special Registers Altered:** FINV FINVS FINXS FG FX ### Convert Floating-Point Double-Precision to Signed Integer with Round toward Zero EVX-form efdctsiz RT,RB $$RT_{32:63} \leftarrow CnvtFP64ToI32Sat((RB)_{0:63}, S, ZER, I)$$ The double-precision floating-point value in register RB is converted to a signed integer using the rounding mode Round toward Zero and the result is saturated if it cannot be represented in a 32-bit integer. NaNs are converted as though they were zero. ### **Special Registers Altered:** FINV FINVS FINXS FG FX ## Convert Floating-Point Double-Precision to Signed Fraction EVX-form efdctsf RT,RB | 4 | RT | /// | RB | 759 | |---|----|-----|----|-------| | 0 | 6 | 11 | 16 | 21 31 | $$RT_{32:63} \leftarrow CnvtFP64ToI32Sat((RB)_{0:63}, S, RND, F)$$ The double-precision floating-point value in register RB is converted to a signed fraction using the current rounding mode and the result is saturated if it cannot be represented in a 32-bit fraction. NaNs are converted as though they were zero. ### **Special Registers Altered:** FINV FINVS FINXS FG FX ## Convert Floating-Point Double-Precision to Unsigned Fraction EVX-form efdctuf RT,RB | Ī | 4 | RT | /// | RB | 758 | | |---|---|----|-----|----|-----|----| | ( | ) | 6 | 11 | 16 | 21 | 31 | $$\text{RT}_{32:63} \leftarrow \text{CnvtFP64ToI32Sat((RB)}_{0:63}, \text{U, RND, F)}$$ The double-precision floating-point value in register RB is converted to an unsigned fraction using the current rounding mode and the result is saturated if it cannot be represented in a 32-bit unsigned fraction. NaNs are converted as though they were zero. ### **Special Registers Altered:** FINV FINVS FINXS FG FX # Convert Floating-Point Double-Precision to Unsigned Integer with Round toward Zero EVX-form efdctuiz RT,RB | 4 | RT | /// | RB | 760 | | |---|----|-----|----|------|----| | 0 | 6 | 11 | 16 | 21 : | 31 | $$RT_{32:63} \leftarrow CnvtFP64ToI32Sat((RB)_{0:63}, U, ZER, I)$$ The double-precision floating-point value in register RB is converted to an unsigned integer using the rounding mode Round toward Zero and the result is saturated if it cannot be represented in a 32-bit integer. NaNs are converted as though they were zero. ### **Special Registers Altered:** FINV FINVS FINXS FG FX ## Floating-Point Double-Precision Convert from Single-Precision EVX-form efdcfs RT,RB ``` FP32format f; FP64format result; f \leftarrow (RB)_{32:63} if (f_{exp} = 0) \& (f_{frac} = 0)) then result \leftarrow f_{sign} \mid\mid ^{63}0 else if Isa32NaNorInfinity(f) | Isa32Denorm(f) then SPEFSCR_{FINV} \leftarrow 1 result \leftarrow f_{sign} \mid\mid 0 bb1111111111110 \mid\mid ^{52}1 else if Isa32Denorm(f) then SPEFSCR_{FINV} \leftarrow 1 result \leftarrow f_{sign} \mid\mid ^{63}0 else result \leftarrow f_{sign} \mid\mid ^{63}0 else result_{sign} \leftarrow f_{sign} result_exp \leftarrow f_{exp} - 127 + 1023 result_frac \leftarrow f_{frac} \mid\mid ^{29}0 RT<sub>0:63</sub> \leftarrow result ``` The single-precision floating-point value in the low element of register RB is converted to a double-precision floating-point value and the result is placed in register RT. ### **Corequisite Categories:** SPE.Embedded Float Scalar Single or SPE.Embedded Float Vector ### **Special Registers Altered:** FINV FINVS FG FX ## Floating-Point Single-Precision Convert from Double-Precision EVX-form efscfd RT,RB ``` FP64format f; FP32format result; f \leftarrow (RB)_{0:63} if (f_{exp} = 0) & (f_{frac} = 0)) then result \leftarrow f_{sign} \mid \mid | ^{31}0 else if Isa64NaNorInfinity(f) then \texttt{SPEFSCR}_{\texttt{FINV}} \; \leftarrow \; \texttt{1} result \leftarrow f<sub>sign</sub> || 0b111111110 || ^{23}1 else if Isa64Denorm(f) then SPEFSCR_{FINV} \leftarrow 1 result \stackrel{\text{rinv}}{\leftarrow} f_{\text{sign}} | |^{31}0 unbias \leftarrow f<sub>exp</sub> - 1023 if unbias > 127 then result \leftarrow f<sub>sign</sub> || 0b111111110 || ^{23}1 \label{eq:spefscr} \text{SPEFSCR}_{\text{FOVF}} \xleftarrow{} 1 \\ \text{else if unbias} < -126 \text{ then} result \leftarrow f<sub>sign</sub> || 0b00000001 || <sup>23</sup>0 SPEFSCR_{FUNF} \leftarrow 1 else \texttt{result}_{\texttt{sign}} \leftarrow \texttt{f}_{\texttt{sign}} result_{exp} \leftarrow unbias + 127 result<sub>frac</sub> \leftarrow f<sub>frac[0:22]</sub> guard \leftarrow f<sub>frac[23]</sub> sticky \leftarrow (f_{frac[24:51]} \neq 0) result ← Round32(result, LO, guard, sticky) \texttt{SPEFSCR}_{\texttt{FG}} \; \leftarrow \; \texttt{guard} \begin{array}{l} \mathtt{SPEFSCR}_{\mathtt{FX}} \leftarrow \mathtt{sticky} \\ \mathtt{if guard} \mid \mathtt{sticky then} \end{array} SPEFSCR<sub>FINXS</sub> ← 1 RT_{32:63} \leftarrow result ``` The double-precision floating-point value in register RB is converted to a single-precision floating-point value using the current rounding mode and the result is placed into the low element of register RT. ### **Corequisite Categories:** SPE.Embedded Float Scalar Scalar ### **Special Registers Altered:** FINV FINVS FOVF FOVFS FUNF FUNFS FG FX FINXS ### 8.4 Embedded Floating-Point Results Summary The following tables summarize the results of various types of *Embedded Floating-Point* operations on various combinations of input operands. Flag settings are performed on appropriate element flags. For all the tables the following annotation and general rules apply: - \* denotes that this status flag is set based on the results of the calculation. - \_Calc\_ denotes that the result is updated with the results of the computation. - max denotes the maximum normalized number with the sign set to the computation [sign(operand A) XOR sign(operand B)]. - amax denotes the maximum normalized number with the sign set to the sign of Operand A. - bmax denotes the maximum normalized number with the sign set to the sign of Operand B. - pmax denotes the maximum normalized positive number. The encoding for single-precision is: 0x7F7FFFF. The encoding for double-precision is: 0x7FEFFFFF FFFFFFFF. - nmax denotes the maximum normalized negative number. The encoding for single-precision is: 0xFF7FFFF. The encoding for double-precision is: 0xFFEFFFFF\_FFFFFFF. - *pmin* denotes the minimum normalized positive number. The encoding for single-precision is: 0x00800000. The encoding for double-precision is: 0x00100000\_00000000. - nmin denotes the minimum normalized negative number. The encoding for single-precision is: 0x80800000. The encoding for double-precision is: 0x80100000 000000000. - Calculations that overflow or underflow saturate. Overflow for operations that have a floating-point result force the result to *max*. Underflow for operations that have a floating-point result force the result to zero. Overflow for operations that have a signed integer result force the result to 0x7FFFFFFF (positive) or 0x80000000 (negative). Overflow for operations that have an unsigned integer result force the result to 0xFFFFFFFF (positive) or 0x000000000 (negative). - 1 (superscript) denotes that the sign of the result is positive when the sign of Operand A and the sign of Operand B are different, for all rounding modes except round to -infinity, where the sign of the result is then negative. - 2 (superscript) denotes that the sign of the result is positive when the sign of Operand A and the sign of Operand B are the same, for all rounding modes except round to -infinity, where the sign of the result is then negative. - Superscript) denotes that the sign for any multiply or divide is always the result of the operation [sign(Operand A) XOR sign(Operand B)]. - 4 (superscript) denotes that if an overflow is detected, the result may be saturated. | Table 3: En | nbedded Floa | ting-Point Re | esults Summar | y—Add | l, Sub, I | Mul, Div | | | | |-------------|--------------|---------------|------------------------|-------|-----------|----------|------|------|--| | Operation | Operand A | Operand B | Result | FINV | FOVF | FUNF | FDBZ | FINX | | | | Add | | | | | | | | | | Add | 8 | 8 | amax | 1 | 0 | 0 | 0 | 0 | | | Add | 8 | NaN | amax | 1 | 0 | 0 | 0 | 0 | | | Add | 8 | denorm | amax | 1 | 0 | 0 | 0 | 0 | | | Add | ∞ | zero | amax | 1 | 0 | 0 | 0 | 0 | | | Add | 8 | Norm | amax | 1 | 0 | 0 | 0 | 0 | | | Add | NaN | 8 | amax | 1 | 0 | 0 | 0 | 0 | | | Add | NaN | NaN | amax | 1 | 0 | 0 | 0 | 0 | | | Add | NaN | denorm | amax | 1 | 0 | 0 | 0 | 0 | | | Add | NaN | zero | amax | 1 | 0 | 0 | 0 | 0 | | | Add | NaN | norm | amax | 1 | 0 | 0 | 0 | 0 | | | Add | denorm | 8 | bmax | 1 | 0 | 0 | 0 | 0 | | | Add | denorm | NaN | bmax | 1 | 0 | 0 | 0 | 0 | | | Add | denorm | denorm | zero <sup>1</sup> | 1 | 0 | 0 | 0 | 0 | | | Add | denorm | zero | zero <sup>1</sup> | 1 | 0 | 0 | 0 | 0 | | | Add | denorm | norm | operand_b <sup>4</sup> | 1 | 0 | 0 | 0 | 0 | | | Add | zero | 8 | bmax | 1 | 0 | 0 | 0 | 0 | | | Add | zero | NaN | bmax | 1 | 0 | 0 | 0 | 0 | | | Add | zero | denorm | zero <sup>1</sup> | 1 | 0 | 0 | 0 | 0 | | | <b>FUNF</b><br>0<br>0 | <b>FDBZ</b> 0 | <b>FINX</b> 0 | |-----------------------|----------------------------------------------------------------------|---------------------------------------| | | 0 | 0 | | 0 | | | | | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | * | 0 | * | | | | | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | * | 0 | * | | | <u>.</u> ! | <u> </u> | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | 0 | 0 | 0 | | U | | | | 0 | 0 | 0 | | | 0<br>* 0<br>0<br>0<br>0<br>0<br>0<br>0<br>0<br>0<br>0<br>0<br>0<br>0 | 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | | | i | | sults Summar | <u> </u> | | | | | |-----------|-----------|-----------|---------------------|----------|------|------|------|------| | Operation | Operand A | Operand B | Result | FINV | FOVF | FUNF | FDBZ | FINX | | Mul | denorm | 8 | zero | 1 | 0 | 0 | 0 | 0 | | Mul | denorm | NaN | zero | 1 | 0 | 0 | 0 | 0 | | Mul | denorm | denorm | zero | 1 | 0 | 0 | 0 | 0 | | Mul | denorm | zero | zero | 1 | 0 | 0 | 0 | 0 | | Mul | denorm | norm | zero | 1 | 0 | 0 | 0 | 0 | | Mul | zero | ∞ | zero | 1 | 0 | 0 | 0 | 0 | | Mul | zero | NaN | zero | 1 | 0 | 0 | 0 | 0 | | Mul | zero | denorm | zero | 1 | 0 | 0 | 0 | 0 | | Mul | zero | zero | zero | 0 | 0 | 0 | 0 | 0 | | Mul | zero | norm | zero | 0 | 0 | 0 | 0 | 0 | | Mul | norm | 8 | max | 1 | 0 | 0 | 0 | 0 | | Mul | norm | NaN | max | 1 | 0 | 0 | 0 | 0 | | Mul | norm | denorm | zero | 1 | 0 | 0 | 0 | 0 | | Mul | norm | zero | zero | 0 | 0 | 0 | 0 | 0 | | Mul | norm | norm | _Calc_ | 0 | * | * | 0 | * | | | | | Divide <sup>3</sup> | | | | | | | Div | ∞ | 8 | zero | 1 | 0 | 0 | 0 | 0 | | Div | ∞ | NaN | zero | 1 | 0 | 0 | 0 | 0 | | Div | ∞ | denorm | max | 1 | 0 | 0 | 0 | 0 | | Div | ∞ | zero | max | 1 | 0 | 0 | 0 | 0 | | Div | ∞ | Norm | max | 1 | 0 | 0 | 0 | 0 | | Div | NaN | 8 | zero | 1 | 0 | 0 | 0 | 0 | | Div | NaN | NaN | zero | 1 | 0 | 0 | 0 | 0 | | Div | NaN | denorm | max | 1 | 0 | 0 | 0 | 0 | | Div | NaN | zero | max | 1 | 0 | 0 | 0 | 0 | | Div | NaN | norm | max | 1 | 0 | 0 | 0 | 0 | | Div | denorm | 8 | zero | 1 | 0 | 0 | 0 | 0 | | Div | denorm | NaN | zero | 1 | 0 | 0 | 0 | 0 | | Div | denorm | denorm | max | 1 | 0 | 0 | 0 | 0 | | Div | denorm | zero | max | 1 | 0 | 0 | 0 | 0 | | Div | denorm | norm | zero | 1 | 0 | 0 | 0 | 0 | | Div | zero | 8 | zero | 1 | 0 | 0 | 0 | 0 | | Div | zero | NaN | zero | 1 | 0 | 0 | 0 | 0 | | Div | zero | denorm | max | 1 | 0 | 0 | 0 | 0 | | Div | zero | zero | max | 1 | 0 | 0 | 0 | 0 | | Div | zero | norm | zero | 0 | 0 | 0 | 0 | 0 | | Div | norm | ∞ | zero | 1 | 0 | 0 | 0 | 0 | | Div | norm | NaN | zero | 1 | 0 | 0 | 0 | 0 | | Div | norm | denorm | max | 1 | 0 | 0 | 0 | 0 | | Div | norm | zero | max | 0 | 0 | 0 | 1 | 0 | | Div | norm | norm | _Calc_ | 0 | * | * | 0 | * | | Table 4: Embedded Floating-Point Results Summary—Single Convert from Double | | | | | | | |-----------------------------------------------------------------------------|---------------|------|------|------|------|------| | Operand B | efscfd result | FINV | FOVF | FUNF | FDBZ | FINX | | +∞ | pmax | 1 | 0 | 0 | 0 | 0 | | -∞ | nmax | 1 | 0 | 0 | 0 | 0 | | +NaN | pmax | 1 | 0 | 0 | 0 | 0 | | -NaN | nmax | 1 | 0 | 0 | 0 | 0 | | +denorm | +zero | 1 | 0 | 0 | 0 | 0 | | -denorm | -zero | 1 | 0 | 0 | 0 | 0 | | +zero | +zero | 0 | 0 | 0 | 0 | 0 | | -zero | -zero | 0 | 0 | 0 | 0 | 0 | | norm | _Calc_ | 0 | * | * | 0 | * | | Table 5: Embedded Floating-Point Results Summary—Double Convert from Single | | | | | | | |-----------------------------------------------------------------------------|---------------|------|------|------|------|------| | Operand B | efdcfs result | FINV | FOVF | FUNF | FDBZ | FINX | | +∞ | pmax | 1 | 0 | 0 | 0 | 0 | | -∞ | nmax | 1 | 0 | 0 | 0 | 0 | | +NaN | pmax | 1 | 0 | 0 | 0 | 0 | | -NaN | nmax | 1 | 0 | 0 | 0 | 0 | | +denorm | +zero | 1 | 0 | 0 | 0 | 0 | | -denorm | -zero | 1 | 0 | 0 | 0 | 0 | | +zero | +zero | 0 | 0 | 0 | 0 | 0 | | -zero | -zero | 0 | 0 | 0 | 0 | 0 | | norm | _Calc_ | 0 | 0 | 0 | 0 | 0 | | Table 6: Embedded Floating-Point Results Summary—Convert to Unsigned | | | | | | | | |----------------------------------------------------------------------|---------------------------------|------------------------|------|------|------|------|------| | Operand B | Integer Result<br>ctui[d][z] | Fractional Result ctuf | FINV | FOVF | FUNF | FDBZ | FINX | | +∞ | 0xFFFF_FFFF<br>0xFFFF_FFFF_FFFF | 0x7FFF_FFFF | 1 | 0 | 0 | 0 | 0 | | -∞ | 0 | 0 | 1 | 0 | 0 | 0 | 0 | | +NaN | 0 | 0 | 1 | 0 | 0 | 0 | 0 | | -NaN | 0 | 0 | 1 | 0 | 0 | 0 | 0 | | denorm | 0 | 0 | 1 | 0 | 0 | 0 | 0 | | zero | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | +norm | _Calc_ | _Calc_ | * | 0 | 0 | 0 | * | | -norm | _Calc_ | _Calc_ | * | 0 | 0 | 0 | * | | Table 7: Embedded Floating-Point Results Summary—Convert to Signed | | | | | | | | |--------------------------------------------------------------------|--------------------------------------|------------------------|------|------|------|------|------| | Operand B | Integer Result<br>ctsi[d][z] | Fractional Result ctsf | FINV | FOVF | FUNF | FDBZ | FINX | | +∞ | 0x7FFF_FFFF<br>0x7FFF_FFFF_FFFF | 0x7FFF_FFFF | 1 | 0 | 0 | 0 | 0 | | -∞ | 0x8000_0000<br>0x8000_0000_0000_0000 | 0x8000_0000 | 1 | 0 | 0 | 0 | 0 | | +NaN | 0 | 0 | 1 | 0 | 0 | 0 | 0 | | -NaN | 0 | 0 | 1 | 0 | 0 | 0 | 0 | | denorm | 0 | 0 | 1 | 0 | 0 | 0 | 0 | | zero | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | +norm | _Calc_ | _Calc_ | * | 0 | 0 | 0 | * | | -norm | _Calc_ | _Calc_ | * | 0 | 0 | 0 | * | | Table 8: Embedded Floating-Point Results Summary—Convert from Unsigned | | | | | | | | |------------------------------------------------------------------------|---------------------|------------------------------------------|---|------|------|------|------| | Operand B | Integer Source cfui | teger Source Fractional Source cfui cfuf | | FOVF | FUNF | FDBZ | FINX | | zero | zero | zero | 0 | 0 | 0 | 0 | 0 | | norm | _Calc_ | _Calc_ | 0 | 0 | 0 | 0 | * | | Table 9: Embedded Floating-Point Results Summary—Convert from Signed | | | | | | | | |----------------------------------------------------------------------|--------------------------------------------|--------|------|------|------|------|------| | Operand B | Integer Source cfsi Fractional Source cfsf | | FINV | FOVF | FUNF | FDBZ | FINX | | zero | zero | zero | 0 | 0 | 0 | 0 | 0 | | norm | _Calc_ | _Calc_ | 0 | 0 | 0 | 0 | * | | Table 10:Embedded Floating-Point Results Summary—*abs, *nabs, *neg | | | | | | | | | |--------------------------------------------------------------------|-----------------|-----------------|-----------------|------|------|------|------|------| | Operand A | *abs | *nabs | *neg | FINV | FOVF | FUNF | FDBZ | FINX | | +∞ | pmax +∞ | nmax -∞ | -amax -∞ | 1 | 0 | 0 | 0 | 0 | | -∞ | pmax +∞ | nmax -∞ | -amax +∞ | 1 | 0 | 0 | 0 | 0 | | +NaN | pmax NaN | nmax -NaN | -amax -NaN | 1 | 0 | 0 | 0 | 0 | | -NaN | pmax NaN | nmax -NaN | -amax +NaN | 1 | 0 | 0 | 0 | 0 | | +denorm | +zero +denorm | -zero -denorm | -zero -denorm | 1 | 0 | 0 | 0 | 0 | | -denorm | +zero +denorm | -zero -denorm | +zero +denorm | 1 | 0 | 0 | 0 | 0 | | +zero | +zero | -zero | -zero | 0 | 0 | 0 | 0 | 0 | | -zero | +zero | -zero | +zero | 0 | 0 | 0 | 0 | 0 | | +norm | +norm | -norm | -norm | 0 | 0 | 0 | 0 | 0 | | -norm | +norm | -norm | +norm | 0 | 0 | 0 | 0 | 0 | # Chapter 9. Legacy Move Assist Instruction [Category: Legacy Move Assist] 31 ### Determine Leftmost Zero Byte X-form ### Special Registers Altered: XER<sub>57:63</sub> CR0 R0 (if Rc=1) | dlmzb<br>dlmzb. | , | S,RB<br>S,RB | | | (Rc=0)<br>(Rc=1) | |-----------------|----|--------------|----|----|------------------| | 31 | RS | RA | RB | 78 | Rc | 16 ``` d_{0:63} \leftarrow (RS)_{32:63} \mid \mid (RB)_{32:63} i \leftarrow 0 x \leftarrow 0 y ← 0 do while (x<8) & (y=0) x \leftarrow x + 1 if d_{i+32:i+39} = 0 then y ← 1 else i ← i + 8 RA \leftarrow x XER_{57:63} \leftarrow x if Rc = 1 then do CR<sub>35</sub> ← SO if y = 1 then do if x<5 then CR_{32:34} \leftarrow 0b010 else CR_{32:34} \leftarrow 0b100 else CR_{32:34} \leftarrow 0b001 ``` The contents of bits 32:63 of register RS and the contents of bits 32:63 of register RB are concatenated to form an 8-byte operand. The operand is searched for the leftmost byte in which each bit is 0 (i.e., a null byte). Bytes in the operand are numbered from left to right starting with 1. If a null byte is found, its byte number is placed into bits 57:63 of the XER and into register RA. Otherwise, the value <code>0b000\_1000</code> is placed into both bits 57:63 of the XER and register RA. If Rc is equal to 1, SO is copied into bit 35 of the CR and bits 32:34 of the CR are updated as follows: - If no null byte is found, bits 32:34 of the CR are set to 0b001. - If the leftmost null byte is in the first 4 bytes (i.e., from register RS), bits 32:34 of the CR are set to 0b010. - If the leftmost null byte is in the last 4 bytes (i.e., from register RB), bits 32:34 of the CR are set to 0b100. # Chapter 10. Legacy Integer Multiply-Accumulate Instructions [Category: Legacy Integer Multiply-Accumulate] The Legacy Integer Multiply-Accumulate instructions with Rc=1 set the first three bits of CR Field 0 based on the 32-bit result, as described in Section 3.3.7, "Other Fixed-Point Instructions". The XO-form Legacy Integer Multiply-Accumulate instructions set SO and OV when OE=1 to reflect over-flow of the 32-bit result. ### **Programming Note** Notice that CR Field 0 may not reflect the "true" (infinitely precise) result if overflow occurs. ## Multiply Accumulate Cross Halfword to Word Modulo Signed XO-form | macchw | RT,RA,RB | (OE=0 Rc=0) | |----------|----------|-------------| | macchw. | RT,RA,RB | (OE=0 Rc=1) | | macchwo | RT,RA,RB | (OE=1 Rc=0) | | macchwo. | RT,RA,RB | (OE=1 Rc=1) | | 4 | RT | RA | RB | OE | 172 | Rc | |---|----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 22 | 31 | ``` \begin{array}{l} \operatorname{prod}_{0:31} \leftarrow & (\operatorname{RA})_{48:63} \times_{\operatorname{si}} & (\operatorname{RB})_{32:47} \\ \operatorname{temp}_{0:32} \leftarrow & \operatorname{prod}_{0:31} + & (\operatorname{RT})_{32:63} \\ \operatorname{RT}_{32:63} \leftarrow & \operatorname{temp}_{1:32} \\ \operatorname{RT}_{0:31} \leftarrow & \operatorname{undefined} \end{array} ``` The signed-integer halfword in bits 48:63 of register RA is multiplied by the signed-integer halfword in bits 32:47 of register RB. The 32-bit signed-integer product is added to the signed-integer word in bits 32:63 of register RT. The low-order 32 bits of the sum are placed into bits 32:63 of register RT. The contents of bits 0:31 of register RT are undefined. ### **Special Registers Altered:** | SO OV | (if OE=1) | |-------|-----------| | CR0 | (if Rc=1) | ## Multiply Accumulate Cross Halfword to Word Saturate Signed XO-form | macchws | RT,RA,RB | (OE=0 Rc=0) | |-----------|----------|-------------| | macchws. | RT,RA,RB | (OE=0 Rc=1) | | macchwso | RT,RA,RB | (OE=1 Rc=0) | | macchwso. | RT,RA,RB | (OE=1 Rc=1) | | 4 | RT | RA | RB | OE | 236 | Rc | |---|----|----|----|-------|-----|----| | 0 | 6 | 11 | 16 | 21 22 | | 31 | The signed-integer halfword in bits 48:63 of register RA is multiplied by the signed-integer halfword in bits 32:47 of register RB. The 32-bit signed-integer product is added to the signed-integer word in bits 32:63 of register RT. If the sum is less than -2 $^{31}$ , then the value $0x8000\_0000$ is placed into bits 32:63 of register RT. If the sum is greater than $2^{31}$ -1, then the value $0x7FFF_FFFF$ is placed into bits 32:63 of register RT. Otherwise, the sum is placed into bits 32:63 of register RT The contents of bits 0:31 of register RT are undefined. ## Multiply Accumulate Cross Halfword to Word Modulo Unsigned XO-form | macchwu | RT,RA,RB | (OE=0 Rc=0) | |-----------|----------|-------------| | macchwu. | RT,RA,RB | (OE=0 Rc=1) | | macchwuo | RT,RA,RB | (OE=1 Rc=0) | | macchwuo. | RT,RA,RB | (OE=1 Rc=1) | | 4 | RT | RA | RB | OE | 140 | Rc | |---|----|----|----|-------|-----|----| | 0 | 6 | 11 | 16 | 21 22 | | 31 | $$\begin{array}{l} \operatorname{prod}_{0:31} \leftarrow \left( \operatorname{RA} \right)_{48:63} \times_{\operatorname{ui}} \left( \operatorname{RB} \right)_{32:47} \\ \operatorname{temp}_{0:32} \leftarrow \operatorname{prod}_{0:31} + \left( \operatorname{RT} \right)_{32:63} \\ \operatorname{RT} \leftarrow \operatorname{temp}_{1:32} \end{array}$$ The unsigned-integer halfword in bits 48:63 of register RA is multiplied by the unsigned-integer halfword in bits 32:47 of register RB. The 32-bit unsigned-integer product is added to the unsigned-integer word in bits 32:63 of register RT. The low-order 32 bits of the sum are placed into bits 32:63 of register RT. The contents of bits 0:31 of register RT are undefined. ### **Special Registers Altered:** ## Multiply Accumulate Cross Halfword to Word Saturate Unsigned XO-form | RT,RA,RB | | (OE=0 Rc=0) | |------------|------------------------------------------------|----------------------| | RT,RA,RB | | (OE=0 Rc=1) | | RT,RA,RB | | (OE=1 Rc=0) | | . RT,RA,RB | | (OE=1 Rc=1) | | | RT,RA,RB<br>RT,RA,RB<br>RT,RA,RB<br>. RT,RA,RB | RT,RA,RB<br>RT,RA,RB | | 4 | RT | RA | RB | OE | 204 | Rc | |---|----|----|----|-------|-----|----| | 0 | 6 | 11 | 16 | 21 22 | | 31 | The unsigned-integer halfword in bits 48:63 of register RA is multiplied by the unsigned-integer halfword in bits 32:47 of register RB. The 32-bit unsigned-integer product is added to the unsigned-integer word in bits 32:63 of register RT. If the sum is greater than $2^{32}$ -1, then the value 0xFFFF FFFF is placed into bits 32:63 of register RT. Otherwise, the sum is placed into bits 32:63 of register RT The contents of bits 0:31 of register RT are undefined. ## Multiply Accumulate High Halfword to Word Modulo Signed XO-form | machhw | RT,RA,RB | (OE=0 Rc=0) | |----------|----------|-------------| | machhw. | RT,RA,RB | (OE=0 Rc=1) | | machhwo | RT,RA,RB | (OE=1 Rc=0) | | machhwo. | RT,RA,RB | (OE=1 Rc=1) | | 4 | RT | RA | RB | OE | 44 | Rc | |---|----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 22 | 31 | ``` \begin{array}{l} \text{prod}_{0:31} \leftarrow \text{(RA)}_{32:47} \times_{\text{si}} \text{(RB)}_{32:47} \\ \text{temp}_{0:32} \leftarrow \text{prod}_{0:31} + \text{(RT)}_{32:63} \\ \text{RT}_{32:63} \leftarrow \text{temp}_{1:32} \\ \text{RT}_{0:31} \leftarrow \text{undefined} \end{array} ``` The signed-integer halfword in bits 32:47 of register RA is multiplied by the signed-integer halfword in bits 32:47 of register RB. The 32-bit signed-integer product is added to the signed-integer word in bits 32:63 of register RT. The low-order 32 bits of the sum are placed into bits 32:63 of register RT. The contents of bits 0:31 of register RT are undefined. ### **Special Registers Altered:** ## Multiply Accumulate High Halfword to Word Saturate Signed XO-form | machhws | RT,RA,RB | (OE=0 Rc=0) | |-----------|----------|-------------| | machhws. | RT,RA,RB | (OE=0 Rc=1) | | machhwso | RT,RA,RB | (OE=1 Rc=0) | | machhwso. | RT,RA,RB | (OE=1 Rc=1) | | 4 | RT | RA | RB | OE | 108 | Rc | |---|----|----|----|-------|-----|----| | 0 | 6 | 11 | 16 | 21 22 | | 31 | The signed-integer halfword in bits 32:47 of register RA is multiplied by the signed-integer halfword in bits 32:47 of register RB. The 32-bit signed-integer product is added to the signed-integer word in bits 32:63 of register RT. If the sum is less than -2 $^{31}$ , then the value $0x8000\_0000$ is placed into bits 32:63 of register RT. If the sum is greater than $2^{31}$ -1, then the value $0x7FFF_FFFF$ is placed into bits 32:63 of register RT. Otherwise, the sum is placed into bits 32:63 of register RT. The contents of bits 0:31 of register RT are undefined. ## Multiply Accumulate High Halfword to Word Modulo Unsigned XO-form | machhwu | RT,RA,RB | (OE=0 Rc=0) | |-----------|----------|-------------| | machhwu. | RT,RA,RB | (OE=0 Rc=1) | | machhwuo | RT,RA,RB | (OE=1 Rc=0) | | machhwuo. | RT,RA,RB | (OE=1 Rc=1) | | 4 | RT | RA | RB | OE | 12 | Rc | |---|----|----|----|-------|----|----| | 0 | 6 | 11 | 16 | 21 22 | | 31 | $$\begin{array}{l} \texttt{prod}_{0:31} \leftarrow (\texttt{RA})_{32:47} \times_{ui} (\texttt{RB})_{32:47} \\ \texttt{temp}_{0:32} \leftarrow \texttt{prod}_{0:31} + (\texttt{RT})_{32:63} \\ \texttt{RT}_{32:63} \leftarrow \texttt{temp}_{1:32} \\ \texttt{RT}_{0:31} \leftarrow \texttt{undefined} \end{array}$$ The unsigned-integer halfword in bits 32:47 of register RA is multiplied by the unsigned-integer halfword in bits 32:47 of register RB. The 32-bit unsigned-integer product is added to the unsigned-integer word in bits 32:63 of register RT. The low-order 32 bits of the sum are placed into bits 32:63 of register RT. The contents of bits 0:31 of register RT are undefined. ### **Special Registers Altered:** ## Multiply Accumulate High Halfword to Word Saturate Unsigned XO-form | machhwsu | RT,RA,RB | (OE=0 Rc=0) | |-----------|----------|-------------| | machhwsu. | RT,RA,RB | (OE=0 Rc=1) | | machhwsuo | RT,RA,RB | (OE=1 Rc=0) | | machhwsuo | RT,RA,RB | (OE=1 Rc=1) | | 4 | RT | RA | RB | OE | 76 | Rc | l | |---|----|----|----|-------|----|----|---| | 0 | 6 | 11 | 16 | 21 22 | | 31 | l | ``` \begin{array}{lll} \texttt{prod}_{0:31} \leftarrow \texttt{(RA)}_{32:47} \ \texttt{x}_{ui} \ \texttt{(RB)}_{32:47} \\ \texttt{temp}_{0:32} \leftarrow \texttt{prod}_{0:31} + \texttt{(RT)}_{32:63} \\ \texttt{if temp} > 2^{32} \text{-1 then RT} \leftarrow \texttt{0xFFFF\_FFFF} \\ \texttt{else} & \texttt{RT} \leftarrow \texttt{temp}_{1:32} \end{array} ``` The unsigned-integer halfword in bits 32:47 of register RA is multiplied by the unsigned-integer halfword in bits 32:47 of register RB. The 32-bit unsigned-integer product is added to the unsigned-integer word in bits 32:63 of register RT. If the sum is greater than $2^{32}$ -1, then the value 0xFFFF FFFF is placed into bits 32:63 of register RT. Otherwise, the sum is placed into bits 32:63 of register RT. The contents of bits 0:31 of register RT are undefined. ## Multiply Accumulate Low Halfword to Word Modulo Signed XO-form | maclhw | RT,RA,RB | (OE=0 Rc=0) | |----------|----------|-------------| | maclhw. | RT,RA,RB | (OE=0 Rc=1) | | maclhwo | RT,RA,RB | (OE=1 Rc=0) | | maclhwo. | RT,RA,RB | (OE=1 Rc=1) | | 4 | RT | RA | RB | OE | 428 | Rc | |---|----|----|----|------|-----|----| | 0 | 6 | 11 | 16 | 21 2 | 2 | 31 | ``` \begin{array}{lll} \operatorname{prod}_{0:31} \leftarrow & (\operatorname{RA})_{48:63} \times_{\operatorname{si}} & (\operatorname{RB})_{48:63} \\ \operatorname{temp}_{0:32} \leftarrow & \operatorname{prod}_{0:31} + & (\operatorname{RT})_{32:63} \\ \operatorname{RT}_{32:63} \leftarrow & \operatorname{temp}_{1:32} \\ \operatorname{RT}_{0:31} \leftarrow & \operatorname{undefined} \end{array} ``` The signed-integer halfword in bits 48:63 of register RA is multiplied by the signed-integer halfword in bits 48:63 of register RB. The 32-bit signed-integer product is added to the signed-integer word in bits 32:63 of register RT. The low-order 32 bits of the sum are placed into bits 32:63 of register RT. The contents of bits 0:31 of register RT are undefined. ### **Special Registers Altered:** ## Multiply Accumulate Low Halfword to Word Saturate Signed XO-form | maclhws | RT,RA,RB | (OE=0 Rc=0) | |-----------|----------|-------------| | maclhws. | RT,RA,RB | (OE=0 Rc=1) | | maclhwso | RT,RA,RB | (OE=1 Rc=0) | | maclhwso. | RT,RA,RB | (OE=1 Rc=1) | | 4 | RT | RA | RB | OE | 492 | Rc | |---|----|----|----|-------|-----|----| | 0 | 6 | 11 | 16 | 21 22 | | 31 | The signed-integer halfword in bits 48:63 of register RA is multiplied by the signed-integer halfword in bits 48:63 of register RB. The 32-bit signed-integer product is added to the signed-integer word in bits 32:63 of register RT. If the sum is less than -2 $^{31}$ , then the value $0x8000\_0000$ is placed into bits 32:63 of register RT. If the sum is greater than $2^{31}$ -1, then the value $0x7FFF_FFFF$ is placed into bits 32:63 of register RT. Otherwise, the sum is placed into bits 32:63 of register RT. The contents of bits 0:31 of register RT are undefined. ## Multiply Accumulate Low Halfword to Word Modulo Unsigned XO-form | maclhwu | RT,RA,RB | (OE=0 Rc=0) | |-----------|----------|-------------| | maclhwu. | RT,RA,RB | (OE=0 Rc=1) | | maclhwuo | RT,RA,RB | (OE=1 Rc=0) | | maclhwuo. | RT,RA,RB | (OE=1 Rc=1) | | 4 | RT | RA | RB | OE | 396 | Rc | |---|----|----|----|------|-----|----| | 0 | 6 | 11 | 16 | 21 2 | 22 | 31 | $$prod_{0:31} \leftarrow (RA)_{48:63} \times_{ui} (RB)_{48:63}$$ $temp_{0:32} \leftarrow prod_{0:31} + (RT)_{32:63}$ $RT_{32:63} \leftarrow temp_{1:32}$ $RT_{0:31} \leftarrow undefined$ The unsigned-integer halfword in bits 48:63 of register RA is multiplied by the unsigned-integer halfword in bits 48:63 of register RB. The 32-bit unsigned-integer product is added to the unsigned-integer word in bits 32:63 of register RT. The low-order 32 bits of the sum are placed into bits 32:63 of register RT. The contents of bits 0:31 of register RT are undefined. ### **Special Registers Altered:** ## Multiply Accumulate Low Halfword to Word Saturate Unsigned XO-form | maclhwsu | RT,RA,RB | (OE=0 Rc=0) | |------------|----------|-------------| | maclhwsu. | RT,RA,RB | (OE=0 Rc=1) | | maclhwsuo | RT,RA,RB | (OE=1 Rc=0) | | maclhwsuo. | RT,RA,RB | (OE=1 Rc=1) | | | | (OE=1 Ro | | 4 | RT | RA | RB | OE | 460 | Rc | |---|----|----|----|-------|-----|----| | 0 | 6 | 11 | 16 | 21 22 | 2 | 31 | $$\begin{array}{lll} {\rm prod}_{0:31} \leftarrow & {\rm (RA)}_{48:63} \times_{\rm ui} & {\rm (RB)}_{48:63} \\ {\rm temp}_{0:32} \leftarrow & {\rm prod}_{0:31} + & {\rm (RT)}_{32:63} \\ {\rm if temp} \, > \, 2^{32} \text{-1 then RT} \leftarrow & {\rm 0xFFFF\_FFFF} \\ {\rm else} & {\rm RT} \leftarrow & {\rm temp}_{1:32} \end{array}$$ The unsigned-integer halfword in bits 48:63 of register RA is multiplied by the unsigned-integer halfword in bits 48:63 of register RB. The 32-bit unsigned-integer product is added to the unsigned-integer word in bits 32:63 of register RT. If the sum is greater than $2^{32}$ -1, then the value $0xFFFF_FFFF$ is placed into bits 32:63 of register RT. Otherwise, the sum is placed into bits 32:63 of register RT The contents of bits 0:31 of register RT are undefined. ### **Special Registers Altered:** ## Multiply Cross Halfword to Word Signed X-form | mulchw | RT,RA,RB | (Rc=0) | |---------|----------|--------| | mulchw. | RT,RA,RB | (Rc=1) | | | 4 | RT | RA | RB | 168 | Rc | |---|---|----|----|----|-----|----| | 0 | | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{ll} \mathrm{RT_{32:63}} \leftarrow \mathrm{(RA)_{48:63}} \times_{\mathrm{si}} \mathrm{(RB)_{32:47}} \\ \mathrm{RT_{0:31}} \leftarrow \mathrm{undefined} \end{array}$$ The signed-integer halfword in bits 48:63 of register RA is multiplied by the signed-integer halfword in bits 32:47 of register RB and the signed-integer word result is placed into bits 32:63 of register RT. The contents of bits 0:31 of register RT are undefined. ### **Special Registers Altered:** | (if Rc=1) | |-----------| | | ## Multiply Cross Halfword to Word Unsigned X-form | mulchwu | RT,RA,RB | (Rc=0) | |----------|----------|--------| | mulchwu. | RT,RA,RB | (Rc=1) | | 4 | RT | RA | RB | 136 | Rc | |---|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{lll} \mathrm{RT_{32:63}} \leftarrow & \mathrm{(RA)_{48:63}} \times_{\mathrm{ui}} & \mathrm{(RB)_{32:47}} \\ \mathrm{RT_{0:31}} & \leftarrow & \mathrm{undefined} \end{array}$$ The unsigned-integer halfword in bits 48:63 of register RA is multiplied by the unsigned-integer halfword in bits 32:47 of register RB and the unsigned-integer word result is placed into bits 32:63 of register RT. The contents of bits 0:31 of register RT are undefined. ## Multiply High Halfword to Word Signed X-form ## $\begin{array}{lll} \text{mulhhw} & \text{RT,RA,RB} & \text{(Rc=0)} \\ \text{mulhhw}. & \text{RT,RA,RB} & \text{(Rc=1)} \end{array}$ | 4 | RT | RA | RB | 40 | Rc | |---|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{ll} \mathrm{RT_{32:63}} \leftarrow \mathrm{(RA)_{32:47}} \times_{\mathrm{si}} \mathrm{(RB)_{32:47}} \\ \mathrm{RT_{0:31}} \leftarrow \mathrm{undefined} \end{array}$$ The signed-integer halfword in bits 32:47 of register RA is multiplied by the signed-integer halfword in bits 32:47 of register RB and the signed-integer word result is placed into bits 32:63 of register RT. The contents of bits 0:31 of register RT are undefined. ### **Special Registers Altered:** ## Multiply Low Halfword to Word Signed X-form | mullhw | RT,RA,RB | (Rc=0) | |---------|----------|--------| | mullhw. | RT,RA,RB | (Rc=1) | | 4 | RT | RA | RB | 424 | Rc | |---|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{l} \text{RT}_{32:63} \leftarrow \text{(RA)}_{48:63} \times_{\text{si}} \text{(RB)}_{48:63} \\ \text{RT}_{0:31} \leftarrow \text{undefined} \end{array}$$ The signed-integer halfword in bits 48:63 of register RA is multiplied by the signed-integer halfword in bits 48:63 of register RB and the signed-integer word result is placed into bits 32:63 of register RT. The contents of bits 0:31 of register RT are undefined. ### **Special Registers Altered:** | CR0 | (if Rc=1) | |-----|-----------| | | | ## Multiply High Halfword to Word Unsigned X-form | mulhhwu | RT,RA,RB | (Rc=0) | |----------|----------|--------| | mulhhwu. | RT,RA,RB | (Rc=1) | | 4 | RT | RA | RB | 8 | Rc | |---|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{lll} \operatorname{RT}_{32:63} \leftarrow & (\operatorname{RA})_{32:47} \times_{\operatorname{ui}} & (\operatorname{RB})_{32:47} \\ \operatorname{RT}_{0:31} & \leftarrow & \operatorname{undefined} \end{array}$$ The unsigned-integer halfword in bits 32:47 of register RA is multiplied by the unsigned-integer halfword in bits 32:47 of register RB and the unsigned-integer word result is placed into bits 32:63 of register RT. The contents of bits 0:31 of register RT are undefined. ### **Special Registers Altered:** ## Multiply Low Halfword to Word Unsigned X-form | mullhwu | RT,RA,RB | (Rc=0) | |----------|----------|--------| | mullhwu. | RT,RA,RB | (Rc=1) | | Ī | 4 | RT | RA | RB | 392 | Rc | |---|---|----|----|----|-----|----| | C | ) | 6 | 11 | 16 | 21 | 31 | $$RT_{32:63} \leftarrow (RA)_{48:63} \times_{ui} (RB)_{48:63}$$ $RT_{0:31} \leftarrow undefined$ The unsigned-integer halfword in bits 48:63 of register RA is multiplied by the unsigned-integer halfword in bits 48:63 of register RB and the unsigned-integer word result is placed into bits 32:63 of register RT. The contents of bits 0:31 of register RT are undefined. ## Negative Multiply Accumulate Cross Halfword to Word Modulo Signed ### XO-form | nmacchw | RT,RA,RB | (OE=0 Rc=0) | |-----------|----------|-------------| | nmacchw. | RT,RA,RB | (OE=0 Rc=1) | | nmacchwo | RT,RA,RB | (OE=1 Rc=0) | | nmacchwo. | RT,RA,RB | (OE=1 Rc=1) | | 4 | RT | RA | RB | 9 | 174 | Rc | |---|----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 22 | 31 | ``` \begin{array}{l} \operatorname{prod}_{0:31} \leftarrow & (\operatorname{RA})_{48:63} \times_{\operatorname{si}} & (\operatorname{RB})_{32:47} \\ \operatorname{temp}_{0:32} \leftarrow & (\operatorname{RT})_{32:63} -_{\operatorname{si}} & \operatorname{prod}_{0:31} \\ \operatorname{RT}_{32:63} \leftarrow & \operatorname{temp}_{1:32} \\ \operatorname{RT}_{0:31} \leftarrow & \operatorname{undefined} \end{array} ``` The signed-integer halfword in bits 48:63 of register RA is multiplied by the signed-integer halfword in bits 32:47 of register RB. The 32-bit signed-integer product is subtracted from the signed-integer word in bits 32:63 of register RT. The low-order 32 bits of the difference are placed into bits 32:63 of register RT. The contents of bits 0:31 of register RT are undefined. ### Special Registers Altered: ### Negative Multiply Accumulate Cross Halfword to Word Saturate Signed XO-form | nmacchws | RT,RA,RB | (OE=0 Rc= | (0 | |------------|----------|-----------|----| | nmacchws. | RT,RA,RB | (OE=0 Rc= | 1) | | nmacchwso | RT,RA,RB | (OE=1 Rc= | 0) | | nmacchwso. | RT,RA,RB | (OE=1 Rc= | 1) | | 4 | RT | RA | RB | OE | 238 | Rc | |---|----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 22 | 31 | The signed-integer halfword in bits 48:63 of register RA is multiplied by the signed-integer halfword in bits 32:47 of register RB. The 32-bit signed-integer product is subtracted from the signed-integer word in bits 32:63 of register RT. If the difference is less than $-2^{31}$ , then the value $0 \times 8000 \ 0000$ is placed into bits 32:63 of register RT. If the difference is greater than 2<sup>31</sup>-1, then the value 0x7FFF FFFF is placed into bits 32:63 of register RT. Otherwise, the difference is placed into bits 32:63 of register RT. The contents of bits 0:31 of register RT are undefined. ## Negative Multiply Accumulate High Halfword to Word Modulo Signed ### XO-form | nmachhw | RT,RA,RB | (OE=0 Rc=0) | |-----------|----------|-------------| | nmachhw. | RT,RA,RB | (OE=0 Rc=1) | | nmachhwo | RT,RA,RB | (OE=1 Rc=0) | | nmachhwo. | RT,RA,RB | (OE=1 Rc=1) | | 4 | RT | RA | RB | OE | 46 | Rc | |---|----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 22 | 31 | ``` \begin{array}{l} \operatorname{prod}_{0;31} \leftarrow & (\operatorname{RA})_{32:47} \times_{\operatorname{si}} & (\operatorname{RB})_{32:47} \\ \operatorname{temp}_{0:32} \leftarrow & (\operatorname{RT})_{32:63} -_{\operatorname{si}} & \operatorname{prod}_{0:31} \\ \operatorname{RT}_{32:63} \leftarrow & \operatorname{temp}_{1:32} \\ \operatorname{RT}_{0:31} \leftarrow & \operatorname{undefined} \end{array} ``` The signed-integer halfword in bits 32:47 of register RA is multiplied by the signed-integer halfword in bits 32:47 of register RB. The 32-bit signed-integer product is subtracted from the signed-integer word in bits 32:63 of register RT. The low-order 32 bits of the difference are placed into bits 32:63 of register RT. The contents of bits 0:31 of register RT are undefined. ### Special Registers Altered: ### Negative Multiply Accumulate High Halfword to Word Saturate Signed XO-form | nmachhws | RT,RA,RB | ( | (OE=0 Rc=0) | |------------|----------|---|-------------| | nmachhws. | RT,RA,RB | | (OE=0 Rc=1) | | nmachhwso | RT,RA,RB | ( | (OE=1 Rc=0) | | nmachhwso. | RT,RA,RB | | (OE=1 Rc=1) | | 4 | RT | RA | RB | OE | 110 | Rc | |---|----|----|----|------|-----|----| | 0 | 6 | 11 | 16 | 21 2 | 2 | 31 | The signed-integer halfword in bits 32:47 of register RA is multiplied by the signed-integer halfword in bits 32:47 of register RB. The 32-bit signed-integer product is subtracted from the signed-integer word in bits 32:63 of register RT. If the difference is less than $-2^{31}$ , then the value 0x8000 0000 is placed into bits 32:63 of register RT. If the difference is greater than $2^{31}$ -1, then the value $0x7FFF_FFFF$ is placed into bits 32:63 of register RT. Otherwise, the difference is placed into bits 32:63 of register RT. The contents of bits 0:31 of register RT are undefined. ## Negative Multiply Accumulate Low Halfword to Word Modulo Signed ### XO-form | nmaclhw | RT,RA,RB | (OE=0 Rc=0) | |-----------|----------|-------------| | nmaclhw. | RT,RA,RB | (OE=0 Rc=1) | | nmaclhwo | RT,RA,RB | (OE=1 Rc=0) | | nmaclhwo. | RT,RA,RB | (OE=1 Rc=1) | | 4 | RT | RA | RB | 9 | 430 | Rc | |---|----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 22 | 31 | ``` \begin{array}{l} \operatorname{prod}_{0:31} \leftarrow & (\operatorname{RA})_{48:63} \times_{\operatorname{si}} & (\operatorname{RB})_{48:63} \\ \operatorname{temp}_{0:32} \leftarrow & (\operatorname{RT})_{32:63} -_{\operatorname{si}} & \operatorname{prod}_{0:31} \\ \operatorname{RT}_{32:63} \leftarrow & \operatorname{temp}_{1:32} \\ \operatorname{RT}_{0:31} \leftarrow & \operatorname{undefined} \end{array} ``` The signed-integer halfword in bits 48:63 of register RA is multiplied by the signed-integer halfword in bits 48:63 of register RB. The 32-bit signed-integer product is subtracted from the signed-integer word in bits 32:63 of register RT. The low-order 32 bits of the difference are placed into bits 32:63 of register RT. The contents of bits 0:31 of register RT are undefined. ### Special Registers Altered: ### Negative Multiply Accumulate Low Halfword to Word Saturate Signed XO-form | nmaclhws | RT,RA,RB | (OE=0 Rc=0) | |------------|----------|-------------| | nmaclhws. | RT,RA,RB | (OE=0 Rc=1) | | nmaclhwso | RT,RA,RB | (OE=1 Rc=0) | | nmaclhwso. | RT,RA,RB | (OE=1 Rc=1) | | 4 | RT | RA | RB | OE | 494 | Rc | |---|----|----|----|-------|-----|----| | 0 | 6 | 11 | 16 | 21 22 | | 31 | The signed-integer halfword in bits 48:63 of register RA is multiplied by the signed-integer halfword in bits 48:63 of register RB. The 32-bit signed-integer product is subtracted from the signed-integer word in bits 32:63 of register RT. If the difference is less than $-2^{31}$ , then the value $0 \times 8000 \ 0000$ is placed into bits 32:63 of register RT. If the difference is greater than 2<sup>31</sup>-1, then the value 0x7FFF\_FFFF is placed into bits 32:63 of register RT. Otherwise, the difference is placed into bits 32:63 of register RT. The contents of bits 0:31 of register RT are undefined. # Appendix A. Suggested Floating-Point Models [Category: Floating-Point] ### A.1 Floating-Point Round to Single-Precision Model The following describes algorithmically the operation of the *Floating Round to Single-Precision* instruction. ``` If (FRB)_{1:11} < 897 and (FRB)_{1:63} > 0 then If FPSCR<sub>UE</sub> = 0 then goto Disabled Exponent Underflow If FPSCR<sub>UE</sub> = 1 then goto Enabled Exponent Underflow If (FRB)_{1:11} > 1150 and (FRB)_{1:11} < 2047 then If FPSCR<sub>OE</sub> = 0 then goto Disabled Exponent Overflow If FPSCR<sub>OE</sub> = 1 then goto Enabled Exponent Overflow End If (FRB)_{1:11} > 896 and (FRB)_{1:11} < 1151 then goto Normal Operand If (FRB)_{1:63} = 0 then goto Zero Operand If (FRB)_{1:11} = 2047 then If (FRB)_{12:63} = 0 then goto Infinity Operand If (FRB)_{12} = 1 then goto QNaN Operand If (FRB)_{12}^{12} = 0 and (FRB)_{13:63} > 0 then goto SNaN Operand Disabled Exponent Underflow: sign \leftarrow (FRB)_0 If (FRB)_{1:11} = 0 then Dο exp ← -1022 frac_{0:52} \leftarrow 0b0 || (FRB)_{12:63} End If (FRB)_{1:11} > 0 then Do exp \leftarrow (FRB)_{1:11} - 1023 frac_{0:52} \leftarrow 0b1 \parallel (FRB)_{12:63} End Denormalize operand: G || R || X ← 0b000 Do while exp < -126 exp \leftarrow exp + 1 frac_{0:52} \parallel G \parallel R \parallel X \leftarrow 0b0 \parallel frac_{0:52} \parallel G \parallel (R \mid X) FPSCR_{UX} \leftarrow (frac_{24:52} || G || R || X) > 0 Round Single(sign,exp,frac<sub>0:52</sub>,G,R,X) FPSCR<sub>XX</sub> ← FPSCR<sub>XX</sub> | FPSCR<sub>FI</sub> If frac_{0.52} = 0 then Do ``` ``` FRT<sub>0</sub> ← sign FRT<sub>1:63</sub> ← 0 If sign = 0 then FPSCR<sub>FPRF</sub> ← "+ zero" If sign = 1 then FPSCR<sub>FPRF</sub> ← "- zero" End If frac_{0:52} > 0 then Do If frac_0 = 1 then Do If sign = 0 then FPSCR<sub>FPRF</sub> ← "+ normal number" If sign = 1 then FPSCR_{FPRF} \leftarrow "- normal number" End If frac_0 = 0 then Do If sign = 0 then FPSCR<sub>FPRF</sub> ← "+ denormalized number" If sign = 1 then FPSCR<sub>FPRF</sub> ← "- denormalized number" Normalize operand: Do while frac_0 = 0 exp ← exp-1 frac_{0:52} \leftarrow frac_{1:52} || 0b0 End FRT<sub>0</sub> ← sign FRT_{1:11} \leftarrow exp + 1023 FRT_{12:63} \leftarrow frac_{1:52} End Done Enabled Exponent Underflow: FPSCR<sub>UX</sub> ← 1 sign ← (FRB)<sub>0</sub> If (FRB)_{1:11} = 0 then Do exp ← -1022 frac_{0:52} \leftarrow 0b0 \parallel (FRB)_{12:63} End If (FRB)_{1:11} > 0 then Do exp \leftarrow (FRB)_{1:11} - 1023 frac_{0:52} \leftarrow 0b1 \parallel (FRB)_{12:63} End Normalize operand: Do while frac_0 = 0 exp \leftarrow exp - 1 frac_{0:52} \leftarrow frac_{1:52} || 0b0 End Round Single(sign,exp,frac<sub>0:52</sub>,0,0,0) FPSCR<sub>XX</sub> ← FPSCR<sub>XX</sub> | FPSCR<sub>FI</sub> exp ← exp + 192 FRT<sub>0</sub> ← sign FRT<sub>1:11</sub> ← exp + 1023 FRT<sub>12:63</sub> ← frac<sub>1:52</sub> If sign = 0 then FPSCR<sub>FPRF</sub> ← "+ normal number" If sign = 1 then FPSCR<sub>FPRF</sub> ← "- normal number" Done Disabled Exponent Overflow: FPSCR<sub>OX</sub> ← 1 If FPSCR_{RN} = 0b00 then /* Round to Nearest */ Do If (FRB)_0 = 0 then FRT \leftarrow 0x7FF0_0000_0000_0000 If (FRB)_0^0 = 1 then FRT \leftarrow 0xFFF0_{0000}_{0000}_{0000}_{0000} If (FRB)_0^0 = 0 then FPSCR_{FPRF} \leftarrow "+ infinity" If (FRB)_0 = 1 then FPSCR_{FPRF} \leftarrow "- infinity" End ``` ``` If FPSCR_{RN} = 0b01 then /* Round toward Zero */ Do If (FRB)_0 = 0 then FRT \leftarrow 0x47EF\_FFFF\_E000\_0000 If (FRB)_0 = 1 then FRT \leftarrow 0xC7EF\_FFFF\_E000\_0000 If (FRB)_0 = 0 then FPSCR_{FPRF} \leftarrow "+ normal number" If (FRB)_0 = 1 then FPSCR_{FPRF} \leftarrow "- normal number" End If FPSCR_{RN} = 0b10 then /* Round toward +Infinity */ Do If (FRB)_0 = 0 then FRT \leftarrow 0x7FF0_0000_0000_0000 If (FRB)_0 = 1 then FRT \leftarrow 0xC7EF\_FFFF\_E000\_0000 If (FRB)_0 = 0 then FPSCR_{FPRF} \leftarrow "+ infinity" If (FRB)_0 = 1 then FPSCR_{FPRF} \leftarrow "- normal number" If FPSCR_{RN} = 0b11 then /* Round toward -Infinity */ Do If (FRB)_0 = 0 then FRT \leftarrow 0x47EF\_FFFF\_E000\_0000 If (FRB)_0 = 1 then FRT \leftarrow 0xFFF0_0000_0000_0000 If (FRB)_0 = 0 then FPSCR_{FPRF} \leftarrow "+ normal number" If (FRB)_0 = 1 then FPSCR_{FPRF} \leftarrow "- infinity" End \mathsf{FPSCR}_{\mathsf{FR}} \leftarrow \mathsf{undefined} FPSCR<sub>FI</sub> ← 1 FPSCR<sub>XX</sub> ← 1 Done Enabled Exponent Overflow: sign \leftarrow (FRB)_0 \exp \leftarrow (FRB)_{1:11} - 1023 \operatorname{frac}_{0:52} \leftarrow 0b1 \parallel (FRB)_{12:63} Round Single(sign,exp,frac<sub>0:52</sub>,0,0,0) FPSCR<sub>XX</sub> ← FPSCR<sub>XX</sub> | FPSCR<sub>FI</sub> Enabled Overflow: FPSCR_{OX} \leftarrow 1 exp ← exp - 192 FRT<sub>0</sub> ← sign FRT_{1:11} \leftarrow exp + 1023 FRT<sub>12:63</sub> ← frac<sub>1:52</sub> If sign = 0 then FPSCR<sub>FPRF</sub> ← "+ normal number" If sign = 1 then FPSCR<sub>FPRF</sub> ← "- normal number" Zero Operand: FRT ← (FRB) If (FRB)_0 = 0 then FPSCR_{FPRF} \leftarrow "+ zero" If (FRB)_0 = 1 then FPSCR_{FPRF} \leftarrow "- zero" FPSCRFR<sub>FI</sub> ← 0b00 Done Infinity Operand: FRT ← (FRB) If (FRB)_0 = 0 then FPSCR_{FPRF} \leftarrow "+ infinity" If (FRB)_0 = 1 then FPSCR_{FPRF} \leftarrow "- infinity" FPSCRFR<sub>FI</sub> ← 0b00 Done QNaN Operand: FRT ← (FRB)<sub>0:34</sub> || <sup>29</sup>0 FPSCR<sub>FPRF</sub> ← "QNaN" FPSCR<sub>FR FI</sub> ← 0b00 Done ``` ``` SNaN Operand: FPSCR<sub>VXSNAN</sub> ← 1 If FPSCR<sub>VE</sub> = 0 then Do FRT_{0:11} ← (FRB)_{0:11} FRT_{12} ← 1 FRT_{13:63}^{-} \leftarrow (FRB)_{13:34} \parallel^{29}0 FPSCR<sub>FPRF</sub> ← "QNaN" FPSCR<sub>FR FI</sub> ← 0b00 Done Normal Operand: sign \leftarrow (FRB)_0 exp ← (FRB)<sub>1:11</sub> - 1023 frac_{0.52} \leftarrow 0b1 || (FRB)_{12.63} Round Single(sign,exp,frac<sub>0:52</sub>,0,0,0) FPSCR<sub>XX</sub> ← FPSCR<sub>XX</sub> | FPSCR<sub>FI</sub> If \exp > 127 and FPSCR_{OE} = 0 then go to Disabled Exponent Overflow If exp > 127 and FPSCR<sub>OF</sub> = 1 then go to Enabled Overflow FRT<sub>0</sub> ← sign FRT_{1:11} \leftarrow exp + 1023 \label{eq:frac1:52} \begin{split} & \mathsf{FRT}_{12:63} \leftarrow \mathsf{frac}_{1:52} \\ & \mathsf{If sign} = 0 \; \mathsf{then} \; \mathsf{FPSCR}_{\mathsf{FPRF}} \leftarrow \text{``+ normal number''} \end{split} If sign = 1 then FPSCR<sub>FPRF</sub> ← "- normal number" Round Single(sign,exp,frac<sub>0:52</sub>,G,R,X): inc ← 0 lsb ← frac_{23} gbit ← frac<sub>24</sub> rbit ← frac<sub>25</sub> xbit \leftarrow (frac<sub>26:52</sub>||G||R||X)\neq0 If FPSCR_{RN} = 0b00 then /* Round to Nearest */ /* comparisons ignore u bits */ Do If sign || lsb || gbit || rbit || xbit = 0bu11uu then inc ← 1 If sign || Isb || gbit || rbit || xbit = 0bu011u then inc ← 1 If sign || Isb || gbit || rbit || xbit = 0bu01u1 then inc ← 1 End If FPSCR_{RN} = 0b10 then /* Round toward + Infinity */ Do /* comparisons ignore u bits */ If sign || Isb || gbit || rbit || xbit = 0b0u1uu then inc ← 1 If sign || lsb || gbit || rbit || xbit = 0b0uu1u then inc ← 1 If sign || Isb || gbit || rbit || xbit = 0b0uuu1 then inc ← 1 End If FPSCR_{RN} = 0b11 then /* Round toward - Infinity */ /* comparisons ignore u bits */ If sign || Isb || gbit || rbit || xbit = 0b1u1uu then inc ← 1 If sign || lsb || gbit || rbit || xbit = 0b1uu1u then inc ← 1 If sign || Isb || gbit || rbit || xbit = 0b1uuu1 then inc ← 1 frac_{0:23} \leftarrow frac_{0:23} + inc If carry_out = 1 then Do frac_{0:23} \leftarrow 0b1 || frac_{0:22} exp \leftarrow exp + 1 End frac<sub>24:52</sub> ← <sup>29</sup>0 FPSCR<sub>FR</sub> ← inc \mathsf{FPSCR}^{\dots}_{\mathsf{FI}} \! \leftarrow \mathsf{gbit} \mid \mathsf{rbit} \mid \mathsf{xbit} Return ``` ### A.2 Floating-Point Convert to Integer Model The following describes algorithmically the operation of the Floating Convert To Integer instructions. ``` If Floating Convert To Integer Word then Do round\_mode \leftarrow FPSCR_{RN} tgt_precision ← "32-bit integer" If Floating Convert To Integer Word with round toward Zero then round_mode ← 0b01 tgt_precision ← "32-bit integer" End If Floating Convert To Integer Doubleword then round_mode ← FPSCR<sub>RN</sub> tgt_precision ← "64-bit integer" If Floating Convert To Integer Doubleword with round toward Zero then Dο round mode ← 0b01 tgt_precision ← "64-bit integer" End sign \leftarrow (FRB)_0 If (FRB)_{1:11} = 2047 and (FRB)_{12:63} = 0 then goto Infinity Operand If (FRB)_{1:11} = 2047 and (FRB)_{12} = 0 then goto SNaN Operand If (FRB)_{1:11} = 2047 and (FRB)_{12} = 1 then goto QNaN Operand If (FRB)_{1:11} > 1086 then goto Large Operand If (FRB)_{1:11} > 0 then exp \leftarrow (FRB)_{1:11} - 1023 /* exp - bias */ If (FRB)_{1:11} = 0 then exp \leftarrow -1022 If (FRB)_{1:11} > 0 then frac_{0:64} \leftarrow 0b01 \parallel (FRB)_{12:63} \parallel {}^{11}0 \quad /^* normal; need leading 0 for later complement */ If (FRB)_{1:11} = 0 then frac_{0:64} \leftarrow 0b00 \parallel (FRB)_{12:63} \parallel {}^{11}0 \quad /^* denormal */ gbit || rbit || xbit ← 0b000 Do i=1,63-exp /* do the loop 0 times if exp = 63 */ frac_{0:64} \parallel gbit \parallel rbit \parallel xbit \leftarrow 0b0 \parallel frac_{0:64} \parallel gbit \parallel (rbit \mid xbit) End Round Integer(sign,frac<sub>0:64</sub>,gbit,rbit,xbit,round_mode) /* needed leading 0 for -2^{64} < (FRB) < -2^{63} */ If sign = 1 then frac_{0.64} \leftarrow \neg frac_{0.64} + 1 If tgt_precision = "32-bit integer" and frac_{0:64} > 2^{31}-1 then goto Large Operand If tgt_precision = "64-bit integer" and frac_{0:64} > 2^{63}-1 then goto Large Operand If tgt_precision = "32-bit integer" and frac_{0:64} < -2^{31} then goto Large Operand If tgt_precision = "64-bit integer" and frac_{0:64} < -2^{63} then goto Large Operand FPSCR_{XX} \leftarrow FPSCR_{XX} \mid FPSCR_{FI} If tgt_precision = "32-bit integer" then FRT ← 0xuuuu_uuuu || frac<sub>33:64</sub> /* u is undefined hex digit */ If tgt_precision = "64-bit integer" then FRT ← frac<sub>1:64</sub> FPSCR<sub>FPRF</sub> ← undefined Done ``` ``` Round Integer(sign,frac<sub>0:64</sub>,gbit,rbit,xbit,round_mode): inc ← 0 If round mode = 0b00 then /* Round to Nearest */ Do /* comparisons ignore u bits */ If sign || frac<sub>64</sub> || gbit || rbit || xbit = 0bu11uu then inc ← 1 If sign || frac<sub>64</sub> || gbit || rbit || xbit = 0bu011u then inc ← 1 If sign || frac<sub>64</sub> || gbit || rbit || xbit = 0bu01u1 then inc \leftarrow 1 End If round mode = 0b10 then /* Round toward +Infinity */ /* comparisons ignore u bits */ Do If sign || frac_{64} || gbit || rbit || xbit = 0b0u1uu then inc \leftarrow 1 If sign || frac<sub>64</sub> || gbit || rbit || xbit = 0b0uu1u then inc \leftarrow 1 If sign || frac_{64}^{-} || gbit || rbit || xbit = 0b0uuu1 then inc \leftarrow 1 If round_mode = 0b11 then /* Round toward -Infinity */ /* comparisons ignore u bits */ Do If sign || frac_{64} || gbit || rbit || xbit = 0b1u1uu then inc \leftarrow 1 If sign || frac<sub>64</sub> || gbit || rbit || xbit = 0b1uu1u then inc \leftarrow 1 If sign || frac<sub>64</sub> || gbit || rbit || xbit = 0b1uuu1 then inc \leftarrow 1 End \begin{array}{l} \operatorname{frac}_{0:64} \leftarrow \operatorname{frac}_{0:64} + \operatorname{inc} \\ \operatorname{FPSCR}_{\operatorname{FR}} \leftarrow \operatorname{inc} \end{array} FPSCR<sub>FI</sub> ← gbit | rbit | xbit Return Infinity Operand: FPSCR<sub>FR FI VXCVI</sub> ← 0b001 If FPSCR_{VE} = 0 then Do If tgt_precision = "32-bit integer" then If sign = 0 then FRT ← 0xuuuu_uuuu_7FFF_FFFF /* u is undefined hex digit */ If sign = 1 then FRT ← 0xuuuu_uuuu_8000_0000 /* u is undefined hex digit */ End Else Do If sign = 0 then FRT ← 0x7FFF_FFFF_FFFF If sign = 1 then FRT \leftarrow 0x8000\_0000\_0000\_0000 End FPSCR<sub>FPRF</sub> ← undefined End Done SNaN Operand: FPSCR<sub>FR FI VXSNAN VXCVI</sub> ← 0b0011 If FPSCR_{VE} = 0 then Do If tgt_precision = "32-bit integer" then FRT ← 0xuuuu_uuuu_8000_0000 /* u is undefined hex digit */ If tgt_precision = "64-bit integer" then FRT ← 0x8000_0000_0000_0000 FPSCR_{FPRF} \leftarrow undefined End Done QNaN Operand: FPSCR<sub>FR FI VXCVI</sub> ← 0b001 If FPSCR_{VE} = 0 then Dο If tgt_precision = "32-bit integer" then FRT ← 0xuuuu_uuuu_8000_0000 /* u is undefined hex digit */ If tgt_precision = "64-bit integer" then FRT ← 0x8000_0000_0000_0000 FPSCR<sub>FPRF</sub> ← undefined End Done ``` ### **Large Operand:** ### A.3 Floating-Point Convert from Integer Model The following describes algorithmically the operation of the *Floating Convert From Integer Doubleword* instruction. ``` sign \leftarrow (FRB)_0 exp ← 63 frac_{0:63} \leftarrow (FRB) If frac_{0:63} = 0 then go to Zero Operand If sign = 1 then frac_{0:63} \leftarrow \neg frac_{0:63} + 1 Do while frac_0 = 0 /* do the loop 0 times if (FRB) = maximum negative integer */ frac_{0:63} \leftarrow frac_{1:63} || 0b0 exp ← exp - 1 Round Float(sign,exp,frac<sub>0:63</sub>,FPSCR<sub>RN</sub>) If sign = 0 then FPSCR<sub>FPRF</sub> ← "+normal number" If sign = 1 then FPSCR<sub>FPRF</sub> ← "-normal number" FRT<sub>0</sub> ← sign FRT_{1:11} \leftarrow exp + 1023 /* exp + bias */ FRT_{12:63} \leftarrow frac_{1:52} Done Zero Operand: \mathsf{FPSCR}_{\mathsf{FR}\;\mathsf{FI}} \leftarrow \mathsf{0b00} FPSCR<sub>FPRF</sub> ← "+ zero" FRT ← 0x0000_0000_0000_0000 Round Float(sign,exp,frac<sub>0:63</sub>,round_mode): inc ← 0 lsb ← frac<sub>52</sub> gbit ← frac<sub>53</sub> rbit ← frac<sub>54</sub> xbit \leftarrow frac_{55.63} > 0 If round_mode = 0b00 then /* Round to Nearest */ Do /* comparisons ignore u bits */ If sign || lsb || gbit || rbit || xbit = 0bu11uu then inc ← 1 If sign || lsb || gbit || rbit || xbit = 0bu011u then inc ← 1 If sign || lsb || gbit || rbit || xbit = 0bu01u1 then inc ← 1 End If round_mode = 0b10 then /* Round toward + Infinity */ Do /* comparisons ignore u bits */ If sign || lsb || gbit || rbit || xbit = 0b0u1uu then inc ← 1 If sign || Isb || gbit || rbit || xbit = 0b0uu1u then inc ← 1 If sign || lsb || gbit || rbit || xbit = 0b0uuu1 then inc ← 1 End If round_mode = 0b11 then /* Round toward - Infinity */ Do /* comparisons ignore u bits */ If sign || lsb || gbit || rbit || xbit = 0b1u1uu then inc ← 1 If sign || Isb || gbit || rbit || xbit = 0b1uu1u then inc ← 1 If sign || lsb || gbit || rbit || xbit = 0b1uuu1 then inc ← 1 frac_{0:52} \leftarrow frac_{0:52} + inc If carry\_out = 1 then exp \leftarrow exp + 1 \overline{\mathsf{FPSCR}_{\mathsf{FR}}} \leftarrow \mathsf{inc} FPSCR<sub>FI</sub> ← gbit | rbit | xbit FPSCR_{XX} \leftarrow FPSCR_{XX} \mid FPSCR_{FI} Return ``` ### A.4 Floating-Point Round to Integer Model The following describes algorithmically the operation of the Floating Round To Integer instructions. ``` If (FRB)_{1:11} = 2047 and (FRB)_{12:63} = 0, then goto Infinity Operand If (FRB)_{1:11} = 2047 and (FRB)_{12} = 0, then goto SNaN Operand If (FRB)_{1:11} = 2047 and (FRB)_{12} = 1, then goto QNaN Operand if (FRB)_{1:63} = 0 then goto Zero Operand If (FRB)_{1:11} < 1023 then goto Small Operand /* exp < 0; |value| < 1*/ If (FRB)_{1:11} > 1074 then goto Large Operand /* exp > 51; integral value */ sign \leftarrow (FRB)_0 \exp \leftarrow (FRB)_{1:11} - 1023 /* \exp - \text{bias */} \\ frac_{0:52} \leftarrow 0b1 || (FRB)_{12:63} gbit || rbit || xbit ← 0b000 Do i = 1, 52 - exp frac_{0:52} \parallel gbit \parallel rbit \parallel xbit \leftarrow 0b0 \parallel frac_{0:52} \parallel gbit \parallel (rbit \mid xbit) Round Integer (sign, frac<sub>0:52</sub>, gbit, rbit, xbit) Do i = 2, 52 - \exp frac_{0:52} \leftarrow frac_{1:52} \parallel 0b0 If frac<sub>0</sub> = 1, then \exp \leftarrow \exp + 1 Else frac<sub>0.52</sub> \leftarrow frac<sub>1.52</sub> || 0b0 FRT_0 \leftarrow sign FRT_{1:11} \leftarrow exp + 1023 FRT_{12:63} \leftarrow frac_{1:52} If (FRT)_0 = 0 then FPSCR_{FPRF} \leftarrow "+ normal number" Else FPŠCR<sub>FPRF</sub> ← "- normal number" FPSCR_{FR FI} \leftarrow 0b00 Done Round Integer(sign, frac0:52, gbit, rbit, xbit): inc ← 0 If inst = Floating Round to Integer Nearest then /* ties away from zero */ Do /* comparisons ignore u bits */ If sign || frac<sub>52</sub> || gbit || rbit || xbit = 0buu1uu then inc \leftarrow 1 If inst = Floating Round to Integer Plus then Do /* comparisons ignore u bits */ If sign || frac_{52} || gbit || rbit || xbit = 0b0u1uu then inc \leftarrow 1 If sign || frac<sub>52</sub> || gbit || rbit || xbit = 0b0uu1u then inc \leftarrow 1 If sign || frac<sub>52</sub> || gbit || rbit || xbit = 0b0uuu1 then inc \leftarrow 1 End If inst = Floating Round to Integer Minus then Do /* comparisons ignore u bits */ If sign || frac<sub>52</sub> || gbit || rbit || xbit = 0b1u1uu then inc \leftarrow 1 If sign || frac<sub>52</sub> || gbit || rbit || xbit = 0b1uu1u then inc ← 1 If sign || frac<sub>52</sub> || gbit || rbit || xbit = 0b1uuu1 then inc \leftarrow 1 End frac_{0:52} \leftarrow frac_{0:52} + inc Return ``` ``` Infinity Operand: FRT ← (FRB) If (FRB)_0 = 0 then FPSCR_{FPRF} \leftarrow "+ infinity" If (FRB)_0 = 1 then FPSCR_{FPRF} \leftarrow "- infinity" FPSCR<sub>FR FI</sub> ← 0b00 Done SNaN Operand: FPSCR<sub>VXSNAN</sub> ← 1 If FPSCR<sub>VE</sub> = 0 then Do FRT ← (FRB) FRT_{12} \leftarrow 1 FPSCR_{FPRF} \leftarrow "QNaN" \mathsf{FPSCR}_{\mathsf{FR}\;\mathsf{FI}}\; \gets \mathsf{0b00} Done QNaN Operand: FRT ← (FRB) FPSCR<sub>FPRF</sub> ← "QNaN" FPSCR<sub>FR FI</sub> ← 0b00 Done Zero Operand: If (FRB)_0 = 0 then FRT \leftarrow 0x0000\_0000\_0000\_0000 FPSCR<sub>FPRF</sub> ← "+ zero" End Else Dο FRT \leftarrow 0x8000\_0000\_0000\_0000 FPSCR<sub>FPRF</sub> ← "- zero" End FPSCR_{FR FI} \leftarrow 0b00 Done Small Operand: If inst = Floating Round to Integer Nearest and (FRB)<sub>1:11</sub> < 1022 then goto Zero Operand If inst = Floating Round to Integer Toward Zero then goto Zero Operand If inst = Floating Round to Integer Plus and (FRB)<sub>0</sub> = 1 then goto Zero Operand If inst = Floating Round to Integer Minus and (FRB)_0 = 0 then goto Zero Operand If (FRB)_0 = 0 then Do FRT ← 0x3FF0_0000_0000_0000 /* value = 1.0 */ FPSCR<sub>FPRF</sub> ← "+ normal number" End Else Do /* value = -1.0 */ FRT ← 0xBFF0_0000_0000_0000 FPSCR<sub>FPRF</sub> ← "- normal number" End FPSCR<sub>FR FI</sub> ← 0b00 Done Large Operand: FRT ← (FRB) If FRT_0 = 0 then FPSCR_{FPRF} \leftarrow "+ normal number" Else FPSCR<sub>FPRF</sub> ← "- normal number" FPSCR<sub>FR FI</sub> ← 0b00 Done ``` #### Appendix A. Densely Packed Decimal The trailing significand field of the decimal floating-point data format is encoded using Densely Packed Decimal (DPD). DPD encoding is a compression technique which supports the representation of decimal integers of arbitrary length. Translation operates on three Binary Coded Decimal (BCD) digits at a time compressing the 12 bits into 10 bits with an algorithm that can be applied or reversed using simple Boolean operations. In the following examples, a 3-digit BCD number is represented as (abcd)(efgh)(ijkm), a 10-bit DPD number is represented as (pqr)(stu)(v)(wxy), and the Boolean operations, & (AND), | (OR), and ¬ (NOT) are #### A.1 BCD-to-DPD Translation The translation from a 3-digit BCD number to a 10-bit DPD can be performed through the following Boolean operations. ``` p = (f & a & i & ¬e) | (j & a & ¬i) | (b & ¬a) q = (g & a & i & ¬e) | (k & a & ¬i) | (c & ¬a) r = d s = (j & ¬a & e & ¬i) | (f & ¬i & ¬e) | (f & ¬a & ¬e) | (e & i) t = (k & ¬a & e & ¬i) | (g & ¬i & ¬e) | (g & ¬a & ¬e) | (a & i) u = h v = a | e | i w = (¬e & j & ¬i) | (e & i) | a x = (¬a & k & ¬i) | (a & i) | e v = m ``` Alternatively, the following table can be used to perform the translation. The most significant bit of the three BCD digits (left column) is used to select a specific 10bit encoding (right column) of the DPD. | aei | pqr | stu | v | wxy | |-----|-----|-----|---|-------| | 000 | bcd | fgh | 0 | jkm | | 001 | bcd | fgh | 1 | 0 0 m | | 010 | bcd | jkh | 1 | 01m | | 011 | bcd | 10h | 1 | 11m | | 100 | jkd | fgh | 1 | 10m | | 101 | fgd | 01h | 1 | 11m | | 110 | jkd | 00h | 1 | 11m | | 111 | 00d | 11h | 1 | 11m | The full translation of a 3-digit BCD number (000 - 999) to a 10-bit DPD is shown in Table 11 on page 373, with the DPD entries shown in hexadecimal format. The BCD number is produced by replacing '\_' in the leftmost column with the corresponding digit along the top row. The table is split into two halves, with the right half being a continuation of the left half. #### A.2 DPD-to-BCD Translation The translation from a 10-bit DPD to a 3-digit BCD number can be performed through the following Boolean operations. ``` a = (\neg s \& v \& w) \mid (t \& v \& w \& s) \mid (v \& w \& \neg x) b = (p \& s \& x \& \neg t) | (p \& \neg w) | (p \& \neg v) c = (q \& s \& x \& \neg t) | (q \& \neg w) | (q \& \neg v) d = r e = (v \& \neg w \& x) | (s \& v \& w \& x) | (¬t & v & x & w) f = (p & t & v & w & x & ¬s) | (s & ¬x & v) | (s & ¬v) g = (q & t & w & v & x & ¬s) | (t & ¬x & v) | (t & ¬v) i = (t & v & w & x) | (s & v & w & x) | (x & w ~ x) j = (p & ¬s & ¬t & w & v) | (s & v & ¬w & x) | (v - 3 w 4 q) (v 3 x - 3 w 3 q) k = (q \& \neg s \& \neg t \& v \& w) | (t \& v \& \neg w \& x) | (q & v & w & ¬x) | (x ~ ¬v) ``` Alternatively, the following table can be used to perform the translation. A combination of five bits in the DPD encoding (leftmost column) are used to specify a translation to the 3-digit BCD encoding. Dashes (-) in the table are don't cares, and can be either one or zero. | vwxst | abcd | efgh | ijkm | |-------|--------------|------|------| | 0 | 0pqr | 0stu | 0wxy | | 100 | 0pqr | 0stu | 100y | | 101 | 01 0pqr 100u | | 0sty | | 110 | 100r | 0stu | 0pqy | | 11100 | 100r | 100u | 0pqy | | 11101 | 100r | 0pqu | 100y | | 11110 | 0pqr | 100u | 100y | | 11111 | 100r | 100u | 100y | The full translation of the 10-bit DPD to a 3-digit BCD number is shown in Table 12 on page 374. The 10-bit DPD index is produced by concatenating the 6-bit value shown in the left column with the 4-bit index along the top row, both represented in hexadecimal. The values in parentheses are non-preferred translations and are explained further in the following section. #### A.3 Preferred DPD encoding Translating from a 3-digit BCD number (1000 numbers) to a 10-bit DPD encoding (1024 combinations) leaves 24 redundant translations. The 24 redundant combinations are evenly assigned to eight BCD numbers and are shown in the following table, with the non-preferred encoding in parentheses. The preferred encoding is produced by translating a 3-digit BCD number with the translation table or Boolean operations shown in Section A.1. The redundant DPD encodings are all valid and will be correctly translated to their respective BCD value through the mechanisms provided in Section A.2. For decimal floating-point operations all DPD encodings are recognized as source operands. | DPD Code | BCD Value | DPD Code | BCD Value | |----------|-----------|----------|-----------| | 0x06E | | 0x0EE | | | (0x16E) | 888 | (0x1EE) | 988 | | (0x26E) | | (0x2EE) | | | (0x36E) | | (0x3EE) | | | 0x06F | | 0x0EF | | | (0x16F) | 889 | (0x1EF) | 989 | | (0x26F) | | (0x2EF) | | | (0x36F) | | (0x3EF) | | | 0x07E | | 0x0FE | | | (0x17E) | 898 | (0x1FE) | 998 | | (0x27E) | | (0x2FE) | | | (0x37E) | | (0x3FE) | | | 0x07F | | 0x0FF | | | (0x17F) | 899 | (0x1FF) | 999 | | (0x27F) | | (0x2FF) | | | (0x37F) | | (0x3FF) | | | 17 | Table | e 11:I | 3CD- | to-DF | D tra | anslat | tion | | | | | | | | | | | | | | | | |----------------------------------|------------|------------|------|-------|------------|--------|------------|------------|------------|------------|-----|-------------|-----|-----|------------|-----|-----|-----|-----|-----|------------|-----| | | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | | | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | | 00_ | 000 | 001 | 002 | 003 | 004 | 005 | 006 | 007 | 800 | 009 | | 50_ | 280 | 281 | 282 | 283 | 284 | 285 | 286 | 287 | 288 | 289 | | 01_ | 010 | 011 | 012 | 013 | 014 | 015 | 016 | 017 | 018 | 019 | | 51_ | 290 | 291 | 292 | 293 | 294 | 295 | 296 | 297 | 298 | 299 | | 02_ ( | 020 | 021 | 022 | 023 | 024 | 025 | 026 | 027 | 028 | 029 | | 52_ | 2A0 | 2A1 | 2A2 | 2A3 | 2A4 | 2A5 | 2A6 | 2A7 | 2A8 | 2A9 | | 03_ | 030 | 031 | 032 | 033 | 034 | 035 | 036 | 037 | 038 | 039 | | 53_ | 2B0 | 2B1 | 2B2 | 2B3 | 2B4 | 2B5 | 2B6 | 2B7 | 2B8 | 2B9 | | 04_ | 040 | 041 | 042 | 043 | 044 | 045 | 046 | 047 | 048 | 049 | | 54_ | 2C0 | 2C1 | 2C2 | 2C3 | 2C4 | 2C5 | 2C6 | 2C7 | 2C8 | 2C9 | | 05_ ( | 050 | 051 | 052 | 053 | 054 | 055 | 056 | 057 | 058 | 059 | | 55_ | 2D0 | 2D1 | 2D2 | 2D3 | 2D4 | 2D5 | 2D6 | 2D7 | 2D8 | 2D9 | | 06_ 0 | 060 | 061 | 062 | 063 | 064 | 065 | 066 | 067 | 068 | 069 | | 56_ | 2E0 | 2E1 | 2E2 | 2E3 | 2E4 | 2E5 | 2E6 | 2E7 | 2E8 | 2E9 | | 07_ ( | 070 | 071 | 072 | 073 | 074 | 075 | 076 | 077 | 078 | 079 | | <b>57</b> _ | 2F0 | 2F1 | 2F2 | 2F3 | 2F4 | 2F5 | 2F6 | 2F7 | 2F8 | 2F9 | | 08_ | 00A | 00B | 02A | 02B | 04A | 04B | 06A | 06B | 04E | 04F | | 58_ | 28A | 28B | 2AA | 2AB | 2CA | 2CB | 2EA | 2EB | 2CE | 2CF | | 09_ ( | 01A | 01B | 03A | 03B | 05A | 05B | 07A | 07B | 05E | 05F | | <b>59</b> _ | 29A | 29B | 2BA | 2BB | 2DA | 2DB | 2FA | 2FB | 2DE | 2DF | | 10_ ( | 080 | 081 | 082 | 083 | 084 | 085 | 086 | 087 | 088 | 089 | | 60_ | 300 | 301 | 302 | 303 | 304 | 305 | 306 | 307 | 308 | 309 | | 11_ ( | 090 | 091 | 092 | 093 | 094 | 095 | 096 | 097 | 098 | 099 | | 61_ | 310 | 311 | 312 | 313 | 314 | 315 | 316 | 317 | 318 | 319 | | 12_ ( | 0A0 | 0A1 | 0A2 | 0A3 | 0A4 | 0A5 | 0A6 | 0A7 | 0A8 | 0A9 | | 62_ | 320 | 321 | 322 | 323 | 324 | 325 | 326 | 327 | 328 | 329 | | 13_ ( | 0B0 | 0B1 | 0B2 | 0B3 | 0B4 | 0B5 | 0B6 | 0B7 | 0B8 | 0B9 | | 63_ | 330 | 331 | 332 | 333 | 334 | 335 | 336 | 337 | 338 | 339 | | 14_ ( | 0C0 | 0C1 | 0C2 | 0C3 | 0C4 | 0C5 | 0C6 | 0C7 | 0C8 | 0C9 | | 64_ | 340 | 341 | 342 | 343 | 344 | 345 | 346 | 347 | 348 | 349 | | 15_ ( | 0D0 | 0D1 | 0D2 | 0D3 | 0D4 | 0D5 | 0D6 | 0D7 | 0D8 | 0D9 | ļ | 65_ | 350 | 351 | 352 | 353 | 354 | 355 | 356 | 357 | 358 | 359 | | 16_ | 0E0 | 0E1 | 0E2 | 0E3 | 0E4 | 0E5 | 0E6 | 0E7 | 0E8 | 0E9 | | 66_ | 360 | 361 | 362 | 363 | 364 | 365 | 366 | 367 | 368 | 369 | | 17_ ( | 0F0 | 0F1 | 0F2 | 0F3 | 0F4 | 0F5 | 0F6 | 0F7 | 0F8 | 0F9 | | 67_ | 370 | 371 | 372 | 373 | 374 | 375 | 376 | 377 | 378 | 379 | | 18_ ( | A80 | 08B | 0AA | 0AB | 0CA | 0CB | 0EA | 0EB | 0CE | 0CF | | 68_ | 30A | 30B | 32A | 32B | 34A | 34B | 36A | 36B | 34E | 34F | | 19_ ( | 09A | 09B | 0BA | 0BB | 0DA | 0DB | 0FA | 0FB | 0DE | 0DF | | 69_ | 31A | 31B | 33A | 33B | 35A | 35B | 37A | 37B | 35E | 35F | | 20_ | 100 | 101 | 102 | 103 | 104 | 105 | 106 | 107 | 108 | 109 | | 70_ | 380 | 381 | 382 | 383 | 384 | 385 | 386 | 387 | 388 | 389 | | 21_ | 110 | 111 | 112 | 113 | 114 | 115 | 116 | 117 | 118 | 119 | | 71_ | 390 | 391 | 392 | 393 | 394 | 395 | 396 | 397 | 398 | 399 | | 22_ | 120 | 121 | 122 | 123 | 124 | 125 | 126 | 127 | 128 | 129 | | 72_ | 3A0 | 3A1 | 3A2 | 3A3 | 3A4 | 3A5 | 3A6 | 3A7 | 3A8 | 3A9 | | 23_ | 130 | 131 | 132 | 133 | 134 | 135 | 136 | 137 | 138 | 139 | | 73_ | 3B0 | 3B1 | 3B2 | 3B3 | 3B4 | 3B5 | 3B6 | 3B7 | 3B8 | 3B9 | | 24_ | 140 | 141 | 142 | 143 | 144 | 145 | 146 | 147 | 148 | 149 | | 74_ | 3C0 | 3C1 | 3C2 | 3C3 | 3C4 | 3C5 | 3C6 | 3C7 | 3C8 | 3C9 | | 25_ | 150 | 151 | 152 | 153 | 154 | 155 | 156 | 157 | 158 | 159 | - | 75_ | 3D0 | 3D1 | 3D2 | 3D3 | 3D4 | 3D5 | 3D6 | 3D7 | 3D8 | 3D9 | | 26_ | 160 | 161 | 162 | 163 | 164 | 165 | 166 | 167 | 168 | 169 | | 76_ | 3E0 | 3E1 | 3E2 | 3E3 | 3E4 | 3E5 | 3E6 | 3E7 | 3E8 | 3E9 | | 27_ | 170 | 171 | 172 | 173 | 174 | 175 | 176 | 177 | 178 | 179 | | 77_ | 3F0 | 3F1 | 3F2 | 3F3 | 3F4 | 3F5 | 3F6 | 3F7 | 3F8 | 3F9 | | 28_ | 10A | 10B | 12A | 12B | 14A | 14B | 16A | 16B | 14E | 14F | | 78_ | 38A | 38B | заа | зав | 3СА | 3СВ | 3ЕА | 3EB | 3CE | 3CF | | 29_ | 11A | 11B | 13A | 13B | 15A | 15B | 17A | 17B | 15E | 15F | | 79_ | 39A | 39B | зва | 3BB | 3DA | 3DB | 3FA | 3FB | 3DE | 3DF | | 30_ | 180 | 181 | 182 | 183 | 184 | 185 | 186 | 187 | 188 | 189 | | 80_ | 00C | 00D | 10C | 10D | 20C | 20D | 30C | 30D | 02E | 02F | | 31_ | 190 | 191 | 192 | 193 | 194 | 195 | 196 | 197 | 198 | 199 | | 81_ | 01C | 01D | 11C | 11D | 21C | 21D | 31C | 31D | 03E | 03F | | 32_ ′ | 1A0 | 1A1 | 1A2 | 1A3 | 1A4 | 1A5 | 1A6 | 1A7 | 1A8 | 1A9 | | 82_ | 02C | 02D | 12C | 12D | 22C | 22D | 32C | 32D | 12E | 12F | | 33_ | 1B0 | 1B1 | 1B2 | 1B3 | 1B4 | 1B5 | 1B6 | 1B7 | 1B8 | 1B9 | | 83_ | 03C | 03D | 13C | 13D | 23C | 23D | 33C | 33D | 13E | 13F | | 34_ ′ | 1C0 | 1C1 | 1C2 | 1C3 | 1C4 | 1C5 | 1C6 | 1C7 | 1C8 | 1C9 | | 84_ | 04C | 04D | 14C | 14D | 24C | 24D | 34C | 34D | 22E | 22F | | 35_ ′ | 1D0 | 1D1 | 1D2 | 1D3 | 1D4 | 1D5 | 1D6 | 1D7 | 1D8 | 1D9 | ļ | 85_ | 05C | 05D | 15C | 15D | 25C | 25D | 35C | 35D | 23E | 23F | | 36_ | 1E0 | 1E1 | 1E2 | 1E3 | 1E4 | 1E5 | 1E6 | 1E7 | 1E8 | 1E9 | | 86_ | 06C | 06D | 16C | 16D | 26C | 26D | 36C | 36D | 32E | 32F | | 37_ | 1F0 | 1F1 | 1F2 | 1F3 | 1F4 | 1F5 | 1F6 | 1F7 | 1F8 | 1F9 | | <b>87</b> _ | 07C | 07D | 17C | 17D | 27C | 27D | 37C | 37D | 33E | 33F | | 38_ ′ | 18A | 18B | 1AA | 1AB | 1CA | 1CB | 1EA | 1EB | 1CE | 1CF | | 88_ | 00E | 00F | 10E | 10F | 20E | 20F | 30E | 30F | 06E | 06F | | 39_ | 19A | 19B | 1BA | 1BB | 1DA | 1DB | 1FA | 1FB | 1DE | 1DF | | 89_ | 01E | 01F | 11E | 11F | 21E | 21F | 31E | 31F | 07E | 07F | | 40_ 2 | 200 | 201 | 202 | | 204 | | 206 | 207 | 208 | 209 | f | 90_ | | | 18C | | 28C | | | | 0AE | | | | 210 | 211 | 212 | | 214 | | 216 | 217 | | 219 | | 91_ | | | 19C | | | 29D | | | 0BE | 0BF | | | 220 | 221 | 222 | | 224 | 225 | 226 | 227 | 228 | 229 | | 92_ | | | 1AC | | | | | | 1AE | 1AF | | | 230 | 231 | 232 | | 234 | | 236 | 237 | 238 | 239 | | 93_ | | | 1BC | | | | | | | | | 43_ 2 | 240 | 241 | 242 | | 244 | | 246 | 247 | | 249 | | 94_ | | | 1CC | | | | | | | | | | | 251 | 252 | 253 | 254 | 255 | 256 | 257 | 258 | 259 | ļ | 95_ | | | 1DC | | | | | | | | | 44_ 2 | 250 | | | | | | | i | | | | _ | | | | | | | | | | | | 44_ 2<br>45_ 2 | 250<br>260 | 261 | 262 | 263 | 264 | 265 | 266 | 267 | 268 | 269 | , , | 96_ | 0EC | 0ED | 1EC | 1ED | 2EC | 2ED | 3EC | 3ED | 3AE | SAF | | 44_ 2<br>45_ 2<br>46_ 2 | | | | | | | 266<br>276 | 267<br>277 | 268<br>278 | 269<br>279 | | 96_<br>97_ | | | 1EC<br>1FC | | | | | | 3AE<br>3BE | | | 44_ 2<br>45_ 2<br>46_ 2<br>47_ 2 | 260 | 261<br>271 | 262 | 273 | 264<br>274 | | 276 | 277 | | 279 | | _ | 0FC | 0FD | | 1FD | 2FC | 2FD | 3FC | 3FD | | 3BF | | | Table | 12: DF | PD-to-E | 3CD tra | anslatio | on . | | | | | | | | | | | |-------------------|-------------------|-------------------|-------------------|-------------------|-------------------|------------|------------|------------|------------|-------------------|------------|------------|------------|------------|----------------|----------------| | | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Α | В | С | D | Е | F | | 00_ | 000 | 001 | 002 | 003 | 004 | 005 | 006 | 007 | 800 | 009 | 080 | 081 | 800 | 801 | 880 | 881 | | 01_ | 010 | 011 | 012 | 013 | 014 | 015 | 016 | 017 | 018 | 019 | 090 | 091 | 810 | 811 | 890 | 891 | | 02_ | 020 | 021 | 022 | 023 | 024 | 025 | 026 | 027 | 028 | 029 | 082 | 083 | 820 | 821 | 808 | 809 | | 03_ | 030 | 031 | 032 | 033 | 034 | 035 | 036 | 037 | 038 | 039 | 092 | 093 | 830 | 831 | 818 | 819 | | 04_ | 040 | 041 | 042 | 043 | 044 | 045 | 046 | 047 | 048 | 049 | 084 | 085 | 840 | 841 | 088 | 089 | | 05_ | 050 | 051 | 052 | 053 | 054 | 055 | 056 | 057 | 058 | 059 | 094 | 095 | 850 | 851 | 098 | 099 | | 06_ | 060 | 061 | 062 | 063 | 064 | 065 | 066 | 067 | 068 | 069 | 086 | 087 | 860 | 861 | 888 | 889 | | 07_ | 070 | 071 | 072 | 073 | 074 | 075 | 076 | 077 | 078 | 079 | 096 | 097 | 870 | 871 | 898 | 899 | | _80 | 100 | 101 | 102 | 103 | 104 | 105<br>115 | 106 | 107 | 108 | 109 | 180 | 181 | 900 | 901 | 980 | 981 | | 09_ | 110 | 111 | 112 | 113 | 114 | | 116 | 117 | 118 | 119 | 190 | 191 | 910 | 911 | 990 | 991 | | 0A_<br>0B | 120<br>130 | 121<br>131 | 122<br>132 | 123<br>133 | 124<br>134 | 125<br>135 | 126<br>136 | 127<br>137 | 128<br>138 | 129<br>139 | 182<br>192 | 183<br>193 | 920<br>930 | 921<br>931 | 908<br>918 | 909<br>919 | | 0C_ | 140 | 141 | 142 | 143 | 144 | 145 | 146 | 147 | 148 | 149 | 184 | 185 | 940 | 941 | 188 | 189 | | 0D_ | 150 | 151 | 152 | 153 | 154 | 155 | 156 | 157 | 158 | 159 | 194 | 195 | 950 | 951 | 198 | 199 | | 0E_ | 160 | 161 | 162 | 163 | 164 | 165 | 166 | 167 | 168 | 169 | 186 | 187 | 960 | 961 | 988 | 989 | | 0F | 170 | 171 | 172 | 173 | 174 | 175 | 176 | 177 | 178 | 179 | 196 | 197 | 970 | 971 | 998 | 999 | | 10_ | 200 | 201 | 202 | 203 | 204 | 205 | 206 | 207 | 208 | 209 | 280 | 281 | 802 | 803 | 882 | 883 | | 11_ | 210 | 211 | 212 | 213 | 214 | 215 | 216 | 217 | 218 | 219 | 290 | 291 | 812 | 813 | 892 | 893 | | 12_ | 220 | 221 | 222 | 223 | 224 | 225 | 226 | 227 | 228 | 229 | 282 | 283 | 822 | 823 | 828 | 829 | | 13_ | 230 | 231 | 232 | 233 | 234 | 235 | 236 | 237 | 238 | 239 | 292 | 293 | 832 | 833 | 838 | 839 | | 14_ | 240 | 241 | 242 | 243 | 244 | 245 | 246 | 247 | 248 | 249 | 284 | 285 | 842 | 843 | 288 | 289 | | 15_ | 250 | 251 | 252 | 253 | 254 | 255 | 256 | 257 | 258 | 259 | 294 | 295 | 852 | 853 | 298 | 299 | | 16_ | 260 | 261 | 262 | 263 | 264 | 265 | 266 | 267 | 268 | 269 | 286 | 287 | 862 | 863 | (888) | (889) | | 17_ | 270 | 271 | 272 | 273 | 274 | 275 | 276 | 277 | 278 | 279 | 296 | 297 | 872 | 873 | (898) | (899) | | 18_ | 300 | 301 | 302 | 303 | 304 | 305 | 306 | 307 | 308 | 309 | 380 | 381 | 902 | 903 | 982 | 983 | | 19_ | 310 | 311 | 312 | 313<br>323 | 314 | 315<br>325 | 316 | 317 | 318 | 319 | 390 | 391 | 912 | 913<br>923 | 992<br>928 | 993 | | 1A_<br>1B | 320<br>330 | 321<br>331 | 322<br>332 | 333 | 324<br>334 | 335 | 326<br>336 | 327<br>337 | 328<br>338 | 329<br>339 | 382<br>392 | 383<br>393 | 922<br>932 | 933 | 938 | 929<br>939 | | 1C_ | 340 | 341 | 342 | 343 | 344 | 345 | 346 | 347 | 348 | 349 | 384 | 385 | 932 | 933 | 388 | 389 | | 1D_ | 350 | 351 | 352 | 353 | 354 | 355 | 356 | 357 | 358 | 359 | 394 | 395 | 952 | 953 | 398 | 399 | | 1E_ | 360 | 361 | 362 | 363 | 364 | 365 | 366 | 367 | 368 | 369 | 386 | 387 | 962 | 963 | (988) | (989) | | 1F | 370 | 371 | 372 | 373 | 374 | 375 | 376 | 377 | 378 | 379 | 396 | 397 | 972 | 973 | (998) | (999) | | 20_ | 400 | 401 | 402 | 403 | 404 | 405 | 406 | 407 | 408 | 409 | 480 | 481 | 804 | 805 | 884 | 885 | | 21_ | 410 | 411 | 412 | 413 | 414 | 415 | 416 | 417 | 418 | 419 | 490 | 491 | 814 | 815 | 894 | 895 | | 22_ | 420 | 421 | 422 | 423 | 424 | 425 | 426 | 427 | 428 | 429 | 482 | 483 | 824 | 825 | 848 | 849 | | 23_ | 430 | 431 | 432 | 433 | 434 | 435 | 436 | 437 | 438 | 439 | 492 | 493 | 834 | 835 | 858 | 859 | | 24_ | 440 | 441 | 442 | 443 | 444 | 445 | 446 | 447 | 448 | 449 | 484 | 485 | 844 | 845 | 488 | 489 | | 25_ | 450 | 451 | 452 | 453 | 454 | 455 | 456 | 457 | 458 | 459 | 494 | 495 | 854 | 855 | 498 | 499 | | 26_ | 460 | 461 | 462 | 463 | 464 | 465 | 466 | 467 | 468 | 469 | 486 | 487 | 864 | 865 | (888) | (889) | | 27_ | 470 | 471 | 472 | 473 | 474 | 475 | 476 | 477 | 478 | 479 | 496 | 497 | 874 | 875 | (898) | (899) | | 28_<br>29_ | 500<br>510 | 501<br>511 | 502<br>512 | 503<br>513 | 504<br>514 | 505<br>515 | 506<br>516 | 507<br>517 | 508<br>518 | 509<br>519 | 580<br>590 | 581<br>591 | 904<br>914 | 905<br>915 | 984<br>994 | 985<br>995 | | 29_<br>2A_ | 520 | 521 | 522 | 523 | 524 | 525 | 526 | 527 | 528 | 529 | 582 | 583 | 924 | 925 | 948 | 949 | | 2A_<br>2B | 530 | 531 | 532 | 533 | 534 | 535 | 536 | 537 | 538 | 539 | 592 | 593 | 934 | 935 | 958 | 959 | | 2C_ | 540 | 541 | 542 | 543 | 544 | 545 | 546 | 547 | 548 | 549 | 584 | 585 | 944 | 945 | 588 | 589 | | 2D_ | 550 | 551 | 552 | 553 | 554 | 555 | 556 | 557 | 558 | 559 | 594 | 595 | 954 | 955 | 598 | 599 | | 2E_ | 560 | 561 | 562 | 563 | 564 | 565 | 566 | 567 | 568 | 569 | 586 | 587 | 964 | 965 | (988) | (989) | | 2F_ | 570 | 571 | 572 | 573 | 574 | 575 | 576 | 577 | 578 | 579 | 596 | 597 | 974 | 975 | (998) | (999) | | 30_ | 600 | 601 | 602 | 603 | 604 | 605 | 606 | 607 | 608 | 609 | 680 | 681 | 806 | 807 | 886 | 887 | | 31_ | 610 | 611 | 612 | 613 | 614 | 615 | 616 | 617 | 618 | 619 | 690 | 691 | 816 | 817 | 896 | 897 | | 32_ | 620 | 621 | 622 | 623 | 624 | 625 | 626 | 627 | 628 | 629 | 682 | 683 | 826 | 827 | 868 | 869 | | 33_ | 630 | 631 | 632 | 633 | 634 | 635 | 636 | 637 | 638 | 639 | 692 | 693 | 836 | 837 | 878 | 879 | | 34_ | 640 | 641 | 642 | 643 | 644 | 645 | 646 | 647 | 648 | 649 | 684 | 685 | 846 | 847 | 688 | 689 | | 35_<br>36 | 650<br>660 | 651<br>661 | 652<br>662 | 653<br>663 | 654<br>664 | 655<br>665 | 656<br>666 | 657<br>667 | 658<br>668 | 659<br>669 | 694<br>686 | 695<br>687 | 856<br>866 | 857<br>867 | 698 | 699<br>(880) | | 36_<br>37_ | 670 | 671 | 672 | 663<br>673 | 674 | 675 | 676 | 677 | 678 | 669<br>679 | 696 | 687<br>697 | 876 | 877 | (888)<br>(898) | (889)<br>(899) | | 38_ | 700 | 701 | 702 | 703 | 704 | 705 | 706 | 707 | 708 | 709 | 780 | 781 | 906 | 907 | 986 | 987 | | 39_ | 710 | 711 | 712 | 713 | 714 | 715 | 716 | 717 | 718 | 719 | 790 | 791 | 916 | 917 | 996 | 997 | | 3A_ | 720 | 721 | 722 | 723 | 724 | 725 | 726 | 727 | 728 | 729 | 782 | 783 | 926 | 927 | 968 | 969 | | 3B_ | 730 | 731 | 732 | 733 | 734 | 735 | 736 | 737 | 738 | 739 | 792 | 793 | 936 | 937 | 978 | 979 | | 3C_ | 740 | 741 | 742 | 743 | 744 | 745 | 746 | 747 | 748 | 749 | 784 | 785 | 946 | 947 | 788 | 789 | | | | Ī | | 750 | 754 | 755 | 756 | 757 | 758 | 750 | 794 | 795 | 956 | 057 | | 799 | | 3D_ | 750 | 751 | 752 | 753 | 754 | 755 | 750 | 757 | 730 | 759 | 794 | 795 | 930 | 957 | 798 | 799 | | 3D_<br>3E_<br>3F_ | 750<br>760<br>770 | 751<br>761<br>771 | 752<br>762<br>772 | 753<br>763<br>773 | 754<br>764<br>774 | 765<br>775 | 766<br>776 | 767<br>777 | 768<br>778 | 769<br>769<br>779 | 786<br>796 | 787<br>797 | 966<br>976 | 967<br>977 | (988)<br>(998) | (989)<br>(999) | #### Appendix B. Vector RTL Functions [Category: Vector] ``` ConvertSPtoSXWsaturate( X, Y ) = X<sub>0</sub> sign exp_{0:7} = X_{1:8} frac_{0:30} = X_{9:31} \parallel 0b0000_0000 if((exp==255)&(frac!=0)) then return(0x0000_0000) // NaN operand if((exp==255)\&(frac==0)) then do // infinity operand VSCR_{SAT} = 1 return( (sign==1) ? 0x8000 0000 : 0x7FFF FFFF ) if((exp+Y-127)>30) then do // large operand VSCR_{SAT} = 1 return( (sign==1) ? 0x8000_0000 : 0x7FFF_FFFF ) if((exp+Y-127)<0) then return(0x0\overline{0}0\_0000) // -\overline{1}.0 < value < 1.0 (value rounds to 0) significand_{0:31} = 0b1 \parallel frac do i=1 to 31-(exp+Y-127) significand = significand >>ui 1 return((sign==0) ? significand : (¬significand + 1) ) ConvertSPtoUXWsaturate( X, Y ) sign = X_0 \exp_{0:7} = X_{1:8} frac_{0.30} = X_{9.31} \parallel 0b0000\_0000 if((exp==255)&&(frac!=0)) then return(0x0000 0000) // NaN operand if((exp==255)&&(frac==0)) then do // infinity operand return( (sign==1) ? 0x0000 0000 : 0xFFFF FFFF ) if((exp+Y-127)>31) then do // large operand VSCR_{SAT} = 1 return( (sign==1) ? 0x0000_0000 : 0xFFFF_FFFF ) if((exp+Y-127)<0) then return(0x0000_0000) // -1.0 < value < 1.0 // value rounds to 0 if ( sign==1 ) then do // negative operand VSCR_{SAT} = 1 return(0x0000_0000) significand_{0:31} = 0b1 \parallel frac do i=1 to 31-(exp+Y-127) significand = significand >>,, 1 return( significand ) ConvertSXWtoSP( X ) sign = X_0 \exp_{0:7} = 32 + 127 frac_{0:32} = X_0 \parallel X_{0:31} if( frac==0 ) return( 0x0000_0000 ) // Zero operand if ( sign==1 ) then frac = \neg frac + 1 do while (frac_0 == 0) frac = frac << 1 exp = exp - 1 lsb = frac_{23} gbit = frac_{24} xbit = frac_{25:32}! = 0 inc = ( lsb && gbit ) | ( gbit && xbit ) frac_{0:23} = frac_{0:23} + inc if( carry_out==1 ) exp = exp + 1 \texttt{return(sign} \parallel \texttt{exp} \parallel \texttt{frac}_{1:23} ) ``` #### Appendix C. Embedded Floating-Point RTL Functions [Category: SPE.Embedded Float Scalar Double] [Category: SPE.Embedded Float Scalar Single] [Category: SPE.Embedded Float Vector] #### **C.1 Common Functions** ``` // Check if 32-bit fp value is a NaN or Infinity Isa32NaNorInfinity(fp) return (fp_{exp} = 255) Isa32NaN(fp) return ((fp<sub>exp</sub> = 255) & (fp<sub>frac</sub> \neq 0)) // Check if 32-bit fp value is denormalized Isa32Denorm(fp) return ((fp_{exp} = 0) \& (fp_{frac} \neq 0)) // Check if 64-bit fp value is a NaN or Infinity Isa64NaNorInfinity(fp) return (fp_{exp} = 2047) Isa64NaN(fp) return ((fp_{exp} = 2047) \& (fp_{frac} \neq 0)) // Check if 32-bit fp value is denormalized Isa64Denorm(fp) return ((fp_{exp} = 0) & (fp_{frac} \neq 0)) // Signal an error in the SPEFSCR SignalFPError(upper_lower, bits) if (upper_lower = HI) then bits ← bits << 15 SPEFSCR ← SPEFSCR | bits bits \leftarrow (FG | FX) if (upper_lower = HI) then bits ← bits << 15 SPEFSCR ← SPEFSCR & ¬bits ``` ``` // Round a 32-bit fp result Round32(fp, guard, sticky) FP32format fp; if (SPEFSCR_{FTNXE} = 0) then // nearest if (SPEFSCR_{FRMC} = 0b00) then if (guard) then if (sticky \mid fp_{frac[22]}) then v_{0:23} \leftarrow fp_{frac} + 1 if v_0 then if (fp_{exp} >= 254) then // overflow fp \leftarrow fp<sub>sign</sub> || 0b111111110 || ^{23}1 fp_{exp} \leftarrow fp_{exp} + 1 fp_{frac} \leftarrow v_{1:23} else \texttt{fp}_{\texttt{frac}} \leftarrow \texttt{v}_{1:23} else if ((SPEFSCR_{FRMC} & 0b10) = 0b10) then // infinity modes // implementation dependent return fp // Round a 64-bit fp result Round64(fp, guard, sticky) FP32format fp; if (SPEFSCR_{FINXE} = 0) then if (SPEFSCR_{FRMC} = 0b00) then // nearest if (guard) then if (sticky \mid fp_{frac[51]}) then v_{0:52} \leftarrow fp_{frac} + 1 if v_0 then if (fp_{exp} >= 2046) then // overflow fp \leftarrow fp_{sign} \mid \mid 0b11111111110 || <sup>52</sup>1 fp_{exp} \leftarrow fp_{exp} + 1 fp_{frac} \leftarrow v_{1:52} else fp_{frac} \leftarrow v_{1:52} else if ((SPEFSCR_{FRMC} & 0b10) = 0b10) then // infinity modes // implementation dependent return fp ``` ## C.2 Convert from Single-Precision Embedded Floating-Point to Integer Word with Saturation ``` // Convert 32-bit Floating-Point to 32-bit integer // or fractional signed = S (signed) or U (unsigned) upper lower = HI (high word) or LO (low word) round = RND (round) or ZER (truncate) fractional = F (fractional) or I (integer) CnvtFP32ToI32Sat(fp, signed, upper lower, round, fractional) FP32format fp; if (Isa32NaNorInfinity(fp)) then SignalFPError(upper_lower, FINV) if (Isa32NaN(fp)) then return 0x00000000 // all NaNs if (signed = S) then if (fp_{sign} = 1) then return 0x80000000 else return 0x7fffffff else if (fp_{sign} = 1) then return 0x00000000 return 0xffffffff if (Isa32Denorm(fp)) then SignalFPError(upper_lower, FINV) // regardless of sign return 0x00000000 if ((signed = U) & (fp_{sign} = 1)) then SignalFPError(upper_lower, FOVF) // overflow return 0x00000000 if ((fp_{exp} = 0) & (fp_{frac} = 0)) then return 0x00000000 // all zero values if (fractional = I) then // convert to integer max exp ← 158 shi\overline{f}t \leftarrow 158 - fp_{exp} if (signed = S) then if ((fp_{exp}\neq158) | (fp_{frac}\neq0) | (fp_{siqn}\neq1)) then max exp ← max exp - 1 else // fractional conversion max_exp \leftarrow 126 shi\overline{f}t \leftarrow 126 - fp_{exp} if (signed = S) then shift ← shift + 1 if (fp_{exp} > max_{exp}) then SignalFPError(upper lower, FOVF) // overflow if (signed = S) then if (fp_{sign} = 1) then return 0x80000000 else return 0x7fffffff else return Oxfffffff result \leftarrow 0b1 || fp<sub>frac</sub> || 0b00000000 // add U bit quard ← 0 sticky ← 0 for (n \leftarrow 0; n < shift; n \leftarrow n + 1) do sticky ← sticky | guard ``` ``` guard ← result & 0x00000001 result ← result > 1 // Report sticky and guard bits if (upper_lower = HI) then \texttt{SPEFSCR}_{\texttt{FGH}} \leftarrow \texttt{guard} SPEFSCR_{FXH} \leftarrow sticky SPEFSCR_{FG} \leftarrow guard SPEFSCR_{FX} \leftarrow sticky if (guard | sticky) then SPEFSCR_{FINXS} \leftarrow 1 // Round the integer result if ((round = RND) & (SPEFSCR<sub>FINXE</sub> = 0)) then if (SPEFSCR<sub>FRMC</sub> = 0b00) then // nearest if (guard) then if (sticky | (result & 0x00000001)) then result ← result + 1 else if ((SPEFSCR_{FRMC} & 0b10) = 0b10) then // infinity modes // implementation dependent if (signed = S) then if (fp_{sign} = 1) then result \leftarrow \neg result + 1 return result ``` ## C.3 Convert from Double-Precision Embedded Floating-Point to Integer Word with Saturation ``` // Convert 64-bit Floating-Point to 32-bit integer // or fractional // signed = S (signed) or U (unsigned) round = RND (round) or ZER (truncate) fractional = F (fractional) or I (integer) CnvtFP64ToI32Sat(fp, signed, round, fractional) FP64format fp; if (Isa64NaNorInfinity(fp)) then SignalFPError(LO, FINV) if (Isa64NaN(fp)) then return 0x00000000 // all NaNs if (signed = S) then if (fp_{sign} = 1) then return 0x80000000 return 0x7fffffff else if (fp_{sign} = 1) then return 0x00000000 else return 0xffffffff if (Isa64Denorm(fp)) then SignalFPError(LO, FINV) return 0x00000000 // regardless of sign if ((signed = U) & (fp<sub>sign</sub> = 1)) then SignalFPError(LO, FOVF) // overflow return 0x00000000 if ((fp_{exp} = 0) & (fp_{frac} = 0)) then return 0x00000000 // all zero values if (fractional = I) then // convert to integer max exp \leftarrow 1054 shi\overline{f}t \leftarrow 1054 - fp_{exp} if (signed ← S) then if ((fp_{exp}\neq1054) | (fp_{frac}\neq0) | (fp_{siqn}\neq1)) then \max \exp \leftarrow \max \exp - 1 // fractional conversion else max_exp \leftarrow 1022 shift ← 1022 - fp<sub>exp</sub> if (signed = S) then shift \leftarrow shift + 1 if (fp_{exp} > max_{exp}) then SignalFPError(LO, FOVF) // overflow if (signed = S) then if (fp_{sign} = 1) then return 0x80000000 else return 0x7fffffff else return Oxfffffff result \leftarrow 0b1 || fp_{frac[0:30]} // add U to frac guard ← fp<sub>frac[31]</sub> sticky \leftarrow (fp<sub>frac[32:63]</sub> \neq 0) for (n \leftarrow 0; n < shift; n \leftarrow n + 1) do sticky ← sticky | quard ``` ``` guard ← result & 0x00000001 result ← result > 1 // Report sticky and guard bits SPEFSCR_{FG} \leftarrow guard SPEFSCR_{FX} \leftarrow sticky if (guard | sticky) then SPEFSCR_{FINXS} \leftarrow 1 // Round the result if ((round = RND) & (SPEFSCR<sub>FINXE</sub> = 0)) then if (SPEFSCR_{FRMC} = 0b00) then // nearest if (guard) then if (sticky | (result & 0x00000001)) then result ← result + 1 else if ((SPEFSCR_{FRMC} \& 0b10) = 0b10) then // infinity modes // implementation dependent if (signed = S) then if (fp_{sign} = 1) then result \leftarrow \neg result + 1 return result ``` # C.4 Convert from Double-Precision Embedded Floating-Point to Integer Doubleword with Saturation ``` // Convert 64-bit Floating-Point to 64-bit integer signed = S (signed) or U (unsigned) round = RND (round) or ZER (truncate) CnvtFP64ToI64Sat(fp, signed, round) FP64format fp; if (Isa64NaNorInfinity(fp)) then SignalFPError(LO, FINV) if (Isa64NaN(fp)) then // all NaNs return 0x00000000 00000000 if (signed = S) then if (fp_{sign} = 1) then return 0x80000000 00000000 else return 0x7fffffff_fffffff else if (fp_{sign} = 1) then return 0x00000000_00000000 else return Oxffffffff ffffffff if (Isa64Denorm(fp)) then SignalFPError(LO, FINV) return 0x00000000 00000000 if ((signed = U) & (fp_{sign} = 1)) then SignalFPError(LO, FOVF) // overflow return 0x00000000_00000000 if ((fp_{exp} = 0) & (fp_{frac} = 0)) then return 0x00000000 00000000 // all zero values max_exp ← 1086 shift ← 1086 - fp<sub>exp</sub> if (signed = S) then if ((fp_{exp}\neq 1086) | (fp_{frac}\neq 0) | (fp_{sign}\neq 1)) then max_exp \leftarrow max_exp - 1 if (fp_{exp} > max_{exp}) then SignalFPError(LO, FOVF) // overflow if (signed = S) then if (fp_{sign} = 1) then return 0x80000000 00000000 return 0x7fffffff ffffffff else return 0xffffffff_fffffff result \leftarrow 0b1 || fp<sub>frac</sub> || 0b00000000000 //add U bit guard ← 0 sticky ← 0 for (n \leftarrow 0; n < \text{shift}; n \leftarrow n + 1) do sticky ← sticky | guard guard ← result & 0x00000000 00000001 result ← result > 1 // Report sticky and guard bits \begin{array}{l} \mathtt{SPEFSCR}_{\mathtt{FG}} \; \leftarrow \; \mathtt{guard} \\ \mathtt{SPEFSCR}_{\mathtt{FX}} \; \leftarrow \; \mathtt{sticky} \end{array} ``` ``` if (guard | sticky) then \texttt{SPEFSCR}_{\texttt{FINXS}} \; \leftarrow \; \texttt{1} // Round the result if ((round = RND) & (SPEFSCR_{FINXE} = 0)) then if (SPEFSCR_{FRMC} = 0b00) then // nearest if (quard) then if (sticky | (result&0x00000000_00000001)) result \leftarrow result + 1 else if ((SPEFSCR_{FRMC} & 0b10) = 0b10) then // infinity modes // implementation dependent if (signed = S) then if (fp_{sign} = 1) then result \leftarrow \neg result + 1 return result ``` #### C.5 Convert to Single-Precision Embedded Floating-Point from Integer Word ``` // Convert from 32-bit integer or fractional to // 32-bit Floating-Point // signed = S (signed) or U (unsigned) // round = RND (round) or ZER (truncate) // fractional = F (fractional) or I (integer) CnvtI32ToFP32(v, signed, upper_lower, fractional) FP32format result; \begin{array}{l} \text{result}_{\text{sign}} \leftarrow \text{0} \\ \text{if (v = 0) then} \end{array} result ← 0 if (upper_lower = HI) then \texttt{SPEFSCR}_{\texttt{FGH}} \; \leftarrow \; \texttt{0} SPEFSCR<sub>FXH</sub> ← 0 SPEFSCR_{FG} \leftarrow 0 SPEFSCR_{FX} \leftarrow 0 else if (signed = S) then if (v_0 = 1) then v ← ¬v + 1 result_{sign} \leftarrow 1 // frac bit align if (fractional = F) then maxexp ← 127 if (signed = U) then maxexp ← maxexp - 1 else // integer bit alignment maxexp ← 158 sc ← 0 while (v_0 = 0) v ← v << 1 sc \leftarrow sc + 1 v_0 \leftarrow 0 // clear U bit result_{exp} \leftarrow maxexp - sc \texttt{guard} \leftarrow \mathsf{v}_{24} sticky \leftarrow (v_{25:31} \neq 0) // Report sticky and quard bits if (upper lower = HI) then SPEFSCR_{FGH} \leftarrow guard SPEFSCR_{FXH} \leftarrow sticky else SPEFSCR_{FG} \leftarrow guard SPEFSCR_{FX} \leftarrow sticky if (guard | sticky) then SPEFSCR<sub>FINXS</sub> ← 1 // Round the result result_{frac} \leftarrow v_{1:23} result ← Round32(result, guard, sticky) return result ``` #### C.6 Convert to Double-Precision Embedded Floating-Point from Integer Word ``` // Convert from integer or fractional to 64 bit // Floating-Point // signed = S (signed) or U (unsigned) // fractional = F (fractional) or I (integer) CnvtI32ToFP64(v, signed, fractional) FP64format result; result_{sign} \leftarrow 0 if (v = 0) then result ← 0 SPEFSCR_{FG} \leftarrow 0 SPEFSCR_{FX} \leftarrow 0 else if (signed = S) then if (v_0 = 1) then v \leftarrow \neg v + 1 \begin{aligned} & \text{result}_{\text{sign}} \leftarrow 1 \\ \text{if (fractional = F) then} \end{aligned} // frac bit align maxexp ← 1023 if (signed = U) then maxexp ← maxexp - 1 maxexp ← 1054 // integer bit align sc ← 0 while (v_0 = 0) v ← v << 1 sc ← sc + 1 v_0 \leftarrow 0 // clear U bit result_{exp} \leftarrow maxexp - sc // Report sticky and guard bits SPEFSCR_{FG} \leftarrow 0 SPEFSCR_{FX} \leftarrow 0 result_{frac} \leftarrow v_{1:31} \mid \mid ^{21}0 return result ``` #### C.7 Convert to Double-Precision Embedded Floating-Point from Integer Doubleword ``` // Convert from 64-bit integer to 64-bit // floating-point // signed = S (signed) or U (unsigned) CnvtI64ToFP64(v, signed) FP64format result; \begin{aligned} \text{result}_{\text{sign}} &\leftarrow \text{0} \\ \text{if } (\text{v = 0}) \text{ then} \end{aligned} result \leftarrow 0 \text{SPEFSCR}_{\text{FG}} \leftarrow \text{ 0} SPEFSCR_{FX} \leftarrow 0 else if (signed = S) then if (v_0 = 1) then v ← ¬v + 1 result_{sign} \leftarrow 1 maxexp ← 1054 sc ← 0 while (v_0 = 0) v ← v << 1 sc ← sc + 1 guard ← v<sub>53</sub> sticky \leftarrow (v_{54:63} \neq 0) // Report sticky and guard bits \texttt{SPEFSCR}_{\texttt{FG}} \leftarrow \texttt{guard} SPEFSCR<sub>FX</sub> ← sticky if (guard | sticky) then SPEFSCR_{FINXS} \leftarrow 1 // Round the result \begin{aligned} & \text{result}_{\text{frac}} \leftarrow \mathbf{v}_{\text{1:52}} \\ & \text{result} \leftarrow \text{Round64} (\text{result, guard, sticky}) \end{aligned} return result ``` #### **Appendix D. Assembler Extended Mnemonics** In order to make assembler language programs simpler to write and easier to understand, a set of extended mnemonics and symbols is provided that defines simple shorthand for the most frequently used forms of *Branch Conditional*, *Compare*, *Trap*, *Rotate and Shift*, and certain other instructions. Assemblers should provide the extended mnemonics and symbols listed here, and may provide others. #### **D.1 Symbols** The following symbols are defined for use in instructions (basic or extended mnemonics) that specify a Condition Register field or a Condition Register bit. The first five (lt, ..., un) identify a bit number within a CR field. The remainder (cr0, ..., cr7) identify a CR field. An expression in which a CR field symbol is multiplied by 4 and then added to a bit-number-within-CR-field symbol and 32 can be used to identify a CR bit. | Symbol | Value | Meaning | |--------|-------|---------------------------------------------| | lt | 0 | Less than | | gt | 1 | Greater than | | eq | 2 | Equal | | so | 3 | Summary overflow | | un | 3 | Unordered (after floating-point comparison) | | cr0 | 0 | CR Field 0 | | cr1 | 1 | CR Field 1 | | cr2 | 2 | CR Field 2 | | cr3 | 3 | CR Field 3 | | cr4 | 4 | CR Field 4 | | cr5 | 5 | CR Field 5 | | cr6 | 6 | CR Field 6 | | cr7 | 7 | CR Field 7 | The extended mnemonics in Sections D.2.2 and D.3 require identification of a CR bit: if one of the CR field symbols is used, it must be multiplied by 4 and added to a bit-number-within-CR-field (value in the range 0-3, explicit or symbolic) and 32. The extended mnemonics in Sections D.2.3 and D.5 require identification of a CR field: if one of the CR field symbols is used, it must *not* be multiplied by 4 or added to 32. (For the extended mnemonics in Section D.2.3, the bit number within the CR field is part of the extended mnemonic. The programmer identifies the CR field, and the Assembler does the multiplication and addition required to produce a CR bit number for the BI field of the underlying basic mnemonic.) #### **D.2 Branch Mnemonics** The mnemonics discussed in this section are variations of the Branch Conditional instructions. **Note:** *bcIr*, *bcIrI*, *bcctr*, and *bcctrI* each serve as both a basic and an extended mnemonic. The Assembler will recognize a *bcIr*, *bcIrI*, *bcctr*, or *bcctrI* mnemonic with three operands as the basic form, and a *bcIr*, *bcIrI*, *bcctr*, or *bcctrI* mnemonic with two operands as the extended form. In the extended form the BH operand is omitted and assumed to be 0b00. Similarly, for all the extended mnemonics described in Sections D.2.2 - D.2.4 that devolve to any of these four basic mnemonics the BH operand can either be coded or omitted. If it is omitted it is assumed to be 0b00. #### D.2.1 BO and BI Fields The 5-bit BO and BI fields control whether the branch is taken. Providing an extended mnemonic for every possible combination of these fields would be neither useful nor practical. The mnemonics described in Sections D.2.2 - D.2.4 include the most useful cases. Other cases can be coded using a basic *Branch Conditional* mnemonic (**bc[I][a]**, **bclr[I]**) with the appropriate operands. #### **D.2.2 Simple Branch Mnemonics** Instructions using one of the mnemonics in Table 13 that tests a Condition Register bit specify the corresponding bit as the first operand. The symbols defined in Section D.1 can be used in this operand. Notice that there are no extended mnemonics for relative and absolute unconditional branches. For these the basic mnemonics **b**, **ba**, **bl**, and **bla** should be used. | Table 13: Simple branch mnemonics | | | | | | | | | |--------------------------------------------------------------|----------------|------------------------|----------------------|-----------------|------------------------|-------------------------|-----------------------|------------------| | | | LR not | Set | | | LR S | et | | | Branch Semantics | bc<br>Relative | <i>bca</i><br>Absolute | <i>bclr</i><br>To LR | bcctr<br>To CTR | <i>bcl</i><br>Relative | <i>bcla</i><br>Absolute | <i>bclrl</i><br>To LR | bcctrl<br>To CTR | | Branch unconditionally | - | - | blr | bctr | - | - | blrl | bctrl | | Branch if CR <sub>BI</sub> =1 | bt | bta | btlr | btctr | btl | btla | btlrl | btctrl | | Branch if CR <sub>BI</sub> =0 | bf | bfa | bflr | bfctr | bfl | bfla | bflrl | bfctrl | | Decrement CTR, branch if CTR nonzero | bdnz | bdnza | bdnzlr | - | bdnzl | bdnzla | bdnzlrl | - | | Decrement CTR, branch if CTR nonzero and CR <sub>BI</sub> =1 | bdnzt | bdnzta | bdnztlr | - | bdnztl | bdnztla | bdnztlrl | - | | Decrement CTR, branch if CTR nonzero and CR <sub>BI</sub> =0 | bdnzf | bdnzfa | bdnzflr | - | bdnzfl | bdnzfla | bdnzflrl | - | | Decrement CTR, branch if CTR zero | bdz | bdza | bdzlr | - | bdzl | bdzla | bdzlrl | - | | Decrement CTR, branch if CTR zero and CR <sub>BI</sub> =1 | bdzt | bdzta | bdztlr | - | bdztl | bdztla | bdztlrl | - | | Decrement CTR, branch if CTR zero and CR <sub>BI</sub> =0 | bdzf | bdzfa | bdzflr | - | bdzfl | bdzfla | bdzflrl | - | #### **Examples** 1. Decrement CTR and branch if it is still nonzero (closure of a loop controlled by a count loaded into CTR). bdnz target (equivalent to: bc 16,0,target) 2. Same as (1) but branch only if CTR is nonzero and condition in CR0 is "equal". bdnzt eq,target (equivalent to: bc 8,2,target) 3. Same as (2), but "equal" condition is in CR5. bdnzt 4xcr5+eq,target (equivalent to: bc 8,22,target) 4. Branch if bit 59 of CR is 0. bf 27,target (equivalent to: bc 4,27,target) 5. Same as (4), but set the Link Register. This is a form of conditional "call". bfl 27,target (equivalent to: bcl 4,27,target) #### **D.2.3** Branch Mnemonics Incorporating Conditions In the mnemonics defined in Table 14, the test of a bit in a Condition Register field is encoded in the mnemonic. Instructions using the mnemonics in Table 14 specify the CR field as an optional first operand. One of the CR field symbols defined in Section D.1 can be used for this operand. If the CR field being tested is CR Field 0, this operand need not be specified unless the resulting basic mnemonic is **bclr[I]** and the BH operand is specified. A standard set of codes has been adopted for the most common combinations of branch conditions. | Code | Meaning | |------|-------------------------------------------------| | lt | Less than | | le | Less than or equal | | eq | Equal | | ge | Greater than or equal | | gt | Greater than | | nl | Not less than | | ne | Not equal | | ng | Not greater than | | SO | Summary overflow | | ns | Not summary overflow | | un | Unordered (after floating-point comparison) | | nu | Not unordered (after floating-point comparison) | | | | These codes are reflected in the mnemonics shown in Table 14. | Table 14: Branch mnemonics incorporating conditions | | | | | | | | | | |-----------------------------------------------------|----------------|------------------------|----------------------|-----------------|------------------------|-------------------------|-----------------------|------------------|--| | | | LR not | Set | | | LR S | et | | | | Branch Semantics | bc<br>Relative | <i>bca</i><br>Absolute | <i>bclr</i><br>To LR | bcctr<br>To CTR | <i>bcl</i><br>Relative | <i>bcla</i><br>Absolute | <i>bclrl</i><br>To LR | bcctrl<br>To CTR | | | Branch if less than | blt | blta | bltlr | bltctr | bltl | bltla | bltlrl | bltctrl | | | Branch if less than or equal | ble | blea | blelr | blectr | blel | blela | blelrl | blectrl | | | Branch if equal | beq | beqa | beqlr | beqctr | beql | beqla | beqlrl | beqctrl | | | Branch if greater than or equal | bge | bgea | bgelr | bgectr | bgel | bgela | bgelrl | bgectrl | | | Branch if greater than | bgt | bgta | bgtlr | bgtctr | bgtl | bgtla | bgtlrl | bgtctrl | | | Branch if not less than | bnl | bnla | bnllr | bnlctr | bnll | bnlla | bnllrl | bnlctrl | | | Branch if not equal | bne | bnea | bnelr | bnectr | bnel | bnela | bnelrl | bnectrl | | | Branch if not greater than | bng | bnga | bnglr | bngctr | bngl | bngla | bnglrl | bngctrl | | | Branch if summary overflow | bso | bsoa | bsolr | bsoctr | bsol | bsola | bsolrl | bsoctrl | | | Branch if not summary overflow | bns | bnsa | bnslr | bnsctr | bnsl | bnsla | bnslrl | bnsctrl | | | Branch if unordered | bun | buna | bunlr | bunctr | bunl | bunla | bunlrl | bunctrl | | | Branch if not unordered | bnu | bnua | bnulr | bnuctr | bnul | bnula | bnulrl | bnuctrl | | #### **Examples** 1. Branch if CR0 reflects condition "not equal". bne target (equivalent to: bc 4,2,target) 2. Same as (1), but condition is in CR3. #### Version 2.05 bne cr3,target (equivalent to: bc 4,14,target) 3. Branch to an absolute target if CR4 specifies "greater than", setting the Link Register. This is a form of conditional "call". bgtla cr4,target (equivalent to: bcla 12,17,target) 4. Same as (3), but target address is in the Count Register. bgtctrl cr4 (equivalent to: bcctrl 12,17,0) #### D.2.4 Branch Prediction Software can use the "at" bits of *Branch Conditional* instructions to provide a hint to the processor about the behavior of the branch. If, for a given such instruction, the branch is almost always taken or almost always not taken, a suffix can be added to the mnemonic indicating the value to be used for the "at" bits. - + Predict branch to be taken (at=0b11) - Predict branch not to be taken (at=0b10) Such a suffix can be added to any *Branch Conditional* mnemonic, either basic or extended, that tests either the Count Register or a CR bit (but not both). Assemblers should use 0b00 as the default value for the "at" bits, indicating that software has offered no prediction. #### **Examples** 1. Branch if CR0 reflects condition "less than", specifying that the branch should be predicted to be taken. blt+ target 2. Same as (1), but target address is in the Link Register and the branch should be predicted not to be taken. bltlr- #### **D.3 Condition Register Logical Mnemonics** The Condition Register Logical instructions can be used to set (to 1), clear (to 0), copy, or invert a given Condition Register bit. Extended mnemonics are provided that allow these operations to be coded easily. | Table 15: Condition Register logical mnemonics | | | | | | | | |------------------------------------------------|-------------------|----------------|--|--|--|--|--| | Operation | Extended Mnemonic | Equivalent to | | | | | | | Condition Register set | crset bx | creqv bx,bx,bx | | | | | | | Condition Register clear | crclr bx | crxor bx,bx,bx | | | | | | | Condition Register move | crmove bx,by | cror bx,by,by | | | | | | | Condition Register not | crnot bx,by | crnor bx,by,by | | | | | | The symbols defined in Section D.1 can be used to identify the Condition Register bits. #### **Examples** 1. Set CR bit 57. crset 25 (equivalent to: creqv 25,25,25) 2. Clear the SO bit of CR0. crclr so (equivalent to: crxor 3,3,3) 3. Same as (2), but SO bit to be cleared is in CR3. crclr 4xcr3+so (equivalent to: crxor 15,15,15) 4. Invert the EQ bit. crnot eq,eq (equivalent to: crnor 2,2,2) 5. Same as (4), but EQ bit to be inverted is in CR4, and the result is to be placed into the EQ bit of CR5. crnot 4xcr5+eq,4xcr4+eq (equivalent to: crnor 22,18,18) #### **D.4 Subtract Mnemonics** #### D.4.1 Subtract Immediate Although there is no "Subtract Immediate" instruction, its effect can be achieved by using an Add Immediate instruction with the immediate operand negated. Extended mnemonics are provided that include this negation, making the intent of the computation clearer. | subi | Rx,Ry,value | (equivalent to: | addi | Rx,Ry,-value) | |--------|-------------|-----------------|--------|---------------| | subis | Rx,Ry,value | (equivalent to: | addis | Rx,Ry,-value) | | subic | Rx,Ry,value | (equivalent to: | addic | Rx,Ry,-value) | | subic. | Rx,Ry,value | (equivalent to: | addic. | Rx,Ry,-value) | #### D.4.2 Subtract The Subtract From instructions subtract the second operand (RA) from the third (RB). Extended mnemonics are provided that use the more "normal" order, in which the third operand is subtracted from the second. Both these mnemonics can be coded with a final "o" and/or "." to cause the OE and/or Rc bit to be set in the underlying instruction. sub Rx,Ry,Rz (equivalent to: subf Rx,Rz,Ry) subc Rx,Ry,Rz (equivalent to: subfc Rx,Rz,Ry) #### **D.5 Compare Mnemonics** The L field in the fixed-point *Compare* instructions controls whether the operands are treated as 64-bit quantities or as 32-bit quantities. Extended mnemonics are provided that represent the L value in the mnemonic rather than requiring it to be coded as a numeric operand. The BF field can be omitted if the result of the comparison is to be placed into CR Field 0. Otherwise the target CR field must be specified as the first operand. One of the CR field symbols defined in Section D.1 can be used for this operand. **Note:** The basic *Compare* mnemonics of Power ISA are the same as those of POWER, but the POWER instructions have three operands while the Power ISA instructions have four. The Assembler will recognize a basic *Compare* mnemonic with three operands as the POWER form, and will generate the instruction with L=0. (Thus the Assembler must require that the BF field, which normally can be omitted when CR Field 0 is the target, be specified explicitly if L is.) #### **D.5.1** Doubleword Comparisons | Table 16: Doubleword compare mnemonics | | | | | | | | |----------------------------------------|-------------------|------------------|--|--|--|--|--| | Operation | Extended Mnemonic | Equivalent to | | | | | | | Compare doubleword immediate | cmpdi bf,ra,si | cmpi bf,1,ra,si | | | | | | | Compare doubleword | cmpd bf,ra,rb | cmp bf,1,ra,rb | | | | | | | Compare logical doubleword immediate | cmpldi bf,ra,ui | cmpli bf,1,ra,ui | | | | | | | Compare logical doubleword | cmpld bf,ra,rb | cmpl bf,1,ra,rb | | | | | | #### **Examples** 1. Compare register Rx and immediate value 100 as unsigned 64-bit integers and place result into CR0. cmpldi Rx,100 (equivalent to: cmpli 0,1,Rx,100) 2. Same as (1), but place result into CR4. cmpldi cr4,Rx,100 (equivalent to: cmpli 4,1,Rx,100) 3. Compare registers Rx and Ry as signed 64-bit integers and place result into CR0. cmpd Rx,Ry (equivalent to: cmp 0,1,Rx,Ry) #### **D.5.2 Word Comparisons** | Table 17: Word compare mnemonics | | | | |----------------------------------|-------------------|------------------|--| | Operation | Extended Mnemonic | Equivalent to | | | Compare word immediate | cmpwi bf,ra,si | cmpi bf,0,ra,si | | | Compare word | cmpw bf,ra,rb | cmp bf,0,ra,rb | | | Compare logical word immediate | cmplwi bf,ra,ui | cmpli bf,0,ra,ui | | | Compare logical word | cmplw bf,ra,rb | cmpl bf,0,ra,rb | | #### **Examples** 1. Compare bits 32:63 of register Rx and immediate value 100 as signed 32-bit integers and place result into CR0. cmpwi Rx,100 (equivalent to: cmpi 0,0,Rx,100) 2. Same as (1), but place result into CR4. cmpwi cr4,Rx,100 (equivalent to: cmpi 4,0,Rx,100) 3. Compare bits 32:63 of registers Rx and Ry as unsigned 32-bit integers and place result into CR0. cmplw Rx,Ry (equivalent to: cmpl 0,0,Rx,Ry) ### **D.6 Trap Mnemonics** The mnemonics defined in Table 18 are variations of the *Trap* instructions, with the most useful values of TO represented in the mnemonic rather than specified as a numeric operand. A standard set of codes has been adopted for the most common combinations of trap conditions. | Code | Meaning | TO encoding | < | > | = | <u< th=""><th>&gt;u</th></u<> | >u | |--------|---------------------------------|-------------|---|---|---|-------------------------------|----| | lt | Less than | 16 | 1 | 0 | 0 | 0 | 0 | | le | Less than or equal | 20 | 1 | 0 | 1 | 0 | 0 | | eq | Equal | 4 | 0 | 0 | 1 | 0 | 0 | | ge | Greater than or equal | 12 | 0 | 1 | 1 | 0 | 0 | | gt | Greater than | 8 | 0 | 1 | 0 | 0 | 0 | | nl | Not less than | 12 | 0 | 1 | 1 | 0 | 0 | | ne | Not equal | 24 | 1 | 1 | 0 | 0 | 0 | | ng | Not greater than | 20 | 1 | 0 | 1 | 0 | 0 | | Ilt | Logically less than | 2 | 0 | 0 | 0 | 1 | 0 | | lle | Logically less than or equal | 6 | 0 | 0 | 1 | 1 | 0 | | lge | Logically greater than or equal | 5 | 0 | 0 | 1 | 0 | 1 | | lgt | Logically greater than | 1 | 0 | 0 | 0 | 0 | 1 | | Inl | Logically not less than | 5 | 0 | 0 | 1 | 0 | 1 | | Ing | Logically not greater than | 6 | 0 | 0 | 1 | 1 | 0 | | u | Unconditionally with parameters | 31 | 1 | 1 | 1 | 1 | 1 | | (none) | Unconditional | 31 | 1 | 1 | 1 | 1 | 1 | These codes are reflected in the mnemonics shown in Table 18. | Table 18: Trap mnemonics | | | | | |-----------------------------------------|------------------|-----------------------|-------------------------|-----------------------| | | 64-bit Co | mparison | 32-bit Comparison | | | Trap Semantics | tdi<br>Immediate | <i>td</i><br>Register | <i>twi</i><br>Immediate | <i>tw</i><br>Register | | Trap unconditionally | - | - | - | trap | | Trap unconditionally with parameters | tdui | tdu | twui | twu | | Trap if less than | tdlti | tdlt | twlti | twlt | | Trap if less than or equal | tdlei | tdle | twlei | twle | | Trap if equal | tdeqi | tdeq | tweqi | tweq | | Trap if greater than or equal | tdgei | tdge | twgei | twge | | Trap if greater than | tdgti | tdgt | twgti | twgt | | Trap if not less than | tdnli | tdnl | twnli | twnl | | Trap if not equal | tdnei | tdne | twnei | twne | | Trap if not greater than | tdngi | tdng | twngi | twng | | Trap if logically less than | tdllti | tdllt | twllti | twllt | | Trap if logically less than or equal | tdllei | tdlle | twllei | twlle | | Trap if logically greater than or equal | tdlgei | tdlge | twlgei | twlge | | Trap if logically greater than | tdlgti | tdlgt | twlgti | twlgt | | Trap if logically not less than | tdlnli | tdlnl | twlnli | twlnl | | Trap if logically not greater than | tdlngi | tdlng | twlngi | twlng | #### **Examples** 1. Trap if register Rx is not 0. tdnei Rx,0 (equivalent to: tdi 24,Rx,0) 2. Same as (1), but comparison is to register Ry. tdne Rx,Ry (equivalent to: td 24,Rx,Ry) 3. Trap if bits 32:63 of register Rx, considered as a 32-bit quantity, are logically greater than 0x7FF. twlgti Rx,0x7FF (equivalent to: twi 1,Rx,0x7FF) 4. Trap unconditionally. trap (equivalent to: tw 31,0,0) 5. Trap unconditionally with immediate parameters Rx and Ry tdu Rx,Ry (equivalent to: td 31,Rx,Ry) #### **D.7 Rotate and Shift Mnemonics** The Rotate and Shift instructions provide powerful and general ways to manipulate register contents, but can be difficult to understand. Extended mnemonics are provided that allow some of the simpler operations to be coded easily. Mnemonics are provided for the following types of operation. Extract Select a field of n bits starting at bit position b in the source register; left or right justify this field in the target register; clear all other bits of the target register to 0. Insert Select a left-justified or right-justified field of n bits in the source register; insert this field starting at bit position b of the target register; leave other bits of the target register unchanged. (No extended mnemonic is provided for insertion of a left-justified field when operating on doublewords, because such an insertion requires more than one instruction.) Rotate Rotate the contents of a register right or left n bits without masking. Shift Shift the contents of a register right or left n bits, clearing vacated bits to 0 (logical shift). Clear Clear the leftmost or rightmost n bits of a register to 0. Clear left and shift left Clear the leftmost b bits of a register, then shift the register left by n bits. This operation can be used to scale a (known nonnegative) array index by the width of an element. #### **D.7.1 Operations on Doublewords** All these mnemonics can be coded with a final "." to cause the Rc bit to be set in the underlying instruction. | Table 19: Doubleword rotate and shift mnemonics | | | |-------------------------------------------------|----------------------------------|-------------------------| | Operation | Extended Mnemonic | Equivalent to | | Extract and left justify immediate | extldi ra,rs,n,b (n > 0) | rldicr ra,rs,b,n-1 | | Extract and right justify immediate | extrdi ra,rs,n,b (n > 0) | rldicl ra,rs,b+n,64-n | | Insert from right immediate | insrdi ra,rs,n,b (n > 0) | rldimi ra,rs,64-(b+n),b | | Rotate left immediate | rotldi ra,rs,n | rldicl ra,rs,n,0 | | Rotate right immediate | rotrdi ra,rs,n | rldicl ra,rs,64-n,0 | | Rotate left | rotld ra,rs,rb | rldcl ra,rs,rb,0 | | Shift left immediate | sldi ra,rs,n (n < 64) | rldicr ra,rs,n,63-n | | Shift right immediate | srdi ra,rs,n (n < 64) | rldicl ra,rs,64-n,n | | Clear left immediate | clrldi ra,rs,n (n < 64) | rldicl ra,rs,0,n | | Clear right immediate | clrrdi ra,rs,n (n < 64) | rldicr ra,rs,0,63-n | | Clear left and shift left immediate | clrlsldi ra,rs,b,n (n <= b < 64) | rldic ra,rs,n,b-n | #### **Examples** 1. Extract the sign bit (bit 0) of register Ry and place the result right-justified into register Rx. extrdi Rx,Ry,1,0 (equivalent to: rldicl Rx,Ry,1,63) 2. Insert the bit extracted in (1) into the sign bit (bit 0) of register Rz. insrdi Rz,Rx,1,0 (equivalent to: rldimi Rz,Rx,63,0) 3. Shift the contents of register Rx left 8 bits. sldi Rx,Rx,8 (equivalent to: rldicr Rx,Rx,8,55) 4. Clear the high-order 32 bits of register Ry and place the result into register Rx. clrldi Rx,Ry,32 (equivalent to: rldicl Rx,Ry,0,32) #### **D.7.2 Operations on Words** All these mnemonics can be coded with a final "." to cause the Rc bit to be set in the underlying instruction. The operations as described above apply to the low-order 32 bits of the registers, as if the registers were 32-bit registers. The Insert operations either preserve the high-order 32 bits of the target register or place rotated data there; the other operations clear these bits. | Table 20: Word rotate and shift mnemonics | | | | |-------------------------------------------|-------------------------------------|---------------------------------|--| | Operation | Extended Mnemonic | Equivalent to | | | Extract and left justify immediate | extlwi ra,rs,n,b (n > 0) | rlwinm ra,rs,b,0,n-1 | | | Extract and right justify immediate | extrwi ra,rs,n,b (n > 0) | rlwinm ra,rs,b+n,32-n,31 | | | Insert from left immediate | inslwi ra,rs,n,b (n > 0) | rlwimi ra,rs,32-b,b,(b+n)-1 | | | Insert from right immediate | insrwi ra,rs,n,b (n > 0) | rlwimi ra,rs,32-(b+n),b,(b+n)-1 | | | Rotate left immediate | rotlwi ra,rs,n | rlwinm ra,rs,n,0,31 | | | Rotate right immediate | rotrwi ra,rs,n | rlwinm ra,rs,32-n,0,31 | | | Rotate left | rotlw ra,rs,rb | rlwnm ra,rs,rb,0,31 | | | Shift left immediate | slwi ra,rs,n (n < 32) | rlwinm ra,rs,n,0,31-n | | | Shift right immediate | srwi ra,rs,n (n < 32) | rlwinm ra,rs,32-n,n,31 | | | Clear left immediate | clrlwi ra,rs,n (n < 32) | rlwinm ra,rs,0,n,31 | | | Clear right immediate | clrrwi ra,rs,n (n < 32) | rlwinm ra,rs,0,0,31-n | | | Clear left and shift left immediate | clrlslwi ra,rs,b,n $(n \le b < 32)$ | rlwinm ra,rs,n,b-n,31-n | | #### **Examples** 1. Extract the sign bit (bit 32) of register Ry and place the result right-justified into register Rx. extrwi Rx,Ry,1,0 (equivalent to: rlwinm Rx,Ry,1,31,31) 2. Insert the bit extracted in (1) into the sign bit (bit 32) of register Rz. insrwi Rz,Rx,1,0 (equivalent to: rlwimi Rz,Rx,31,0,0) 3. Shift the contents of register Rx left 8 bits, clearing the high-order 32 bits. slwi Rx,Rx,8 (equivalent to: rlwinm Rx,Rx,8,0,23) 4. Clear the high-order 16 bits of the low-order 32 bits of register Ry and place the result into register Rx, clearing the high-order 32 bits of register Rx. clrlwi Rx,Ry,16 (equivalent to: rlwinm Rx,Ry,0,16,31) #### **D.8 Move To/From Special Purpose Register Mnemonics** The *mtspr* and *mfspr* instructions specify a Special Purpose Register (SPR) as a numeric operand. Extended mnemonics are provided that represent the SPR in the mnemonic rather than requiring it to be coded as an operand. | Table 21: Extended mnemonics for moving to/from an SPR | | | | | |--------------------------------------------------------|-------------|---------------|---------------|---------------| | Special Burness Beginter | Move To SPR | | Move From SPR | | | Special Purpose Register | Extended | Equivalent to | Extended | Equivalent to | | Fixed-Point Exception Register (XER) | mtxer Rx | mtspr 1,Rx | mfxer Rx | mfspr Rx,1 | | Link Register (LR) | mtlr Rx | mtspr 8,Rx | mflr Rx | mfspr Rx,8 | | Count Register (CTR) | mtctr Rx | mtspr 9,Rx | mfctr Rx | mfspr Rx,9 | | PPR | mtppr Rx | mtspr 896,Rx | mfppr Rx | mfspr Rx,896 | #### **Examples** 1. Copy the contents of register Rx to the XER. mtxer Rx (equivalent to: mtspr 1,Rx) 2. Copy the contents of the LR to register Rx. mflr Rx (equivalent to: mfspr Rx,8) 3. Copy the contents of register Rx to the CTR. mtctr Rx (equivalent to: mtspr 9,Rx) #### **D.9 Miscellaneous Mnemonics** #### No-op Many Power ISA instructions can be coded in a way such that, effectively, no operation is performed. An extended mnemonic is provided for the preferred form of no-op. If an implementation performs any type of run-time optimization related to no-ops, the preferred form is the no-op that will trigger this. nop (equivalent to: ori 0,0,0) For some uses of a no-op instruction, optimizations related to no-ops, such as removal from the execution stream, are not desireable. An extended mnemonic is provided for the executed form of no-op. This form of no-op will still consume execution resources. xnop (equivalent to: xori 0,0,0) #### **Load Immediate** The *addi* and *addis* instructions can be used to load an immediate value into a register. Extended mnemonics are provided to convey the idea that no addition is being performed but merely data movement (from the immediate field of the instruction to a register). Load a 16-bit signed immediate value into register Rx. li Rx,value (equivalent to: addi Rx,0,value) Load a 16-bit signed immediate value, shifted left by 16 bits, into register Rx. lis Rx,value (equivalent to: addis Rx,0,value) #### **Load Address** This mnemonic permits computing the value of a base-displacement operand, using the **addi** instruction which normally requires separate register and immediate operands. la Rx,D(Ry) (equivalent to: addi Rx,Ry,D) The *la* mnemonic is useful for obtaining the address of a variable specified by name, allowing the Assembler to supply the base register number and compute the displacement. If the variable v is located at offset Dv bytes from the address in register Rv, and the Assembler has been told to use register Rv as a base for references to the data structure containing v, then the following line causes the address of v to be loaded into register Rx. la Rx,v (equivalent to: addi Rx,Rv,Dv) #### Move Register Several Power ISA instructions can be coded in a way such that they simply copy the contents of one register to another. An extended mnemonic is provided to convey the idea that no computation is being performed but merely data movement (from one register to another). The following instruction copies the contents of register Ry to register Rx. This mnemonic can be coded with a final "." to cause the Rc bit to be set in the underlying instruction. mr Rx,Ry (equivalent to: or Rx,Ry,Ry) #### **Complement Register** Several Power ISA instructions can be coded in a way such that they complement the contents of one register and place the result into another register. An extended mnemonic is provided that allows this operation to be coded easily. The following instruction complements the contents of register Ry and places the result into register Rx. This mnemonic can be coded with a final "." to cause the Rc bit to be set in the underlying instruction. not Rx,Ry (equivalent to: nor Rx,Ry,Ry) #### **Move To/From Condition Register** This mnemonic permits copying the contents of the low-order 32 bits of a GPR to the Condition Register, using the same style as the *mfcr* instruction. mtcr Rx (equivalent to: mtcrf 0xFF,Rx) The following instructions may generate either the (old) *mtcrf* or *mfcr* instructions or the (new) *mtocrf* or *mfocrf* instruction, respectively, depending on the target machine type assembler parameter. mtcrf FXM,Rx mfcr Rx All three extended mnemonics in this subsection are being phased out. In future assemblers the form "mtcr Rx" may not exist, and the mtcrf and mfcr mnemonics may generate the old form instructions (with bit 11 = 0) regardless of the target machine type assembler parameter, or may cease to exist. #### **Appendix E. Programming Examples** #### **E.1 Multiple-Precision Shifts** This section gives examples of how multiple-precision shifts can be programmed. A multiple-precision shift is defined to be a shift of an N-doubleword quantity (64-bit mode) or an N-word quantity (32-bit mode), where N>1. The quantity to be shifted is contained in N registers. The shift amount is specified either by an immediate value in the instruction, or by a value in a register. The examples shown below distinguish between the cases N=2 and N>2. If N=2, the shift amount may be in the range 0 through 127 (64-bit mode) or 0 through 63 (32-bit mode), which are the maximum ranges supported by the *Shift* instructions used. However if N>2, the shift amount must be in the range 0 through 63 (64-bit mode) or 0 through 31 (32-bit mode), in order for the examples to yield the desired result. The specific instance shown for N>2 is N=3; extending those code sequences to larger N is straightforward, as is reducing them to the case N=2 when the more stringent restriction on shift amount is met. For shifts with immediate shift amounts only the case N=3 is shown, because the more stringent restriction on shift amount is always met. In the examples it is assumed that GPRs 2 and 3 (and 4) contain the quantity to be shifted, and that the result is to be placed into the same registers, except for the immediate left shifts in 64-bit mode for which the result is placed into GPRs 3, 4, and 5. In all cases, for both input and result, the lowest-numbered register contains the highest-order part of the data and highest-numbered register contains the lowest-order part. For non-immediate shifts, the shift amount is assumed to be in GPR 6. For immediate shifts, the shift amount is assumed to be greater than 0. GPRs 0 and 31 are used as scratch registers. For N>2, the number of instructions required is 2N-1 (immediate shifts) or 3N-1 (non-immediate shifts). ### Multiple-precision shifts in 64-bit mode [Category: 64-Bit] #### Shift Left Immediate, N = 3 (shift amnt < 64) | rldicr | r5,r4,sh,63- | |--------|--------------| | rldimi | r4,r3,0,sh | | rldicl | r4,r4,sh,0 | | rldimi | r3,r2,0,sh | | rldicl | r3,r3,sh,0 | #### Shift Left, N = 2 (shift amnt < 128) | subfic | r31,r6,64 | |--------|------------| | sld | r2,r2,r6 | | srd | r0,r3,r31 | | or | r2,r2,r0 | | addi | r31,r6,-64 | | sld | r0,r3,r31 | | or | r2,r2,r0 | | sld | r3,r3,r6 | #### Shift Left, N = 3 (shift amnt < 64) | subfic | r31,r6,64 | |--------|-----------| | sld | r2,r2,r6 | | srd | r0,r3,r31 | | or | r2,r2,r0 | | sld | r3,r3,r6 | | srd | r0,r4,r31 | | or | r3,r3,r0 | | sld | r4,r4,r6 | | | | #### Shift Right Immediate, N = 3 (shift amnt < 64) | rldimi | r4,r3,0,64-sh | |--------|----------------| | rldicl | r4,r4,64-sh,0 | | rldimi | r3,r2,0,64-sh | | rldicl | r3,r3,64-sh,0 | | rldicl | r2,r2,64-sh,sh | #### Shift Right, N = 2 (shift amnt < 128) | subfic | r31,r6,64 | |--------|------------| | srd | r3,r3,r6 | | sld | r0,r2,r31 | | or | r3,r3,r0 | | addi | r31,r6,-64 | | srd | r0,r2,r31 | | or | r3,r3,r0 | | srd | r2,r2,r6 | #### Shift Right, N = 3 (shift amnt < 64) | subfic | r31,r6,64 | |--------|-----------| | srd | r4,r4,r6 | | sld | r0,r3,r31 | | or | r4,r4,r0 | | srd | r3,r3,r6 | | sld | r0,r2,r31 | | or | r3,r3,r0 | | srd | r2,r2,r6 | ### Multiple-precision shifts in 32-bit mode #### Shift Left Immediate, N = 3 (shift amnt < 32) | rlwinm | r2,r2,sh,0,31-sh | |--------|-------------------| | rlwimi | r2,r3,sh,32-sh,31 | | rlwinm | r3,r3,sh,0,31-sh | | rlwimi | r3,r4,sh,32-sh,31 | | rlwinm | r4,r4,sh,0,31-sh | | | ,, , | #### Shift Left, N = 2 (shift amnt < 64) | subfic | r31,r6,32 | |--------|------------| | slw | r2,r2,r6 | | srw | r0,r3,r31 | | or | r2,r2,r0 | | addi | r31,r6,-32 | | slw | r0,r3,r31 | | or | r2,r2,r0 | | slw | r3,r3,r6 | | | | #### Shift Left, N = 3 (shift amnt < 32) | subfic | r31,r6,32 | |--------|-----------| | slw | r2,r2,r6 | | srw | r0,r3,r31 | | or | r2,r2,r0 | | slw | r3,r3,r6 | | srw | r0,r4,r31 | | or | r3,r3,r0 | | slw | r4 r4 r6 | #### Shift Right Immediate, N = 3 (shift amnt < 32) | rlwinm | r4,r4,32-sh,sh,31 | |--------|--------------------| | rlwimi | r4,r3,32-sh,0,sh-1 | | rlwinm | r3,r3,32-sh,sh,31 | | rlwimi | r3,r2,32-sh,0,sh-1 | | rlwinm | r2,r2,32-sh,sh,31 | #### Shift Right, N = 2 (shift amnt < 64) | SUDTIC | r31,r6,32 | |--------|------------| | srw | r3,r3,r6 | | slw | r0,r2,r31 | | or | r3,r3,r0 | | addi | r31,r6,-32 | | srw | r0,r2,r31 | | or | r3,r3,r0 | | srw | r2,r2,r6 | | | | #### Shift Right, N = 3 (shift amnt < 32) | SUDTIC | r31,r6,32 | |--------|-----------| | srw | r4,r4,r6 | | slw | r0,r3,r31 | | or | r4,r4,r0 | | srw | r3,r3,r6 | | slw | r0,r2,r31 | | or | r3,r3,r0 | | srw | r2,r2,r6 | ### Multiple-precision shifts in 64-bit mode, continued [Category: 64-Bit] ### Multiple-precision shifts in 32-bit mode, continued ### Shift Right Algebraic Immediate, N = 3 (shift amnt < 64) | rldimi | r4,r3,0,64-sh | |--------|---------------| | rldicl | r4,r4,64-sh,0 | | rldimi | r3,r2,0,64-sh | | rldicl | r3,r3,64-sh,0 | | sradi | r2,r2,sh | ### Shift Right Algebraic Immediate, N = 3 (shift amnt < 32) | rlwinm | r4,r4,32-sh,sh,31 | |--------|--------------------| | rlwimi | r4,r3,32-sh,0,sh-1 | | rlwinm | r3,r3,32-sh,sh,31 | | rlwimi | r3,r2,32-sh,0,sh-1 | | srawi | r2,r2,sh | #### Shift Right Algebraic, N = 2 (shift amnt < 128) | 9 | 900: a.o, :1 - 2 | |--------|------------------| | subfic | r31,r6,64 | | srd | r3,r3,r6 | | sld | r0,r2,r31 | | or | r3,r3,r0 | | addic. | r31,r6,-64 | | srad | r0,r2,r31 | | ble | \$+8 | | ori | r3,r0,0 | | srad | r2,r2,r6 | | | | #### Shift Right Algebraic, N = 2 (shift amnt < 64) | subfic | r31,r6,32 | |--------|------------| | srw | r3,r3,r6 | | slw | r0,r2,r31 | | or | r3,r3,r0 | | addic. | r31,r6,-32 | | sraw | r0,r2,r31 | | ble | \$+8 | | ori | r3,r0,0 | | sraw | r2,r2,r6 | | | | #### Shift Right Algebraic, N = 3 (shift amnt < 64) | subfic | r31,r6,64 | |--------|-----------| | srd | r4,r4,r6 | | sld | r0,r3,r31 | | or | r4,r4,r0 | | srd | r3,r3,r6 | | sld | r0,r2,r31 | | or | r3,r3,r0 | | srad | r2,r2,r6 | | | | #### Shift Right Algebraic, N = 3 (shift amnt < 32) | subfic | r31,r6,32 | |--------|-----------| | srw | r4,r4,r6 | | slw | r0,r3,r31 | | or | r4,r4,r0 | | srw | r3,r3,r6 | | slw | r0,r2,r31 | | or | r3,r3,r0 | | sraw | r2,r2,r6 | | | | #### **E.2 Floating-Point Conversions [Category: Floating-Point]** This section gives examples of how the *Floating-Point Conversion* instructions can be used to perform various conversions. **Warning:** Some of the examples use the *fsel* instruction. Care must be taken in using *fsel* if IEEE compatibility is required, or if the values being tested can be NaNs or infinities; see Section E.3.4, "Notes" on page 402. ## E.2.1 Conversion from Floating-Point Number to Floating-Point Integer The full *convert to floating-point integer* function can be implemented with the sequence shown below, assuming the floating-point value to be converted is in FPR 1 and the result is returned in FPR 3. | mtfsb0 | 23 | #clear VXCVI | | |----------|---------|----------------------|--| | fctid[z] | f3,f1 | #convert to fx int | | | fcfid | f3,f3 | #convert back again | | | mcrfs | 7,5 | #VXCVI to CR | | | bf | 31,\$+8 | #skip if VXCVI was 0 | | | fmr | f3,f1 | #input was fp int | | ## E.2.2 Conversion from Floating-Point Number to Signed Fixed-Point Integer Doubleword The full convert to signed fixed-point integer doubleword function can be implemented with the sequence shown below, assuming the floating-point value to be converted is in FPR 1, the result is returned in GPR 3, and a doubleword at displacement "disp" from the address in GPR 1 can be used as scratch space. ``` fctid[z] f2,f1 #convert to dword int stfd f2,disp(r1) #store float ld r3,disp(r1) #load dword ``` ## E.2.3 Conversion from Floating-Point Number to Unsigned Fixed-Point Integer Doubleword The full *convert to unsigned fixed-point integer double-word* function can be implemented with the sequence shown below, assuming the floating-point value to be converted is in FPR 1, the value 0 is in FPR 0, the value 2<sup>64</sup>-2048 is in FPR 3, the value 2<sup>63</sup> is in FPR 4 and GPR 4, the result is returned in GPR 3, and a doubleword at displacement "disp" from the address in GPR 1 can be used as scratch space. ``` fsel f2,f1,f1,f0 #use 0 if < 0 fsub f5,f3,f1 #use max if > max f2, f5, f2, f3 fsel #subtract 2<sup>63</sup> fsub f5,f2,f4 #use diff if >= 2^{63} fcmpu cr2,f2,f4 f2,f5,f5,f2 fsel fctid[z] f2,f2 #convert to fx int #store float stfd f2, disp(r1) 1d r3, disp(r1) #load dword #add 2<sup>63</sup> if input b1t cr2,$+8 \# was >= 2^{63} r3, r3, r4 ``` ## E.2.4 Conversion from Floating-Point Number to Signed Fixed-Point Integer Word The full convert to signed fixed-point integer word function can be implemented with the sequence shown below, assuming the floating-point value to be converted is in FPR 1, the result is returned in GPR 3, and a doubleword at displacement "disp" from the address in GPR 1 can be used as scratch space. ``` fctiw[z] f2,f1 #convert to fx int stfd f2,disp(r1) #store float lwa r3,disp+4(r1) #load word algebraic ``` ## E.2.5 Conversion from Floating-Point Number to Unsigned Fixed-Point Integer Word The full *convert to unsigned fixed-point integer word* function can be implemented with the sequence shown below, assuming the floating-point value to be converted is in FPR 1, the value 0 is in FPR 0, the value $2^{32}$ -1 is in FPR 3, the result is returned in GPR 3, and a doubleword at displacement "disp" from the address in GPR 1 can be used as scratch space. ## E.2.6 Conversion from Signed Fixed-Point Integer Doubleword to Floating-Point Number The full convert from signed fixed-point integer double-word function, using the rounding mode specified by FPSCR<sub>RN</sub>, can be implemented with the sequence shown below, assuming the fixed-point value to be converted is in GPR 3, the result is returned in FPR 1, and a doubleword at displacement "disp" from the address in GPR 1 can be used as scratch space. ``` std r3,disp(r1) #store dword lfd f1,disp(r1) #load float fcfid f1,f1 #convert to fp int ``` ## E.2.7 Conversion from Unsigned Fixed-Point Integer Doubleword to Floating-Point Number The full convert from unsigned fixed-point integer doubleword function, using the rounding mode specified by FPSCR<sub>RN</sub>, can be implemented with the sequence shown below, assuming the fixed-point value to be converted is in GPR 3, the value 2<sup>32</sup> is in FPR 4, the result is returned in FPR 1, and two doublewords at displacement "disp" from the address in GPR 1 can be used as scratch space. ``` rldicl r2, r3, 32, 32 #isolate high half #isolate low half rldicl r0,r3,0,32 r2,disp(r1) #store dword both std r0,disp+8(r1) lfd f2, disp(r1) #load float both f1,disp+8(r1) 1fd fcfid f2,f2 #convert each half to \begin{array}{lll} \text{f1,f1} & \text{\# fp int (exact result)} \\ \text{f1,f4,f2,f1} & \text{\#}(2^{32}) \times \text{high + low} \end{array} fcfid fmadd ``` An alternative, shorter, sequence can be used if rounding according to FSCPR<sub>RN</sub> is desired and FPSCR<sub>RN</sub> specifies *Round toward +Infinity* or *Round toward -Infinity*, or if it is acceptable for the rounded answer to be either of the two representable floating-point integers nearest to the given fixed-point integer. In this case the full *convert from unsigned fixed-point integer doubleword* function can be implemented with the sequence shown below, assuming the value 2<sup>64</sup> is in FPR 2. ``` std r3,disp(r1) #store dword lfd f1,disp(r1) #load float fcfid f1,f1 #convert to fp int fadd f4,f1,f2 #add 2^{64} fsel f1,f1,f1,f4 # if r3 < 0 ``` ## E.2.8 Conversion from Signed Fixed-Point Integer Word to Floating-Point Number The full convert from signed fixed-point integer word function can be implemented with the sequence shown below, assuming the fixed-point value to be converted is in GPR 3, the result is returned in FPR 1, and a doubleword at displacement "disp" from the address in GPR 1 can be used as scratch space. (The result is exact.) ``` extsw r3,r3 #extend sign std r3,disp(r1) #store dword lfd f1,disp(r1) #load float fcfid f1,f1 #convert to fp int ``` The following sequence can be used, assuming a word at the address in GPR 1 + GPR 2 can be used as scratch space. #### E.3 Floating-Point Selection [Category: Floating-Point] This section gives examples of how the *Floating Select* instruction can be used to implement floating-point minimum and maximum functions, and certain simple forms of if-then-else constructions, without branching. The examples show program fragments in an imaginary, C-like, high-level programming language, and the corresponding program fragment using *fsel* and other Power ISA instructions. In the examples, a, b, x, y, and z are floating-point variables, which are assumed to be in FPRs fa, fb, fx, fy, and fz. FPR fs is assumed to be available for scratch space. Additional examples can be found in Section E.2, "Floating-Point Conversions [Category: Floating-Point]" on page 400. **Warning:** Care must be taken in using *fsel* if IEEE compatibility is required, or if the values being tested can be NaNs or infinities; see Section E.3.4. #### E.3.1 Comparison to Zero | High-level language: | Power ISA: | Notes | |------------------------------------------------------------|---------------------------------------------------------|-------| | if $a \ge 0.0$ then $x \leftarrow y$ else $x \leftarrow z$ | fsel fx,fa,fy,fz | (1) | | if a > 0.0 then $x \leftarrow y$ else $x \leftarrow z$ | fneg fs,fa<br>fsel fx,fs,fz,fy | (1,2) | | if $a = 0.0$ then $x \leftarrow y$ else $x \leftarrow z$ | <pre>fsel fx,fa,fy,fz fneg fs,fa fsel fx,fs,fx,fz</pre> | (1) | #### E.3.2 Minimum and Maximum | High-level language: | Power ISA: | Notes | |--------------------------|-----------------------------------|---------| | $x \leftarrow \min(a,b)$ | fsub fs,fa,fb<br>fsel fx,fs,fb,fa | (3,4,5) | | $x \leftarrow \max(a,b)$ | fsub fs,fa,fb<br>fsel fx,fs,fa,fb | (3,4,5) | ### E.3.3 Simple if-then-else Constructions | High-level language: | Power ISA: | Notes | |----------------------------------------------------------|-----------------------------------------------------------------------|---------| | if $a \ge b$ then $x \leftarrow y$ else $x \leftarrow z$ | fsub fs,fa,fb<br>fsel fx,fs,fy,fz | (4,5) | | if a > b then $x \leftarrow y$ else $x \leftarrow z$ | fsub fs,fb,fa<br>fsel fx,fs,fz,fy | (3,4,5) | | if $a = b$ then $x \leftarrow y$ else $x \leftarrow z$ | <pre>fsub fs,fa,fb fsel fx,fs,fy,fz fneg fs,fs fsel fx,fs,fx,fz</pre> | (4,5) | #### E.3.4 Notes The following Notes apply to the preceding examples and to the corresponding cases using the other three arithmetic relations (<, $\le$ , and $\ne$ ). They should also be considered when any other use of **fsel** is contemplated. In these Notes, the "optimized program" is the Power ISA program shown, and the "unoptimized program" (not shown) is the corresponding Power ISA program that uses *fcmpu* and *Branch Conditional* instructions instead of *fsel*. - The unoptimized program affects the VXSNAN bit of the FPSCR, and therefore may cause the system error handler to be invoked if the corresponding exception is enabled, while the optimized program does not affect this bit. This property of the optimized program is incompatible with the IEEE standard. - The optimized program gives the incorrect result if a is a NaN. - The optimized program gives the incorrect result if a and/or b is a NaN (except that it may give the correct result in some cases for the minimum and maximum functions, depending on how those functions are defined to operate on NaNs). - 4. The optimized program gives the incorrect result if a and b are infinities of the same sign. (Here it is assumed that Invalid Operation Exceptions are disabled, in which case the result of the subtraction is a NaN. The analysis is more complicated if Invalid Operation Exceptions are enabled, because in that case the target register of the subtraction is unchanged.) - 5. The optimized program affects the OX, UX, XX, and VXISI bits of the FPSCR, and therefore may cause the system error handler to be invoked if the corresponding exceptions are enabled, while the unoptimized program does not affect these bits. This property of the optimized program is incompatible with the IEEE standard. ### **E.4 Vector Unaligned Storage Operations [Category: Vector]** ## E.4.1 Loading a Unaligned Quadword Using Permute from Big-Endian Storage The following sequence of instructions copies the unaligned quadword storage operand into VRT. **Book II:** **Power ISA Virtual Environment Architecture** ## **Chapter 1. Storage Model** | 44 5 6 7 | | |--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------| | | Length Encoded (VLE) | | 1.2 Introduction | | | 1.3 Virtual Storage 408 1.7 Shared Sto | orage413 | | 1.4 Single-copy Atomicity 409 1.7.1 Storage | Access Ordering 413 | | 1.5 Cache Model 409 1.7.2 Storage | Ordering of I/O Accesses | | 1.6 Storage Control Attributes 410 415 | _ | | 1.6.1 Write Through Required 410 1.7.3 Atomic L | Jpdate 415 | | 1.6.2 Caching Inhibited 411 1.7.3.1 Reser | vations 415 | | 1.6.3 Memory Coherence Required [Cate- 1.7.3.2 Forwards 1.6.3 Memory Coherence Required [Cate- 1.7.3.2 Forwards 1.6.3 Memory Coherence Required [Cate- 1.7.3.2 Forwards 1.6.3 Memory Coherence Required [Cate- 1.7.3.2 Forwards 1.7 | ard Progress 417 | | gory: Memory Coherence] 411 1.8 Instruction | Storage 417 | | 1.6.4 Guarded 411 1.8.1 Concurre | ent Modification and Execu- | | | ions 419 | | ded.Little-Endian] 412 | | #### 1.1 Definitions The following definitions, in addition to those specified in Book I, are used in this Book. In these definitions, "Load instruction" includes the Cache Management and other instructions that are stated in the instruction descriptions to be "treated as a Load", and similarly for "Store instruction". #### processor A hardware component that executes the instructions specified in a program. #### system A combination of processors, storage, and associated mechanisms that is capable of executing programs. Sometimes the reference to system includes services provided by the operating system. #### ■ main storage The level of storage hierarchy in which all storage state is visible to all processors and mechanisms in the system. #### primary cache The level of cache closest to the processor. #### ■ secondary cache After the primary cache, the next closest level of cache to the processor. #### instruction storage The view of storage as seen by the mechanism that fetches instructions. #### data storage The view of storage as seen by a *Load* or *Store* instruction. #### program order The execution of instructions in the order required by the sequential execution model. (See the section entitled "Instruction Execution Order" in Book I. A *dcbz* instruction that modifies storage which contains instructions has the same effect with respect to the sequential execution model as a *Store* instruction as described there.) #### storage location A contiguous sequence of one or more bytes in storage. When used in association with a specific instruction or the instruction fetching mechanism, the length of the sequence of one or more bytes is typically implied by the operation. In other uses, it may refer more abstractly to a group of bytes which share common storage attributes. #### storage access An access to a storage location. There are three (mutually exclusive) kinds of storage access. #### - data access An access to the storage location specified by a *Load* or *Store* instruction, or, if the access is performed "out-of-order" (see Book III), an access to a storage location as if it were the storage location specified by a *Load* or *Store* instruction. #### - instruction fetch An access for the purpose of fetching an instruction. #### - implicit access An access by the processor for the purpose of address translation or reference and change recording (see Book III-S). #### caused by, associated with #### - caused by A storage access is said to be caused by an instruction if the instruction is a *Load* or *Store* and the access (data access) is to the storage location specified by the instruction. #### - associated with A storage access is said to be associated with an instruction if the access is for the purpose of fetching the instruction (instruction fetch), or is a data access caused by the instruction, or is an implicit access that occurs as a side effect of fetching or executing the instruction. #### prefetched instructions Instructions for which a copy of the instruction has been fetched from instruction storage, but the instruction has not yet been executed. #### ■ uniprocessor A system that contains one processor. #### ■ multiprocessor A system that contains two or more processors. #### ■ shared storage multiprocessor A multiprocessor that contains some common storage, which all the processors in the system can access. #### performed A load or instruction fetch by a processor or mechanism (P1) is performed with respect to any processor or mechanism (P2) when the value to be returned by the load or instruction fetch can no longer be changed by a store by P2. A store by P1 is performed with respect to P2 when a load by P2 from the location accessed by the store will return the value stored (or a value stored subsequently). An instruction cache block invalidation by P1 is performed with respect to P2 when an instruction fetch by P2 will not be satisfied from the copy of the block that existed in its instruction cache when the instruction causing the invalidation was executed, and similarly for a data cache block invalidation. The preceding definitions apply regardless of whether P1 and P2 are the same entity. #### ■ page (virtual page) 2<sup>n</sup> contiguous bytes of storage aligned such that the effective address of the first byte in the page is an integral multiple of the page size for which protection and control attributes are independently specifiable and for which reference and change status <S> are independently recorded. #### ■ block The aligned unit of storage operated on by the *Cache Management* instructions. The size of an instruction cache block may differ from the size of a data cache block, and both sizes may vary between implementations. The maximum block size is equal to the minimum page size. #### aligned storage access A load or store is aligned if the address of the target storage location is a multiple of the size of the transfer effected by the instruction. #### 1.2 Introduction The Power ISA User Instruction Set Architecture, discussed in Book I, defines storage as a linear array of bytes indexed from 0 to a maximum of 2<sup>64</sup>-1. Each byte is identified by its index, called its address, and each byte contains a value. This information is sufficient to allow the programming of applications that require no special features of any particular system environment. The Power ISA Virtual Environment Architecture, described herein, expands this simple storage model to include caches, virtual storage, and shared storage multiprocessors. The Power ISA Virtual Environment Architecture, in conjunction with services based on the Power ISA Operating Environment Architecture (see Book III) and provided by the operating system, permits explicit control of this expanded storage model. A simple model for sequential execution allows at most one storage access to be performed at a time and requires that all storage accesses appear to be performed in program order. In contrast to this simple model, the Power ISA specifies a relaxed model of storage consistency. In a multiprocessor system that allows multiple copies of a storage location, aggressive implementations of the architecture can permit intervals of time during which different copies of a storage location have different values. This chapter describes features of the Power ISA that enable programmers to write correct programs for this storage model. ## 1.3 Virtual Storage The Power ISA system implements a virtual storage model for applications. This means that a combination of hardware and software can present a storage model that allows applications to exist within a "virtual" address space larger than either the effective address space or the real address space. Each program can access 2<sup>64</sup> bytes of "effective address" (EA) space, subject to limitations imposed by the operating system. In a typical Power ISA system, each program's EA space is a subset of a larger "virtual address" (VA) space managed by the operating system. Each effective address is translated to a real address (i.e., to an address of a byte in real storage or on an I/O device) before being used to access storage. The hardware accomplishes this, using the address translation mechanism described in Book III. The operating system manages the real (physical) storage resources of the system, by setting up the tables and other information used by the hardware address translation mechanism. In general, real storage may not be large enough to map all the virtual pages used by the currently active applications. With support provided by hardware, the operating system can attempt to use the available real pages to map a sufficient set of virtual pages of the applications. If a sufficient set is maintained, "paging" activity is minimized. If not, performance degradation is likely. The operating system can support restricted access to virtual pages (including read/write, read only, and no access; see Book III), based on system standards (e.g., program code might be read only) and application requests. ## 1.4 Single-copy Atomicity An access is *single-copy atomic*, or simply *atomic*, if it is always performed in its entirety with no visible fragmentation. Atomic accesses are thus serialized: each happens in its entirety in some order, even when that order is not specified in the program or enforced between processors. Vector storage accesses are not guaranteed to be atomic. The following other types of single-register accesses are always atomic: - byte accesses (all bytes are aligned on byte boundaries) - halfword accesses aligned on halfword boundaries - word accesses aligned on word boundaries - doubleword accesses aligned on doubleword boundaries (64-bit implementations only; see Section 1.2 of Book III-E<E>) No other accesses are guaranteed to be atomic. For example, the access caused by the following instructions is not guaranteed to be atomic. any Load or Store instruction for which the operand is unaligned - Imw, stmw, Iswi, Iswx, stswi, stswx - Ifdp, Ifdpx, stfdp, stfdpx - any Cache Management instruction An access that is not atomic is performed as a set of smaller disjoint atomic accesses. The number and alignment of these accesses are implementation-dependent, as is the relative order in which they are performed. Accesses that are aligned on a doubleword boundary for *Ifdp*, *Ifdpx*, *stfdp*, and *stfdpx* are performed as a pair of disjoint atomic doubleword accesses. The results for several combinations of loads and stores to the same or overlapping locations are described below. - When two processors execute atomic stores to locations that do not overlap, and no other stores are performed to those locations, the contents of those locations are the same as if the two stores were performed by a single processor. - When two processors execute atomic stores to the same storage location, and no other store is performed to that location, the contents of that location are the result stored by one of the processors. - When two processors execute stores that have the same target location and are not guaranteed to be atomic, and no other store is performed to that location, the result is some combination of the bytes stored by both processors. - 4. When two processors execute stores to overlapping locations, and no other store is performed to those locations, the result is some combination of the bytes stored by the processors to the overlapping bytes. The portions of the locations that do not overlap contain the bytes stored by the processor storing to the location. - 5. When a processor executes an atomic store to a location, a second processor executes an atomic load from that location, and no other store is performed to that location, the value returned by the load is the contents of the location before the store or the contents of the location after the store. - 6. When a load and a store with the same target location can be executed simultaneously, and no other store is performed to that location, the value returned by the load is some combination of the contents of the location before the store and the contents of the location after the store. ### 1.5 Cache Model A cache model in which there is one cache for instructions and another cache for data is called a "Harvardstyle" cache. This is the model assumed by the Power ISA, e.g., in the descriptions of the *Cache Management* instructions in Section 3.3. Alternative cache models may be implemented (e.g., a "combined cache" model, in which a single cache is used for both instructions and data, or a model in which there are several levels of caches), but they support the programming model implied by a Harvard-style cache. The processor is not required to maintain copies of storage locations in the instruction cache consistent with modifications to those storage locations (e.g., modifications caused by *Store* instructions). A location in the data cache is considered to be modified in that cache if the location has been modified (e.g., by a *Store* instruction) and the modified data have not been written to main storage. Cache Management instructions are provided so that programs can manage the caches when needed. For example, program management of the caches is needed when a program generates or modifies code that will be executed (i.e., when the program modifies data in storage and then attempts to execute the modified data as instructions). The Cache Management instructions are also useful in optimizing the use of memory bandwidth in such applications as graphics and numerically intensive computing. The functions performed by these instructions depend on the storage control attributes associated with the specified storage location (see Section 1.6, "Storage Control Attributes"). The Cache Management instructions allow the program to do the following. - invalidate the copy of storage in an instruction cache block (*icbi*) - <E> provide a hint that an instruction will probably soon be accessed from a specified instruction cache block (*icbt*) - provide a hint that the program will probably soon access a specified data cache block (dcbt, dcbtst) - <E> allocate a data cache block and set the contents of that block to zeros, but perform no operation if no write access is allowed to the data cache block (dcba) - set the contents of a data cache block to zeros (dcbz) - copy the contents of a modified data cache block to main storage (dcbst) - copy the contents of a modified data cache block to main storage and make the copy of the block in the data cache invalid (dcbf or dcbfl<S>) ## 1.6 Storage Control Attributes Some operating systems may provide a means to allow programs to specify the storage control attributes described in this section. Because the support provided for these attributes by the operating system may vary between systems, the details of the specific system being used must be known before these attributes can be used. Storage control attributes are associated with units of storage that are multiples of the page size. Each storage access is performed according to the storage control attributes of the specified storage location, as described below. The storage control attributes are the following. - Write Through Required - Caching Inhibited - Memory Coherence Required - Guarded - Endianness<E> These attributes have meaning only when an effective address is translated by the processor performing the storage access. <E> Additional storage control attributes may be defined for some implementations. See Section 4.8 of Book III-E for additional information. #### **Programming Note** The Write Through Required and Caching Inhibited attributes are mutually exclusive because, as described below, the Write Through Required attribute permits the storage location to be in the data cache while the Caching Inhibited attribute does not. Storage that is Write Through Required or Caching Inhibited is not intended to be used for general-purpose programming. For example, the *Iwarx*, *Idarx*, *stwcx*., and *stdcx*. instructions may cause the system data storage error handler to be invoked if they specify a location in storage having either of these attributes. In the remainder of this section, "Load instruction" includes the Cache Management and other instructions that are stated in the instruction descriptions to be "treated as a Load", and similarly for "Store instruction". ## 1.6.1 Write Through Required A store to a Write Through Required storage location is performed in main storage. A Store instruction that specifies a location in Write Through Required storage may cause additional locations in main storage to be accessed. If a copy of the block containing the specified location is retained in the data cache, the store is also performed in the data cache. The store does not cause the block to be considered to be modified in the data cache. In general, accesses caused by separate *Store* instructions that specify locations in Write Through Required storage may be combined into one access. Such combining does not occur if the *Store* instructions are separated by a *sync*, *eieio*<S>, or *mbar*<E> instruction. ### 1.6.2 Caching Inhibited An access to a Caching Inhibited storage location is performed in main storage. A *Load* instruction that specifies a location in Caching Inhibited storage may cause additional locations in main storage to be accessed unless the specified location is also Guarded. An instruction fetch from Caching Inhibited storage may cause additional words in main storage to be accessed. No copy of the accessed locations is placed into the caches. In general, non-overlapping accesses caused by separate *Load* instructions that specify locations in Caching Inhibited storage may be combined into one access, as may non-overlapping accesses caused by separate *Store* instructions that specify locations in Caching Inhibited storage. Such combining does not occur if the *Load* or *Store* instructions are separated by a *sync* or *mbar*<E> instruction. Combining may also occur among such accesses from multiple processors that share a common memory interface. No combining occurs if the storage is also Guarded. #### **Programming Note** None of the memory barrier instructions prevent the combining of accesses from different processors. The Guarded storage attribute must be used in combination with Caching Inhibited to prevent such combining. ## 1.6.3 Memory Coherence Required [Category: Memory Coherence] An access to a Memory Coherence Required storage location is performed coherently, as follows. Memory coherence refers to the ordering of stores to a single location. Atomic stores to a given location are coherent if they are serialized in some order, and no processor or mechanism is able to observe any subset of those stores as occurring in a conflicting order. This serialization order is an abstract sequence of values; the physical storage location need not assume each of the values written to it. For example, a processor may update a location several times before the value is written to physical storage. The result of a store operation is not available to every processor or mechanism at the same instant, and it may be that a processor or mechanism observes only some of the values that are written to a location. However, when a location is accessed atomically and coherently by all processors and mechanisms, the sequence of values loaded from the location by any processor or mechanism during any interval of time forms a subsequence of the sequence of values that the location logically held during that interval. That is, a processor or mechanism can never load a "newer" value first and then, later, load an "older" value. Memory coherence is managed in blocks called coherence *blocks*. Their size is implementation-dependent, but is larger than a word and is usually the size of a cache block. For storage that is not Memory Coherence Required, software must explicitly manage memory coherence to the extent required by program correctness. The operations required to do this may be system-dependent. Because the Memory Coherence Required attribute for a given storage location is of little use unless all processors that access the location do so coherently, in statements about Memory Coherence Required storage elsewhere in this document it is generally assumed that the storage has the Memory Coherence Required attribute for all processors that access it. #### Programming Note Operating systems that allow programs to request that storage not be Memory Coherence Required should provide services to assist in managing memory coherence for such storage, including all system-dependent aspects thereof. In most systems the default is that all storage is Memory Coherence Required. For some applications in some systems, software management of coherence may yield better performance. In such cases, a program can request that a given unit of storage not be Memory Coherence Required, and can manage the coherence of that storage by using the **sync** instruction, the *Cache Management* instructions, and services provided by the operating system. #### 1.6.4 Guarded A data access to a Guarded storage location is performed only if either (a) the access is caused by an instruction that is known to be required by the sequential execution model, or (b) the access is a load and the storage location is already in a cache. If the storage is also Caching Inhibited, only the storage location specified by the instruction is accessed; otherwise any storage location in the cache block containing the specified storage location may be accessed. For the Server environment, instructions are not fetched from virtual storage that is Guarded. If the instruction addressed by the current instruction address is in such storage, the system instruction storage error handler may be invoked (see Section 6.5.5 of Book III-S). #### **Programming Note** In some implementations, instructions may be executed before they are known to be required by the sequential execution model. Because the results of instructions executed in this manner are discarded if it is later determined that those instructions would not have been executed in the sequential execution model, this behavior does not affect most programs. This behavior does affect programs that access storage locations that are not "well-behaved" (e.g., a storage location that represents a control register on an I/O device that, when accessed, causes the device to perform an operation). To avoid unintended results, programs that access such storage locations should request that the storage be Guarded, and should prevent such storage locations from being in a cache (e.g., by requesting that the storage also be Caching Inhibited). # 1.6.5 Endianness [Category: Embedded.Little-Endian] The Endianness storage control attribute specifies the byte ordering (Big-Endian or Little-Endian) that is used when the storage location is accessed; see Section 1.10 of Book I. # 1.6.6 Variable Length Encoded (VLE) Instructions VLE storage is used to store VLE instructions. Instructions fetched from VLE storage are processed as VLE instructions. VLE storage must also be Big-Endian. Instructions fetched from VLE storage that is Little-Endian cause a Byte-ordering exception, and the system instruction storage error handler will be invoked. The VLE attribute has no effect on data accesses. See Chapter 1 of Book VLE. ## 1.7 Shared Storage This architecture supports the sharing of storage between programs, between different instances of the same program, and between processors and other mechanisms. It also supports access to a storage location by one or more programs using different effective addresses. All these cases are considered storage sharing. Storage is shared in blocks that are an integral number of pages. When the same storage location has different effective addresses, the addresses are said to be *aliases*. Each application can be granted separate access privileges to aliased pages. ## 1.7.1 Storage Access Ordering The storage model for the ordering of storage accesses is weakly consistent. This model provides an opportunity for improved performance over a model that has stronger consistency rules, but places the responsibility on the program to ensure that ordering or synchronization instructions are properly placed when storage is shared by two or more programs. The order in which the processor performs storage accesses, the order in which those accesses are performed with respect to another processor or mechanism, and the order in which those accesses are performed in main storage may all be different. Several means of enforcing an ordering of storage accesses are provided to allow programs to share storage with other programs, or with mechanisms such as I/O devices. These means are listed below. The phrase "to the extent required by the associated Memory Coherence Required attributes" refers to the Memory Coherence Required attribute, if any, associated with each access. - If two Store instructions or two Load instructions specify storage locations that are both Caching Inhibited and Guarded, the corresponding storage accesses are performed in program order with respect to any processor or mechanism. - If a Load instruction depends on the value returned by a preceding Load instruction (because the value is used to compute the effective address specified by the second Load), the corresponding storage accesses are performed in program order with respect to any processor or mechanism to the extent required by the associated Memory Coherence Required attributes. This applies even if the dependency has no effect on program logic (e.g., the value returned by the first Load is ANDed with zero and then added to the effective address specified by the second Load). - When a processor (P1) executes a Synchronize, eieio<S>, or mbar<E> instruction a memory barrier is created, which orders applicable storage accesses pairwise, as follows. Let A be a set of storage accesses that includes all storage accesses associated with instructions preceding the barrier-creating instruction, and let B be a set of storage accesses that includes all storage accesses associated with instructions following the barrier-creating instruction. For each applicable pair $a_i,b_j$ of storage accesses such that $a_i$ is in A and $b_j$ is in B, the memory barrier ensures that $a_i$ will be performed with respect to any processor or mechanism, to the extent required by the associated Memory Coherence Required attributes, before $b_j$ is performed with respect to that processor or mechanism. The ordering done by a memory barrier is said to be "cumulative" if it also orders storage accesses that are performed by processors and mechanisms other than P1, as follows. - A includes all applicable storage accesses by any such processor or mechanism that have been performed with respect to P1 before the memory barrier is created. - B includes all applicable storage accesses by any such processor or mechanism that are performed after a Load instruction executed by that processor or mechanism has returned the value stored by a store that is in B. No ordering should be assumed among the storage accesses caused by a single instruction (i.e, by an instruction for which the access is not atomic), and no means are provided for controlling that order. #### **Programming Note** Because stores cannot be performed "out-of-order" (see Book III), if a *Store* instruction depends on the value returned by a preceding *Load* instruction (because the value returned by the *Load* is used to compute either the effective address specified by the *Store* or the value to be stored), the corresponding storage accesses are performed in program order. The same applies if *whether* the *Store* instruction is executed depends on a conditional *Branch* instruction that in turn depends on the value returned by a preceding *Load* instruction. Because an *isync* instruction prevents the execution of instructions following the *isync* until instructions preceding the *isync* have completed, if an *isync* follows a conditional *Branch* instruction that depends on the value returned by a preceding *Load* instruction, the load on which the *Branch* depends is performed before any loads caused by instructions following the *isync*. This applies even if the effects of the "dependency" are independent of the value loaded (e.g., the value is compared to itself and the *Branch* tests the EQ bit in the selected CR field), and even if the branch target is the sequentially next instruction. With the exception of the cases described above and earlier in this section, data dependencies and control dependencies do not order storage accesses. Examples include the following. - If a Load instruction specifies the same storage location as a preceding Store instruction and the location is in storage that is not Caching Inhibited, the load may be satisfied from a "store queue" (a buffer into which the processor places stored values before presenting them to the storage subsystem), and not be visible to other processors and mechanisms. A consequence is that if a subsequent Store depends on the value returned by the Load, the two stores need not be performed in program order with respect to other processors and mechanisms. - Because a Store Conditional instruction may complete before its store has been performed, a conditional Branch instruction that depends on the CR0 value set by a Store Conditional instruction does not order the *Store Conditional's* store with respect to storage accesses caused by instructions that follow the *Branch*. Because processors may predict branch target addresses and branch condition resolution, control dependencies (e.g., branches) do not order storage accesses except as described above. For example, when a subroutine returns to its caller the return address may be predicted, with the result that loads caused by instructions at or after the return address may be performed before the load that obtains the return address is performed. Because processors may implement nonarchitected duplicates of architected resources (e.g., GPRs, CR fields, and the Link Register), resource dependencies (e.g., specification of the same target register for two Load instructions) do not order storage accesses. Examples of correct uses of dependencies, **sync**, **lwsync**, **eieio**<S>, and **mbar**<E> to order storage accesses can be found in Appendix B. "Programming Examples for Sharing Storage" on page 459. Because the storage model is weakly consistent, the sequential execution model as applied to instructions that cause storage accesses guarantees only that those accesses appear to be performed in program order with respect to the processor executing the instructions. For example, an instruction may complete, and subsequent instructions may be executed. before storage accesses caused by the first instruction have been performed. However, for a sequence of atomic accesses to the same storage location, if the location is in storage that is Memory Coherence Required the definition of coherence guarantees that the accesses are performed in program order with respect to any processor or mechanism that accesses the location coherently, and similarly if the location is in storage that is Caching Inhibited. Because accesses to storage that is Caching Inhibited are performed in main storage, memory barriers and dependencies on *Load* instructions order such accesses with respect to any processor or mechanism even if the storage is not Memory Coherence Required. #### Programming Note - The first example below illustrates cumulative ordering of storage accesses preceding a memory barrier, and the second illustrates cumulative ordering of storage accesses following a memory barrier. Assume that locations X, Y, and Z initially contain the value 0. #### Example 1: #### Processor A: stores the value 1 to location X #### Processor B: loads from location X obtaining the value 1, executes a **sync** instruction, then stores the value 2 to location Y #### Processor C: loads from location Y obtaining the value 2, executes a *sync* instruction, then loads from location X #### Example 2: #### Processor A: stores the value 1 to location X, executes a *sync* instruction, then stores the value 2 to location Y #### Processor B: loops loading from location Y until the value 2 is obtained, then stores the value 3 to location $\mathsf{Z}$ #### Processor C: loads from location Z obtaining the value 3, executes a *sync* instruction, then loads from location X In both cases, cumulative ordering dictates that the value loaded from location X by processor C is 1. # 1.7.2 Storage Ordering of I/O Accesses A "coherence domain" consists of all processors and all interfaces to main storage. Memory reads and writes initiated by mechanisms outside the coherence domain are performed within the coherence domain in the order in which they enter the coherence domain and are performed as coherent accesses. ## 1.7.3 Atomic Update The Load And Reserve and Store Conditional instructions together permit atomic update of a shared storage location. There are word and doubleword forms of each of these instructions. Described here is the operation of the word forms *Iwarx* and *stwcx.*; operation of the doubleword forms *Idarx* and *stdcx*. is the same except for obvious substitutions. The *Iwarx* instruction is a load from a word-aligned location that has two side effects. Both of these side effects occur at the same time that the load is performed. - A reservation for a subsequent stwcx. instruction is created. - The memory coherence mechanism is notified that a reservation exists for the storage location specified by the *Iwarx*. The **stwcx**. instruction is a store to a word-aligned location that is conditioned on the existence of the reservation created by the **Iwarx** and on whether the same storage location is specified by both instructions. To emulate an atomic operation with these instructions, it is necessary that both the **Iwarx** and the **stwcx**. specify the same storage location. A **stwcx.** performs a store to the target storage location only if the storage location specified by the **Iwarx** that established the reservation has not been stored into by another processor or mechanism since the reservation was created. If the storage locations specified by the two instructions differ, the store is not necessarily performed. A stwcx. that performs its store is said to "succeed". Examples of the use of *Iwarx* and *stwcx*. are given in Appendix B. "Programming Examples for Sharing Storage" on page 459. A successful **stwcx.** to a given location may complete before its store has been performed with respect to other processors and mechanisms. As a result, a subsequent load or **lwarx** from the given location by another processor may return a "stale" value. However, a subsequent **lwarx** from the given location by the other processor followed by a successful **stwcx.** by that processor is guaranteed to have returned the value stored by the first processor's **stwcx**. (in the absence of other stores to the given location). #### Programming Note – The store caused by a successful **stwcx.** is ordered, by a dependence on the reservation, with respect to the load caused by the **lwarx** that established the reservation, such that the two storage accesses are performed in program order with respect to any processor or mechanism. #### 1.7.3.1 Reservations The ability to emulate an atomic operation using *lwarx* and *stwcx*. is based on the conditional behavior of *stwcx*., the reservation created by *lwarx*, and the clearing of that reservation if the target location is mod- ified by another processor or mechanism before the **stwcx.** performs its store. A reservation is held on an aligned unit of real storage called a reservation granule. The size of the reservation granule is 2<sup>n</sup> bytes, where n is implementation-dependent but is always at least 4 (thus the minimum reservation granule size is a quadword). The reservation granule associated with effective address EA contains the real address to which EA maps. ("real\_addr(EA)" in the RTL for the Load And Reserve and Store Conditional instructions stands for "real address to which EA maps".) A processor has at most one reservation at any time. A reservation is established by executing a *lwarx* or *ldarx* instruction, and is lost (or may be lost, in the case of the third, fifth, sixth and seventh item) if any of the following occur. - The processor holding the reservation executes another *Iwarx* or *Idarx*: this clears the first reservation and establishes a new one. - The processor holding the reservation executes any stwcx. or stdcx., regardless of whether the specified address matches the address specified by the lwarx or ldarx that established the reservation. - The processor holding the reservation executes a dcbf or dcbfl<S> to the reservation granule: whether the reservation is lost is undefined. - 4. Some other processor executes a *Store* or *dcbz* to the same reservation granule. - Some other processor executes a dcbtst, dcbst, dcbf (but not dcbfl<S>) to the same reservation granule: whether the reservation is lost is undefined. - <E> Some other processor executes a *dcba* to the same reservation granule: the reservation is lost if the instruction causes the target block to be newly established in a data cache or to be modified; otherwise whether the reservation is lost is undefined. - Any processor modifies a Reference or Change bit (see Book III-S) in the same reservation granule: whether the reservation is lost is undefined. - 8. Some mechanism other than a processor modifies a storage location in the same reservation granule. For the Server environment, interrupts (see Book III-S) do not clear reservations (however, system software invoked by interrupts may clear reservations); for the Embedded environment, interrupts do not necessarily clear reservations (see Book III-E). #### Programming Note - One use of Iwarx and stwcx. is to emulate a "Compare and Swap" primitive like that provided by the IBM System/370 Compare and Swap instruction; see Section B.1, "Atomic Update Primitives" on page 459. A System/370-style Compare and Swap checks only that the old and current values of the word being tested are equal, with the result that programs that use such a Compare and Swap to control a shared resource can err if the word has been modified and the old value subsequently restored. The combination of Iwarx and stwcx. improves on such a Compare and Swap, because the reservation reliably binds the *Iwarx* and *stwcx*. together. The reservation is always lost if the word is modified by another processor or mechanism between the Iwarx and stwcx., so the stwcx. never succeeds unless the word has not been stored into (by another processor or mechanism) since the Iwarx. #### **Programming Note -** In general, programming conventions must ensure that Iwarx and stwcx. specify addresses that match; a stwcx. should be paired with a specific Iwarx to the same storage location. Situations in which a stwcx. may erroneously be issued after some *Iwarx* other than that with which it is intended to be paired must be scrupulously avoided. For example, there must not be a context switch in which the processor holds a reservation in behalf of the old context, and the new context resumes after a *Iwarx* and before the paired *stwcx*. The *stwcx*. in the new context might succeed, which is not what was intended by the programmer. Such a situation must be prevented by executing a stwcx. or stdcx. that specifies a dummy writable aligned location as part of the context switch; see Section 6.4.3 of Book III-S and Section 5.5 of Book III-E. #### Programming Note Because the reservation is lost if another processor stores anywhere in the reservation granule, lock words (or doublewords) should be allocated such that few such stores occur, other than perhaps to the lock word itself. (Stores by other processors to the lock word result from contention for the lock, and are an expected consequence of using locks to control access to shared storage; stores to other locations in the reservation granule can cause needless reservation loss.) Such allocation can most easily be accomplished by allocating an entire reservation granule for the lock and wasting all but one word. Because reservation granule size is implementation-dependent, portable code must do such allocation dynamically. Similar considerations apply to other data that are shared directly using *Iwarx* and *stwcx*. (e.g., pointers in certain linked lists; see Section B.3, "List Insertion" on page 463). #### 1.7.3.2 Forward Progress Forward progress in loops that use *Iwarx* and *stwcx*. is achieved by a cooperative effort among hardware, system software, and application software. The architecture guarantees that when a processor executes a *Iwarx* to obtain a reservation for location X and then a *stwcx*, to store a value to location X, either - the stwcx. succeeds and the value is written to location X, or - the stwcx. fails because some other processor or mechanism modified location X, or - the stwcx. fails because the processor's reservation was lost for some other reason. In Cases 1 and 2, the system as a whole makes progress in the sense that some processor successfully modifies location X. Case 3 covers reservation loss required for correct operation of the rest of the system. This includes cancellation caused by some other processor writing elsewhere in the reservation granule for X, as well as cancellation caused by the operating system in managing certain limited resources such as real storage. It may also include implementation-dependent causes of reservation loss. An implementation may make a forward progress guarantee, defining the conditions under which the system as a whole makes progress. Such a guarantee must specify the possible causes of reservation loss in Case 3. While the architecture alone cannot provide such a guarantee, the characteristics listed in Cases 1 and 2 are necessary conditions for any forward progress guarantee. An implementation and operating system can build on them to provide such a guarantee. #### - Programming Note The architecture does not include a "fairness guarantee". In competing for a reservation, two processors can indefinitely lock out a third. ## 1.8 Instruction Storage The instruction execution properties and requirements described in this section, including its subsections, apply only to instruction execution that is required by the sequential execution model. In this section, including its subsections, it is assumed that all instructions for which execution is attempted are in storage that is not Caching Inhibited and (unless instruction address translation is disabled; see Book III) is not Guarded, and from which instruction fetching does not cause the system error handler to be invoked (e.g., from which instruction fetching is not prohibited by the "address translation mechanism" or the "storage protection mechanism"; see Book III). #### Programming Note - The results of attempting to execute instructions from storage that does not satisfy this assumption are described in Section 1.6.2 and Section 1.6.4 of this Book and in Book III. For each instance of executing an instruction from location X, the instruction may be fetched multiple times. The instruction cache is not necessarily kept consistent with the data cache or with main storage. It is the responsibility of software to ensure that instruction storage is consistent with data storage when such consistency is required for program correctness. After one or more bytes of a storage location have been modified and before an instruction located in that storage location is executed, software must execute the appropriate sequence of instructions to make instruction storage consistent with data storage. Otherwise the result of attempting to execute the instruction is boundedly undefined except as described in Section 1.8.1, "Concurrent Modification and Execution of Instructions" on page 419. #### Programming Note Following are examples of how to make instruction storage consistent with data storage. Because the optimal instruction sequence to make instruction storage consistent with data storage may vary between systems, many operating systems will provide a system service to perform this function. **Case 1:** The given program does not modify instructions executed by another program nor does another program modify the instructions executed by the given program. Assume that location X previously contained the instruction A0; the program modified one of more bytes of that location such that, in data storage, the location contains the instruction A1; and location X is wholly contained in a single cache block. The following instruction sequence will make instruction storage consistent with data storage such that if the *isync* was in location X-4, the instruction A1 in location X would be executed immediately after the *isync*. ``` dcbst X #copy the block to main storage sync #order copy before invalidation icbi X #invalidate copy in instr cache isync #discard prefetched instructions ``` **Case 2:** One or more programs execute the instructions that are concurrently being modified by another program. Assume program A has modified the instruction at location X and other programs are waiting for program A to signal that the new instruction is ready to execute. The following instruction sequence will make instruction storage consistent with data storage and then set a flag to indicate to the waiting programs that the new instruction can be executed. ``` li r0,1 #put a 1 value in r0 dcbst X #copy the block in main storage sync #order copy before invalidation ``` ``` icbi X #invalidate copy in instr cache sync #order invalidation before store # to flag stw r0,flag #set flag indicating instruction # storage is now consistent ``` The following instruction sequence, executed by the waiting program, will prevent the waiting programs from executing the instruction at location X until location X in instruction storage is consistent with data storage, and then will cause any prefetched instructions to be discarded. In the preceding instruction sequence any context synchronizing instruction (e.g., *rfid*) can be used instead of *isync*. (For Case 1 only *isync* can be used.) For both cases, if two or more instructions in separate data cache blocks have been modified, the *dcbst* instruction in the examples must be replaced by a sequence of *dcbst* instructions such that each block containing the modified instructions is copied back to main storage. Similarly, for *icbi* the sequence must invalidate each instruction cache block containing a location of an instruction that was modified. The *sync* instruction that appears above between "*dcbst* X" and "*icbi* X" would be placed between the sequence of *dcbst* instructions and the sequence of *icbi* instructions. # 1.8.1 Concurrent Modification and Execution of Instructions The phrase "concurrent modification and execution of instructions" (CMODX) refers to the case in which a processor fetches and executes an instruction from instruction storage which is not consistent with data storage or which becomes inconsistent with data storage prior to the completion of its processing. This section describes the only case in which executing this instruction under these conditions produces defined results. In the remainder of this section the following terminology is used. - Location X is an arbitrary word-aligned storage location. - X<sub>0</sub> is the value of the contents of location X for which software has made the location X in instruction storage consistent with data storage. - X<sub>1</sub>, X<sub>2</sub>, ..., X<sub>n</sub> are the sequence of the first n values occupying location X after X<sub>0</sub>. - X<sub>n</sub> is the first value of X subsequent to X<sub>0</sub> for which software has again made instruction storage consistent with data storage. - The "patch class" of instructions consists of the I-form *Branch* instruction (**b**[I][a]) and the preferred no-op instruction (**ori** 0,0,0). If the instruction from location X is executed after the copy of location X in instruction storage is made consistent for the value $X_0$ and before it is made consistent for the value $X_n$ , the results of executing the instruction are defined if and only if the following conditions are satisfied. - The stores that place the values X<sub>1</sub>, ..., X<sub>n</sub> into location X are atomic stores that modify all four bytes of location X. - 2. Each $X_i$ , $0 \le i \le n$ , is a patch class instruction. - Location X is in storage that is Memory Coherence Required. If these conditions are satisfied, the result of each execution of an instruction from location X will be the execution of some $X_i,\ 0\leq i\leq n.$ The value of the ordinate i associated with each value executed may be different and the sequence of ordinates i associated with a sequence of values executed is not constrained, (e.g., a valid sequence of executions of the instruction at location X could be the sequence $X_i,\ X_{i+2},$ then $X_{i-1}).$ If these conditions are not satisfied, the results of each such execution of an instruction from location X are boundedly undefined, and may include causing inconsistent information to be presented to the system error handler. #### **Programming Note** - An example of how failure to satisfy the requirements given above can cause inconsistent information to be presented to the system error handler is as follows. If the value $X_0$ (an illegal instruction) is executed, causing the system illegal instruction handler to be invoked, and before the error handler can load $X_0$ into a register, $X_0$ is replaced with $X_1$ , an *Add Immediate* instruction, it will appear that a legal instruction caused an illegal instruction exception. #### **Programming Note** It is possible to apply a patch or to instrument a given program without the need to suspend or halt the program. This can be accomplished by modifying the example shown in the Programming Note at the end of Section 1.8 where one program is creating instructions to be executed by one or more other programs. In place of the Store to a flag to indicate to the other programs that the code is ready to be executed, the program that is applying the patch would replace a patch class instruction in the original program with a Branch instruction that would cause any program executing the Branch to branch to the newly created code. The first instruction in the newly created code must be an isync, which will cause any prefetched instructions to be discarded, ensuring that the execution is consistent with the newly created code. The instruction storage location containing the isync instruction in the patch area must be consistent with data storage with respect to the processor that will execute the patched code before the Store which stores the new Branch instruction is performed. #### Programming Note It is believed that all processors that comply with versions of the architecture that precede Version 2.01 support concurrent modification and execution of instructions as described in this section if the requirements given above are satisfied, and that most such processors yield boundedly undefined results if the requirements given above are not satisfied. However, in general such support has not been verified by processor testing. Also, one such processor is known to yield undefined results in certain cases if the requirements given above are not satisfied. ## Chapter 2. Effect of Operand Placement on Performance #### 2.1 Instruction Restart ..... 423 The placement (location and alignment) of operands in storage affects relative performance of storage accesses, and may affect it significantly. The best performance is guaranteed if storage operands are aligned. In order to obtain the best performance across the widest range of implementations, the programmer should assume the performance model described in Figure 1 with respect to the placement of storage operands for the Embedded environment. For the Server environment, Figure 1 applies for Big-Endian byte ordering, and Figure 2 applies for Little-Endian byte ordering. Performance of storage accesses varies depending on the following: - 1. Operand Size - 2. Operand Alignment - 3. Crossing no boundary - 4. Crossing a cache block boundary - 5. Crossing a virtual page boundary The *Move Assist* instructions have no alignment requirements. | Ope | rand | Boui | ndary Cros | ssing | |-------------------------------------|----------------|-------------------------|-------------------|------------------------------| | Size | Byte<br>Align. | None | Cache<br>Block | Virtual<br>Page <sup>2</sup> | | Inte | ger | | | | | 8 Byte | 8<br>4<br><4 | optimal<br>good<br>good | good<br>good | -<br>good<br>good | | 4 Byte | 4 <4 | optimal<br>good | -<br>good | -<br>good | | 2 Byte | 2<br><2 | optimal<br>good | -<br>good | -<br>good | | 1 Byte | 1 | optimal | - | - | | lmw,<br>stmw | 4<br><4 | good<br>poor | good<br>poor | good<br>poor | | string | | good | good | good | | Flo | oat | | | | | Ifdp,<br>Ifdpx,<br>stfdp,<br>stfdpx | 16<br><16 | optimal<br>poor | -<br>poor | -<br>poor | | 8 Byte | 8<br>4<br><4 | optimal<br>good<br>poor | -<br>good<br>poor | -<br>poor<br>poor | | 4 Byte | 4<br><4 | optimal<br>poor | -<br>poor | -<br>poor | | Ved | tor | | | | | any | any | optimal <sup>3</sup> | - | - | If an instruction causes an access that is not atomic and any portion of the operand is in storage that is Write Through Required or Caching Inhibited, performance is likely to be poor. Figure 1. Performance effects of storage operand placement | Operand | | Boundary | ry Crossing | | | |-------------------------------------|----------------|----------------------|----------------|------------------------------|--| | Size | Byte<br>Align. | None | Cache<br>Block | Virtual<br>Page <sup>2</sup> | | | Integ | ger | | | | | | 8 Byte | 8 | optimal | - | - | | | | 4 | poor | poor | poor | | | | <4 | poor | poor | poor | | | 4 Byte | 4 | optimal | - | - | | | | <4 | poor | poor | poor | | | 2 Byte | 2 | optimal | - | - | | | | <2 | poor | poor | poor | | | 1 Byte | 1 | optimal | - | - | | | Float | | | | | | | Ifdp,<br>Ifdpx,<br>stfdp,<br>stfdpx | 16<br><16 | optimal<br>poor | -<br>poor | -<br>poor | | | 8 Byte | 8 | optimal | - | - | | | | 4 | poor | poor | poor | | | | <4 | poor | poor | poor | | | 4 Byte | 4 | optimal | - | - | | | | <4 | poor | poor | poor | | | Ved | ctor | | | | | | any | any | optimal <sup>3</sup> | - | - | | If an instruction causes an access that is not atomic and any portion of the operand is in storage that is Write Through Required or Caching Inhibited, performance is likely to be poor. Figure 2. [Category: Server] Performance effects of storage operand placement, Little-Endian If the storage operand spans two virtual pages that have different storage control attributes or, in the Server environment, spans two segments, performance is likely to be poor. The storage operands for Vector instructions are all assumed to be aligned (see Section 6.4 of Book I). If the storage operand spans two virtual pages that have different storage control attributes or, in the Server environment, spans two segments, performance is likely to be poor. The storage operands for Vector instructions are all assumed to be aligned (see Section 6.4 of Book I). #### 2.1 Instruction Restart In this section, "Load instruction" includes the Cache Management and other instructions that are stated in the instruction descriptions to be "treated as a Load", and similarly for "Store instruction". The following instructions are never restarted after having accessed any portion of the storage operand (unless the instruction causes a "Data Address Breakpoint match", for which the corresponding rules are given in Book III). - A Store instruction that causes an atomic access and, for the Embedded environment, accesses storage that is Guarded - A Load instruction that causes an atomic access to storage that is Guarded and, for the Server environment, is also Caching Inhibited. Any other *Load* or *Store* instruction may be partially executed and then aborted after having accessed a portion of the storage operand, and then re-executed (i.e., restarted, by the processor or the operating system). If an instruction is partially executed, the contents of registers are preserved to the extent that the correct result will be produced when the instruction is re-executed. Additional restrictions on the partial execution of instructions are described in Section 6.6 of Book III-S and Section 5.7 of Book III-E. #### Programming Note - In order to ensure that the contents of registers are preserved to the extent that a partially executed instruction can be re-executed correctly, the registers that are preserved must satisfy the following conditions. For any given instruction, zero or more of the conditions applies. - For a fixed-point Load instruction that is not a multiple or string form, or for an eciwx instruction, if RT=RA or RT=RB then the contents of register RT are not altered. - For an update form *Load* or *Store* instruction, the contents of register RA are not altered. #### **Programming Note** There are many events that might cause a *Load* or *Store* instruction to be restarted. For example, a hardware error may cause execution of the instruction to be aborted after part of the access has been performed, and the recovery operation could then cause the aborted instruction to be re-executed. When an instruction is aborted after being partially executed, the contents of the instruction pointer indicate that the instruction has not been executed, however, the contents of some registers may have been altered and some bytes within the storage operand may have been accessed. The following are examples of an instruction being partially executed and altering the program state even though it appears that the instruction has not been executed. - Load Multiple, Load String: Some registers in the range of registers to be loaded may have been altered. - 2. Any *Store* instruction, *dcbz*: Some bytes of the storage operand may have been altered. ## **Chapter 3. Storage Control Instructions** | 3.1 Parameters Useful to Application Programs | 3.4.1 Instruction Synchronize Instruction . 440 | |--------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------| | 3.2 Data Stream Control Register (DSCR) [Category: Stream] 426 | 3.4.2 Load and Reserve and Store Conditional Instructions | | 3.3 Cache Management Instructions . 427 3.3.1 Instruction Cache Instructions 428 3.3.2 Data Cache Instructions 429 | 3.4.2.1 64-Bit Load and Reserve and Store Conditional Instructions [Category: 64-Bit] 444 | | 3.3.2.1 Obsolete Data Cache Instructions [Category: Vector Phased-Out] 437 3.4 Synchronization Instructions 440 | 3.4.3 Memory Barrier Instructions 446<br>3.4.4 Wait Instruction 449 | # 3.1 Parameters Useful to Application Programs It is suggested that the operating system provide a service that allows an application program to obtain the following information. - 1. The virtual page sizes - 2. Coherence block size - 3. Granule sizes for reservations - An indication of the cache model implemented (e.g., Harvard-style cache, combined cache) - 5. Instruction cache size - 6. Data cache size - 7. Instruction cache block size - 8. Data cache block size - 9. Instruction cache associativity - 10. Data cache associativity - Number of stream IDs supported for the stream variant of *dcbt* - 12. Factors for converting the Time Base to seconds If the caches are combined, the same value should be given for an instruction cache attribute and the corresponding data cache attribute. # 3.2 Data Stream Control Register (DSCR) [Category: Stream] The layout of the Data Stream Control Register (DSCR) is shown in Figure 3 below. See Section 3.3.2 for information on streams. Figure 3. Data Stream Control Register | Bits<br>60 | Name<br>SSE | Description Store Stream Enable This bit enables hardware detection and initiation of store streams. | |------------|-------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | 61:63 | DPFD | Default Prefetch Depth | | | | This field supplies a prefetch depth for hardware-detected streams and for software-defined streams for which a depth of zero is specified or for which <i>dcbt</i> dcbtst with TH=1010 is not used in their description. Values and their meanings are as follows. 0 default (LPCR <sub>DPFD</sub> ) 1 none 2 shallowest 3 shallow 4 medium 5 deep 6 deeper 7 deepest | The contents of the DSCR affect how a processor handles hardware-detected and software-defined data streams. A move to the DSCR causes all active and nascent data streams to cease to exist. ## 3.3 Cache Management Instructions The Cache Management instructions obey the sequential execution model except as described in Section 3.3.1. In the instruction descriptions the statements "this instruction is treated as a *Load*" and "this instruction is treated as a *Store*" mean that the instruction is treated as a *Load* (*Store*) from (to) the addressed byte with respect to address translation, the definition of program order on page 407, storage protection, reference and change recording<S>, and the storage access ordering described in Section 1.7.1 and is treated as a *Read* (*Write*) from (to) the addressed byte with respect to debug events unless otherwise specified. (See Book III-E.) Some Cache Management instructions contain a CT field that is used to specify a cache level within a cache hierarchy or a portion of a cache structure to which the instruction is to be applied. The correspondence between the CT value specified and the cache level is shown below. | CT Field Value | Cache Level | | | |----------------|-----------------|--|--| | 0 | Primary Cache | | | | 2 | Secondary Cache | | | CT values not shown above may be used to specify implementation-dependent cache levels or implementation-dependent portions of a cache structure. #### 3.3.1 Instruction Cache Instructions #### Instruction Cache Block Invalidate X-form icbi RA,RB | 31 | /// | RA | RB | 982 | / | |----|-----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | Let the effective address (EA) be the sum (RA|0)+(RB). If the block containing the byte addressed by EA is in storage that is Memory Coherence Required and a block containing the byte addressed by EA is in the instruction cache of any processors, the block is invalidated in those instruction caches. If the block containing the byte addressed by EA is in storage that is not Memory Coherence Required and the block is in the instruction cache of this processor, the block is invalidated in that instruction cache. The function of this instruction is independent of whether the block containing the byte addressed by EA is in storage that is Write Through Required or Caching Inhibited. This instruction is treated as a *Load* (see Section 3.3), except that reference and change recording<S> need not be done. #### **Special Registers Altered:** None #### **Programming Note** Because the instruction is treated as a *Load*, the effective address is translated using translation resources that are used for data accesses, even though the block being invalidated was copied into the instruction cache based on translation resources used for instruction fetches (see Book III). #### Programming Note The invalidation of the specified block need not have been performed with respect to the processor executing the *icbi* instruction until a subsequent *isync* instruction has been executed by that processor. No other instruction or event has the corresponding effect. #### Instruction Cache Block Touch X-form icbt CT, RA, RB [Category: Embedded] | 31 | / | CT | RA | RB | 22 | / | |----|---|----|----|----|----|----| | 0 | 6 | 7 | 11 | 16 | 21 | 31 | Let the effective address (EA) be the sum (RA|0)+(RB). If CT=0, this instruction provides a hint that the program will probably soon execute code from the addressed location. If CT=0, the operation performed by this instruction is implementation-dependent, except that the instruction is treated as a no-op for values of CT that are not implemented. The hint is ignored if the block is Caching Inhibited. This instruction treated as a *Load* (see Section 3.3), except that the system instruction storage error handler is not invoked. #### **Special Registers Altered:** None #### 3.3.2 Data Cache Instructions The Data Cache instructions control various aspects of the data cache. #### TH field in the dcbt and dcbtst instructions Described below are the TH field values for the *dcbt* and *dcbtst* instructions. For all TH field values which are not listed, the hint provided by the instruction is undefined. #### TH=0b00000 If TH=0b00000, the *dcbt/dcbtst* instruction provides a hint that the program will probably soon access the block containing the byte addressed by EA. ## TH=0b00000 - 0b00111 [Category: Cache Specification] In addition to the hint specified above for the TH field value of 0b00000, an additional hint is provided indicating that placement of the block in the cache specified by the TH field might also improve performance. The correspondence between each value of the TH field and the cache to be specified is the same as the correspondence between each value the CT field and the cache to be specified as defined in Section 3.2. The hints corresponding to values of the TH field not supported by the implementation are undefined. #### TH=0b01000 - 0b01111 [Category: Stream] The *dcbt/dcbtst* instructions provide hints regarding a sequence of accesses to data, or indicate the expected use thereof. Such a sequence is called a "data stream", and a *dcbt/dcbtst* instruction in which TH is set to one of these values is said to be a "data stream variant" of *dcbt/dcbtst*. In the remainder of this section, "data stream" may be abbreviated to "stream". A data stream to which a program may perform *Load* accesses is said to be a "load data stream", and is described using the data stream variants of the *dcbt* instruction. A data stream to which a program may perform *Store* accesses is said to be a "store data stream", and is described using the data stream variants of the *dcbtst* instruction. When, and how often, effective addresses for a data stream are translated is implementation-dependent. The address and length of such data streams is specified in terms of aligned 128-byte units of storage; in the remainder of this instruction description, "aligned 128-byte unit of storage" is abbreviated to "unit". Each such data stream is associated, by software, with a stream ID, which is a resource that the processor uses to distinguish the data stream from other such data streams. The number of stream IDs is an implementation-dependent value in the range 1:16. Stream IDs are numbered sequentially starting from 0. The encodings of the TH field and of the corresponding EA values, are as follows. In the EA layout diagrams, fields shown as "/"s are reserved. These fields, and reserved values of defined EA fields, are treated in the same manner as the corresponding cases for instruction fields (see the section entitled "Reserved Fields and Reserved Values" in Book I), except that a reserved value in a defined EA field does not make the instruction form invalid. If a defined EA field contains a reserved value, the hint provided by the instruction is undefined. #### TH Description #### 01000 The *dcbt/dcbtst* instruction provides a hint that describes certain attributes of a data stream, and may indicate that the program will probably soon access the stream. The EA is interpreted as follows. | | EATRUNC | D UG | / | ID | |---|---------|------|----|-------| | 0 | | 57 | 59 | 60 63 | #### Bit(s) Description #### 0:56 **EATRUNC** High-order 57 bits of the effective address of the first unit of the data stream. (i.e., the effective address of the first unit of the stream is EATRUNC $\parallel^{7}0$ ) #### 57 **Direction** (D) - O Subsequent units are the sequentially following units. - Subsequent units are the sequentially preceding units. #### 58 Unlimited/GO (UG) - 0 No information is provided by the UG field. - 1 The number of units in the data stream is unlimited, the program's need for each unit of the stream is not likely to be transient, and the program will probably soon access the stream. #### 59 Reserved #### 60:63 Stream ID (ID) Stream ID to use for this data stream. 01010 The *dcbt/dcbtst* instruction provides a hint that describes certain attributes of a data stream, or indicates that the program will probably soon access data streams that have been described using data stream variants of the **dcbt/dcbtst** instruction, or will probably no longer access such data streams. The EA is interpreted as follows. If GO=1 and S≠0b00 the hint provided by the instruction is undefined; the remainder of this instruction description assumes that this combination is not used. #### Bit(s) Description 0:31 Reserved 32 **GO** - No information is provided by the GO field. - 1 For *dcbt*, the program will probably soon access all nascent load and store data streams that have been completely described, and will probably no longer access all other nascent load and store data streams. All other fields of the EA are ignored. ("Nascent" and "completely described" are defined below.) For *dcbtst*, this field value holds no meaning and is treated as though it were zero. #### 33:34 **Stop** (S) - 00 No information is provided by the S field. - 01 Reserved - 10 The program will probably no longer access the data stream (if any) associated with the specified stream ID. (All other fields of the EA except the ID field are ignored.) - 11 For dcbt, the program will probably no longer access the load and store data streams associated with all stream IDs. (All other fields of the EA are ignored.) For dcbtst, this field value holds no meaning, and is treated as though it were 0b00. #### 35 Reserved #### 36:38 **Depth** (DEP) The DEP field provides a relative estimate of how many units ahead of the point of stream-use the latency-reducing actions should go. This value reflects a comparison of the rate of consumption of the units of the data stream and the latency to bring an arbi- trary unit of the stream into cache. The values are as follows. - 0 default = DSCR<sub>DPFD</sub> - 1 none - 2 shallowest - 3 shallow - 4 medium - 5 deep - 6 deeper - 7 deepest #### 39:46 Reserved #### 47:56 UNITCHT Number of units in data stream. #### 57 **Transient** (T) If T=1, the program's need for each unit of the data stream is likely to be transient (i.e., the time interval during which the program accesses the unit is likely to be short). #### 58 *Unlimited* (U) If U=1, the number of units in the data stream is unlimited (and the UNITCNT field is ignored). #### 59 Reserved #### 60:63 Stream ID (ID) Stream ID to use for this data stream (GO=0 and S=0b00), or stream ID associated with the data stream which the program will probably no longer access(S=0b10). If the specified stream ID value is greater than m-1, where m is the number of stream IDs provided by the implementation, and either (a) TH=0b01000 or (b) TH=0b01010 with GO=0 and S≠0b11, no hint is provided by the instruction. The following terminology is used to describe the state of a data stream. Except as described in the paragraph after the next paragraph, the state of a data stream at a given time is determined by the most recently provided hint for the stream. - A data stream for which only descriptive hints have been provided (by *dcbt/dcbtst* instructions with TH=0b01000 and UG=0 or with TH=0b01010 and GO=0 and S=0b00) is said to be "nascent". A nascent data stream for which both kinds of descriptive hints have been provided (by both of the *dcbt/dcbtst* usages listed in the preceding sentence) is considered to be "completely described". - A data stream for which a hint has been provided (by a dcbt/dcbtst instruction with TH=0b01000 and UG=1 or dcbt with TH=0b01010 and GO=1) - that the program will probably soon access it is said to be "active". - A data stream that is either nascent or active is considered to "exist". - A data stream for which a hint has been provided (e.g., by a *dcbt* instruction with TH=0b01010 and S≠0b00) that the program will probably no longer access it is considered no longer to exist. - The hint provided by a *dcbt/dcbtst* instruction with TH=0b01000 and UG=1 implicitly includes a hint that the program will probably no longer access the data stream (if any) previously associated with the specified stream ID. The hint provided by a *dcbt/dcbtst* instruction with TH=0b01000 and UG=0, or with TH=0b01010 and GO=0 and S=0b00 implicitly includes a hint that the program will probably no longer access the *active* data stream (if any) previously associated with the specified stream ID. If a data stream is specified without using a *dcbt/dcbtst* instruction with TH=0b01010 and GO=0 and S=0b00, then the number of units in the stream is unlimited, and the program's need for each unit of the stream is not likely to be transient. Interrupts (see Book III) cause all existing data streams to cease to exist. In addition, depending on the implementation, certain conditions and events may cause an existing data stream to cease to exist; for example, in some implementations an existing data stream ceases to exist when it comes to the end of a page. #### **Programming Note** To obtain the best performance across the widest range of implementations that support the data stream variants of **dcbt/dcbtst**, the programmer should assume the following model when using those variants. ■ The processor's response to a hint that the program will probably soon access a given data stream is to take actions that reduce the latency of accesses to the first few units of the stream. (Such actions may include prefetching cache blocks into levels of the storage hierarchy that are "near" the processor.) Thereafter, as the program accesses each successive unit of the stream, the processor takes latency-reducing actions for additional units of the stream, pacing these actions with the program's accesses (i.e., taking the actions for only a limited number of units ahead of the unit that the program is currently accessing). The processor's response to a hint that the program will probably no longer access a given data stream, or to the cessation of existence of a data stream, is to stop taking latency-reducing actions for the stream. - A data stream having finite length ceases to exist when the latency-reducing actions have been taken for all units of the stream. - If the program ceases to need a given data stream before having accessed all units of the stream (always the case for streams having unlimited length), performance may be improved if the program then provides a hint that it will no longer accesses the stream (e.g., by executing the appropriate *dcbt* instruction with TH=0b01010 and S≠0b00). - At each level of the storage hierarchy that is "near" the processor, units of a data stream that is specified as transient are most likely to be replaced. As a result, it may be desirable to stagger addresses of streams (choose addresses that map to different cache congruence classes) to reduce the likelihood that a unit of a transient stream will be replaced prior to being accessed by the program. - Processors that comply with versions of the architecture that do not support the TH field at all treat TH = 0b01000 and 0b01010 as if TH = 0b000000. - A single set of stream IDs is shared between the dcbt and dcbtst instructions. - On some implementations, data streams that are not specified by software may be detected by the processor. Such data streams are called "hardware-detected data streams". On some such implementations, data stream resources (resources that are used primarily to support data streams) are shared between software-specified data streams and hardware-detected data streams. On these latter implementations, the programming model includes the following. - Software-specified data streams take precedence over hardware-detected data streams in use of data stream resources. - The processor's response to a hint that the program will probably no longer access a given data stream, or to the cessation of existence of a data stream, includes releasing the associated data stream resources, so that they can be used by hardware-detected data streams. #### **Programming Note** This Programming Note describes several aspects of using the data stream variants of the *dcbt* and *dcbtst* instructions. - A non-transient data stream having unlimited length can be completely specified, including providing the hint that the program will probably soon access it, using one dcbt instruction. The corresponding specification for a data stream having other attributes requires two dcbt/dcbtst instructions to describe the stream and one additional dcbt instruction to start the stream. However, one dcbt instruction with TH=0b01010 and GO=1 can apply to a set of the data streams described in the preceding sentence, so the corresponding specification for n such data streams requires 2×n dcbt/ dcbtst instructions plus one dcbt instruction. (There is no need to execute a dcbt/dcbtst instruction with TH=0b01010 and S=0b10 for a given stream ID before using the stream ID for a new data stream; the implicit portion of the hint provided by dcbt/dcbtst instructions that describe data streams suffices.) - If it is desired that the hint provided by a given dcbt/dcbtst instruction be provided in program order with respect to the hint provided by another dcbt/dcbtst instruction, the two instructions must be separated by an eieio<S> (or sync) instruction. For example, if a dcbt instruction with TH=0b01010 and GO=1 is intended to indicate that the program will probably soon access nascent data streams described (completely) by preceding dcbt/dcbtst instructions, and is intended not to indicate that the program will probably soon access nascent data streams described (completely) by following dcbt/dcbtst instructions, an eieio<S> (or sync) instruction must separate - the *dcbt* instruction with GO=1 from the preceding *dcbt/dcbtst* instructions, and another *eieio*<S> (or *sync*) instruction must separate that *dcbt* instruction from the following *dcbt/dcbtst* instructions. - In practice, the second *eieio*<S> (or *sync*) described above can sometimes be omitted. For example, if the program consists of an outer loop that contains the *dcbt/dcbtst* instructions and an inner loop that contains the *Load* or *Store* instructions that access the data streams, the characteristics of the inner loop and of the implementation's branch prediction mechanisms may make it highly unlikely that hints corresponding to a given iteration of the outer loop will be provided out of program order with respect to hints corresponding to the previous iteration of the outer loop. (Also, any providing of hints out of program order affects only performance, not program correctness.) - To mitigate the effects of interrupts on data streams, it may be desirable to specify a given "logical" data stream as a sequence of shorter, component data streams. Similar considerations apply to conditions and events that, depending on the implementation, may cause an existing data stream to cease to exist; for example, in some implementations an existing data stream ceases to exist when it comes to the end of a virtual page. - If it is desired to specify data streams without regard to the number of stream IDs provided by the implementation, stream IDs should be assigned to data streams in order of decreasing stream importance (stream ID 0 to the most important stream, stream ID 1 to the next most important stream, etc.). This order ensures that the hints for the most important data streams will be provided. #### Data Cache Block Allocate X-form dcba RA,RB [Category: Embedded] Let the effective address (EA) be the sum (RA|0)+(RB). This instruction provides a hint that the program will probably soon store into a portion of the block and the contents of the rest of the block are not meaningful to the program. The contents of the block are undefined when the instruction completes. The hint is ignored if the block is Caching Inhibited. This instruction is treated as a *Store* (see Section 3.3) except that the instruction is treated as a no-op if execution of the instruction would cause the system data storage error handler to be invoked. #### **Special Registers Altered:** None #### Data Cache Block Touch #### X-form | dcbt | RA,RB,TH [Category: Server] | |------|-------------------------------| | dcbt | TH,RA,RB [Category: Embedded] | | 31 | TH | RA | RB | 278 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | Let the effective address (EA) be the sum (RA|0)+(RB). The dcbt instruction provides a hint that describes a block or data stream to which the program may perform a Load access. The instruction is also used to indicate imminent access or end of access to described load and store data streams. A hint that the program will probably soon load from a given storage location is ignored if the location is Caching Inhibited or Guarded<S>. The only operation that is "caused" by the dcbt instruction is the providing of the hint. The actions (if any) taken by the processor in response to the hint are not considered to be "caused by" or "associated with" the dcbt instruction (e.g., dcbt is considered not to cause any data accesses). No means are provided by which software can synchronize these actions with the execution of the instruction stream. For example, these actions are not ordered by the memory barrier created by a **sync** instruction. The dcbt instruction may complete before the operation it causes has been performed. The nature of the hint depends, in part, on the value of the TH field, as specified at the beginning of this section. If TH≠0b01010 this instruction is treated as a Load (see Section 3.2), except that the system data storage error handler is not invoked, and reference and change recording<S> need not be done. #### **Special Registers Altered:** None #### **Extended Mnemonics:** Extended mnemonics are provided for the Data Cache Block Touch instruction so that it can be coded with the TH value as the last operand for all categories. #### **Extended:** #### Equivalent to: dcbtct RA,RB,TH dcbt for TH values of 0b00000 - 0b00111: other TH values are invalid. dcbtds RA,RB,TH dcbt for TH values of 0b00000 or 0b01000 - 0b01111: other TH values are invalid. #### Programming Notes – New programs should avoid using the dcbt and dcbtst mnemonics; one of the extended mnemonics should be used exclusively. <S> If the *dcbt* mnemonic is used with only two operands, the TH operand assumed to be 0b00000. Processors that comply with versions of the architecture that precede Version 2.01 do not necessarily ignore the hint provided by dcbt and dcbtst if the specified block is in storage that is Guarded <S> and not Caching Inhibited. #### Programming Note See the Programming Notes at the beginning of #### Data Cache Block Touch for Store X-form dcbtst RA,RB,TH [Category: Server] dcbtst TH,RA,RB [Category: Embedded] | 31 | TH | RA | RB | 246 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | Let the effective address (EA) be the sum (RA|0)+(RB). The *dcbtst* instruction provides a hint that describes a block or data stream to which the program may perform a *Store* access, or indicates the expected use thereof. A hint that the program will soon store to a given storage location is ignored if the location is Caching Inhibited or Guarded<S>. The only operation that is "caused by" the *dcbtst* instruction is the providing of the hint. The actions (if any) taken by the processor in response to the hint are not considered to be "caused by" or "associated with" the *dcbtst* instruction (e.g., *dcbtst* is considered not to cause any data accesses). No means are provided by which software can synchronize these actions with the execution of the instruction stream. For example, these actions are not ordered by memory barriers. The *dcbtst* instruction may complete before the operation it causes has been performed. The nature of the hint depends, in part, on the value of the TH field, as specified at the beginning of this section. If TH≠0b01010 this instruction is treated as a *Load* (see Section 3.2), except that the system data storage error handler is not invoked, and reference and change recording<S> need not be done. #### **Special Registers Altered:** None #### **Extended Mnemonic:** An extended mnemonic is provided for the *Data Cache Block Touch for Store* instruction so that it can be coded with the TH value as the last operand for all categories. #### Extended: Equivalent to: dcbtstct RA,RB,TH dcbtst for TH values of 0b00000 or 0b00000 - 0b00111; other TH values are invalid. dcbtstds RA,RB,TH dcbtst for TH values of 0b000000 or 0b01000 - 0b01010; other TH values are invalid. #### - Programming Note See the Programming Notes at the beginning of this section. #### Data Cache Block set to Zero #### X-form #### Data Cache Block Store X-form 31 dcbz RA,RB | 31 | /// | RA | RB | 1014 | / | |----|-----|----|----|------|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) n ← block size (bytes) m \leftarrow log_2(n) ea \leftarrow EA_{0:63-m} | | ^{m}0 MEM(ea, n) \leftarrow ^{n}0x00 ``` Let the effective address (EA) be the sum (RA|0)+(RB). All bytes in the block containing the byte addressed by EA are set to zero. This instruction is treated as a Store (see Section 3.3). #### Special Registers Altered: None #### **Programming Note** - dcbz does not cause the block to exist in the data cache if the block is in storage that is Caching Inhibited. For storage that is neither Write Through Required nor Caching Inhibited, dcbz provides an efficient means of setting blocks of storage to zero. It can be used to initialize large areas of such storage, in a manner that is likely to consume less memory bandwidth than an equivalent sequence of Store instructions. For storage that is either Write Through Required or Caching Inhibited, dcbz is likely to take significantly longer to execute than an equivalent sequence of Store instructions. For example, on some implementations dcbz for such storage may cause the system alignment error handler to be invoked; on such implementations the system alignment error handler sets the specified block to zero using Store instructions. See Section 5.9.1 of Book III-S and Section 4.9.1 of Book III-E. for additional information about dcbz. | dcbst | RA,R | В | | | |-------|------|----|----|----| | | | | | | | 31 | /// | RA | RB | 54 | Let the effective address (EA) be the sum (RA|0)+(RB). 16 If the block containing the byte addressed by EA is in storage that is Memory Coherence Required and a block containing the byte addressed by EA is in the data cache of any processor and any locations in the block are considered to be modified there, those locations are written to main storage, additional locations in the block may be written to main storage, and the block ceases to be considered to be modified in that data cache. If the block containing the byte addressed by EA is in storage that is not Memory Coherence Required and the block is in the data cache of this processor and any locations in the block are considered to be modified there, those locations are written to main storage, additional locations in the block may be written to main storage, and the block ceases to be considered to be modified in that data cache. The function of this instruction is independent of whether the block containing the byte addressed by EA is in storage that is Write Through Required or Caching Inhibited. This instruction is treated as a Load (see Section 3.3), except that reference and change recording<S> need not be done, and it is treated as a Write with respect to debug events. #### **Special Registers Altered:** None #### Data Cache Block Flush X-form dcbf RA,RB,L | Ī | 31 | /// | L | RA | RB | 86 | / | |---|----|-----|---|----|----|----|----| | | 0 | 6 | 9 | 11 | 16 | 21 | 31 | Let the effective address (EA) be the sum (RA|0)+(RB). #### L=0 If the block containing the byte addressed by EA is in storage that is Memory Coherence Required and a block containing the byte addressed by EA is in the data cache of any processor and any locations in the block are considered to be modified there, those locations are written to main storage and additional locations in the block may be written to main storage. The block is invalidated in the data caches of all processors. If the block containing the byte addressed by EA is in storage that is not Memory Coherence Required and the block is in the data cache of this processor and any locations in the block are considered to be modified there, those locations are written to main storage and additional locations in the block may be written to main storage. The block is invalidated in the data cache of this processor. #### L=1 ("dcbf local") [Category: Server ] The L=1 form of the *dcbf* instruction permits a program to limit the scope of the "flush" operation to the data cache of this processor. If the block containing the byte addressed by EA is in the data cache of this processor, it is removed from this cache. The coherence of the block is maintained to the extent required by the Memory Coherence Required storage attribute. #### L = 3 ("dcbf local primary") [Category: Server] The L=3 form of the *dcbf* instruction permits a program to limit the scope of the "flush" operation to the primary data cache of this processor. If the block containing the byte addressed by EA is in the primary data cache of this processor, it is removed from this cache. The coherence of the block is maintained to the extent required by the Memory Coherence Required storage attribute. For the L operand, the value 2 is reserved. The results of executing a *dcbf* instruction with L=2 are boundedly undefined. The function of this instruction is independent of whether the block containing the byte addressed by EA is in storage that is Write Through Required or Caching Inhibited. This instruction is treated as a *Load* (see Section 3.3), except that reference and change recording<S> need not be done, and it is treated as a Write with respect to debug events. #### **Special Registers Altered:** None #### **Extended Mnemonics:** Extended mnemonics are provided for the *Data Cache Block Flush* instruction so that it can be coded with the L value as part of the mnemonic rather than as a numeric operand. These are shown as examples with the instruction. See Appendix A. "Assembler Extended Mnemonics" on page 457. The extended mnemonics are shown below. | Extended: | Equivalent to: | |----------------------|----------------| | dcbf RA,RB | dcbf RA,RB,0 | | dcbfl RA,RB <s></s> | dcbf RA,RB,1 | | dcbflp RA,RB <s></s> | dcbf RA,RB,3 | Except in the *dcbf* instruction description in this section, references to "*dcbf*" in Books I-III imply L=0 unless otherwise stated or obvious from context; "*dcbfl*<S>" is used for L=1 and "*dcbflp*<S>" is used for L=3. #### Programming Note - **dcbf** serves as both a basic and an extended mnemonic. The Assembler will recognize a **dcbf** mnemonic with three operands as the basic form, and a **dcbf** mnemonic with two operands as the extended form. In the extended form the L operand is omitted and assumed to be 0. #### Programming Note [Category: Server] - **dcbf** with L=1 can be used to provide a hint that a block in this processor's data cache will not be reused soon. **dcbf** with L=3 can be used to flush a block from the processor's primary data cache but reduce the latency of a subsequent access. For example, the block may be evicted from the primary data cache but a copy retained in a lower level of the cache hierarchy. Programs which manage coherence in software must use *dcbf* with L=0. ## 3.3.2.1 Obsolete Data Cache Instructions [Category: Vector.Phased-Out] The Data Stream Touch (dst), Data Stream Touch for Store (dstst), and Data Stream Stop (dss) instructions (primary opcode 31, extended opcodes 342, 374, and 822 respectively), which were proposed for addition to the Power ISA and were implemented by some processors, must be treated as no-ops (rather than as illegal instructions). The treatment of these instructions is independent of whether other Vector instructions are available (i.e., is independent of the contents of MSR<sub>VEC</sub><S> (see Book III-S) or MSR<sub>SPV</sub> (see Book III-E). #### **Programming Note -** These instructions merely provided hints, and thus were permitted to be treated as no-ops even on processors that implemented them. The treatment of these instructions is independent of whether other Vector instructions are available because, on processors that implemented the instructions, the instructions were available even when other Vector instructions were not. The extended mnemonics for these instructions were *dstt, dststt,* and *dssall.* ## 3.4 Synchronization Instructions The synchronization instructions are used to ensure that certain instructions have completed before other instructions are initiated, or to control storage access ordering, or to support debug operations. # 3.4.1 Instruction Synchronize Instruction #### Instruction Synchronize XL-form isync | 19 | /// | /// | /// | 150 | / | |----|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | Executing an *isync* instruction ensures that all instructions preceding the *isync* instruction have completed before the *isync* instruction completes, and that no subsequent instructions are initiated until after the *isync* instruction completes. It also ensures that all instruction cache block invalidations caused by *icbi* instructions preceding the *isync* instruction have been performed with respect to the processor executing the *isync* instruction, and then causes any prefetched instructions to be discarded. Except as described in the preceding sentence, the **isync** instruction may complete before storage accesses associated with instructions preceding the **isync** instruction have been performed. This instruction is context synchronizing (see Book III). #### **Special Registers Altered:** None ## 3.4.2 Load and Reserve and Store Conditional Instructions The Load And Reserve and Store Conditional instructions can be used to construct a sequence of instructions that appears to perform an atomic update operation on an aligned storage location. See Section 1.7.3, "Atomic Update" for additional information about these instructions. The Load And Reserve and Store Conditional instructions are fixed-point Storage Access instructions; see Section 3.3.1, "Fixed-Point Storage Access Instructions", in Book I. The storage location specified by the *Load And Reserve* and *Store Conditional* instructions must be in storage that is Memory Coherence Required if the location may be modified by other processors or mechanisms. If the specified location is in storage that is Write Through Required or Caching Inhibited, the system data storage error handler or the system alignment error handler is invoked for the Server environment and may be invoked for the Embedded environment. The Load and Reserve instructions include an Exclusive Access hint (EH), which can be used to indicate that the instruction sequence being executed is implementing one of two types of algorithms: #### Atomic Update (EH=0) This hint indicates that the program is using a fetch and operate (e.g., fetch and add) or some similar algorithm and that all programs accessing the shared variable are likely to use a similar operation to access the shared variable for some time. #### Exclusive Access (EH=1) This hint indicates that the program is attempting to acquire a lock and if it succeeds, will perform another store to the lock variable (releasing the lock) before another program attempts to modify the lock variable. #### Programming Note \_ The Memory Coherence Required attribute on other processors and mechanisms ensures that their stores to the reservation granule will cause the reservation created by the Load And Reserve instruction to be lost. #### Programming Note - Because the Load And Reserve and Store Conditional instructions have implementation dependencies (e.g., the granularity at which reservations are managed), they must be used with care. The operating system should provide system library programs that use these instructions to implement the high-level synchronization functions (Test and Set, Compare and Swap, locking, etc.; see Appendix B) that are needed by application programs. Application programs should use these library programs, rather than use the Load And Reserve and Store Conditional instructions directly. #### Programming Note - EH = 1 should be used when the program is obtaining a lock variable which it will subsequently release before another program attempts to perform a store to it. When contention for a lock is significant, using this hint may reduce the number of times a cache block is transferred between processor caches. EH = 0 should be used when all accesses to a mutex variable are performed using an instruction sequence with *Load and Reserve* followed by *Store Conditional* (e.g., emulating atomic update primitives such as "Fetch and Add"; see Appendix B). The processor may use this hint to optimize the cache to cache transfer of the block containing the mutex variable, thus reducing the latency of performing an operation such as 'Fetch and Add'. #### Programming Note - **Warning**: On some processors that comply with versions of the architecture that precede Version 2.00, executing a *Load And Reserve* instruction in which EH = 1 will cause the illegal instruction error handler to be invoked. #### Load Word And Reserve Indexed X-form #### lwarx RT,RA,RB,EH | | 31 | RT | RA | RB | 20 | EΗ | |---|----|----|----|----|----|----| | 0 | | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) RESERVE \leftarrow 1 RESERVE_ADDR \leftarrow real_addr(EA) RT \leftarrow 320 | MEM(EA, 4) ``` Let the effective address (EA) be the sum (RA|0)+(RB). The word in storage addressed by EA is loaded into $RT_{32:63}$ . $RT_{0:31}$ are set to 0. This instruction creates a reservation for use by a *Store Word Conditional* instruction. An address computed from the EA as described in Section 1.7.3.1 is associated with the reservation, and replaces any address previously associated with the reservation. The value of EH provides a hint as to whether the program will perform a subsequent store to the word in storage addressed by EA before some other processor attempts to modify it. - Other programs might attempt to modify the doubleword in storage addressed by EA regardless of the result of the corresponding Store Word/Doubleword Conditional instruction. - Other programs will not attempt to modify the word in storage addressed by EA until the program that has acquired the lock performs a subsequent store releasing the lock. EA must be a multiple of 4. If it is not, either the system alignment error handler is invoked or the results are boundedly undefined. #### **Special Registers Altered:** None #### **Programming Note** **Iwarx** serves as both a basic and an extended mnemonic. The Assembler will recognize a **Iwarx** mnemonic with four operands as the basic form, and a **Iwarx** mnemonic with three operands as the extended form. In the extended form the EH operand is omitted and assumed to be 0. #### Store Word Conditional Indexed X-form stwcx. RS,RA,RB | 31 | RS | RA | RB | 150 | 1 | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) if RESERVE then if RESERVE ADDR = real_addr(EA) then MEM(EA, 4) \leftarrow (RS)<sub>32:63</sub> CR0 \leftarrow 0b00 || 0b1 || XER<sub>SO</sub> else u1 \leftarrow undefined 1-bit value if u1 then MEM(EA, 4) \leftarrow (RS)<sub>32:63</sub> u2 \leftarrow undefined 1-bit value CR0 \leftarrow 0b00 || u2 || XER<sub>SO</sub> RESERVE \leftarrow 0 else CR0 \leftarrow 0b00 || 0b0 || XER<sub>SO</sub> ``` Let the effective address (EA) be the sum (RA|0)+(RB). If a reservation exists and the storage location specified by the **stwcx**. is the same as the location specified by the **Load And Reserve** instruction that established the reservation, (RS)<sub>32:63</sub> are stored into the word in storage addressed by EA and the reservation is cleared. If a reservation exists but the storage location specified by the **stwcx.** is not the same as the location specified by the **Load And Reserve** instruction that established the reservation, the reservation is cleared, and it is undefined whether (RS)<sub>32:63</sub> are stored into the word in storage addressed by EA. If a reservation does not exist, the instruction completes without altering storage. CR Field 0 is set as follows. n is a 1-bit value that indicates whether the store was performed, except that if a reservation exists but the storage location specified by the **stwcx**. is not the same as the location specified by the **Load And Reserve** instruction that established the reservation the value of n is undefined. EA must be a multiple of 4. If it is not, either the system alignment error handler is invoked or the results are boundedly undefined. #### **Special Registers Altered:** CR0 ## 3.4.2.1 64-Bit Load and Reserve and Store Conditional Instructions [Category: 64-Bit] Idarx # Store Doubleword Conditional Indexed X-form # Load Doubleword And Reserve Indexed X-form RT,RA,RB,EH stdcx. RS,RA,RB | 31 | RS | RA | RB | 214 | 1 | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 b ← (RA) else EA \leftarrow b + (RB) if RESERVE then if RESERVE ADDR = real addr(EA) then MEM(EA, 8) \leftarrow (RS) CR0 ← 0b00 || 0b1 || XER<sub>SO</sub> else u1 ← undefined 1-bit value if u1 then MEM(EA, 8) \leftarrow (RS) u2 ← undefined 1-bit value CR0 \leftarrow 0b00 \mid \mid u2 \mid \mid XER_{SO} RESERVE ← 0 else CR0 ← 0b00 || 0b0 || XER<sub>SO</sub> ``` Let the effective address (EA) be the sum (RA|0)+(RB). If a reservation exists and the storage location specified by the **stdcx**. is the same as the location specified by the **Load And Reserve** instruction that established the reservation, (RS) is stored into the doubleword in storage addressed by EA and the reservation is cleared. If a reservation exists but the storage location specified by the *stdcx*. is not the same as the location specified by the *Load And Reserve* instruction that established the reservation, the reservation is cleared, and it is undefined whether (RS) is stored into the doubleword in storage addressed by EA. If a reservation does not exist, the instruction completes without altering storage. CR Field 0 is set as follows. n is a 1-bit value that indicates whether the store was performed, except that if a reservation exists but the storage location specified by the *stdcx*. is not the same as the location specified by the *Load And Reserve* instruction that established the reservation the value of n is undefined. $$CR0_{LT\ GT\ EQ\ SO} = 0b00 \parallel n \parallel XER_{SO}$$ EA must be a multiple of 8. If it is not, either the system alignment error handler is invoked or the results are boundedly undefined. ## **Special Registers Altered:** CR0 | 31 | RT | RA | RB | 84 | EΗ | |----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b +(RB) RESERVE \leftarrow 1 RESERVE_ADDR \leftarrow real_addr(EA) RT \leftarrow MEM(EA, 8) ``` Let the effective address (EA) be the sum (RA|0)+(RB). The doubleword in storage addressed by EA is loaded into RT. This instruction creates a reservation for use by a *Store Doubleword Conditional* instruction. An address computed from the EA as described in Section 1.7.3.1 is associated with the reservation, and replaces any address previously associated with the reservation. The value of EH provides a hint as to whether the program will perform a subsequent store to the doubleword in storage addressed by EA before some other processor attempts to modify it. - Other programs might attempt to modify the doubleword in storage addressed by EA regardless of the result of the corresponding Store Word/Doubleword Conditional instruction. - Other programs will not attempt to modify the doubleword in storage addressed by EA until the program that has acquired the lock performs a subsequent store releasing the lock. EA must be a multiple of 8. If it is not, either the system alignment error handler is invoked or the results are boundedly undefined.s ## **Special Registers Altered:** None ### **Programming Note** *Idarx* serves as both a basic and an extended mnemonic. The Assembler will recognize a *Idarx* mnemonic with four operands as the basic form, and a *Idarx* mnemonic with three operands as the extended form. In the extended form the EH operand is omitted and assumed to be 0. ## 3.4.3 Memory Barrier Instructions The *Memory Barrier* instructions can be used to control the order in which storage accesses are performed. Additional information about these instructions and about related aspects of storage management can be found in Book III. ## **Extended mnemonics for Synchronize** Extended mnemonics are provided for the *Synchronize* instruction so that it can be supported by assemblers that recognize only the *msync*<E> mnemonic and so that it can be coded with the L value as part of the mnemonic rather than as a numeric operand. These are shown as examples with the instruction. See Appendix A. "Assembler Extended Mnemonics" on page 457. ## Synchronize X-form sync L | 31 | /// | L | /// | /// | 598 | / | |----|-----|---|-----|-----|-----|----| | 0 | 6 | 9 | 11 | 16 | 21 | 31 | The **sync** instruction creates a memory barrier (see Section 1.7.1). The set of storage accesses that is ordered by the memory barrier depends on the value of the L field. ## L=0 ("heavyweight sync") The memory barrier provides an ordering function for the storage accesses associated with all instructions that are executed by the processor executing the sync instruction. The applicable pairs are all pairs $a_i,b_j$ in which $b_j$ is a data access, except that if $a_i$ is the storage access caused by an icbi instruction then $b_j$ may be performed with respect to the processor executing the sync instruction before $a_i$ is performed with respect to that processor. ## L=1 ("lightweight sync") The memory barrier provides an ordering function for the storage accesses caused by *Load*, *Store*, and *dcbz* instructions that are executed by the processor executing the *sync* instruction and for which the specified storage location is in storage that is Memory Coherence Required and is neither Write Through Required nor Caching Inhibited. The applicable pairs are all pairs $a_i, b_j$ of such accesses except those in which $a_i$ is an access caused by a *Store* or *dcbz* instruction and $b_j$ is an access caused by a *Load* instruction. ## L=2<S> The set of storage accesses that is ordered by the memory barrier is described in Section 5.9.2 of Book III-S, as are additional properties of the *sync* instruction with L=2. The ordering done by the memory barrier is cumulative. If L=0 (or L=2<S>), the **sync** instruction has the following additional properties. - Executing the sync instruction ensures that all instructions preceding the sync instruction have completed before the sync instruction completes, and that no subsequent instructions are initiated until after the sync instruction completes. - The **sync** instruction is execution synchronizing (see Book III). However, address translation and reference and change recording<S> (see Book III) associated with subsequent instructions may be performed before the **sync** instruction completes. - The memory barrier provides the additional ordering function such that if a given instruction that is the result of a store in set B is executed, all applicable storage accesses in set A have been performed with respect to the processor executing the instruction to the extent required by the associated memory coherence properties. The single exception is that any storage access in set A that is caused by an *icbi* instruction executed by the processor executing the *sync* instruction (P1) may not have been performed with respect to P1 (see the description of the *icbi* instruction on page 428). The cumulative properties of the barrier apply to the execution of the given instruction as they would to a load that returned a value that was the result of a store in set B. The sync instruction provides an ordering function for the operations caused by the stream variants of the dcbt and dcbtst instructions (i.e. the providing of hints). The value L=3 is reserved. The **sync** instruction may complete before storage accesses associated with instructions preceding the **sync** instruction have been performed. The **sync** instruction may complete before operations caused by **dcbt** instructions with $TH_0=1$ preceding the **sync** instruction have been performed. See Section 4.9.3 of Book III-E for additional information related to the sync instruction for the Embedded environment. ## **Special Registers Altered:** None #### **Extended Mnemonics:** Extended mnemonics for Synchronize: | Extended: | Equivalent to: | |-----------------|----------------| | sync | sync 0 | | msync <e></e> | sync 0 | | lwsync | sync 1 | | ptesync <s></s> | sync 2 | Except in the *sync* instruction description in this section, references to "*sync*" in Books I-III imply L=0 unless otherwise stated or obvious from context; the appropriate extended mnemonics are used when other L values are intended. #### Programming Note - Section 1.8 contains a detailed description of how to modify instructions such that a well-defined result is obtained. ## **Programming Note** **sync** serves as both a basic and an extended mnemonic. The Assembler will recognize a **sync** mnemonic with one operand as the basic form, and a **sync** mnemonic with no operand as the extended form. In the extended form the L operand is omitted and assumed to be 0. ## **Programming Note** - The *sync* instruction can be used to ensure that all stores into a data structure, caused by *Store* instructions executed in a "critical section" of a program, will be performed with respect to another processor before the store that releases the lock is performed with respect to that processor; see Section B.2, "Lock Acquisition and Release, and Related Techniques" on page 461. The memory barrier created by a **sync** instruction with L=0 or L=1 does not order implicit storage accesses. The memory barrier created by a **sync** instruction with any L value does not order instruction fetches. (The memory barrier created by a **sync** instruction with L=0 – or L=2<S>; see Book III – appears to order instruction fetches for instructions preceding the **sync** instruction with respect to data accesses caused by instructions following the **sync** instruction. However, this ordering is a consequence of the first "additional property" of **sync** with L=0, not a property of the memory barrier.) In order to obtain the best performance across the widest range of implementations, the programmer should use the *sync* instruction with L=1, or the *eieio*<S> or *mbar*<E> instruction, if any of these is sufficient for his needs; otherwise he should use *sync* with L=0. *sync* with L=2<S> should not be used by application programs. ## Programming Note The functions provided by **sync** with L=1 are a strict subset of those provided by **sync** with L=0. (The functions provided by **sync** with L=2<S> are a strict superset of those provided by **sync** with L=0; see Book III.) ## Enforce In-order Execution of I/O X-form eieio [Category: Server] | | 31 | /// | /// | /// | 854 | / | |---|----|-----|-----|-----|-----|----| | 0 | | 6 | 11 | 16 | 21 | 31 | The *eieio* instruction creates a memory barrier (see Section 1.7.1, "Storage Access Ordering"), which provides an ordering function for the storage accesses caused by *Load*, *Store*, *dcbz*, *eciwx*, and *ecowx* instructions executed by the processor executing the *eieio* instruction. These storage accesses are divided into the two sets listed below. The storage access caused by an *eciwx* instruction is ordered as a load, and the storage access caused by a *dcbz* or *ecowx* instruction is ordered as a store. Loads and stores to storage that is both Caching Inhibited and Guarded, and stores to main storage caused by stores to storage that is Write Through Required. The applicable pairs are all pairs $a_i,b_j$ of such accesses. Stores to storage that is Memory Coherence Required and is neither Write Through Required nor Caching Inhibited. The applicable pairs are all pairs $a_i,b_j$ of such accesses. The operations caused by the stream variants of the *dcbt* and *dcbtst* instructions (i.e. the providing of hints) are ordered by *eieio* as a third set of operations, and the operations caused by *tlbie*<S> and *tlbsync* instructions (see Book III-S) are ordered by *eieio* as a fourth set of operations. Each of the four sets of storage accesses or operations is ordered independently of the other three sets. The ordering done by *eieio*'s memory barrier for the second set is cumulative; the ordering done by *eieio*'s memory barrier for the other three sets is not cumulative. The **eieio** instruction may complete before storage accesses or operations associated with instructions preceding the **eieio** instruction have been performed. ## **Special Registers Altered:** None ## Memory Barrier X-form mbar MO [Category: Embedded] | 31 | MO | /// | /// | 854 | / | |----|----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | When MO=0, the *mbar* instruction creates a cumulative memory barrier (see Section 1.7.1, "Storage Access Ordering"), which provides an ordering function for the storage accesses executed by the processor executing the *mbar* instruction. When MO≠0, an implementation may support the *mbar* instruction ordering a particular subset of storage accesses. An implementation may also support multiple, non-zero values of MO that each specify a different subset of storage accesses that are ordered by the *mbar* instruction. Which subsets of storage accesses that are ordered and which values of MO that specify these subsets is implementation-dependent. The *mbar* instruction may complete before storage accesses associated with instructions preceding the *mbar* instruction have been performed. The *mbar* instruction may complete before operations caused by *dcbt* instructions having TH<sub>0</sub>=1 preceding the *mbar* instruction have been performed. ## **Special Registers Altered:** None #### Programming Note The *eieio*<S> and *mbar*<E> instructions are intended for use in doing memory-mapped I/O). Because loads, and separately stores, to storage that is both Caching Inhibited and Guarded are performed in program order (see Section 1.7.1, "Storage Access Ordering" on page 413), *eieio*<S> or *mbar*<E> is needed for such storage only when loads must be ordered with respect to stores. For the **eieio**<S> instruction, accesses in set 1, $a_i$ and $b_j$ need not be the same kind of access or be to storage having the same storage control attributes. For example, $a_i$ can be a load to Caching Inhibited, Guarded storage, and $b_j$ a store to Write Through Required storage. If stronger ordering is desired than that provided by *eieio*<\$> or *mbar*<E>, the *sync* instruction must be used, with the appropriate value in the L field. ## Programming Note The functions provided by **eieio**<S> and **mbar**<E> are a strict subset of those provided by **sync** with L=0. The functions provided by **eieio**<S> for its second set are a strict subset of those provided by **sync** with L=1. Since *eieio*<S> and *mbar*<E>share the same opcode, software designed for both server and embedded environments must assume that only the *eieio*<S> functionality applies since the functions provided by *eieio* are a subset of those provided by *mbar*. ## 3.4.4 Wait Instruction Wait X-form wait [Category: Wait] The wait instruction provides an ordering function for the effects of all instructions executed by the processor executing the wait instruction. Executing a wait instruction ensures that all instructions have completed before the wait instruction completes, and that no subsequent instructions are initiated until an interrupt occurs. The wait instruction also causes any prefetched instructions to be discarded and instruction fetching is suspended until an interrupt occurs. Once the *wait* instruction has completed, the NIA will point to the next sequential instruction. ## **Special Registers Altered:** None #### Programming Note - The *wait* instruction can be used in verification test cases to signal the end of a test case. The encoding for the instruction is the same in both Big-Endian and Little-Endian modes. ## **Programming Note** The *wait* instruction may be useful as the primary instruction of an "idle process" or the completion of processing for a cooperative thread. Note that *wait* updates the NIA so that an interrupt that awakens a *wait* instruction will return to the instruction after the *wait*. ## **Chapter 4. Time Base** | 4.1 Time Base Overview 451 | 4.3 Alternate Time Base [Category: Alter- | |----------------------------------|-------------------------------------------| | 4.2 Time Base 451 | nate Time Base] | | 4.2.1 Time Base Instructions 451 | | ## 4.1 Time Base Overview The time base facilities include a Time Base and an Alternate Time Base which is category: Alternate Time Base. The Alternate Time Base is analogous to the Time Base except that it may count at a different frequency and is not writable. ## 4.2 Time Base The Time Base (TB) is a 64-bit register (see Figure 4) containing a 64-bit unsigned integer that is incremented periodically as described below. | Field | Description | |-------|-------------| | | | | | | TBU Upper 32 bits of Time Base TBL Lower 32 bits of Time Base #### Figure 4. Time Base The Time Base bits 0:59 increment until their value becomes 0xFFF\_FFFF\_FFFF\_FFFF (2<sup>59</sup> - 1), at the next increment their value becomes 0x000\_0000\_0000\_0000. There is no interrupt or other indication when this occurs. Time base bits 60:63 may increment at a variable rate. When the value of bit 59 changes, bits 60:63 are set to zero; if bits 60:63 increment to 0xF before the value of bit 59 changes, they remain at 0xF until the value of bit 59 changes. $$T_{\text{TB}} = \frac{2^{64} \times 32}{1 \,\text{GHz}} = 5.90 \,\text{x} \, 10^{11} \,\text{seconds}$$ which is approximately 18,700 years. The Power ISA AS does not specify a relationship between the frequency at which the Time Base is updated and other frequencies, such as the CPU clock or bus clock. The Time Base update frequency is not required to be constant. What *is* required, so that system software can keep time of day and operate interval timers, is one of the following. - The system provides an (implementation-dependent) interrupt to software whenever the update frequency of the Time Base bits 0:59 changes, and a means to determine what the current update frequency is. - The update frequency of the Time Base bits 0:59 is under the control of the system software. #### Programming Note If the operating system initializes the Time Base on power-on to some reasonable value and the update frequency of the Time Base is constant, the Time Base can be used as a source of values that increase at a constant rate, such as for time stamps in trace entries. Even if the update frequency is not constant, values read from the Time Base are monotonically increasing (except when the Time Base wraps from $2^{64}$ -1 to 0). If a trace entry is recorded each time the update frequency changes, the sequence of Time Base values can be post-processed to become actual time values. Successive readings of the Time Base may return identical values. ## 4.2.1 Time Base Instructions ## Move From Time Base XFX-form mftb RT,TBR [Category: Server Phased-Out] | 31 | RT | tbr | 371 | / | |----|----|-----|-----|---| This instruction behaves as if it were an $\it mfspr$ instruction; see the $\it mfspr$ instruction description in Section 3.3.14 of Book I. ## **Special Registers Altered:** None #### **Extended Mnemonics:** Extended mnemonics for Move From Time Base: | Extend | led: | Equivalent to: | |--------|------|-----------------------------| | mftb | | mftb Rx,268<br>mfspr Rx,268 | | mftbu | | mftb Rx,269<br>mfspr Rx,269 | ## Programming Note New programs should use *mfspr* instead of *mftb* to access the Time Base. ## Programming Note - **mftb** serves as both a basic and an extended mnemonic. The Assembler will recognize an **mftb** mnemonic with two operands as the basic form, and an **mftb** mnemonic with one operand as the extended form. In the extended form the TBR operand is omitted and assumed to be 268 (the value that corresponds to TB). ## Programming Note - The *mfspr* instruction can be used to read the Time Base on all processors that comply with Version 2.01 of the architecture or with any subsequent version. It is believed that the *mfspr* instruction can be used to read the Time Base on most processors that comply with versions of the architecture that precede Version 2.01. Processors for which *mfspr* cannot be used to read the Time Base include the following. - 601 - POWER3 (601 implements neither the Time Base nor *mftb*, but depends on software using *mftb* to read the Time Base, so that the attempt causes the Illegal Instruction error handler to be invoked and thereby permits the operating system to emulate the Time Base.) ## **Programming Note** Since the update frequency of the Time Base is implementation-dependent, the algorithm for converting the current value in the Time Base to time of day is also implementation-dependent. As an example, assume that the Time Base increments at the constant rate of 512 MHz. (Note, however, that programs should allow for the possibility that some implementations may not increment the least-significant 4 bits of the Time Base at a constant rate.) What is wanted is the pair of 32-bit values comprising a POSIX standard clock: 1 the number of whole seconds that have passed since 00:00:00 January 1, 1970, UTC, and the remaining fraction of a second expressed as a number of nanoseconds. #### Assume that: - The value 0 in the Time Base represents the start time of the POSIX clock (if this is not true, a simple 64-bit subtraction will make it so). - The integer constant *ticks\_per\_sec* contains the value 512,000,000, which is the number of times the Time Base is updated each second. - The integer constant *ns\_adj* contains the value $$\frac{1,000,000,000}{512,000,000} \times 2^{32} / 2 = 4194304000$$ which is the number of nanoseconds per tick of the Time Base, multiplied by $2^{32}$ for use in *mulhwu* (see below), and then divided by 2 in order to fit, as an unsigned integer, into 32 bits. When the processor is in 64-bit mode, The POSIX clock can be computed with an instruction sequence such as this: ``` mfspr Ry,268 # Ry = Time Base lwz Rx,ticks_per_sec divdu Rz,Ry,Rx # Rz = whole seconds stw Rz,posix sec ``` ``` mulld Rz,Rz,Rx # Rz = quotient * divisor sub Rz,Ry,Rz # Rz = excess ticks lwz Rx,ns_adj slwi Rz,Rz,1 # Rz = 2 * excess ticks mulhwu Rz,Rz,Rx # mul by (ns/tick)/2 * 2<sup>32</sup> stw Rz,posix ns# product[0:31] = excess ns ``` For the Embedded environment when the processor is in 32-bit mode, it is not possible to read the Time Base using a single instruction. Instead, two instructions must be used, one of which reads TBL and the other of which reads TBU. Because of the possibility of a carry from TBL to TBU occurring between the two reads, a sequence such as the following must be used to read the Time Base. ``` loop: mfspr Rx,TBU # load from TBU mfspr Ry,TB # load from TB mfspr Rz,TBU # load from TBU cmp cr0,0,Rx,Rz# check if 'old'='new' bne loop #branch if carry occurred ``` ## Non-constant update frequency In a system in which the update frequency of the Time Base may change over time, it is not possible to convert an isolated Time Base value into time of day. Instead, a Time Base value has meaning only with respect to the current update frequency and the time of day that the update frequency was last changed. Each time the update frequency changes, either the system software is notified of the change via an interrupt (see Book III), or the change was instigated by the system software itself. At each such change, the system software must compute the current time of day using the old update frequency, compute a new value of ticks\_per\_sec for the new frequency, and save the time of day, Time Base value, and tick rate. Subsequent calls to compute Time of Day use the current Time Base Value and the saved value. Described in POSIX Draft Standard P1003.4/D12, Draft Standard for Information Technology -- Portable Operating System Interface (POSIX) -Part 1: System Application Program Interface (API) - Amendment 1: Real-time Extension [C Language]. Institute of Electrical and Electronics Engineers, Inc., Feb. 1992. # 4.3 Alternate Time Base [Category: Alternate Time Base] The Alternate Time Base (ATB) is a 64-bit register (see Figure 4) containing a 64-bit unsigned integer that is incremented periodically. The frequency at which the integer is updated is implementation-dependent. Figure 5. Alternate Time Base The ATBL register is an aliased name for the ATB. The Alternate Time Base increments until its value becomes 0xFFFF\_FFFF\_FFFF\_FFFF (2<sup>64</sup> - 1). At the next increment, its value becomes 0x0000\_0000\_0000\_0000. There is no explicit indication (such as an interrupt; see Book III) that this has occurred. The Alternate Time Base is accessible in both user and supervisor mode. The counter can be read by executing a *mfspr* instruction specifying the ATB (or ATBL) register, but cannot be written. A second SPR register ATBU, is defined that accesses only the upper 32 bits of the counter. Thus the upper 32 bits of the counter may be read into a register by reading the ATBU register. The effect of entering a power-savings mode or of processor frequency changes on counting in the Alternate Time Base is implementation-dependent. ## Chapter 5. External Control [Category: External Control] The External Control category of facilities and instructions permits a program to communicate with a special-purpose device. Two instructions are provided, both of which must be implemented if the facility is provided. - External Control In Word Indexed (eciwx), which does the following: - Computes an effective address (EA) like most X-form instructions - Validates the EA as would be done for a load from that address - Translates the EA to a real address - Transmits the real address to the device - Accepts a word of data from the device and places it into a General Purpose Register - External Control Out Word Indexed (ecowx), which does the following: - Computes an effective address (EA) like most X-form instructions - Validates the EA as would be done for a store to that address - Translates the EA to a real address - Transmits the real address and a word of data from a General Purpose Register to the device Permission to execute these instructions and identification of the target device are controlled by two fields, called the E bit and the RID field respectively. If attempt is made to execute either of these instructions when E=0 the system data storage error handler is invoked. The location of these fields is described in Book III. The storage access caused by **eciwx** and **ecowx** is performed as though the specified storage location is Caching Inhibited and Guarded, and is neither Write Through Required nor Memory Coherence Required. Interpretation of the real address transmitted by *eciwx* and *ecowx* and of the 32-bit value transmitted by *ecowx* is up to the target device, and is not specified by the Power ISA. See the System Architecture documentation for a given Power ISA system for details on how the External Control facility can be used with devices on that system. ## Example An example of a device designed to be used with the External Control facility might be a graphics adapter. The **ecowx** instruction might be used to send the device the translated real address of a buffer containing graphics data, and the word transmitted from the General Purpose Register might be control information that tells the adapter what operation to perform on the data in the buffer. The **eciwx** instruction might be used to load status information from the adapter. A device designed to be used with the External Control facility may also recognize events that indicate that the address translation being used by the processor has changed. In this case the operating system need not "pin" the area of storage identified by an *eciwx* or *ecowx* instruction (i.e., need not protect it from being paged out). ## 5.1 External Access Instructions In the instruction descriptions the statements "this instruction is treated as a *Load*" and "this instruction is treated as a *Store*" have the same meanings as for the *Cache Management* instructions; see Section 3.3. ## External Control In Word Indexed X-form eciwx RT,RA,RB | 31 | RT | RA | RB | 310 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) raddr \leftarrow address translation of EA send load word request for raddr to device identified by RID RT \leftarrow 320 || word from device ``` Let the effective address (EA) be the sum (RA|0)+(RB). A load word request for the real address corresponding to EA is sent to the device identified by RID, bypassing the cache. The word returned by the device is placed into RT<sub>32:63</sub>. RT<sub>0:31</sub> are set to 0. The E bit must be 1. If it is not, the data storage error handler is invoked. EA must be a multiple of 4. If it is not, either the system alignment error handler is invoked or the results are boundedly undefined. This instruction is treated as a Load. See Book III-S for additional information about this instruction. ## **Special Registers Altered:** None ## **Programming Note** The **eieio**<S> or **mbar**<E> instruction can be used to ensure that the storage accesses caused by **eciwx** and **ecowx** are performed in program order with respect to other Caching Inhibited and Guarded storage accesses. ## External Control Out Word Indexed X-form ecowx RS,RA,RB | 31 | RS | RA | RB | 438 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 ``` ``` else b ← (RA) EA ← b + (RB) raddr ← address translation of EA send store word request for raddr to device identified by RID send (RS)<sub>32:63</sub> to device ``` Let the effective address (EA) be the sum (RA|0)+(RB). A store word request for the real address corresponding to EA and the contents of RS<sub>32:63</sub> are sent to the device identified by RID, bypassing the cache. The E bit must be 1. If it is not, the data storage error handler is invoked. EA must be a multiple of 4. If it is not, either the system alignment error handler is invoked or the results are boundedly undefined. This instruction is treated as a *Store*, except that its storage access is not performed in program order with respect to accesses to other Caching Inhibited and Guarded storage locations unless software explicitly imposes that order. See Book III-S for additional information about this instruction. ## **Special Registers Altered:** None ## **Appendix A. Assembler Extended Mnemonics** In order to make assembler language programs simpler to write and easier to understand, a set of extended mnemonics and symbols is provided for certain instructions. This appendix defines extended mnemonics and symbols related to instructions defined in Book II. Assemblers should provide the extended mnemonics and symbols listed here, and may provide others. # A.1 Data Cache Block Flush Mnemonics The L field in the *Data Cache Block Flush* instruction controls the scope of the flush function performed by the instruction. Extended mnemonics are provided that represent the L value in the mnemonic rather than requiring it to be coded as a numeric operand. **Note:** *dcbf* serves as both a basic and an extended mnemonic. The Assembler will recognize a *dcbf* mnemonic with three operands as the basic form, and a *dcbf* mnemonic with two operands as the extended form. In the extended form the L operand is omitted and assumed to be 0. dcbf RA,RB (equivalent to: dcbf RA,RB,0) dcbfl<S> RA,RB (equivalent to: dcbfl RA,RB,1) ## **A.2 Synchronize Mnemonics** The L field in the *Synchronize* instruction controls the scope of the synchronization function performed by the instruction. Extended mnemonics are provided that represent the L value in the mnemonic rather than requiring it to be coded as a numeric operand. Two extended mnemonics are provided for the L=0 value in order to support assemblers that do not recognize the *sync* mnemonic. **Note:** *sync* serves as both a basic and an extended mnemonic. The Assembler will recognize a *sync* mnemonic with one operand as the basic form, and a *sync* mnemonic with no operand as the extended form. In the extended form the L operand is omitted and assumed to be 0. | sync | (equivalent to: | sync | 0) | |-----------------|-----------------|------|----| | msync <e></e> | (equivalent to: | sync | 0) | | lwsync | (equivalent to: | sync | 1) | | ptesync <s></s> | (equivalent to: | sync | 2) | # **Appendix B. Programming Examples for Sharing Storage** This appendix gives examples of how dependencies and the *Synchronization* instructions can be used to control storage access ordering when storage is shared between programs. Many of the examples use extended mnemonics (e.g., **bne**, **bne**-, **cmpw**) that are defined in Appendix D of Book I. Many of the examples use the Load And Reserve and Store Conditional instructions, in a sequence that begins with a Load And Reserve instruction and ends with a Store Conditional instruction (specifying the same storage location as the Load Conditional) followed by a Branch Conditional instruction that tests whether the Store Conditional instruction succeeded. In these examples it is assumed that contention for the shared resource is low; the conditional branches are optimized for this case by using "+" and "-" suffixes appropriately. The examples deal with words; they can be used for doublewords by changing all word-specific mnemonics to the corresponding doubleword-specific mnemonics (e.g., *Iwarx* to *Idarx*, *cmpw* to *cmpd*). In this appendix it is assumed that all shared storage locations are in storage that is Memory Coherence Required, and that the storage locations specified by Load And Reserve and Store Conditional instructions are in storage that is neither Write Through Required nor Caching Inhibited. ## **B.1 Atomic Update Primitives** This section gives examples of how the *Load And Reserve* and *Store Conditional* instructions can be used to emulate atomic read/modify/write operations. An atomic read/modify/write operation reads a storage location and writes its next value, which may be a function of its current value, all as a single atomic operation. The examples shown provide the effect of an atomic read/modify/write operation, but use several instructions rather than a single atomic instruction. ## **Fetch and No-op** The "Fetch and No-op" primitive atomically loads the current value in a word in storage. In this example it is assumed that the address of the word to be loaded is in GPR 3 and the data loaded are returned in GPR 4. #### Note: The stwcx., if it succeeds, stores to the target location the same value that was loaded by the preceding lwarx. While the store is redundant with respect to the value in the location, its success ensures that the value loaded by the lwarx is still the current value at the time the stwcx. is executed. ## **Fetch and Store** The "Fetch and Store" primitive atomically loads and replaces a word in storage. In this example it is assumed that the address of the word to be loaded and replaced is in GPR 3, the new value is in GPR 4, and the old value is returned in GPR 5. ## **Fetch and Add** The "Fetch and Add" primitive atomically increments a word in storage. In this example it is assumed that the address of the word to be incremented is in GPR 3, the increment is in GPR 4, and the old value is returned in GPR 5. ``` loop: lwarx r5,0,r3 #load and reserve add r0,r4,r5#increment word stwcx. r0,0,r3 #store new value if still res'ved bne- loop #loop if lost reservation ``` ## Fetch and AND The "Fetch and AND" primitive atomically ANDs a value into a word in storage. In this example it is assumed that the address of the word to be ANDed is in GPR 3, the value to AND into it is in GPR 4, and the old value is returned in GPR 5. ``` loop: lwarx r5,0,r3 #load and reserve and r0,r4,r5#AND word stwcx. r0,0,r3 #store new value if still res'ved bne- loop #loop if lost reservation ``` #### Note: The sequence given above can be changed to perform another Boolean operation atomically on a word in storage, simply by changing the and instruction to the desired Boolean instruction (or, xor, etc.). ## Test and Set This version of the "Test and Set" primitive atomically loads a word from storage, sets the word in storage to a nonzero value if the value loaded is zero, and sets the EQ bit of CR Field 0 to indicate whether the value loaded is zero. In this example it is assumed that the address of the word to be tested is in GPR 3, the new value (nonzero) is in GPR 4, and the old value is returned in GPR 5. ``` loop: lwarx r5,0,r3 #load and reserve cmpwi r5,0 #done if word not equal to 0 bne- exit stwcx. r4,0,r3 #try to store non-0 bne- loop #loop if lost reservation exit: ... ``` ## Compare and Swap The "Compare and Swap" primitive atomically compares a value in a register with a word in storage, if they are equal stores the value from a second register into the word in storage, if they are unequal loads the word from storage into the first register, and sets the EQ bit of CR Field 0 to indicate the result of the comparison. In this example it is assumed that the address of the word to be tested is in GPR 3, the comparand is in GPR 4 and the old value is returned there, and the new value is in GPR 5. ``` loop: lwarx r6,0,r3 #load and reserve cmpw r4,r6 #1st 2 operands equal? bne- exit #skip if not stwcx. r5,0,r3 #store new value if still res'ved bne- loop #loop if lost reservation exit: mr r4,r6 #return value from storage ``` #### Notes: - The semantics given for "Compare and Swap" above are based on those of the IBM System/370 Compare and Swap instruction. Other architectures may define a Compare and Swap instruction differently. - 2. "Compare and Swap" is shown primarily for pedagogical reasons. It is useful on machines that lack the better synchronization facilities provided by *Iwarx* and *stwcx*.. A major weakness of a System/370-style Compare and Swap instruction is that, although the instruction itself is atomic, it checks only that the old and current values of the word being tested are equal, with the result that programs that use such a Compare and Swap to control a shared resource can err if the word has been modified and the old value subsequently restored. The sequence shown above has the same weakness. - 3. In some applications the second bne- instruction and/or the mr instruction can be omitted. The bne- is needed only if the application requires that if the EQ bit of CR Field 0 on exit indicates "not equal" then (r4) and (r6) are in fact not equal. The mr is needed only if the application requires that if the comparands are not equal then the word from storage is loaded into the register with which it was compared (rather than into a third register). If either or both of these instructions is omitted, the resulting Compare and Swap does not obey System/370 semantics. ## **B.2 Lock Acquisition and Release, and Related Techniques** This section gives examples of how dependencies and the *Synchronization* instructions can be used to implement locks, import and export barriers, and similar constructs. # **B.2.1** Lock Acquisition and Import Barriers An "import barrier" is an instruction or sequence of instructions that prevents storage accesses caused by instructions following the barrier from being performed before storage accesses that acquire a lock have been performed. An import barrier can be used to ensure that a shared data structure protected by a lock is not accessed until the lock has been acquired. A **sync** instruction can be used as an import barrier, but the approaches shown below will generally yield better performance because they order only the relevant storage accesses. # B.2.1.1 Acquire Lock and Import Shared Storage If *Iwarx* and *stwcx*. instructions are used to obtain the lock, an import barrier can be constructed by placing an *isync* instruction immediately following the loop containing the *Iwarx* and *stwcx*. The following example uses the "Compare and Swap" primitive to acquire the lock. In this example it is assumed that the address of the lock is in GPR 3, the value indicating that the lock is free is in GPR 4, the value to which the lock should be set is in GPR 5, the old value of the lock is returned in GPR 6, and the address of the shared data structure is in GPR 9. ``` loop: lwarx r6,0,r3,1 #load lock and reserve cmpw r4,r6 #skip ahead if # lock not free bne- wait stwcx. r5,0,r3 #try to set lock bne- loop #loop if lost reservation isync #import barrier lwz r7, data1(r9) #load shared data #wait for lock to free ``` The hint provided with *Iwarx* indicates that after the program acquires the lock variable (i.e. *stwcx*. is successful), it will release it (i.e. store to it) prior to another program attempting to modify it. The second **bne-** does not complete until CR0 has been set by the **stwcx.**. The **stwcx.** does not set CR0 until it has completed (successfully or unsuccessfully). The lock is acquired when the **stwcx.** completes successfully. Together, the second **bne-** and the subse- quent *isync* create an import barrier that prevents the load from "data1" from being performed until the branch has been resolved not to be taken. If the shared data structure is in storage that is neither Write Through Required nor Caching Inhibited, an *Iwsync* instruction can be used instead of the *isync* instruction. If *Iwsync* is used, the load from "data1" may be performed before the *stwcx*. But if the *stwcx*. fails, the second branch is taken and the *Iwarx* is reexecuted. If the *stwcx*. succeeds, the value returned by the load from "data1" is valid even if the load is performed before the *stwcx*., because the *Iwsync* ensures that the load is performed after the instance of the *Iwarx* that created the reservation used by the successful *stwcx*. # **B.2.1.2** Obtain Pointer and Import Shared Storage If *Iwarx* and *stwcx*. instructions are used to obtain a pointer into a shared data structure, an import barrier is not needed if all the accesses to the shared data structure depend on the value obtained for the pointer. The following example uses the "Fetch and Add" primitive to obtain and increment the pointer. In this example it is assumed that the address of the pointer is in GPR 3, the value to be added to the pointer is in GPR 4, and the old value of the pointer is returned in GPR 5. ``` loop: lwarx r5,0,r3 #load pointer and reserve add r0,r4,r5#increment the pointer stwcx. r0,0,r3 #try to store new value bne- loop #loop if lost reservation lwz r7,data1(r5) #load shared data ``` The load from "data1" cannot be performed until the pointer value has been loaded into GPR 5 by the *Iwarx*. The load from "data1" may be performed before the *stwcx*. But if the *stwcx*. fails, the branch is taken and the value returned by the load from "data1" is discarded. If the *stwcx*. succeeds, the value returned by the load from "data1" is valid even if the load is performed before the *stwcx*., because the load uses the pointer value returned by the instance of the *Iwarx* that created the reservation used by the successful *stwcx*. An *isync* instruction could be placed between the *bne*and the subsequent *lwz*, but no *isync* is needed if all accesses to the shared data structure depend on the value returned by the *lwarx*. # B.2.2 Lock Release and Export Barriers An "export barrier" is an instruction or sequence of instructions that prevents the store that releases a lock from being performed before stores caused by instructions preceding the barrier have been performed. An export barrier can be used to ensure that all stores to a shared data structure protected by a lock will be performed with respect to any other processor before the store that releases the lock is performed with respect to that processor. # **B.2.2.1** Export Shared Storage and Release Lock A **sync** instruction can be used as an export barrier independent of the storage control attributes (e.g., presence or absence of the Caching Inhibited attribute) of the storage containing the shared data structure. Because the lock must be in storage that is neither Write Through Required nor Caching Inhibited, if the shared data structure is in storage that is Write Through Required or Caching Inhibited a **sync** instruction *must* be used as the export barrier. In this example it is assumed that the shared data structure is in storage that is Caching Inhibited, the address of the lock is in GPR 3, the value indicating that the lock is free is in GPR 4, and the address of the shared data structure is in GPR 9. ``` stw r7,data1(r9)#store shared data (last) sync #export barrier stw r4,lock(r3)#release lock ``` The **sync** ensures that the store that releases the lock will not be performed with respect to any other processor until all stores caused by instructions preceding the **sync** have been performed with respect to that processor. ## B.2.2.2 Export Shared Storage and Release Lock using lwsync If the shared data structure is in storage that is neither Write Through Required nor Caching Inhibited, an *Iwsync* instruction can be used as the export barrier. Using *Iwsync* rather than *sync* will yield better performance in most systems. In this example it is assumed that the shared data structure is in storage that is neither Write Through Required nor Caching Inhibited, the address of the lock is in GPR 3, the value indicating that the lock is free is in GPR 4, and the address of the shared data structure is in GPR 9. ``` stw r7,data1(r9)#store shared data (last) lwsync #export barrier stw r4,lock(r3)#release lock ``` The *Iwsync* ensures that the store that releases the lock will not be performed with respect to any other processor until all stores caused by instructions preceding the *Iwsync* have been performed with respect to that processor. ## B.2.3 Safe Fetch If a load must be performed before a subsequent store (e.g., the store that releases a lock protecting a shared data structure), a technique similar to the following can be used. In this example it is assumed that the address of the storage operand to be loaded is in GPR 3, the contents of the storage operand are returned in GPR 4, and the address of the storage operand to be stored is in GPR 5. ``` lwz r4,0(r3)#load shared data cmpw r4,r4 #set CR0 to "equal" bne- $-8 #branch never taken stw r7,0(r5)#store other shared data ``` An alternative is to use a technique similar to that described in Section B.2.1.2, by causing the **stw** to depend on the value returned by the **Iwz** and omitting the **cmpw** and **bne-**. The dependency could be created by ANDing the value returned by the **Iwz** with zero and then adding the result to the value to be stored by the **stw**. If both storage operands are in storage that is neither Write Through Required nor Caching Inhibited, another alternative is to replace the **cmpw** and **bne-**with an **Iwsync** instruction. ## **B.3 List Insertion** This section shows how the *Iwarx* and *stwcx*. instructions can be used to implement simple insertion into a singly linked list. (Complicated list insertion, in which multiple values must be changed atomically, or in which the correct order of insertion depends on the contents of the elements, cannot be implemented in the manner shown below and requires a more complicated strategy such as using locks.) The "next element pointer" from the list element after which the new element is to be inserted, here called the "parent element", is stored into the new element, so that the new element points to the next element in the list; this store is performed unconditionally. Then the address of the new element is conditionally stored into the parent element, thereby adding the new element to the list. In this example it is assumed that the address of the parent element is in GPR 3, the address of the new element is in GPR 4, and the next element pointer is at offset 0 from the start of the element. It is also assumed that the next element pointer of each list element is in a reservation granule separate from that of the next element pointer of all other list elements. ``` loop: lwarx r2,0,r3 #get next pointer stw r2,0(r4) #store in new element lwsync or sync #order stw before stwcx stwcx. r4,0,r3 #add new element to list bne- loop #loop if stwcx. failed ``` In the preceding example, if two list elements have next element pointers in the same reservation granule then, in a multiprocessor, "livelock" can occur. (Livelock is a state in which processors interact in a way such that no processor makes forward progress.) If it is not possible to allocate list elements such that each element's next element pointer is in a different reservation granule, then livelock can be avoided by using the following, more complicated, sequence. ``` lwz r2,0(r3) #get next pointer loop1: r5,r2 #keep a copy mr r2,0(r4) #store in new element sync #order stw before stwcx. and before lwarx loop2: lwarx r2,0,r3 #get it again r2,r5 #loop if changed (someone cmpw loop1 # else progressed) stwcx. r4,0,r3 #add new element to list bne- loop2 #loop if failed ``` In the preceding example, livelock is avoided by the fact that each processor re-executes the **stw** only if some other processor has made forward progress. ## **B.4 Notes** - 1. To increase the likelihood that forward progress is made, it is important that looping on *Iwarx/stwcx*. pairs be minimized. For example, in the "Test and Set" sequence shown in Section B.1, this is achieved by testing the old value before attempting the store; were the order reversed, more *stwcx*. instructions might be executed, and reservations might more often be lost between the *Iwarx* and the *stwcx*. - 2. The manner in which *Iwarx* and *stwcx*. are communicated to other processors and mechanisms, and between levels of the storage hierarchy within a given processor, is implementation-dependent. In some implementations performance may be improved by minimizing looping on a *Iwarx* instruction that fails to return a desired value. For example, in the "Test and Set" sequence shown in Section B.1, if the programmer wishes to stay in the loop until the word loaded is zero, he could change the "bne- exit" to "bne- loop". However, in some implementations better performance may be obtained by using an ordinary Load instruction to do the initial checking of the value, as follows. ``` loop: lwz r5,0(r3) #load the word cmpwi r5,0 #loop back if word bne- loop # not equal to 0 lwarx r5,0,r3 #try again, reserving cmpwi r5,0 # (likely to succeed) bne- loop stwcx.r4,0,r3 #try to store non-0 bne- loop #loop if lost reserv'n ``` 3. In a multiprocessor, livelock is possible if there is a Store instruction (or any other instruction that can clear another processor's reservation; see Section 1.7.3.1) between the Iwarx and the stwcx. of a Iwarx/stwcx. loop and any byte of the storage location specified by the Store is in the reservation granule. For example, the first code sequence shown in Section B.3 can cause livelock if two list elements have next element pointers in the same reservation granule. ## **Book III-S:** **Power ISA Operating Environment Architecture - Server Environment** ## **Chapter 1. Introduction** | 1.1 Overview | 1.4 Exceptions | |------------------------------------|-------------------------------------| | 1.2 Document Conventions 467 | 1.5 Synchronization | | 1.2.1 Definitions and Notation 467 | 1.5.1 Context Synchronization 469 | | 1.2.2 Reserved Fields 468 | 1.5.2 Execution Synchronization 469 | | 1.3 General Systems Overview 468 | • | ## 1.1 Overview Chapter 1 of Book I describes computation modes, document conventions, a general systems overview, instruction formats, and storage addressing. This chapter augments that description as necessary for the Power ISA Operating Environment Architecture. ## 1.2 Document Conventions The notation and terminology used in Book I apply to this Book also, with the following substitutions. - For "system alignment error handler" substitute "Alignment interrupt". - For "system data storage error handler" substitute "Data Storage interrupt", "Hypervisor Data Storage interrupt", "Data Segment interrupt", or "Hypervisor Data Segment interrupt," as appropriate. - For "system error handler" substitute "interrupt". - For "system floating-point enabled exception error handler" substitute "Floating-Point Enabled Exception type Program interrupt". - For "system illegal instruction error handler" substitute "Illegal Instruction type Program interrupt". (If Category: HEA is supported, see the Programming Note in Section 6.5.9.) - For "system instruction storage error handler" substitute "Instruction Storage interrupt", "Hypervisor Instruction Storage interrupt", "Instruction Segment interrupt", or "Hypervisor Instruction Segment interrupt", as appropriate. - For "system privileged instruction error handler" substitute "Privileged Instruction type Program interrupt". - For "system service program" substitute "System Call interrupt". - For "system trap handler" substitute "Trap type Program interrupt". ## 1.2.1 Definitions and Notation The definitions and notation given in Book I are augmented by the following. ## ■ real page A unit of real storage that is aligned at a boundary that is a multiple of its size. The real page size is 4KB. ## context of a program The processor state (e.g., privilege and relocation) in which the program executes. The context is controlled by the contents of certain System Registers, such as the MSR and SDR1, of certain lookaside buffers, such as the SLB and TLB, and of the Page Table. ## exception An error, unusual condition, or external signal, that may set a status bit and may or may not cause an interrupt, depending upon whether the corresponding interrupt is enabled. ## ■ interrupt The act of changing the machine state in response to an exception, as described in Chapter 6. "Interrupts" on page 547. #### trap interrupt An interrupt that results from execution of a *Trap* instruction. Additional exceptions to the rule that the processor obeys the sequential execution model, beyond those described in the section entitled "Instruction Fetching" in Book I, are the following. - A System Reset or Machine Check interrupt may occur. The determination of whether an instruction is required by the sequential execution model is not affected by the potential occurrence of a System Reset or Machine Check interrupt. (The determination is affected by the potential occurrence of any other kind of interrupt.) - A context-altering instruction is executed (Chapter 10. "Synchronization Requirements for Context Alterations" on page 585). The context alteration need not take effect until the required subsequent synchronizing operation has occurred. - A Reference and Change bit is updated by the processor. The update need not be performed with respect to that processor until the required subsequent synchronizing operation has occurred. - A Branch instruction is executed and the branch is taken. The update of the Come-From Address Register<S> (see Section 8.1.1 of Book III-S) need not occur until a subsequent context synchronizing operation has occurred. #### ■ "must" If hypervisor software violates a rule that is stated using the word "must" (e.g., "this field must be set to 0"), and the rule pertains to the contents of a hypervisor resource, to executing an instruction that can be executed only in hypervisor state, or to accessing storage in real addressing mode, the results are undefined, and may include altering resources belonging to other partitions, causing the system to "hang", etc. #### hardware Any combination of hard-wired implementation, emulation assist, or interrupt for software assistance. In the last case, the interrupt may be to an architected location or to an implementation-dependent location. Any use of emulation assists or interrupts to implement the architecture is implementation-dependent. #### hypervisor privileged A term used to describe an instruction or facility that is available only when the procesor is in hypervisor state. ## privileged state and supervisor mode Used interchangeably to refer to a processor s Used interchangeably to refer to a processor state in which privileged facilities are available. #### ■ problem state and user mode Used interchangeably to refer to a processor state in which privileged facilities are not available. - /, //, ///, ... denotes a field that is reserved in an instruction, in a register, or in an architected storage table. - ?, ??, ???, ... denotes a field that is implementation-dependent in an instruction, in a register, or in an architected storage table. ## 1.2.2 Reserved Fields Book I's description of the handling of reserved bits in System Registers, and of reserved values of defined fields of System Registers, applies also to the SLB. Book I's description of the handling of reserved values of defined fields of System Registers applies also to architected storage tables (e.g., the Page Table). Some fields of certain architected storage tables may be written to automatically by the processor, e.g., Reference and Change bits in the Page Table. When the processor writes to such a table, the following rules are obeyed. - Unless otherwise stated, no defined field other than the one(s) the processor is specifically updating are modified. - Contents of reserved fields are either preserved by the processor or written as zero. #### **Programming Note** Software should set reserved fields in the SLB and in architected storage tables to zero, because these fields may be assigned a meaning in some future version of the architecture. ## 1.3 General Systems Overview The processor or processor unit contains the sequencing and processing controls for instruction fetch, instruction execution, and interrupt action. Most implementations also contain data and instruction caches. Instructions that the processing unit can execute fall into the following classes: - instructions executed in the Branch Processor - instructions executed in the Fixed-Point Processor - instructions executed in the Floating-Point Processor - instructions executed in the Vector Processor Almost all instructions executed in the Branch Processor, Fixed-Point Processor, Floating-Point Processor, and Vector Processor are nonprivileged and are described in Book I. Book II may describe additional nonprivileged instructions (e.g., Book II describes some nonprivileged instructions for cache management). Instructions related to the privileged state of the processor, control of processor resources, control of the storage hierarchy, and all other privileged instructions are described here or are implementation-dependent. ## 1.4 Exceptions The following augments the exceptions defined in Book I that can be caused directly by the execution of an instruction: - the execution of a floating-point instruction when MSR<sub>FP</sub>=0 (Floating-Point Unavailable interrupt) - an attempt to modify a hypervisor resource when the processor is in privileged but non-hypervisor state (see Chapter 2), or an attempt to execute a hypervisor-only instruction (e.g., *tlbie*) when the processor is in privileged but non-hypervisor state - the execution of a traced instruction (Trace interrupt) - the execution of a Vector instruction when the vector processor is unavailable (Vector Unavailable interrupt) ## 1.5 Synchronization The synchronization described in this section refers to the state of the processor that is performing the synchronization. ## 1.5.1 Context Synchronization An instruction or event is *context synchronizing* if it satisfies the requirements listed below. Such instructions and events are collectively called *context synchronizing operations*. The context synchronizing operations are the *isync* instruction, the *System Linkage* instructions, the *mtmsr[d]* instructions with L=0, and most interrupts (see Section 6.4). - The operation causes instruction dispatching (the issuance of instructions by the instruction fetching mechanism to any instruction execution mechanism) to be halted. - The operation is not initiated or, in the case of isync and wait [Category: Wait], does not complete, until all instructions that precede the operation have completed to a point at which they have reported all exceptions they will cause. - 3. The operation ensures that the instructions that precede the operation will complete execution in the context (privilege, relocation, storage protection, etc.) in which they were initiated, except that the operation has no effect on the context in which the associated Reference and Change bit updates are performed. - 4. If the operation directly causes an interrupt (e.g., sc directly causes a System Call interrupt) or is an interrupt, the operation is not initiated until no exception exists having higher priority than the exception associated with the interrupt (see Section 6.8). 5. The operation ensures that the instructions that follow the operation will be fetched and executed in the context established by the operation. (This requirement dictates that any prefetched instructions be discarded and that any effects and side effects of executing them out-of-order also be discarded, except as described in Section 5.5, "Performing Operations Out-of-Order".) ## **Programming Note** A context synchronizing operation is necessarily execution synchronizing; see Section 1.5.2. Unlike the *Synchronize* instruction, a context synchronizing operation does not affect the order in which storage accesses are performed. Item 2 permits a choice only for *isync* (and *sync* and *ptesync*; see Section 1.5.2) because all other execution synchronizing operations also alter context. ## 1.5.2 Execution Synchronization An instruction is *execution synchronizing* if it satisfies items 2 and 3 of the definition of context synchronization (see Section 1.5.1). **sync** and **ptesync** are treated like **isync** with respect to item 2 (i.e., the conditions described in item 2 apply to the completion of **sync** and **ptesync**). Examples of execution synchronizing instructions include **sync**, **ptesync**, and **mtmsrd**. An instruction is *execution synchronizing* if it satisfies items 2 and 3 of the definition of context synchronization (see Section 1.5.1). *sync* and *ptesync* are treated like *isync* with respect to item 2. The execution synchronizing instructions are *sync*, *ptesync*, the *mtmsr[d]*<S> instructions with L=1, and all context synchronizing instructions. ## Programming Note All context synchronizing instructions are execution synchronizing. Unlike a context synchronizing operation, an execution synchronizing instruction does not ensure that the instructions following that instruction will execute in the context established by that instruction. This new context becomes effective sometime after the execution synchronizing instruction completes and before or at a subsequent context synchronizing operation. ## **Chapter 2. Logical Partitioning (LPAR)** | 2.1 Overview | 2.6 Processor Compatibility Register (PCR) | |--------------|--------------------------------------------| |--------------|--------------------------------------------| ## 2.1 Overview The Logical Partitioning (LPAR) facility permits processors and portions of real storage to be assigned to logical collections called *partitions*, such that a program executing on a processor in one partition cannot interfere with any program executing on a processor in a different partition. This isolation can be provided for both problem state and privileged state programs, by using a layer of trusted software, called a *hypervisor* program (or simply a "hypervisor"), and the resources provided by this facility to manage system resources. (A hypervisor is a program that runs in hypervisor state; see below.) The number of partitions supported is implementationdependent. A processor is assigned to one partition at any given time. A processor can be assigned to any given partition without consideration of the physical configuration of the system (e.g., shared registers, caches, organization of the storage hierarchy), except that processors that share certain hypervisor resources may need to be assigned to the same partition; see Section 2.7. The registers and facilities used to control Logical Partitioning are listed below and described in the following subsections. Except in the following subsections, references to the "operating system" in this document include the hypervisor unless otherwise stated or obvious from context. ## 2.2 Logical Partitioning Control Register (LPCR) The layout of the Logical Partitioning Control Register (LPCR) is shown in Figure 1 below. Figure 1. Logical Partitioning Control Register The contents of the LPCR control a number of aspects of the operation of the processor with respect to a logi- cal partition. Below are shown the bit definitions for the LPCR. #### Bit Description ## 0:2 Virtualization Control (VC) Controls the virtualization of partition memory. This field contains two subfields, VPM and ISL. ## 0:1 Virtualized Partition Memory (VPM) This field controls whether VPM mode is enabled as specified below. (See Section 5.7.3.4 and Section 5.7.2, "Virtualized Partition Memory (VPM) Mode" for additional information on VPM mode.) ## Bit Description - This bit controls whether VPM mode is enabled when address translation is disabled - 0 VPM mode disabled - 1 VPM mode enabled - 1 This bit controls whether VPM mode is enabled when address translation is enabled - 0 VPM mode disabled - 1 VPM mode enabled ## 2 Ignore SLB Large Page Specification (ISL) Controls whether ISL mode is enabled as specified below. - 0 ISL mode disabled - 1 ISL mode enabled When ISL mode is enabled and address translation is enabled and the processor is not in hypervisor state, address translation is performed as if the contents of $SLB_{L||LP}$ were 0b000. When address translation is disabled, the setting of the ISL bit has no effect. ISL mode has no effect on SLB, TLB, and ERAT entry invalidations caused by *slbie*, *slbia*, *tlbia*, *tlbie*, and *slbie*. ## 3:8 Reserved #### 9:11 **Default Prefetch Depth** (DPFD) The DPFD field is used as the default prefetch depth for data stream prefetching when DSCR<sub>DPFD</sub>=0; see page 430. ## 12:16 Virtual Real Mode Area Segment Descriptor (VRMASD) When address translation is disabled and VPM<sub>0</sub>=1, the contents of this field specify the L and LP fields of the segment descriptor that apply for storage references to the virtualized real mode area (VRMA). See Section 5.7.3.4 for additional information. The definitions and allowed values of the L and LP fields are the same as for the corresponding fields in the segment descriptor. (See Section 5.7.7.) If VPM<sub>0</sub>=0 or address translation is enabled, the setting of the VRMASD has no effect. #### **Bit Description** - 0 Virtual Page Size Selector Bit 0 (L) - 1:2 Reserved - 3:4 Virtual Page Size Selector Bits 1:2 (LP) ### **Programming Note** 17:33 Reserved ### 34:37 Real Mode Limit Selector (RMLS) The RMLS field specifies the largest effective address that can be used by partition software when address translation is disabled. The valid RMLS values are implementation-dependent, and each value corresponds to a maximum effective address of 2<sup>m</sup>, where m has a minimum value of 12 and a maximum value equal to the number of bits in the real address size supported by the implementation. ## 38 Interrupt Little-Endian (ILE) The contents of the ILE bit are copied into MSR<sub>LE</sub> by interrupts that set MSR<sub>HV</sub> to 0 (see Section 6.5), to establish the Endian mode for the interrupt handler. #### 39:48 Reserved ## 49:51 **Power-saving mode Exit Cause Enable** (PECE) - If PECE<sub>0</sub> = 1 when a power-saving mode instruction is executed, External exceptions are enabled to cause exit from power-saving mode; otherwise External exceptions are disabled from causing exit from power-saving mode. - If PECE<sub>1</sub> = 1 when a power-saving mode instruction is executed, Decrementer exceptions are enabled to cause exit from power-saving mode; otherwise Decrementer exceptions are disabled from causing exit from power-saving mode. (In sleep and rvwinkle power-saving levels, Decrementer exceptions do not occur if the state of the Decrementer is not maintained and updated as if the processor was not in power-saving mode.) - 51 If PECE<sub>2</sub>=1 when a power-saving mode instruction is executed, Machine Check, Hypervisor Maintenance, and certain implementation-specific exceptions are enabled to cause exit from power-saving mode; otherwise Machine Check, Hypervisor Mainte- nance, and the same implementation-specific exceptions are disabled from causing exit from power-saving mode. It is implementation-specific whether the exceptions enabled by the PECE field cause exit from sleep and rvwinkle power-saving levels. See section 6.6.1 and section 6.6.2 for additional information about exit from power-saving mode. ## 52 Mediated External Exception Request (MER) - 0 A Mediated External exception is not requested. - A Mediated External exception is requested. The exception effects of this bit are said to be consistent with the contents of this bit if one of the following statements is true. - LPCR<sub>MER</sub> = 1 and a Mediated External exception exists. - LPCR<sub>MER</sub> = 0 and a Mediated External exception does not exist. A context synchronizing instruction or event that is executed or occurs when LPCR $_{\rm MER}=0$ ensures that the exception effects of LPCR $_{\rm MER}$ are consistent with the contents of LPCR $_{\rm MER}$ . Otherwise, when an instruction changes the contents of LPCR $_{\rm MER}$ , the exception effects of LPCR $_{\rm MER}$ become consistent with the new contents of LPCR $_{\rm MER}$ reasonably soon after the change. ## **Programming Note** LPCR<sub>MER</sub> provides a means for the hypervisor to direct an external exception to a partition independent of the partition's MSR<sub>EE</sub> setting. (When MSR<sub>EE</sub>=0, it is inappropriate for the hypervisor to deliver the exception.) Using LPCR<sub>MER</sub>, the partition can be interrupted upon enabling external interrupts. Without using LPCR<sub>MER</sub>, the hypervisor must check the state of MSR<sub>EE</sub> whenever it gets control, which will result in less timely delivery of the exception to the partition. ## 53 Reserved #### 54 **Translation Control** (TC) - The secondary Page Table search is enabled. - The secondary Page Table search is disabled. ## 55:59 Reserved # 60:61 Logical Partitioning Environment Selector (LPES) Three of the four LPES values are supported. The 0b10 value is reserved. ## 60 LPES<sub>0</sub> Controls whether External interrupts set $MSR_{HV}$ to 1 and $MSR_{RI}$ to 0, or leaves them unchanged. #### 61 LPES₁ Controls how storage is accessed when address translation is disabled, and whether a subset of interrupts set MSR<sub>HV</sub> to 1. ## - Programming Note - LPES<sub>1</sub>=0 provides an environment in which only the hypervisor can run with address translation disabled and in which all interrupts invoke the hypervisor. This value (along with MSR $_{HV}$ =1) can also be used in a system that is not partitioned, to permit the operating system to access all system resources. ## 62 Reserved # 63 Hypervisor Decrementer Interrupt Conditionally Enable (HDICE) - Hypervisor Decrementer interrupts are disabled. - Hypervisor Decrementer interrupts are enabled if permitted by MSR<sub>EE</sub>, MSR<sub>HV</sub>, and MSR<sub>PR</sub>; see Section 6.5.12 on page 565. See Section 5.7.3 on page 508 (including subsections) and Section 5.7.9 on page 524 for a description of how storage accesses are affected by the setting of LPES<sub>1</sub>, and RMLS. See Section 6.5 on page 555 for a description of how the setting of LPES<sub>0:1</sub> affects the processing of interrupts. # 2.3 Real Mode Offset Register (RMOR) The layout of the Real Mode Offset Register (RMOR) is shown in Figure 2 below. **Bits Name Description**4:63 RMO Real Mode Offset ### Figure 2. Real Mode Offset Register All other fields are reserved. The supported RMO values are the non-negative multiples of 2<sup>s</sup>, where 2<sup>s</sup> is the smallest implementation- dependent limit value representable by the contents of the Real Mode Limit Selector field of the LPCR. The contents of the RMOR affect how some storage accesses are performed as described in Section 5.7.3 on page 508 and Section 5.7.4 on page 512. # 2.4 Hypervisor Real Mode Offset Register (HRMOR) The layout of the Hypervisor Real Mode Offset Register (HRMOR) is shown in Figure 3 below. **Bits Name Description**4:63 HRMO Real Mode Offset Figure 3. Hypervisor Real Mode Offset Register All other fields are reserved. The supported HRMO values are the non-negative multiples of $2^r$ , where r is an implementation-dependent value and $12 \le r \le 26$ . The contents of the HRMOR affect how some storage accesses are performed as described in Section 5.7.3 on page 508 and Section 5.7.4 on page 512. # 2.5 Logical Partition Identification Register (LPIDR) The layout of the Logical Partition Identification Register (LPIDR) is shown in Figure 4 below. Bits Name Description 32:63 LPID Logical Partition Identifier ## Figure 4. Logical Partition Identification Register The contents of the LPIDR identify the partition to which the processor is assigned, affecting operations necessary to manage the coherency of some translation lookaside buffers (see Section 5.10.1 and Chapter 10.). The number of LPIDR bits supported is implementation-dependent. ## Programming Note - On some implementations, software must prevent the execution of a *tlbie* instruction on any processor for which the contents of the LPIDR is the same as on the processor on which the LPIDR is being modified or is the same as the new value being written to the LPIDR. This restriction can be met with less effort if one partition identity is used only on processors on which no *tlbie* instruction is ever executed. This partition can be thought of as the transfer partition used exclusively to move a processor from one partition to another. # 2.6 Processor Compatibility Register (PCR) The layout of the Processor Compatibility Register (PCR) is shown in Figure 5 below. Figure 5. Processor Compatibility Register Each defined bit in the PCR controls whether certain instructions, SPRs, and other related facilities are available in problem state. The PCR has no effect on facilities when the processor is not in problem state. Facilities that are made unavailable by the PCR are treated as follows when the processor is in problem state. - Instructions are treated as illegal instructions, - SPRs are treated as if they were not defined for the implementation - Fields in instructions are treated as reserved. A PCR bit may also determine how an instruction field value is interpreted or may define other behavior as specified in the bit definitions below. The PCR has no effect on the setting of the MSR by interrupts or instructions such as *rfid*, *mtmsrd*. When facilities that have enable bits in the MSR are made unavailable by the value in the PCR (e.g. Vector and Decimal Floating-Point Facility), they become unavailable as specified above regardless of whether they are enabled by the corresponding MSR bit. The bit definitions for the PCR are shown below. ## Bit Description ## 0 Vector [Category: Vector] This bit controls the availability, in problem state, of the instructions and facilities in the Vector category. The instructions and facilities in the Vector category are available in problem state. The instructions and facilities in the Vector category are unavailable in problem state. #### 1:62 Reserved ## 63 Version 2.04 (v2.04) This bit controls the availability, in problem state, of the instructions and related resources that were new in the version of the architecture subsequent to Version 2.04. - Decimal Floating-point instructions and facilities [Category: Decimal Floating-Point] - cmpb - fcpsgn - Ifdp,Ifdpx,stfdp,stfdpx - Ifiwax - prtyd, prtyw - W field in the *mtfsfi* instruction - L and W fields in the mtfsf instruction - 0 The listed instructions and related resources are available in problem state - The listed instructions and related resources are unavailable in problem state The initial state of the PCR is all 0s. ### Programming Note - Treating the W field in the *mtfsfi* instruction and the L and W fields in the *mtfsf* instruction as 0s has the effect of making FPSCR<sub>0:31</sub> unavailable. ## Programming Note - Since the PCR has no effect on privileged instructions, privileged instructions that are available on newer processors but not available on older processors will behave differently when the processor is in problem state. On the older processors, an Illegal Instruction type Program interrupt will occur since the instruction is undefined; on the newer processor, a Privileged Instruction type Program interrupt will occur since the instruction is implemented. In future versions of the architecture, in general the lowest-order reserved bit of the PCR will be used to control the availability of the instructions and related resources that are new in that version of the architecture; the name of the bit will correspond to the previous version of the architecture (i.e. the newest version in which the instructions and related resources were not available). In these future versions of the architecture, there will be a requirement that if any bit of the low-order defined bits is set to 1 then all higher-order bits of the defined low-order bits must also be set to 1, and the architecture version with which the processor appears to comply, in problem state, will be the version corresponding to the name of the lowest-order 1 bit in the set of defined low-order PCR bits, or the current architecture version if none of these bits are 1. Also, in general the highest-order reserved bits will be used to control the availability of sets of instructions and related resources having the requirement that their availability be independent of versions of the architecture. ## 2.7 Other Hypervisor Resources In addition to the resources described above, all hypervisor privileged instructions as well as the following resources are hypervisor resources, accessible to software only when the processor is in hypervisor state except as noted below. - All implementation-specific resources, including implementation-specific registers (e.g., "HID" registers), that control hardware functions or affect the results of instruction execution. Examples include resources that disable caches, disable hardware error detection, set breakpoints, control power management, or significantly affect performance. - ME bit of the MSR - SPRs defined as hypervisor-privileged in Section 4.4.5. (Note: Although the Time Base, the PURR, and the SPURR can be altered only by a hypervisor program, the Time Base can be read by all programs and the PURR and SPURR can be read when the processor is in privileged state.) The contents of a hypervisor resource can be modified by the execution of an instruction (e.g., mtspr) only in hypervisor state (MSR<sub>HV PR</sub> = 0b10). An attempt to modify the contents of a given hypervisor resource, other than MSR<sub>ME</sub>, in privileged but non-hypervisor state (MSR<sub>HV PR</sub> = 0b00) causes a Privileged Instruction type Program interrupt. An attempt to modify MSR<sub>ME</sub> in privileged but non-hypervisor state is ignored (i.e., the bit is not changed). ## Programming Note Because the SPRs listed above are privileged for writing, an attempt to modify the contents of any of these SPRs in problem state (MSR<sub>PR</sub>=1) using *mtspr* causes a Privileged Instruction type Program exception, and similarly for MSR<sub>ME</sub>. # 2.8 Sharing Hypervisor Resources Some hypervisor resources may be shared among processors. Programs that modify these resources must be aware of this sharing, and must allow for the fact that changes to these resources may affect more than one processor. The following resources may be shared among processors. - RMOR (see Section 2.3.) - HRMOR (see Section 2.4.) - LPIDR (see Section 2.5.) - PVR (see Section 4.3.1.) - SDR1 (see Section 5.7.7.2.) - Time Base (see Section 7.2.) - Hypervisor Decrementer (see Section 7.4.) - HMEER (see Section 6.2.9) - certain implementation-specific registers The set of resources that are shared is implementationdependent. Processors that share any of the resources listed above, with the exception of the PVR and the HRMOR, must be in the same partition. For each field of the LPCR, except the HDICE field and the MER field, software must ensure that the contents of the field are identical among all processors that are in the same partition and are in a state such that the contents of the field could have side effects. (E.g., software must ensure that the contents of LPCR<sub>LPES</sub> are identical among all processors that are in the same partition and are not in hypervisor state.) For the HDICE field, software must ensure that the contents of the field are identical among all processors that share the Hypervisor Decrementer and are in a state such that the contents of the field could have side effects. (There are no identity requirements for the MER field). ## 2.9 Hypervisor Interrupt Little-Endian (HILE) Bit The Hypervisor Interrupt Little-Endian (HILE) bit is a bit in an implementation-dependent register or similar mechanism. The contents of the HILE bit are copied into $MSR_{LE}$ by interrupts that set $MSR_{HV}$ to 1 (see Section 6.5), to establish the Endian mode for the interrupt handler. The HILE bit is set, by an implementation-dependent method, during system initialization, and cannot be modified after system initialization. The contents of the HILE bit must be the same for all processors under the control of a given instance of the hypervisor; otherwise all results are undefined. ## **Chapter 3. Branch Processor** | <ul><li>3.1 Branch Processor Overview 477</li><li>3.2 Branch Processor Registers 477</li></ul> | 3.3.2 Power-Saving Mode Instructions 481 3.3.2.1 Entering and Exiting Power-Sav- | |------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------| | 3.2.1 Machine State Register 477 | ing Mode | | 3.3 Branch Processor Instructions 479 | · · | | 3.3.1 System Linkage Instructions 479 | | ## 3.1 Branch Processor Overview This chapter describes the details concerning the registers and the privileged instructions implemented in the Branch Processor that are not covered in Book I. ## 3.2 Branch Processor Registers ## 3.2.1 Machine State Register The Machine State Register (MSR) is a 64-bit register. This register defines the state of the processor. On interrupt, the MSR bits are altered in accordance with Figure 43 on page 555. The MSR can also be modified by the *mtmsr*[d], *rfid*, and *hrfid* instructions. It can be read by the *mfmsr* instruction. ## Figure 6. Machine State Register Below are shown the bit definitions for the Machine State Register. ## Bit Description - 0 Sixty-Four-Bit Mode (SF) - 0 The processor is in 32-bit mode. - 1 The processor is in 64-bit mode. - 1:2 Reserved - 3 Hypervisor State (HV) - **0** The processor is not in hypervisor state. - 1 If MSR<sub>PR</sub>=0 the processor is in hypervisor state; otherwise the processor is not in hypervisor state. ## Programming Note - The privilege state of the processor is determined by $MSR_{HV}$ and $MSR_{PR}$ , as follows. ## HV PR - 0 0 privileged - 0 1 problem - 1 0 privileged and hypervisor - 1 1 problem MSR<sub>HV</sub> can be set to 1 only by the *System Call* instruction and some interrupts. It can be set to 0 only by *rfid* and *hrfid*. - 4:37 Reserved - 38 Vector Available (VEC) [Category: Vector] - The processor cannot execute any vector instructions, including vector loads, stores, and moves. - The processor can execute vector instructions. - 39:46 Reserved - 47 Reserved - 48 External Interrupt Enable (EE) - External and Decrementer interrupts are disabled. - External and Decrementer interrupts are enabled. This bit also affects whether Hypervisor Decrementer and Hypervisor Maintenance interrupts are enabled; see Section 6.5.12 on page 565 and Section 6.2.9 on page 549. - 49 **Problem State** (PR) - 0 The processor is in privileged state. 1 The processor is in problem state. ## **Programming Note** Any instruction that sets $MSR_{PR}$ to 1 also sets $MSR_{EE}$ , $MSR_{IR}$ , and $MSR_{DR}$ to 1. ## 50 Floating-Point Available (FP) [Category: Floating-Point] - The processor cannot execute any floating-point instructions, including floatingpoint loads, stores, and moves. - The processor can execute floating-point instructions. ## 51 Machine Check Interrupt Enable (ME) - 0 Machine Check interrupts are disabled. - 1 Machine Check interrupts are enabled. This bit is a hypervisor resource; see Chapter 2., "Logical Partitioning (LPAR)", on page 471. ## Programming Note — The only instructions that can alter MSR<sub>ME</sub> are *rfid* and *hrfid*. ## 52 Floating-Point Exception Mode 0 (FE0) [Category: Floating-Point] See below. ### 53 Single-Step Trace Enable (SE) [Category: Trace] - The processor executes instructions normally. - 1 The processor generates a Single-Step type Trace interrupt after successfully completing the execution of the next instruction, unless that instruction is *hrfid* or *rfid*, which are never traced. Successful completion means that the instruction caused no other interrupt. ## 54 **Branch Trace Enable (BE)** [Category: Trace] - The processor executes branch instructions normally. - 1 The processor generates a Branch type Trace interrupt after completing the execution of a branch instruction, whether or not the branch is taken. Branch tracing need not be supported on all implementations that support the Trace category. If the function is not implemented, this bit is treated as reserved. ## 55 Floating-Point Exception Mode 1 (FE1) [Category: Floating-Point] See below. #### 56:57 Reserved ## 58 Instruction Relocate (IR) - Instruction address translation is disabled. - 1 Instruction address translation is enabled. ## Programming Note — See the Programming Note in the definition of MSR<sub>PR</sub>. ### 59 **Data Relocate** (DR) - 0 Data address translation is disabled. Effective Address Overflow (EAO) (see Book I) does not occur. - Data address translation is enabled. EAO causes a Data Storage interrupt. ## Programming Note - See the Programming Note in the definition of $MSR_{PR}$ . #### 60 Reserved ## 61 **Performance Monitor Mark** (PMM) [Category: Server.Performance Monitor] See Appendix B of Book III-S. #### 62 **Recoverable Interrupt** (RI) - 0 Interrupt is not recoverable. - 1 Interrupt is recoverable. Additional information about the use of this bit is given in Sections 6.4.3, "Interrupt Processing" on page 551, 6.5.1, "System Reset Interrupt" on page 556, and 6.5.2, "Machine Check Interrupt" on page 557. ## 63 Little-Endian Mode (LE) - 0 The processor is in Big-Endian mode. - 1 The processor is in Little-Endian mode. #### Programming Note - The only instructions that can alter MSR<sub>LE</sub> are *rfid* and *hrfid*. The Floating-Point Exception Mode bits FE0 and FE1 are interpreted as shown below. For further details see Book I. # FE0 FE1 Mode 0 0 Ignore Exceptions 0 1 Imprecise Nonrecoverable 1 0 Imprecise Recoverable 1 1 Precise #### 3.3 Branch Processor Instructions #### 3.3.1 System Linkage Instructions These instructions provide the means by which a program can call upon the system to perform a service, and by which the system can return from performing a service or from processing an interrupt. The System Call instruction is described in Book I, but only at the level required by an application programmer. A complete description of this instruction appears below. #### System Call SC-form sc LEV ``` SRR0 \leftarrow_{\text{iea}} CIA + 4 SRR1<sub>33:36</sub> 42:47 \leftarrow 0 SRR1<sub>0:32</sub> 37:41 48:63 \leftarrow MSR<sub>0:32</sub> 37:41 48:63 MSR \leftarrow new_value (<u>see below</u>) NIA \leftarrow 0x0000 0000 0000 0000 ``` The effective address of the instruction following the *System Call* instruction is placed into SRR0. Bits 0:32, 37:41, and 48:63 of the MSR are placed into the corresponding bits of SRR1, and bits 33:36 and 42:47 of SRR1 are set to zero. Then a System Call interrupt is generated. The interrupt causes the MSR to be set as described in Section 6.5, "Interrupt Definitions" on page 555. The setting of the MSR is affected by the contents of the LEV field. LEV values greater than 1 are reserved. Bits 0:5 of the LEV field (instruction bits 20:25) are treated as a reserved field. The interrupt causes the next instruction to be fetched from effective address 0x0000 0000 0000 0C00. This instruction is context synchronizing. #### **Special Registers Altered:** SRR0 SRR1 MSR #### Programming Note - If LEV=1 the hypervisor is invoked. If LPES<sub>1</sub>=1, executing this instruction with LEV=1 is the only way that executing an instruction can cause hypervisor state to be entered. Because this instruction is not privileged, it is possible for application software to invoke the hypervisor. However, such invocation should be considered a programming error. #### **Programming Note** sc serves as both a basic and an extended mnemonic. The Assembler will recognize an sc mnemonic with one operand as the basic form, and an sc mnemonic with no operand as the extended form. In the extended form the LEV operand is omitted and assumed to be 0. ### Return From Interrupt Doubleword XL-form rfid | 19 | /// | /// | /// | 18 | / | |----|-----|-----|-----|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | If $MSR_3$ =1 then bits 3 and 51 of SRR1 are placed into the corresponding bits of the MSR. The result of ORing bits 48 and 49 of SRR1 is placed into $MSR_{48}$ . The result of ORing bits 58 and 49 of SRR1 is placed into $MSR_{58}$ . The result of ORing bits 59 and 49 of SRR1 is placed into $MSR_{58}$ . The result of ORing bits 59 and 49 of SRR1 is placed into $MSR_{59}$ . Bits 0:2, 4:32, 37:41, 49:50, 52:57, and 60:63 of SRR1 are placed into the corresponding bits of the MSR. If the new MSR value does not enable any pending exceptions, then the next instruction is fetched, under control of the new MSR value, from the address SRR0<sub>0:61</sub> || 0b00 (when SF=1 in the new MSR value) or <sup>32</sup>0 || SRR0<sub>32:61</sub> || 0b00 (when SF=0 in the new MSR value). If the new MSR value enables one or more pending exceptions, the interrupt associated with the highest priority pending exception is generated; in this case the value placed into SRR0 or HSRR0 by the interrupt processing mechanism (see Section 6.4.3) is the address of the instruction that would have been executed next had the interrupt not occurred. This instruction is privileged and context synchronizing. #### **Special Registers Altered:** **MSR** #### **Programming Note** If this instruction sets $MSR_{PR}$ to 1, it also sets $MSR_{FF}$ , $MSR_{IR}$ , and $MSR_{DR}$ to 1. #### Hypervisor Return From Interrupt Doubleword XL-form hrfid | 19 | /// | /// | /// | 274 | / | |----|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | The result of ORing bits 48 and 49 of HSRR1 is placed into MSR $_{48}$ . The result of ORing bits 58 and 49 of HSRR1 is placed into MSR $_{58}$ . The result of ORing bits 59 and 49 of HSRR1 is placed into MSR $_{59}$ . Bits 0:32, 37:41, 49:57, and 60:63 of HSRR1 are placed into the corresponding bits of the MSR. If the new MSR value does not enable any pending exceptions, then the next instruction is fetched, under control of the new MSR value, from the address HSRR0<sub>0:61</sub> || 0b00 (when SF=1 in the new MSR value) or <sup>32</sup>0 || HSRR0<sub>32:61</sub> || 0b00 (when SF=0 in the new MSR value). If the new MSR value enables one or more pending exceptions, the interrupt associated with the highest priority pending exception is generated; in this case the value placed into SRR0 or HSRR0 by the interrupt processing mechanism (see Section 6.4.3) is the address of the instruction that would have been executed next had the interrupt not occurred. This instruction is hypervisor privileged and context synchronizing. Special Registers Altered: MSR #### **Programming Note** If this instruction sets $MSR_{PR}$ to 1, it also sets $MSR_{EE}$ , $MSR_{IR}$ , and $MSR_{DR}$ to 1. #### 3.3.2 Power-Saving Mode Instructions The power-saving mode instructions provide a means by which the hypervisor can put the processor into power-saving mode. When the processor is in powersaving mode it does not execute instructions, and it may consume less power than it would consume when it is not in power-saving mode. There are four levels of power-savings, called doze, nap, sleep, and rvwinkle. For each level in this list, the power consumed is less than or equal to the power consumed in the preceding level, and the time required for the processor to exit from the level and for software then to resume normal operation is greater than or equal to the corresponding time for the preceding level. Doze power-saving level requires a minimum amount of such time, while the other levels may require more time. Resources other than those listed in the instruction descriptions that are maintained in each level other than doze, and the actions required by the hypervisor in order for software to resume normal operation after the processor exits from power-saving mode, are implementation-specific. Read-only resources are maintained in all power-saving levels. Descriptions of resource state loss in the power-saving mode instruction descriptions do not apply to read-only resources. #### **Programming Note** The hypervisor determines which power-saving level to enter based on how responsive the system needs to be. If the hypervisor decides that some loss of state is acceptable, it can use the *nap* instruction rather than the *doze* instruction, and when the processor exits from power-saving mode the hypervisor can quickly determine whether any resources need to be restored. #### Doze XL-form doze | 19 | /// | /// | /// | 402 | / | |----|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | The processor is placed into doze power-saving level. When the processor is in doze power-saving level, the state of all processor resources is maintained as if the processor was not in power-saving mode. When the interrupt that causes exit from doze powersaving level occurs, resource state is as described in the preceding paragraph, except that if the exception that caused the exit is a System Reset, Machine Check, or Hypervisor Maintenance exception, resource state that would be lost if the exception occurred when the processor was not in power-saving mode may be This instruction is hypervisor privileged and context synchronizing. #### **Special Registers Altered:** None #### Nap XL-form nap The processor is placed into nap power-saving level. When the processor is in nap power-saving level, the state of the Decrementer and all hypervisor resources is maintained as if the processor was not in power-saving mode, and sufficient information is maintained to allow the hypervisor to resume execution. When the interrupt that causes exit from nap powersaving level occurs, resource state is as described in the preceding paragraph, except that if the exception that caused the exit is a System Reset, Machine Check, or Hypervisor Maintenance exception, resource state that would be lost if the exception occurred when the processor was not in power-saving mode may be lost. This instruction is hypervisor privileged and context synchronizing. #### **Special Registers Altered:** None #### Programming Note - If the state of the Decrementer were not maintained and updated as if the processor was not in power-saving mode, Decrementer exceptions would not reliably cause exit from nap power-saving level even if Decrementer exceptions were enabled to cause exit. #### Sleep XL-form sleep The processor is placed into sleep power-saving level. When the processor is in sleep power-saving level, the state of all resources may be lost except for the HRMOR. When the interrupt that causes exit from sleep powersaving level occurs, resource state is as described in the preceding paragraph, except that if the exception that caused the exit is a System Reset, Machine Check, or Hypervisor Maintenance exception, resource state that would be lost if the exception occurred when the processor was not in power-saving mode may be lost. This instruction is hypervisor privileged and context synchronizing. #### **Special Registers Altered:** None #### Programming Note - If the state of the Decrementer is not maintained and updated, in sleep or rvwinkle power-saving level, as if the processor was not in power-saving mode, Decrementer exceptions will not reliably cause exit from power-saving mode even if Decrementer exceptions are enabled to cause exit. #### Note - See the Notes that appear in the *rvwinkle* instruction description. #### Rip Van Winkle XL-form rvwinkle The processor is placed into rvwinkle power-saving level. When the processor is in rvwinkle power-saving level, the state of all resources may be lost except for the HRMOR. When the interrupt that causes exit from rvwinkle power-saving level occurs, resource state is as described in the preceding paragraph, except that if the exception that caused the exit is a System Reset, Machine Check, or Hypervisor Maintenance exception, resource state that would be lost if the exception occurred when the processor was not in power-saving mode may be lost. This instruction is hypervisor privileged and context synchronizing. #### **Special Registers Altered:** None #### Programming Note - In the short story by Washington Irving, Rip Van Winkle is a man who fell asleep on a green knoll and awoke twenty years later. #### Note - See the Notes that appear in the *sleep* instruction description. #### 3.3.2.1 Entering and Exiting Power-Saving Mode In order to enter power-saving mode, the hypervisor must use the instruction sequence shown below. Before executing this sequence, the hypervisor must ensure that LPCR<sub>MER</sub> contains the value 0, the LPCR<sub>PECF</sub> contains the desired value if doze or nap power-saving level is to be entered, MSR<sub>SF</sub>, MSR<sub>HV</sub>, and MSR<sub>ME</sub> contain the value 1, and all other bits of the MSR contain the value 0 except for MSR<sub>RI</sub>, which may contain either 0 or 1. Depending on the implementation and on the power-saving mode being entered, it may also be necessary for the hypervisor to save the state of certain processor resources before entering the sequence. The sequence must be exactly as shown, with no intervening instructions, except that any GPR may be used as Rx and as Ry, and any value may be used for "save\_area" provided the resulting effective address is double-word aligned and corresponds to a valid real address. ``` std Rx, save area(Ry) /* last store neces-*/ /* sary to save state*/ /* order load after*/ ptesync /* last store /* reload from last */ ld Rx,save area(Ry) /* store location, */ /* for synchro- */ /* nization loop: /* create dependency */ cmp Rx, Rx bne loop nap/doze/sleep/rvwinkle /* enter power- */ /* saving mode /* branch to self */ ``` After the processor has entered power-saving mode as specified above, various exceptions may cause exit from power-saving mode. The exceptions include, System Reset, Machine Check, Decrementer, External, Hypevisor Maintence, and implementation-specific exceptions. Upon exit from power-saving mode, if the exception was a Machine Check exception, then a Machine Check interrupt occurs; otherwise a System Reset interrupt occurs, and the contents of SRR1 indicate the type of exception that caused exit from power-saving mode. See to 6.6.1 for additional information. #### **Programming Note** - The ptesync instruction (see Book III-S, Section 5.9.2) in the preceding sequence, in conjunction with the *Id* instruction and the loop, ensure that all storage accesses associated with instructions preceding the *ptesync* instruction, and all Reference, and Change bit updates associated with additional address translations that were performed, by the processor executing the ptesync instruction, before the ptesync instruction is executed, have been performed with respect to all processors and mechanisms, to the extent required by the associated Memory Coherence Required attributes, before the processor enters power-saving mode. The **b** instruction (branch to self) is not executed since the preceding power-saving mode instruction puts the processor in a power-saving mode in which instructions are not executed. Even though it is not executed, requiring it to be present simplifies implementation and testing because it reduces the synchronization needed between execution of the instruction stream and entry into power-saving mode. If the performance monitor is in use when the processor enters power-saving mode, the Performance Monitor data obtainable when the processor exits from power-saving mode may be incomplete or otherwise misleading. #### **Programming Note** Software is not required to set the RI bit to any particular value prior to entering power-saving mode because the setting of SRR1<sub>62</sub> upon exit from power-saving mode is independent of the value of the RI bit upon entry into power-saving mode. ### **Chapter 4. Fixed-Point Processor** | I | <ul><li>4.1 Fixed-Point Processor Overview . 487</li><li>4.2 Special Purpose Registers 487</li></ul> | | Fixed-Point Load and Store Quad-<br>Instructions [Category: Load/Store | |---|------------------------------------------------------------------------------------------------------------|---|------------------------------------------------------------------------| | | 4.3 Fixed-Point Processor Registers . 487 | | dword]493 | | | <ul><li>4.3.1 Processor Version Register 487</li><li>4.3.2 Processor Identification Register 487</li></ul> | | Binary Coded Decimal (BCD) Assise Instructions [Category: BCD Assis- | | | 4.3.3 Control Register 488 | | e] | | | 4.3.4 Program Priority Register 488 | | OR Instruction 496 | | | <ul><li>4.3.5 Software-use SPRs</li></ul> | | Move To/From System Register uctions | | | Inhibited Instructions | I | | ## 4.1 Fixed-Point Processor Overview This chapter describes the details concerning the registers and the privileged instructions implemented in the Fixed-Point Processor that are not covered in Book I. #### 4.2 Special Purpose Registers Special Purpose Registers (SPRs) are read and written using the *mfspr* (page 500) and *mtspr* (page 499) instructions. Most SPRs are defined in other chapters of this book; see the index to locate those definitions. ## 4.3 Fixed-Point Processor Registers #### 4.3.1 Processor Version Register The Processor Version Register (PVR) is a 32-bit readonly register that contains a value identifying the version and revision level of the processor. The contents of the PVR can be copied to a GPR by the *mfspr* instruction. Read access to the PVR is privileged; write access is not provided. Figure 7. Processor Version Register The PVR distinguishes between processors that differ in attributes that may affect software. It contains two fields. Version A 16-bit number that identifies the version of the processor. Different version numbers indicate major differences between processors, such as which categories are supported. Revision A 16-bit number that distinguishes between implementations of the version. Different revision numbers indicate minor differences between processors having the same version number, such as clock rate and Engineering Change level. Version numbers are assigned by the Power ISA process. Revision numbers are assigned by an implementation-defined process. ## 4.3.2 Processor Identification Register The Processor Identification Register (PIR) is a 32-bit register that contains a value that can be used to distinguish the processor from other processors in the system. The contents of the PIR can be copied to a GPR by the *mfspr* instruction. Read access to the PIR is privileged; write access, if provided, is hypervisor privi- leged. It is implementation-dependent whether write access is provided.. | Bits | Name | Description | |------|--------|--------------| | 0:31 | PROCID | Processor ID | Figure 8. Processor Identification Register The means by which the PIR is initialized are implementation-dependent. The PIR is a hypervisor resource; see Chapter 2. #### 4.3.3 Control Register The Control Register (CTRL) is a 32-bit register that controls an external I/O pin. This signal may be used for the following: - driving the RUN Light on a system operator panel - Direct External exception routing - Performance Monitor Counter incrementing (see Appendix B) | Bit | Name | Description | |-----|------|---------------| | 63 | RUN | Run state bit | All other fields are implementation-dependent. #### Figure 9. Control Register The CTRL RUN can be used by the operating system to indicate when the processor is doing useful work. The contents of the CTRL can be written by the *mtspr* instruction and read by the *mfspr* instruction. Write access to the CTRL is privileged. Reads can be performed in privileged or problem state. #### 4.3.4 Program Priority Register The Program Priority Register (PPR) is a 64-bit register that controls the program's priority. The layout of the PPR is shown in Figure 10. A subset of the PRI values may be set by problem state programs (see Section 3.2.3 of Book I). #### Bit(s) Description #### 11:13 Program Priority (PRI) 001 very low 010 low 011 medium low 100 medium (normal) 101 medium high 110 high 111 very high 44:63 Implementation-specific Figure 10. Program Priority Register #### 4.3.5 Software-use SPRs Software-use SPRs are 64-bit registers provided for use by software. | | SPRG0 | | |---|-------|----| | | SPRG1 | | | | SPRG2 | | | | SPRG3 | | | 0 | | 63 | Figure 11. Software-use SPRs SPRG0, SPRG1, and SPRG2 are privileged registers. SPRG3 is a privileged register except that the contents may be copied to a GPR in Problem state when accessed using the *mfspr* instruction. #### - Programming Note Neither the contents of the SPRGs, nor accessing them using *mtspr* or *mfspr*, has a side effect on the operation of the processor. One or more of the registers is likely to be needed by non-hypervisor interrupt handler programs (e.g., as scratch registers and/or pointers to per processor save areas). Operating systems must ensure that no sensitive data are left in SPRG3 when a problem state program is dispatched, and operating systems for secure systems must ensure that SPRG3 cannot be used to implement a "covert channel" between problem state programs. These requirements can be satisfied by clearing SPRG3 before passing control to a program that will run in problem state. HSPRG0 and HSPRG1 are 64-bit registers provided for use by hypervisor programs. | | HSPRG0 | | |---|--------|----| | | HSPRG1 | | | 0 | | 63 | Figure 12. SPRs for use by hypervisor programs #### Programming Note Neither the contents of the HSPRGs, nor accessing them using *mtspr* or *mfspr*, has a side effect on the operation of the processor. One or more of the registers is likely to be needed by hypervisor interrupt handler programs (e.g., as scratch registers and/or pointers to per processor save areas). #### 4.4 Fixed-Point Processor Instructions #### 4.4.1 Fixed-Point Load and Store Caching Inhibited Instructions The storage accesses caused by the instructions described in this section are performed as though the specified storage location is Caching Inhibited and Guarded. The instructions can be executed only in hypervisor state. If any of the following restrictions on execution of these instructions (while in hypervisor state) are violated, the results are undefined. - They must be executed only when MSR<sub>DR</sub>=0. - The specified storage location must not be in storage specified by the Real Mode Storage Control facility to be treated as non-Guarded. - Software must ensure that the specified storage location is not in the caches. #### **Programming Note** The instructions described in this section can be used to permit a control register on an I/O device to be accessed without permitting the corresponding storage location to be copied into the caches. See also, in Book I, the introductions to Section 3.3.1, "Fixed-Point Storage Access Instructions", Section 3.3.2, "Fixed-Point Load Instructions", and Section 3.3.3, "Fixed-Point Store Instructions". ### Load Byte and Zero Caching Inhibited Indexed X-form Ibzcix RT,RA,RB | 31 | RT | RA | RB | 853 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) RT \leftarrow 560 | MEM(EA, 1) ``` Let the effective address (EA) be the sum (RA|0)+ (RB). The byte in storage addressed by EA is loaded into $RT_{56:63}$ . $RT_{0:55}$ are set to 0. The storage access caused by this instruction is performed as though the specified storage location is Caching Inhibited and Guarded. This instruction is hypervisor privileged. #### Special Registers Altered: None ### Load Word and Zero Caching Inhibited Indexed X-form lwzcix RT,RA,RB | 31 | D | Δ. | DR | 780 | / | |----|------|-----|-----|-----|----| | 31 | 17.1 | 11/ | IVD | 109 | / | | 0 | c | 11 | 16 | 24 | 24 | | U | O | 111 | 10 | Z I | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) RT \leftarrow <sup>32</sup>0 | MEM(EA, 4) ``` Let the effective address (EA) be the sum (RA|0)+ (RB). The word in storage addressed by EA is loaded into $RT_{32:63}$ . $RT_{0:31}$ are set to 0. The storage access caused by this instruction is performed as though the specified storage location is Caching Inhibited and Guarded. This instruction is hypervisor privileged. #### **Special Registers Altered:** None ## Load Halfword and Zero Caching Inhibited Indexed X-form Ihzcix RT,RA,RB | Ī | 31 | RT | RA | RB | 821 | / | |---|----|----|----|----|-----|----| | | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) RT \leftarrow ^{48}0 | MEM(EA, 2) ``` Let the effective address (EA) be the sum (RA|0)+ (RB). The halfword in storage addressed by EA is loaded into $RT_{48:63}$ . $RT_{0:47}$ are set to 0. The storage access caused by this instruction is performed as though the specified storage location is Caching Inhibited and Guarded. This instruction is hypervisor privileged. #### **Special Registers Altered:** None ### Load Doubleword Caching Inhibited Indexed X-form Idcix RT,RA,RB | 31 | RT | RA | RB | 885 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) RT \leftarrow MEM(EA, 8) ``` Let the effective address (EA) be the sum (RA|0)+ (RB). The doubleword in storage addressed by EA is loaded into RT. The storage access caused by this instruction is performed as though the specified storage location is Caching Inhibited and Guarded. This instruction is hypervisor privileged. #### **Special Registers Altered:** ### Store Byte Caching Inhibited Indexed X-form stbcix RS,RA,RB | 3 | | RS RA | A RI | 3 98 | 31 / | |---|---|-------|------|------|------| | 0 | 6 | 11 | 16 | 21 | 31 | if RA = 0 then b $$\leftarrow$$ 0 else b $\leftarrow$ (RA) EA $\leftarrow$ b + (RB) MEM(EA, 1) $\leftarrow$ (RS)<sub>56:63</sub> Let the effective address (EA) be the sum (RA|0)+ (RB). (RS) $_{56:63}$ are stored into the byte in storage addressed by EA. The storage access caused by this instruction is performed as though the specified storage location is Caching Inhibited and Guarded. This instruction is hypervisor privileged. #### Special Registers Altered: None ### Store Halfword Caching Inhibited Indexed sthcix RS,RA,RB | 31 | RS | RA | RB | 949 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | if RA = 0 then b $$\leftarrow$$ 0 else b $\leftarrow$ (RA) EA $\leftarrow$ b + (RB) MEM(EA, 2) $\leftarrow$ (RS)<sub>48:63</sub> Let the effective address (EA) be the sum (RA|0)+ (RB). (RS) $_{48:63}$ are stored into the halfword in storage addressed by EA. The storage access caused by this instruction is performed as though the specified storage location is Caching Inhibited and Guarded. This instruction is hypervisor privileged. #### **Special Registers Altered:** None ## Store Word Caching Inhibited Indexed X-form stwcix RS,RA,RB | | 31 | RS | RA | RB | 917 | / | |---|----|----|----|----|-----|----| | 0 | | 6 | 11 | 16 | 21 | 31 | if RA = 0 then b $$\leftarrow$$ 0 else b $\leftarrow$ (RA) EA $\leftarrow$ b + (RB) MEM(EA, 4) $\leftarrow$ (RS)<sub>32:63</sub> Let the effective address (EA) be the sum (RA|0)+(RB). $(RS)_{32:63}$ are stored into the word in storage addressed by EA. The storage access caused by this instruction is performed as though the specified storage location is Caching Inhibited and Guarded. This instruction is hypervisor privileged. #### **Special Registers Altered:** None ### Store Doubleword Caching Inhibited Indexed X-form stdcix RS,RA,RB | Ī | 31 | RS | RA | RB | 1013 | / | |---|----|----|----|----|------|----| | ( | ) | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) MEM(EA, 8) \leftarrow (RS) ``` Let the effective address (EA) be the sum (RA|0)+ (RB). (RS) is stored into the doubleword in storage addressed by EA. The storage access caused by this instruction is performed as though the specified storage location is Caching Inhibited and Guarded. This instruction is hypervisor privileged. #### **Special Registers Altered:** ## 4.4.2 Fixed-Point Load and Store Quadword Instructions [Category: Load/ Store Quadword] /// 28 31 # Load QuadwordDQ-formStore QIqRTp,DQ(RA)stq DQ if RA = 0 then $$b \leftarrow 0$$ else $b \leftarrow (RA)$ EA $\leftarrow b + EXTS(DQ \mid \mid 0b0000)$ RTp $\leftarrow MEM(EA, 8)$ 11 RA RTp 56 Let the effective address (EA) be the sum (RA|0)+ (DQ||0b0000). The quadword in storage addressed by EA is loaded into register pair RTp. 16 EA must be a multiple of 16. If it is not, an Alignment interrupt occurs. If RTp is odd or RTp=RA, the instruction form is invalid. If RTp=RA, an attempt to execute this instruction causes an Illegal Instruction type Program interrupt. (The RTp=RA case includes the case of RTp=RA=0.) This instruction is not supported in Little-Endian mode. Execution of this instruction in Little-Endian mode causes either an Alignment interrupt or the results are boundedly undefined. This instruction is privileged. #### **Special Registers Altered:** None #### Store Quadword DS-form RSp,DS(RA) 62 RSp RA DS 2 0 6 11 16 30 31 if RA = 0 then $$b \leftarrow 0$$ else $b \leftarrow (RA)$ EA $\leftarrow b + EXTS(DS \mid \mid 0b00)$ MEM(EA, 8) $\leftarrow RSp$ Let the effective address (EA) be the sum (RA|0)+ (DS||0b00). The contents of register pair RSp are stored into the quadword in storage addressed by EA. EA must be a multiple of 16. If it is not, an Alignment interrupt occurs. If RSp is odd, the instruction form is invalid. This instruction is not supported in Little-Endian mode. Execution of this instruction in Little-Endian mode causes either an Alignment interrupt or the results are boundedly undefined. This instruction is privileged. #### **Special Registers Altered:** ## 4.4.3 Binary Coded Decimal (BCD) Assistance Instructions [Category: BCD Assistance] The Binary Coded Decimal Assist instructions operate on Binary Coded Decimal operands (cbcdtd and addg6s) and Decimal Floating-Point operands (cdtbcd) (see Chapter 5 of Book I - III). ## Convert Declets To Binary Coded Decimal X-form cdtbcd RA, RS ``` do i = 0 to 1 n \leftarrow i \times 32 RA_{n+0:n+7} \leftarrow 0 RA_{n+8:n+19} \leftarrow DEC\_TO\_BCD((RS)_{n+12:n+21}) RA_{n+20:n+31} \leftarrow DEC\_TO\_BCD((RS)_{n+22:n+31}) ``` The low-order 20 bits of each word of register RS contain two declets which are converted to six 4-bit BCD fields, and the result is placed into the low-order 24 bits of the corresponding word in RA. The high-order 8 bits in each word of RA are set to 0. This instruction is hypervisor privileged. #### **Special Registers Altered:** None #### Convert Binary Coded Decimal To Declets X-form cbcdtd RA, RS ``` do i = 0 to 1 n \leftarrow i \times 32 RA_{n+0:n+11} \leftarrow 0 RA_{n+12:n+21} \leftarrow BCD\_TO\_DEC((RS)_{n+8:n+19}) RA_{n+22:n+31} \leftarrow BCD\_TO\_DEC((RS)_{n+20:n+31}) ``` The low-order 24 bits of each word of register RS contain six 4-bit BCD fields which are converted to two declets, and the result is placed into the low-order 20 bits of the corresponding word in RA. The high-order 12 bits in each word of RA are set to 0. If a 4-bit BCD field has a value greater than 9, the results are undefined. This instruction is hypervisor privileged. #### **Special Registers Altered:** #### Add and Generate Sixes XO-form addg6s RT,RA,RB | 31 | RT | RA | RB | / | 74 | / | |----|----|----|----|-----|----|----| | 0 | 6 | 11 | 16 | 212 | 2 | 31 | The contents of register RA are added to the contents of register RB. The carry out of decimal digit position n (bit position $4\times n$ ) produces 16 carry bits. A doubleword is composed from the 16 carry bits, and placed into RT. The doubleword consists of a decimal six (0b0110) in every decimal digit position for which the corresponding carry bit is 0, and a zero (0b0000) in every position for which the corresponding carry bit is 1. This instruction is hypervisor privileged. #### **Special Registers Altered:** None #### Programming Note - See Appendix E of Book III-S for programming examples using *add6gs*. #### 4.4.4 OR Instruction or Rx,Rx,Rx can be used to set PPR<sub>PRI</sub> (see Section 4.3.4) as shown in Figure 13. PPR<sub>PRI</sub> remains unchanged if the privilege state of the processor executing the instruction is lower than the privilege indicated in the figure. (The encodings available to application programs are also shown in Book I.) | Rx | PPR <sub>PRI</sub> | Priority | Privileged | |----|--------------------|-----------------|------------| | 31 | 001 | very low | yes | | 1 | 010 | low | no | | 6 | 011 | medium low | no | | 2 | 100 | medium (normal) | no | | 5 | 101 | medium high | yes | | 3 | 110 | high | yes | | 7 | 111 | very high | hypv | Figure 13. Priority levels for or Rx,Rx,Rx ## 4.4.5 Move To/From System Register Instructions The Move To Special Purpose Register and Move From Special Purpose Register instructions are described in Book I, but only at the level available to an application programmer. For example, no mention is made there of registers that can be accessed only in privileged state. The descriptions of these instructions given below extend the descriptions given in Book I, but do not list Special Purpose Registers that are implementation-dependent. In the descriptions of these instructions given below, the "defined" SPR numbers are the SPR numbers shown in the figure for the instruction and the implementation-specific SPR numbers that are implemented, and similarly for "defined" registers. #### **Extended mnemonics** Extended mnemonics are provided for the *mtspr* and *mfspr* instructions so that they can be coded with the SPR name as part of the mnemonic rather than as a numeric operand. See Appendix A, "Assembler Extended Mnemonics" on page 589. #### Figure 14. SPR encodings (Sheet 1 of 2) | | doolmal | SPR <sup>1</sup> | Degister Name | Privile | eged | Length | Cat <sup>2</sup> | |---|---------|---------------------------------------|---------------|-------------------|-------------------|--------|------------------| | | decimal | spr <sub>5:9</sub> spr <sub>0:4</sub> | Register Name | mtspr | mfspr | (bits) | Cat | | | 1 | 00000 00001 | XER | no | no | 64 | В | | | 8 | 00000 01000 | LR | no | no | 64 | В | | | 9 | 00000 01001 | CTR | no | no | 64 | В | | | 17 | 00000 10001 | DSCR | yes | yes | 64 | S | | _ | 18 | 00000 10010 | DSISR | yes | yes | 32 | S | | | 19 | 00000 10011 | DAR | yes | yes | 64 | S | | | 22 | 00000 10110 | DEC | yes | yes | 32 | В | | | 25 | 00000 11001 | SDR1 | hypv <sup>3</sup> | hypv <sup>3</sup> | 64 | S | | - | 26 | 00000 11010 | SRR0 | yes | yes | 64 | В | | | 27 | 00000 11011 | SRR1 | yes | yes | 64 | В | | | 28 | 00000 11100 | CFAR | yes | yes | 64 | S | | _ | 29 | 00000 11101 | AMR | yes | yes | 64 | S | | | 136 | 00100 01000 | CTRL | - | no | 32 | S | | | 152 | 00100 11000 | CTRL | yes | - | 32 | S | | | 256 | 01000 00000 | VRSAVE | no | no | 32 | V | | | 259 | 01000 00011 | SPRG3 | - | no | 64 | В | | | 268 | 01000 01100 | TB | - | no | 64 | В | | | 269 | 01000 01100 | TBU | - | no | 32 | В | | | 272-275 | 01000 100xx | SPRG[0-3] | yes | yes | 64 | В | | | 282 | 01000 11010 | EAR | hypv <sup>3</sup> | hypv <sup>3</sup> | 32 | EC | | _ | 284 | 01000 11100 | TBL | hypv <sup>3</sup> | - | 32 | В | | | 285 | 01000 11101 | TBU | hypv <sup>3</sup> | - | 32 | В | | | 286 | 01000 11110 | TBU40 | hypv | - | 64 | S | | | 287 | 01000 11111 | PVR | | yes | 32 | В | | | 304 | 01001 10000 | HSPRG0 | hypv <sup>3</sup> | hypv <sup>3</sup> | 64 | S | | | 305 | 01001 10001 | HSPRG1 | hypv <sup>3</sup> | hypv <sup>3</sup> | 64 | S | | | 306 | 01001 10010 | HDSISR | hypv <sup>3</sup> | hypv <sup>3</sup> | 32 | В | | | 307 | 01001 10011 | HDAR | hypv <sup>3</sup> | hypv <sup>3</sup> | 64 | В | | | 308 | 01001 10100 | SPURR | hypv <sup>3</sup> | yes | 64 | S | | _ | 309 | 01001 10101 | PURR | hypv <sup>3</sup> | yes | 64 | S | | | 310 | 01001 10110 | HDEC | hypv <sup>3</sup> | hypv <sup>3</sup> | 32 | S | | | 312 | 01001 11000 | RMOR | hypv <sup>3</sup> | hypv <sup>3</sup> | 64 | S | | | 313 | 01001 11001 | HRMOR | hypv <sup>3</sup> | hypv <sup>3</sup> | 64 | S | | | 314 | 01001 11010 | HSRR0 | hypv <sup>3</sup> | hypv <sup>3</sup> | 64 | S | | | 315 | 01001 11011 | HSRR1 | hypv <sup>3</sup> | hypv <sup>3</sup> | 64 | S | Figure 14. SPR encodings (Sheet 2 of 2) | decimal | SPR <sup>1</sup> | Pogistor Namo | Privile | eged | Length | Cat <sup>2</sup> | |---------|---------------------------------------|---------------|---------------------|-------------------|--------|------------------| | decimai | spr <sub>5:9</sub> spr <sub>0:4</sub> | Register Name | mtspr | mfspr | (bits) | Cal | | 318 | 01001 11110 | LPCR | hypv <sup>3</sup> | hypv <sup>3</sup> | 64 | S | | 319 | 01001 11111 | LPIDR | hypv <sup>3</sup> | hypv <sup>3</sup> | 32 | S | | 336 | 01010 10000 | HMER | hypv <sup>3,4</sup> | hypv <sup>3</sup> | 64 | S | | 337 | 01010 10001 | HMEER | hypv <sup>3</sup> | hypv <sup>3</sup> | 64 | S | | 338 | 01010 10010 | PCR | hypv <sup>3</sup> | hypv <sup>3</sup> | 64 | S | | 339 | 01010 10011 | HEIR | hypv <sup>3</sup> | hypv <sup>3</sup> | 32 | HEA | | 512 | 10000 00000 | SPEFSCR | no | no | 32 | SP | | 526 | 10000 01110 | ATB/ATBL | - | no | 64 | ATB | | 527 | 10000 01111 | ATBU | - | no | 32 | ATB | | 768-783 | 11000 0xxxx | perf_mon | - | no | 64 | S.PM | | 784-799 | 11000 1xxxx | perf_mon | varies | yes | 64 | S.PM | | 896 | 11100 00000 | PPR | no | no | 64 | S | | 1013 | 11111 10101 | DABR | hypv <sup>3</sup> | hypv <sup>3</sup> | 64 | S | | 1015 | 11111 10111 | DABRX | hypv <sup>3</sup> | hypv <sup>3</sup> | 64 | S | | 1023 | 11111 11111 | PIR | - | yes | 32 | S | <sup>-</sup> This register is not defined for this instruction. All SPR numbers that are not shown above and are not implementation-specific are reserved. Note that the order of the two 5-bit halves of the SPR number is reversed. <sup>&</sup>lt;sup>2</sup> See Section 1.3.5 of Book I. This register is a hypervisor resource, and can be modified by this instruction only in hypervisor state (see Chapter 2). <sup>&</sup>lt;sup>4</sup> This register cannot be directly written. Instead, bits in the register corresponding to 0 bits in (RS) can be cleared using *mtspr SPR,RS*. #### Move To Special Purpose Register XFX-form mtspr SPR,RS | 31 | RS | spr | 467 | / | |----|----|-----|-----|----| | 0 | 6 | 11 | 21 | 31 | ``` n \leftarrow \operatorname{spr}_{5:9} \mid \mid \operatorname{spr}_{0:4} if length(SPR(n)) = 64 then if n = 336 then \operatorname{SPR}(336) \leftarrow (\operatorname{SPR}(336)) \& (\operatorname{RS}) else \operatorname{SPR}(n) \leftarrow (\operatorname{RS}) else \operatorname{SPR}(n) \leftarrow (\operatorname{RS})_{32:63} ``` The SPR field denotes a Special Purpose Register, encoded as shown in Figure 14. The contents of register RS are placed into the designated Special Purpose Register, except in the case described below. For Special Purpose Registers that are 32 bits long, the low-order 32 bits of RS are placed into the SPR. When the designated SPR is the Hypervisor Maintenance Exception Register (HMER), the contents of register RS are ANDed with the contents of the HMER and the result is placed into the HMER. For this instruction, SPRs TBL and TBU are treated as separate 32-bit registers; setting one leaves the other unaltered. $\rm spr_0=1$ if and only if writing the register is privileged. Execution of this instruction specifying an SPR number with $\rm spr_0=1$ causes a Privileged Instruction type Program interrupt when $\rm MSR_{PR}=1$ and, if the SPR is a hypervisor resource (see Figure), when $\rm MSR_{HV\ PR}=0b00$ except that when $\rm MSR_{HV\ PR}=0b00$ no operation occurs for SPRs 791 or 792 on proceessors in which these registers are hypervisor privileged. Execution of this instruction specifying an SPR number that is not defined for the implementation causes either an Illegal Instruction type Program interrupt or one of the following. - if spr<sub>0</sub>=0: - if MSR<sub>PR</sub>=1: Hypervisor Emulation Assistance interrupt if Category: HEA is supported; otherwise Illegal Instruction type Program interrupt - if MSR<sub>PR</sub>=0: Ilegal Instruction type Program interrupt for SPR 0 and no operation (i.e. the instruction is treated as a no-op) for all other SPRs - if spr<sub>0</sub>=1: - if MSR<sub>PR</sub>=1: Privileged Instruction type Program interrupt - if MSR<sub>PR</sub>=0: no operation (i.e. the instruction is treated as a no-op) If the SPR number is set to a value that is shown in Figure but corresponds to an optional Special Purpose Register that is not provided by the implementation, the effect of executing this instruction is the same as if the SPR number were reserved. #### **Special Registers Altered:** See Figure #### Programming Note For a discussion of software synchronization requirements when altering certain Special Purpose Registers, see Chapter 10. "Synchronization Requirements for Context Alterations" on page 585. ### Move From Special Purpose Register XFX-form mfspr RT,SPR | 31 | RT | spr | 339 | / | |----|----|-----|-----|----| | 0 | 6 | 11 | 21 | 31 | ``` \begin{array}{ll} n \leftarrow spr_{5:9} \mid \mid spr_{0:4} \\ \text{if length}(SPR(n)) = 64 \text{ then} \\ RT \leftarrow SPR(n) \\ \text{else} \\ RT \leftarrow \begin{array}{ll} 32_0 \mid \mid SPR(n) \end{array} \end{array} ``` The SPR field denotes a Special Purpose Register, encoded as shown in Figure . The contents of the designated Special Purpose Register are placed into register RT. For Special Purpose Registers that are 32 bits long, the low-order 32 bits of RT receive the contents of the Special Purpose Register and the high-order 32 bits of RT are set to zero. $\rm spr_0{=}1$ if and only if reading the register is privileged. Execution of this instruction specifying an SPR number with $\rm spr_0{=}1$ causes a Privileged Instruction type Program interrupt when MSR\_{PR}{=}1 and, if the SPR is a hypervisor resource (see Figure), when MSR\_{PR} $_{\rm HV}{=}0b00$ . Execution of this instruction specifying an SPR number that is not defined for the implementation causes either an Illegal Instruction type Program interrupt or one of the following. - if $spr_0=0$ : - if MSR<sub>PR</sub>=1: Hypervisor Emulation Assistance interrupt if Category: HEA is supported; otherwise Illegal Instruction type Program interrupt - if MSR<sub>PR</sub>=0:llegal Instruction type Program interrupt for SPRs 0, 4, 5, and 6 and no operation (i.e. the instruction is treated as a no-op) for all other SPRs - $\blacksquare$ if spr<sub>0</sub>=1: - if MSR<sub>PR</sub>=1: Privileged Instruction type Program interrupt - if MSR<sub>PR</sub>=0: "no-op" If the SPR field contains a value that is shown in Figure but corresponds to an optional Special Purpose Register that is not provided by the implementation, the effect of executing this instruction is the same as if the SPR number were reserved. #### **Special Registers Altered:** None #### Note See the Notes that appear with *mtspr*. #### Move To Machine State Register X-form mtmsr RS,L | 31 | RS | /// | L | /// | 146 | / | |----|----|-----|----|-----|-----|----| | 0 | 6 | 11 | 15 | 16 | 21 | 31 | The MSR is set based on the contents of register RS and of the L field. #### L=0: The result of ORing bits 48 and 49 of register RS is placed into $MSR_{48}$ . The result of ORing bits 58 and 49 of register RS is placed into $MSR_{58}$ . The result of ORing bits 59 and 49 of register RS is placed into $MSR_{59}$ . Bits 32:47, 49:50, 52:57, and 60:62 of register RS are placed into the corresponding bits of the MSR. #### L=1: Bits 48 and 62 of register RS are placed into the corresponding bits of the MSR. The remaining bits of the MSR are unchanged. This instruction is privileged. If L=0 this instruction is context synchronizing. If L=1 this instruction is execution synchronizing; in addition, the alterations of the EE and RI bits take effect as soon as the instruction completes. #### **Special Registers Altered:** **MSR** Except in the *mtmsr* instruction description in this section, references to "*mtmsr*" in this document imply either L value unless otherwise stated or obvious from context (e.g., a reference to an *mtmsr* instruction that modifies an MSR bit other than the EE or RI bit implies L=0). #### Programming Note If this instruction sets $MSR_{PR}$ to 1, it also sets $MSR_{EE}$ , $MSR_{IR}$ , and $MSR_{DR}$ to 1. This instruction does not alter $MSR_{ME}$ or $MSR_{LE}$ . (This instruction does not alter $MSR_{HV}$ because it does not alter any of the high-order 32 bits of the MSR.) If the only MSR bits to be altered are MSR<sub>EE RI</sub>, to obtain the best performance L=1 should be used. #### **Programming Note** - If MSR<sub>EE</sub>=0 and an External or Decrementer exception is pending, executing an *mtmsr* instruction that sets MSR<sub>EE</sub> to 1 will cause the External or Decrementer interrupt to occur before the next instruction is executed, if no higher priority exception exists (see Section 6.8, "Interrupt Priorities" on page 571). Similarly, if a Hypervisor Decrementer interrupt is pending, execution of the instruction by the hypervisor causes a Hypervisor Decrementer interrupt to occur if HDICE=1. For a discussion of software synchronization requirements when altering certain MSR bits, see Chapter 10. #### - Programming Note **mtmsr** serves as both a basic and an extended mnemonic. The Assembler will recognize an **mtmsr** mnemonic with two operands as the basic form, and an **mtmsr** mnemonic with one operand as the extended form. In the extended form the L operand is omitted and assumed to be 0. #### Programming Note - There is no need for an analogous version of the *mfmsr* instruction, because the existing instruction copies the entire contents of the MSR to the selected GPR. #### Move To Machine State Register Doubleword #### X-form mtmsrd RS,L | 31 | RS | /// | L | /// | 178 | / | |----|----|-----|----|-----|-----|----| | 0 | 6 | 11 | 15 | 16 | 21 | 31 | if L = 0 then The MSR is set based on the contents of register RS and of the L field. #### L=0: The result of ORing bits 48 and 49 of register RS is placed into MSR $_{48}$ . The result of ORing bits 58 and 49 of register RS is placed into MSR $_{58}$ . The result of ORing bits 59 and 49 of register RS is placed into MSR $_{59}$ . Bits 0:2, 4:47, 49:50, 52:57, and 60:62 of register RS are placed into the corresponding bits of the MSR. #### L=1: Bits 48 and 62 of register RS are placed into the corresponding bits of the MSR. The remaining bits of the MSR are unchanged. This instruction is privileged. If L=0 this instruction is context synchronizing. If L=1 this instruction is execution synchronizing; in addition, the alterations of the EE and RI bits take effect as soon as the instruction completes. #### **Special Registers Altered:** **MSR** Except in the *mtmsrd* instruction description in this section, references to "*mtmsrd*" in this document imply either L value unless otherwise stated or obvious from context (e.g., a reference to an *mtmsrd* instruction that modifies an MSR bit other than the EE or RI bit implies L=0). #### Programming Note If this instruction sets ${\rm MSR}_{\rm PR}$ to 1, it also sets ${\rm MSR}_{\rm EE}$ , ${\rm MSR}_{\rm IR}$ , and ${\rm MSR}_{\rm DR}$ to 1. This instruction does not alter $\mathsf{MSR}_\mathsf{LE}$ , $\mathsf{MSR}_\mathsf{ME}$ or $\mathsf{MSR}_\mathsf{HV}$ . If the only MSR bits to be altered are MSR<sub>EE RI</sub>, to obtain the best performance L=1 should be used. #### Programming Note If MSR<sub>EE</sub>=0 and an External or Decrementer exception is pending, executing an *mtmsrd* instruction that sets MSR<sub>EE</sub> to 1 will cause the External or Decrementer interrupt to occur before the next instruction is executed, if no higher priority exception exists (see Section 6.8, "Interrupt Priorities" on page 571). Similarly, if a Hypervisor Decrementer interrupt is pending, execution of the instruction by the hypervisor causes a Hypervisor Decrementer interrupt to occur if HDICE=1. For a discussion of software synchronization requirements when altering certain MSR bits, see Chapter 10. #### - Programming Note **mtmsrd** serves as both a basic and an extended mnemonic. The Assembler will recognize an **mtmsrd** mnemonic with two operands as the basic form, and an **mtmsrd** mnemonic with one operand as the extended form. In the extended form the L operand is omitted and assumed to be 0. ## Move From Machine State Register X-form mfmsr RT | 31 | RT | /// | /// | 83 | / | |----|----|-----|-----|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | RT ← MSR The contents of the MSR are placed into register RT. This instruction is privileged. #### **Special Registers Altered:** ## **Chapter 5. Storage Control** | 5.1 Overview. 505 5.2 Storage Exceptions 506 5.3 Instruction Fetch 506 5.3.1 Implicit Branch 506 5.3.2 Address Wrapping Combined with Changing MSR Bit SF 506 5.4 Data Access 506 5.5 Performing Operations Out-of-Order 506 5.6 Invalid Real Address 507 5.7 Storage Addressing 508 | 5.7.7.4 Relaxed Page Table Alignment [Category: Server.Relaxed Page Table Alignment] | |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------| | 5.7.1 32-Bit Mode 508 | 5.7.9.3 Storage Protection, Address | | 5.7.2 Virtualized Partition Memory (VPM) | Translation Disabled 526 | | Mode | 5.8 Storage Control Attributes 527 | | 5.7.3 Real And Virtual Real Addressing | 5.8.1 Guarded Storage 527 | | Modes | 5.8.1.1 Out-of-Order Accesses to Guarded | | 5.7.3.1 Hypervisor Offset Real Mode | Storage | | Address | 5.8.2 Storage Control Bits | | 5.7.3.2 Offset Real Mode Address 509<br>5.7.3.3 Storage Control Attributes for | 5.8.2.1 Storage Control Bit Restrictions 528 | | Accesses in Real and Hypervisor Real | 5.8.2.2 Altering the Storage Control Bits | | Addressing Modes 510 | 528 | | 5.7.3.3.1 Hypervisor Real Mode Storage | 5.9 Storage Control Instructions 529 | | Control | 5.9.1 Cache Management Instructions 529 | | 5.7.3.4 Virtual Real Mode Addressing | 5.9.2 Synchronize Instruction 529 | | Mechanism 510 | 5.9.3 Lookaside Buffer | | 5.7.3.5 Storage Control Attributes for | Management | | Implicit Storage Accesses 511 | 5.9.3.1 SLB Management Instructions 530 | | 5.7.4 Address Ranges Having Defined | 5.9.3.2 Bridge to SLB Architecture [Cate- | | Uses | gory:Server.Phased-Out]536 | | 5.7.5 Address Translation Overview 514 | 5.9.3.2.1 Segment Register | | 5.7.6 Virtual Address Generation 514 5.7.6.1 Segment Lookaside Buffer (SLB) | Manipulation Instructions536<br>5.9.3.3 TLB Management Instructions. 539 | | 514 | 5.10 Page Table Update Synchronization | | 5.7.6.2 SLB Search 515 | Requirements543 | | 5.7.7 Virtual to Real Translation 517 | 5.10.1 Page Table Updates 543 | | 5.7.7.1 Page Table 518 | 5.10.1.1 Adding a Page Table Entry 544 | | 5.7.7.2 Storage Description | 5.10.1.2 Modifying a Page Table Entry 545 | | Register 1 | 5.10.1.3 Deleting a Page Table Entry . 546 | | 5.7.7.3 Page Table Search 520 | | | | | ### 5.1 Overview A program references storage using the effective address computed by the processor when it executes a Load, Store, Branch, or Cache Management instruction, or when it fetches the next sequential instruction. The effective address is translated to a real address according to procedures described in Section 5.7.3, in Section 5.7.5 and in the following sections. The real address is what is presented to the storage subsystem. For a complete discussion of storage addressing and effective address calculation, see Section 1.10 of Book I #### 5.2 Storage Exceptions A storage exception results when the sequential execution model requires that a storage access be performed but the access is not permitted (e.g., is not permitted by the storage protection mechanism), the access cannot be performed because the effective address cannot be translated to a real address, or the access matches some tracking mechanism criteria (e.g., Data Address Breakpoint). In certain cases a storage exception may result in the "restart" of (re-execution of at least part of) a Load or Store instruction. See Section 2.1 of Book II, and Section 6.6 in this Book. #### 5.3 Instruction Fetch Instructions are fetched under control of MSR<sub>IR</sub>. #### MSR<sub>IR</sub>=0 The effective address of the instruction is interpreted as described in Section 5.7.3. #### MSR<sub>IR</sub>=1 The effective address of the instruction is translated by the Address Translation mechanism described beginning in Section 5.7.5. #### 5.3.1 Implicit Branch Explicitly altering certain MSR bits (using mtmsr[d]), or explicitly altering SLB entries, Page Table Entries, or certain System Registers (including the HRMOR, and possibly other implementation-dependent registers), may have the side effect of changing the addresses, effective or real, from which the current instruction stream is being fetched. This side effect is called an implicit branch. For example, an mtmsrd instruction that changes the value of ${ m MSR}_{ m SF}$ may change the effective addresses from which the current instruction stream is being fetched. The MSR bits and System Registers (excluding implementation-dependent registers) for which alteration can cause an implicit branch are indicated as such in Chapter 10. "Synchronization Requirements for Context Alterations" on page 585. Implicit branches are not supported by the Power ISA. If an implicit branch occurs, the results are boundedly undefined. ## 5.3.2 Address Wrapping Combined with Changing MSR Bit SF If the current instruction is at effective address $2^{32}$ - 4 and is an *mtmsrd* instruction that changes the contents of MSR<sub>SF</sub>, the effective address of the next sequential instruction is undefined. #### **Programming Note** In the case described in the preceding paragraph, if an interrupt occurs before the next sequential instruction is executed, the contents of SRR0, or HSRR0, as appropriate to the interrupt, are undefined. #### 5.4 Data Access Data accesses are controlled by MSR<sub>DR</sub>. #### MSR<sub>DR</sub>=0 The effective address of the data is interpreted as described in Section 5.7.3. #### MSR<sub>DR</sub>=1 The effective address of the data is translated by the Address Translation mechanism described in Section 5.7.5. ## 5.5 Performing Operations Out-of-Order An operation is said to be performed "in-order" if, at the time that it is performed, it is known to be required by the sequential execution model. An operation is said to be performed "out-of-order" if, at the time that it is performed, it is not known to be required by the sequential execution model. Operations are performed out-of-order by the processor on the expectation that the results will be needed by an instruction that will be required by the sequential execution model. Whether the results are really needed is contingent on everything that might divert the control flow away from the instruction, such as *Branch*, *Trap*, *System Call*, and *Return From Interrupt* instructions, and interrupts, and on everything that might change the context in which the instruction is executed. Typically, the processor performs operations out-oforder when it has resources that would otherwise be idle, so the operation incurs little or no cost. If subsequent events such as branches or interrupts indicate that the operation would not have been performed in the sequential execution model, the processor abandons any results of the operation (except as described below). In the remainder of this section, including its subsections, "Load instruction" includes the Cache Management and other instructions that are stated in the instruction descriptions to be "treated as a Load", and similarly for "Store instruction". A data access that is performed out-of-order may correspond to an arbitrary *Load* or *Store* instruction (e.g., a *Load* or *Store* instruction that is not in the instruction stream being executed). Similarly, an instruction fetch that is performed out-of-order may be for an arbitrary instruction (e.g., the aligned word at an arbitrary location in instruction storage). Most operations can be performed out-of-order, as long as the machine appears to follow the sequential execution model. Certain out-of-order operations are restricted, as follows. #### ■ Stores Stores are not performed out-of-order (even if the Store instructions that caused them were executed out-of-order). Accessing Guarded Storage The restrictions for this case are given in Section 5.8.1.1. The only permitted side effects of performing an operation out-of-order are the following. - A Machine Check or Checkstop that could be caused by in-order execution may occur out-oforder, except as described in Section 5.7.3.3.1 for the Real Mode Storage Control facility. - On implementations which support Reference and Change bits, these bits may be set as described in Section 5.7.8. - Non-Guarded storage locations that could be fetched into a cache by in-order fetching or execution of an arbitrary instruction may be fetched outof-order into that cache. #### 5.6 Invalid Real Address A storage access (including an access that is performed out-of-order; see Section 5.5) may cause a Machine Check if the accessed storage location contains an uncorrectable error or does not exist. In the case that the accessed storage location does not exist, the Checkstop state may be entered. See Section 6.5.2 on page 557. #### Programming Note - In configurations supporting multiple partitions, hypervisor software must ensure that a storage access by a program in one partition will not cause a Checkstop or other system-wide event that could affect the integrity of other partitions (see Chapter 2). For example, such an event could occur if a real address placed in a Page Table Entry or made accessible to a partition using the Offset Real Mode Address mechanism (see Section 5.7.3.3) does not exist. #### 5.7 Storage Addressing #### Storage Control Overview - Real address space size is 2<sup>m</sup> bytes, m≤60; see Note 1. - Real page size is 2<sup>12</sup> bytes (4 KB). - Effective address space size is 2<sup>64</sup> bytes. - An effective address is translated to a virtual address via the Segment Lookaside Buffer (SLB). - Virtual address space size is 2<sup>n</sup> bytes, 65≤n≤78; see Note 2. - Segment size is 2<sup>s</sup> bytes, s=28 or 40. - 2<sup>n-40</sup> ≤ number of virtual segments ≤ 2<sup>n-28</sup>; see Note 2. - Virtual page size is 2<sup>p</sup> bytes, where 12≤p, and 2<sup>p</sup> is no larger than either the size of the biggest segment or the real address space; a size of 4KB, 64 KB, and an implementation-dependent number of other sizes are supported; see Note 3. - Segments contain pages of a single size or a mixture of 4KB and 64KB pages - A virtual address is translated to a real address via the Page Table. #### Notes: - The value of m is implementation-dependent (subject to the maximum given above). When used to address storage, the high-order 60-m bits of the "60-bit" real address must be zeros. - The value of n is implementation-dependent (subject to the range given above). In references to 78-bit virtual addresses elsewhere in this Book, the high-order 78-n bits of the "78-bit" virtual address are assumed to be zeros. - 3. The supported values of p for the larger virtual page sizes are implementation-dependent (subject to the limitations given above). #### 5.7.1 32-Bit Mode The computation of the 64-bit effective address is independent of whether the processor is in 32-bit mode or 64-bit mode. In 32-bit mode (MSR<sub>SF</sub>=0), the high-order 32 bits of the 64-bit effective address are treated as zeros for the purpose of addressing storage. This applies to both data accesses and instruction fetches. It applies independent of whether address translation is enabled or disabled. This truncation of the effective address is the only respect in which storage accesses in 32-bit mode differ from those in 64-bit mode. #### Programming Note - Treating the high-order 32 bits of the effective address as zeros effectively truncates the 64-bit effective address to a 32-bit effective address such as would have been generated on a 32-bit implementation of the Power ISA. Thus, for example, the ESID in 32-bit mode is the high-order four bits of this truncated effective address; the ESID thus lies in the range 0-15. When address translation is enabled, these four bits would select a Segment Register on a 32-bit implementation of the Power ISA. The SLB entries that translate these 16 ESIDs can be used to emulate these Segment Registers. ## 5.7.2 Virtualized Partition Memory (VPM) Mode VPM mode enables the hypervisor to reassign all or part of a partition's memory transparently so that the reassignment is not visible to the partition. When this is done, the partition's memory is said to be "virtualized." The VPM field in the LPCR enables VPM mode separately when address translation is enabled and when translation is disabled. If the processor is not in hypervisor state, and either address translation is enabled and VPM<sub>1</sub>=1, or address translation is disabled and VPM<sub>0</sub>=1, conditions that would have caused a Data Storage or an Instruction Storage interrupt if the affected memory were not virtualized instead cause a Hypervisor Data Storage or a Hypervisor Instruction Storage interrupt respectively. Because the Hypervisor Data Storage and Hypervisor Instruction Storage interrupts always put the processor in hypervisor state, they permit the hypervisor to handle the condition if appropriate (e.g., to restore the contents of a page that was reassigned), and to reflect it to the operating system's Data Storage or Instruction Storage interrupt handler otherwise. When address translation is enabled, VPM mode has no effect on address translation. When address translation is disabled, addressing is controlled as specified in Section 5.7.3. ## 5.7.3 Real And Virtual Real Addressing Modes When a storage access is an instruction fetch performed when instruction address translation is disabled, or if the access is a data access and data address translation is disabled, it is said to be performed in "real addressing mode" if VPM<sub>0</sub>=0 and the processor is not in hypervisor state. If the processor is in hypervisor state, the access is said to be performed in "hypervisor real addressing mode" regardless of the value of VPM $_0$ . If the processor is not in hypervisor state and VPM $_0$ =1, the access is said to be performed in "virtual real addressing mode." Storage accesses in real, hypervisor real, and virtual real addressing modes are performed in a manner that depends on the contents of MSR $_{\rm HV}$ , LPES, VPM, VRMASD, HRMOR, RMLS, and RMOR (see Chapter 2), and bit 0 of the effective address (EA $_0$ ) as described below. Bit 1 of the effective address is ignored. #### MSR<sub>HV</sub>=1 - If EA<sub>0</sub>=0, the Hypervisor Offset Real Mode Address mechanism, described in Section 5.7.3.1, controls the access. - If EA<sub>0</sub>=1, bits 4:63 of the effective address are used as the real address for the access. #### MSR<sub>HV</sub>=0 - If LPES<sub>1</sub>=0, the access causes a storage exception as described in Section 5.7.9.3. - If LPES<sub>1</sub>=1 and VPM<sub>0</sub>=0, the Offset Real Mode Address mechanism, described in Section 5.7.3.2, controls the access. - If LPES<sub>1</sub>=1 and VPM<sub>0</sub>=1, the Virtual Real Mode Addressing mechanism, described in Section 5.7.3.4, controls the access. ## 5.7.3.1 Hypervisor Offset Real Mode Address If $MSR_{HV} = 1$ and $EA_0 = 0$ , the access is controlled by the contents of the Hypervisor Real Mode Offset Register, as follows. #### Hypervisor Real Mode Offset Register (HRMOR) Bits 4:63 of the effective address for the access are ORed with the 60-bit offset represented by the contents of the HRMOR, and the 60-bit result is used as the real address for the access. The supported offset values are all values of the form i×2<sup>r</sup>, where $0 \le i < 2^j$ , and j and r are implementation-dependent values having the properties that $12 \le r \le 26$ (i.e., the minimum offset granularity is 4 KB and the maximum offset granularity is 64 MB) and j+r = m, where the real address size supported by the implementation is m bits. #### Programming Note - $\rm EA_{4:63-r}$ should equal $^{60-r}$ 0. If this condition is satisfied, ORing the effective address with the offset produces a result that is equivalent to adding the effective address and the offset. If m<60, $\mbox{EA}_{4:63\mbox{-m}}$ and $\mbox{HRMOR}_{0:59\mbox{-m}}$ must be zeros. Software must ensure that altering the HRMOR does not cause an implicit branch. #### 5.7.3.2 Offset Real Mode Address If VPM<sub>0</sub>=0, MSR<sub>HV</sub>=0, and LPES<sub>1</sub>=1, the access is controlled by the contents of the Real Mode Limit Selector and Real Mode Offset Register, as specified below, and the set of storage locations accessible by code is referred to as the Real Mode Area (RMA). #### Real Mode Limit Selector (RMLS) If bits 4:63 of effective address for the access are greater than or equal to the value (limit) represented by the contents of the RMLR, the access causes a storage exception (see Section 5.7.9.3). In this comparison, if m<60, bits 4:63-m of the effective address may be ignored (i.e., treated as if they were zeros), where the real address size supported by the implementation is m bits. The supported limit values are of the form $2^j$ , where $12 \le j \le 60$ . Subject to the preceding sentence, the number and values of the limits supported are implementation-dependent. #### Real Mode Offset Register (RMOR) If the access is permitted by the RMLR, bits 4:63 of the effective address for the access are ORed with the 60-bit offset represented by the contents of the RMOR, and the low-order m bits of the 60-bit result are used as the real address for the access. The supported offset values are all values of the form ix2 $^{\rm S}$ , where $0 \le i < 2^{\rm k}$ , and k and s are implementation-dependent values having the properties that $2^{\rm S}$ is the minimum limit value supported by the implementation (i.e., the minimum value representable by the contents of the RMLR) and k+s = m. #### Programming Note - The offset specified by the RMOR should be a non-zero multiple of the limit specified by the RMLS. If these registers are set thus, ORing the effective address with the offset produces a result that is equivalent to adding the effective address and the offset. (The offset must not be zero, because real page 0 contains the fixed interrupt vectors and real pages 1 and 2 may be used for implementation-specific purposes; see Section 5.7.4, "Address Ranges Having Defined Uses" on page 512.) #### 5.7.3.3 Storage Control Attributes for Accesses in Real and Hypervisor Real Addressing Modes Storage accesses in hypervisor real addressing mode are performed as though all of storage had the following storage control attributes, except as modified by the Real Mode Storage Control facility (see Section 5.7.3.3.1). (The storage control attributes are defined in Book II.) - not Write Through Required - not Caching Inhibited, for instruction fetches - not Caching Inhibited, for data accesses except those caused by the Load/Store Caching Inhibited instructions; Caching Inhibited, for data accesses caused by the Load/Store Caching Inhibited instructions - Memory Coherence Required, for data accesses - Guarded Storage accesses in real addressing mode are performed as though all of storage had the following storage control attributes. (Such accesses use the Offset Real Mode Address mechanism.) - not Write Through Required - not Caching Inhibited - Memory Coherence Required, for data accesses - not Guarded Additionally, storage accesses in real or hypervisor real addressing modes are performed as though all storage was not No-execute. #### Programming Note Because storage accesses in real addressing mode and hypervisor real addressing mode do not use the SLB or the Page Table, accesses in these modes bypass all checking and recording of information contained therein (e.g., storage protection checks that use information contained therein are not performed, and reference and change information is not recorded). #### 5.7.3.3.1 Hypervisor Real Mode Storage Control The Hypervisor Real Mode Storage Control facility provides a means of specifying portions of real storage that are treated as non-Guarded in hypervisor real addressing mode (MSR<sub>HV PR</sub>=0b10, and MSR<sub>IR</sub>=0 or MSR<sub>DR</sub>=0, as appropriate for the type of access). The remaining portions are treated as Guarded in hypervisor real addressing mode. The means is a hypervisor resource (see Chapter 2), and may also be system-specific. When executing a Load/Store Caching Inhibited instruction, the specified storage location must not be in storage specified by the Real Mode Storage Control facility to be treated as non-Guarded. If this restriction is violated, the results are undefined. The facility does not apply to implicit accesses to the Page Table by the processor in performing address translation or in recording reference and change information. These accesses are performed as described in Section 5.7.3.3. #### Programming Note - The preceding capability can be used to improve the performance of hypervisor software that runs in hypervisor real addressing mode, by causing accesses to instructions and data that occupy well-behaved storage to be treated as non-Guarded. See also the second paragraph of the Programming Note in Section 5.7.3.3. The statement in Section 5.5, that non-Guarded storage locations may be fetched out-of-order into a cache only if they could be fetched into that cache by in-order execution, does not preclude the out-of-order fetching into the data cache of storage locations that are treated as non-Guarded for *Load/Store Caching-Inhibited* instructions, because the effective Caching Inhibited value that could be used for an in-order data access to such a storage location is undefined and hence could be 0. ## 5.7.3.4 Virtual Real Mode Addressing Mechanism If VPM $_0$ =1, MSR $_{HV}$ =0, LPES $_1$ =1, and MSR $_{DR}$ =0 or MSR $_{IR}$ =0 as appropriate for the type of access, the access is said to be made in virtual real addressing mode and is controlled by the mechanism specified below. The set of storage locations accessible by code is referred to as the Virtualized Real Mode Area (VRMA). In virtual real addressing mode, address translation, storage protection, and reference and change recording are handled as follows. Address translation and storage protection are handled as if address translation were enabled, except that translation of effective addresses to virtual addresses use the SLBE values in Figure 15 instead of the entry in the SLB corresponding to the ESID, bits 0:3 of the effective address are ignored (i.e. treated as if they were 0s), bits 4:63-m of the effective address may be ignored (where the real address size supported by the implementation is m bits), and the Virtual Page Class Key protection mechanism does not apply. #### Programming Note- The Virtual Page Class Key protection mechanism does not apply because the authority mask that an OS has set for application programs executing with address translation enabled may not be the same as the authority mask required by the OS when address translation is disabled, such as when first entering an interrupt handler. Reference and change recording are handled as if address translation were enabled. | Field | Value | | |----------------|----------------------|--| | ESID | <sup>36</sup> 0 | | | V | 1 | | | В | 0b01 - 1 TB | | | VSID | 0x0_01FF_FFFF | | | K <sub>s</sub> | 0 | | | K <sub>p</sub> | undefined | | | N | 0 | | | L | VRMASD <sub>L</sub> | | | С | 0 | | | LP | VRMASD <sub>LP</sub> | | Figure 15. SLBE for VRMA If the effective address is not less than 1 TB, a Hypervisor Data Segment or Hypervisor Instruction Segment interrupt may occur. #### Programming Note The C bit in Figure 15 is set to 0 because the implementation-dependent lookaside information associated with the VRMA is expected to be long-lived. See Section 5.9.3.1. #### Programming Note - The 1 TB VSID 0x0\_01FF\_FFFF should not be used by the operating system for purposes other than mapping the VRMA when address translation is enabled. #### Programming Note - Software should specify $PTE_B = 0b01$ for all Page Table Entries that map the VRMA in order to be consistent with the values in Figure 15. #### Programming Note - All accesses to the RMA are considered not Guarded. The G bit of the associated Page Table Entry determines whether an access to the VRMA is Guarded. Therefore, if an instruction is fetched from the VRMA, a Hypervisor Instruction Storage interrupt will result if G=1 in the associated Page Table Entry. ## 5.7.3.5 Storage Control Attributes for Implicit Storage Accesses Implicit accesses to the Page Table by the processor in performing address translation and in recording reference and change information are performed as though the storage occupied by the Page Table had the following storage control attributes. - not Write Through Required - not Caching Inhibited - Memory Coherence Required - not Guarded The definition of "performed" given in Book II applies also to these implicit accesses; accesses for performing address translation are considered to be loads in this respect, and accesses for recording reference and change information are considered to be stores. These implicit accesses are ordered by the *ptesync* instruction as described in Section 5.9.2. ## 5.7.4 Address Ranges Having Defined Uses The address ranges described below have uses that are defined by the architecture. #### ■ Fixed interrupt vectors Except for the first 256 bytes, which are reserved for software use, the real page beginning at real address 0x0000\_0000\_0000\_0000 is either used for interrupt vectors or reserved for future interrupt vectors. #### ■ Implementation-specific use The two contiguous real pages beginning at real address 0x0000\_0000\_0000\_1000 are reserved for implementation-specific purposes. #### ■ Offset Real Mode interrupt vectors The real pages beginning at the real address specified by the HRMOR and RMOR are used similarly to the page for the fixed interrupt vectors. #### ■ Page Table A contiguous sequence of real pages beginning at the real address specified by SDR1 contains the Page Table. #### 5.7.5 Address Translation Overview The effective address (EA) is the address generated by the processor for an instruction fetch or for a data access. If address translation is enabled, this address is passed to the Address Translation mechanism, which attempts to convert the address to a real address which is then used to access storage. The first step in address translation is to convert the effective address to a virtual address (VA), as described in Section 5.7.6. The second step, conversion of the virtual address to a real address (RA), is described in Section 5.7.7. If the effective address cannot be translated, a storage exception (see Section 5.2) occurs. Figure 16 gives an overview of the address translation process. Figure 16. Address translation overview #### 5.7.6 Virtual Address Generation Conversion of a 64-bit effective address to a virtual address is done by searching the Segment Lookaside Buffer (SLB) as shown in Figure 17. Figure 17. Translation of 64-bit effective address to 78 bit virtual address ## 5.7.6.1 Segment Lookaside Buffer (SLB) The Segment Lookaside Buffer (SLB) specifies the mapping between Effective Segment IDs (ESIDs) and Virtual Segment IDs (VSIDs). The number of SLB entries is implementation-dependent, except that all implementations provide at least 32 entries. The contents of the SLB are managed by software, using the instructions described in Section 5.9.3.1. See Chapter 10. "Synchronization Requirements for Context Alterations" on page 585 for the rules that software must follow when updating the SLB. #### **SLB Entry** Each SLB entry (SLBE, sometimes referred to as a "segment descriptor") maps one ESID to one VSID. Figure 18 shows the layout of an SLB entry ESID V B VSID K<sub>s</sub>K<sub>p</sub>NLC / LP 0 36 37 39 89 94 95 96 | Bit(s)<br>0:35<br>36<br>37:38 | Name<br>ESID<br>V<br>B | Description Effective Segment ID Entry valid (V=1) or invalid (V=0) Segment Size Selector 0b00 - 256 MB (s=28) 0b01 - 1 TB (s=40) 0b10 - reserved 0b11 - reserved | |-------------------------------|------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------| | 39:88 | VSID | Virtual Segment ID | | 89 | K <sub>s</sub> | Supervisor (privileged) state storage key (see Section 5.7.9.2) | | 90 | $K_p$ | Problem state storage key (See Section 5.7.9.2.) | | 91 | N | No-execute segment if N=1 | | 92 | L | Virtual page size selector bit 0. | | 93 | С | Class | | 95:96 | LP | Virtual page size selector bits 1:2. | All other fields are reserved. $B_0$ (SLBE $_{37}$ )is treated as a reserved field. #### Figure 18. SLB Entry Instructions cannot be executed from a No-execute (N=1) segment. The L and LP bits specify the page size or sizes that the segment may contain as shown in Figure 19. A Mixed Page Size (MPS) segment is a segment that may contain 4 KB pages, 64 KB pages, or a mixture of both. A Uniform Page Size (UPS) segment is a segment that must contain pages of only a single size. | SLBE <sub>L LP</sub> | Seg-<br>ment<br>Type | Virtual Page Size(s) | |-----------------------------------|----------------------|---------------------------------------------------------------------------------------------| | 0b000 | MPS | 4 KB, 64 KB if PTE <sub>L LP</sub> specifies<br>64 KB page in MPS segment, or<br>both sizes | | 0b101 | UPS | 64 KB if PTE <sub>L LP</sub> specifies 64 KB page in UPS segment | | additional<br>values <sup>1</sup> | UPS | 2 <sup>p</sup> bytes, where p > 12 and may differ among SLB <sub>L LP</sub> values | The "additional values" of SLB<sub>L||LP</sub> are implementation-dependent, as are the corresponding virtual page sizes. Figure 19. SLBL<sub>LIILP</sub> Encoding For each SLB entry, software must ensure the following requirements are satisfied. - L||LP contains a value supported by the implementation. - The page size selected by the L and LP fields does not exceed the segment size selected by the B field. - If s=40, the following bits of the SLB entry contain 0s. - ESID<sub>24:35</sub> - VSID<sub>38:49</sub> The bits in the above two items are ignored by the processor. The Class field of the SLB is used in conjunction with the *slbie* and *slbia* instructions (see Section 5.9.3.1). "Class" refers to a grouping of SLB entries and implementation-specific lookaside information so that only entries in a certain group need be invalidated and others might be preserved. The Class value assigned to an implementation-specific lookaside entry derived from an SLB entry must match the Class value of that SLB entry. The Class value assigned to an implementation-specific lookaside entry that is not derived from an SLB entry (such as real mode address "translations") is 0. Software must ensure that the SLB contains at most one entry that translates a given effective address, and that if the SLB contains an entry that translates a given effective address, then any previously existing translation of that effective address has been invalidated. An attempt to create an SLB entry that violates this requirement may cause a Machine Check. #### **Programming Note** It is permissible for software to replace the contents of a valid SLB entry without invalidating the translation specified by that entry provided the specified restrictions are followed. See Chapter 10 Note 11. #### 5.7.6.2 SLB Search When the hardware searches the SLB, all entries are tested for a match with the EA. For a match to exist, the following conditions must be satisfied for indicated fields in the SLBE. - V=1 - ESID<sub>0:63-s</sub>=EA<sub>0:63-s</sub>, where the value of s is specified by the B field in the SLBE being tested If no match is found, the search fails. If one match is found, the search succeeds. If more than one match is found, one of the matching entries is used as if it were the only matching entry, or a Machine Check occurs. If the SLB search succeeds, the virtual address (VA) is formed from the EA and the matching SLB entry fields as follows. $VA=VSID_{0:77-s} || EA_{64-s:63}$ The Virtual Page Number (VPN) is bits 0:77-p of the virtual address. If the value of the virtual page size selector field in the matching SLBE is 0b000, then the value of p is the value specified in the PTE used to translate the virtual address (see Section 5.7.7.1); otherwise the value of p is the value specified in the virtual page size selector field in the matching SLBE. If SLBE $_{\rm N}$ = 1, the N (No-execute) value used for the storage access is 1. If the SLB search fails, a *segment fault* occurs. This is an Instruction Segment exception or a Data Segment exception, depending on whether the effective address is for an instruction fetch or for a data access. #### 5.7.7 Virtual to Real Translation Conversion of a 78-bit virtual address to a real address is done by searching the Page Table as shown in Figure 20. Figure 20. Translation of 78-bit virtual address to 60-bit real address #### 5.7.7.1 Page Table The Hashed Page Table (HTAB) is a variable-sized data structure that specifies the mapping between virtual page numbers and real page numbers, where the real page number of a real page is bits 0:49 of the address of the first byte in the real page. The HTAB's size can be any size 2<sup>n</sup> bytes where 18≤n≤46. The HTAB must be located in storage having the storage control attributes that are used for implicit accesses to it (see Section 5.7.3.3). The starting address must be a multiple of its size unless the implementation supports the Server.Relaxed Page Table Alignment category, in which case its starting address is a multiple of 2<sup>18</sup> bytes (see Section 5.7.7.4). The HTAB contains Page Table Entry Groups (PTEGs). A PTEG contains 8 Page Table Entries (PTEs) of 16 bytes each; each PTEG is thus 128 bytes long. PTEGs are entry points for searches of the Page Table. See Section 5.10 for the rules that software must follow when updating the Page Table. #### Programming Note – The Page Table must be treated as a hypervisor resource (see Chapter 2), and therefore must be placed in real storage to which only the hypervisor has write access. Moreover, the contents of the Page Table must be such that non-hypervisor software cannot modify storage that contains hypervisor programs or data. #### **Page Table Entry** Each Page Table Entry (PTE) maps one VPN to one RPN. Figure 21 shows the layout of a PTE. This layout is independent of the Endian mode of the processor. | 0 | | | | | | | | | 57 | 61 | 62 | 63 | |----|---|-----|---|------|----|-----|----|----|------|----|----|----| | I | В | | | AVI | PΝ | | | | SW | L | Н | ٧ | | pp | / | key | ' | ARPN | LP | key | R | С | WIMG | Ν | р | р | | 0 | 1 | 2 | 4 | | 44 | 52 | 55 | 56 | 57 | 61 | 62 | 63 | | <b>Dword</b><br>0 | <b>Bit(s)</b> 0:1 | <b>Name</b><br>B | Description Segment Size 0b00 - 256 MB 0b01 - 1 TB 0b10 - reserved 0b11 - reserved | |-------------------|-------------------|------------------|------------------------------------------------------------------------------------| | | 2:56 | AVPN | Abbreviated Virtual Page<br>Number | | | 57:60 | SW | Available for software use | | | 61 | L | Virtual page size<br>0b0 - 4 KB<br>0b1 - greater than 4KB | | | | | (large page) | | | 62 | Н | Hash function identifier | | | 63 | V | Entry valid (V=1) or invalid (V=0) | | 1 | 0 | pp | Page Protection bit 0 | | | 2:3 | key | KEY bits 0:1 | | | 4:43 | ARPN | Abbreviated Real Page<br>Number | | | 44:51 | LP | Large page size selector | | | 52:54 | key | KEY bits 2:4 | | | 55 | R | Reference bit | | | 56 | С | Change bit | | | 57:60 | WIMG | Storage control bits | | | 61 | N | No-execute page if N=1 | | | 62:63 | рр | Page protection bits 1:2 | All other fields are reserved. Figure 21. Page Table Entry #### **Programming Note** The H bit in the Page Table entry should not be set to one unless the secondary Page Table search has been enabled. If p≤23, the Abbreviated Virtual Page Number (AVPN) field contains bits 0:54 of the VPN. Otherwise bits 0:77-p of the AVPN field contain bits 0:77-p of the VPN, and bits 78-p:54 of the AVPN field must be zeros and are ignored by the processor. #### Programming Note If p≤23, the AVPN field omits the low-order 23-p bits of the VPN. These bits are not needed in the PTE, because the low-order 11 bits of the VPN are always used in selecting the PTEGs to be searched (see Section 5.7.7.3). On implementations that support a virtual address size of only n bits, n<78, bits 0:77-n of the AVPN field must be zeros. A virtual page is mapped to a sequence of 2<sup>p-12</sup> contiguous real pages such that the low-order p-12 bits of the real page number of the first real page in the sequence are 0s. If PTE<sub>L</sub>=0, the virtual page size is 4KB, and ARPN concatenated with LP (ARPN||LP) contains the page number of the real page that maps the virtual page described by the entry. If $PTE_L=1$ , the virtual page size is specified by $PTE_{LP}$ . In this case, the contents of $PTE_{LP}$ have the format shown in Figure 22. Bits labelled "r" are bits of the real page number. The page size specified by the non-r bits of $PTE_{LP}$ is implementation-dependent. ``` rrrr_rr0 rrrr_rr01 rrrr_r011 rrrr_0111 rrr0_1111 rr01_1111 0111_1111 ``` #### Figure 22. Format of PTE<sub>LP</sub> There are at least 2 formats of PTE<sub>LP</sub> that specify a 64 KB page. One format specifies a 64 KB page contained in an MPS segment, and another specifies a 64 K page contained in a Uniform segment. If L=1, the page size selected by the LP field must not exceed the segment size selected by the B field. Forms of $PTE_{LP}$ not supported by a given processor are treated as reserved values for that processor. The concatenation of the ARPN field and bits labeled "r" in the LP field contain the high-order bits of the real page number of the real page that maps the first 4KB of the virtual page described by the entry. The low-order p-12 bits of the real page number contained in the ARPN and LP fields must be 0s and are ignored by the processor. #### Programming Note The page size specified by a given $PTE_{LP}$ format is at least $2^{12+(8-c)}$ , where c is the number of r bits in the format. #### Programming Note - The processor often has implementation-dependent lookaside buffers (e.g. TLBs and ERATs) used to cache translations of recently used storage addresses. Mapping virtual storage to large pages may increase the effectiveness of such lookaside buffers, improving performance, because it is possible for such buffers to translate a larger range of addresses, reducing the frequency that the Page Table must be searched to translate an address. Instructions cannot be executed from a No-execute (N=1) page. #### Page Table Size The number of entries in the Page Table directly affects performance because it influences the hit ratio in the Page Table and thus the rate of page faults. If the table is too small, it is possible that not all the virtual pages that actually have real pages assigned can be mapped via the Page Table. This can happen if too many hash collisions occur and there are more than 16 entries for the same primary/secondary pair of PTEGs (when the secondary Page Table search is enabled) or more than 8 entries for the same primary PTEG (when the secondary Page Table search is disabled). . While this situation cannot be guaranteed not to occur for any size Page Table, making the Page Table larger than the minimum size (see Section 5.7.7.2) will reduce the frequency of occurrence of such collisions. #### Programming Note - If large pages are not used, it is recommended that the number of PTEGs in the Page Table be at least half the number of real pages to be accessed. For example, if the amount of real storage to be accessed is $2^{31}$ bytes (2 GB), then we have $2^{31-12}=2^{19}$ real pages. The minimum recommended Page Table size would be $2^{18}$ PTEGs, or $2^{25}$ bytes (32 MB). ### 5.7.7.2 Storage Description Register 1 The Storage Description Register 1 (SDR1) register is shown in Figure 23. | Bits | Name | Description | |-------|----------|----------------------------| | 4:45 | HTABORG | Real address of Page Table | | 59:63 | HTABSIZE | Encoded size of Page Table | All other fields are reserved. #### Figure 23. SDR1 SDR1 is a hypervisor resource; see Chapter 2. The HTABORG field in SDR1 contains the high-order 42 bits of the 60-bit real address of the Page Table. The Page Table is thus constrained to lie on a 2<sup>18</sup> byte (256 KB) boundary at a minimum. At least 11 bits from the hash function (see Figure 20) are used to index into the Page Table. The minimum size Page Table is 256 KB (2<sup>11</sup> PTEGs of 128 bytes each). The Page Table can be any size 2<sup>n</sup> bytes where 18≤n≤46. As the table size is increased, more bits are used from the hash to index into the table and the value in HTABORG must have more of its low-order bits equal to 0 unless the implementation supports the Server.Relaxed Page Table Alignment category. The HTABSIZE field in SDR1 contains an integer giving the number of bits (in addition to the minimum of 11 bits) from the hash that are used in the Page Table index. This number must not exceed 28. HTABSIZE is used to generate a mask of the form 0b00...011...1, which is a string of 28 - HTABSIZE 0-bits followed by a string of HTABSIZE 1-bits. The 1-bits determine which additional bits (beyond the minimum of 11) from the hash are used in the index (see Figure 20). The number of low-order 0 bits in HTABORG must be greater than or equal to the value in HTABSIZE. On implementations that support a real address size of only m bits, m<60, bits 0:59-m of the HTABORG field are treated as reserved bits, and software must set them to zeros. #### Programming Note - Let n equal the virtual address size (in bits) supported by the implementation. If n<67, software should set the HTABSIZE field to a value that does not exceed n-39. Because the high-order 78-n bits of the VSID are assumed to be zeros, the hash value used in the Page Table search will have the high-order 67-n bits either all 0s (primary hash; see Section 5.7.7.3) or all 1s (secondary hash). If HTABSIZE > n-39, some of these hash value bits will be used to index into the Page Table, with the result that certain PTEGs will not be searched. #### **Example:** Suppose that the Page Table is 16,384 ( $2^{14}$ ) 128-byte PTEGs, for a total size of $2^{21}$ bytes (2 MB). A 14-bit index is required. Eleven bits are provided from the hash to start with, so 3 additional bits from the hash must be selected. Thus the value in HTABSIZE must be 3 and the value in HTABORG must have its low-order 3 bits (bits 43:45 of SDR1) equal to 0. This means that the Page Table must begin on a $2^{3+11+7} = 2^{21} = 2$ MB boundary. #### 5.7.7.3 Page Table Search When the hardware searches the Page Table, the accesses are performed as described in Section 5.7.3.3. An outline of the HTAB search process is shown in Figure 20. If the implementation supports the Server.Relaxed Page Table Alignment category see Section 5.7.7.4. Up to two hash functions are used to locate a PTE that may translate the given virtual address. A 39-bit hash value is computed from the VPN. The value of s is the value specified in the SLBE that was used to generate the virtual address; the value of p used when computing the hash function is 12 if $SLBE_{L||LP}$ =0b000, otherwise the value of p is the value specified in the SLBE. #### 1. Primary Hash: If s=28, the hash value is computed by Exclusive ORing VPN<sub>11:49</sub> with ( $^{11+p}0||{\rm VPN}_{50:77-p})$ If s=40, the hash value is computed by Exclusive ORing the following three quantities: (VPN<sub>24:37</sub> $\parallel^{25}$ 0), (0||VPN<sub>0:37</sub>), and ( $^{p-1}$ 0||VPN<sub>38:77-p</sub>) The 60-bit real address of a PTEG is formed by concatenating the following values: - Bits 4:17 of SDR1 (the high-order 14 bits of HTABORG). - Bits 0:27 of the 39-bit hash value ANDed with the mask generated from bits 59:63 of SDR1 (HTABSIZE) and then ORed with bits 18:45 of SDR1 (the low-order 28 bits of HTABORG). - Bits 28:38 of the 39-bit hash value. - Seven 0-bits. This operation identifies a particular PTEG, called the "primary PTEG", whose eight PTEs will be tested. #### 2. Secondary Hash: If the secondary Page Table search is enabled (LPCR<sub>TC</sub>=0), perform the secondary hash function as follows; otherwise do not perform step 2 and proceed to step 3 below. If s=28, the hash value is computed by taking the ones complement of the Exclusive OR of VPN<sub>11·49</sub> with $(^{11+p}0||VPN_{50:77-p})$ If s=40, the hash value is computed by taking the ones complement of the Exclusive OR of the following three quantities: $(VPN_{24:37} ||^{25}0)$ , $(0||VPN_{0:37})$ , and $(p^{-1}0||VPN_{38:77-p})$ The 60-bit real address of a PTEG is formed by concatenating the following values: - Bits 4:17 of SDR1 (the high-order 14 bits of HTABORG). - Bits 0:27 of the 39-bit hash value ANDed with the mask generated from bits 59:63 of SDR1 (HTABSIZE) and then ORed with bits 18:45 of SDR1 (the low-order 28 bits of HTABORG). - Bits 28:38 of the 39-bit hash value. - Seven 0-bits. This operation identifies the "secondary PTEG". - 3. As many as 8 PTEs in the primary PTEG and, if the secondary Page Table search is enabled, 8 PTEs in the secondary PTEG are tested to determine if any translate the given virtual address. Let q = minimum(54, 77-p). For a match to exist, the following conditions must be satisfied, where SLBE is the SLBE used to form the virtual address. - PTE<sub>H</sub>=0 for the primary PTEG, 1 for the secondary PTEG - $PTE_{V}=1$ - PTE<sub>B</sub>=SLBE<sub>B</sub> - PTE<sub>AVPN[0:q]</sub>=VA<sub>0:q</sub> if PTE<sub>L</sub>=0 then SLBE<sub>L||LP</sub>=0b000 else PTE<sub>LP</sub> specifies a page size specified by SLBE<sub>LIILP</sub> If no match is found, the search fails. If one match is found, the search succeeds. If more than one match is found, one of the matching entries is used as if it were the only matching entry, or a Machine Check occurs. If the Page Table search succeeds, the real address (RA) is formed by concatenating bits 0:59-p of (ARPN||LP) from the matching PTE with bits 64-p:63 of the effective address (the byte offset), where the p value is the value specified by PTE<sub>L LP</sub>. $RA=(ARPN || LP)_{0:59-p} || EA_{64-p:63}$ The N (No-execute) value used for the storage access is the result of ORing the N bit from the matching PTE with the N bit from the SLB entry that was used to translate the effective address. #### **Programming Note** For segments that may contain a mixture of 4 KB and 64 KB pages (i.e. $SLBE_{L||LP} = 0b000$ ), the value of p used when searching the Page Table to identify the PTEGs is specified to be 12. Since the segment may contain pages of size 4KB and 64 KB, the processor searches for PTEs specifying pages of either size, and the real address is formed using a value of p specified by the matching PTE. If the Page Table search fails, a page fault occurs. This is an Instruction Storage exception or a Data Storage exception, depending on whether the effective address is for an instruction fetch or for a data access. The N value used for the storage access is the N bit from the SLB entry that was used to translate the effective #### Programming Note - To obtain the best performance, Page Table Entries should be allocated beginning with the first empty entry in the primary PTEG, or with the first empty entry in the secondary PTEG if the primary PTEG is full and the secondary Page Table search is enabled (LPCR<sub>TC</sub>=0). #### Translation Lookaside Buffer Conceptually, the Page Table is searched by the address relocation hardware to translate every reference. For performance reasons, the hardware usually keeps a Translation Lookaside Buffer (TLB) that holds PTEs that have recently been used. The TLB is searched prior to searching the Page Table. As a consequence, when software makes changes to the Page Table it must perform the appropriate TLB invalidate operations to maintain the consistency of the TLB with the Page Table (see Section 5.10). #### Programming Notes - 1. Page Table Entries may or may not be cached in a TLB. - 2. It is possible that the hardware implements more than one TLB, such as one for data and one for instructions. In this case the size and shape of the TLBs may differ, as may the values contained therein. - 3. Use the *tlbie* or *tlbia* instruction to ensure that the TLB no longer contains a mapping for a particular virtual page. # 5.7.7.4 Relaxed Page Table Alignment [Category: Server.Relaxed Page Table Alignment] The Page Table can be aligned on any 2<sup>18</sup> byte (256 KB) boundary regardless of the HTAB size. Section 5.7.7.2 describes the Storage Description Register, which includes the HTABORG field. That description generally applies except for the following difference. As the Page Table size is increased beyond 256 KB, the value in HTABORG need not have more of its low-order bits equal to 0. Instead, (HTABORG || <sup>18</sup>0) is the real address of the start of the Page Table regardless of the Page Table size. A Page Table search is performed as described in Section 5.7.7.3 except the 60-bit real address of a PTEG for both the primary and, if the secondary Page Table search is enabled, the secondary hash is formed by concatenating the following values: - Bits 0:27 of the 39-bit appropriate primary or secondary hash value ANDed with the mask generated from bits 59:63 of SDR1 (HTAB-SIZE) and then added to the value of bits 4:45 of SDR1 (HTABORG). This part of the real address differs from Section 5.7.7.2. - Bits 28:38 of the 39-bit hash value. - Seven 0-bits. An outline of the PTEG real address computation is shown in Figure 20. # 5.7.8 Reference and Change Recording If address translation is enabled, Reference (R) and Change (C) bits are updated in the Page Table Entry that is used to translate the virtual address. If the storage operand of a *Load* or *Store* instruction crosses a virtual page boundary, the accesses to the components of the operand in each page are treated as separate and independent accesses to each of the pages for the purpose of setting the Reference and Change bits. Reference and Change bits are set by the processor as described below. Setting the bits need not be atomic with respect to performing the access that caused the bits to be updated. An attempt to access storage may cause one or more of the bits to be set (as described below) even if the access is not performed. The bits are updated in the Page Table Entry if the new value would otherwise be different from the old value for the virtual page, as determined by examining either the Page Table Entry or any lookaside information for the virtual page (e.g., TLB) maintained by the processor. Reference Bit The Reference bit is set to 1 if the corresponding access (load, store, or instruction fetch) is required by the sequential execution model and is performed. Otherwise the Reference bit may be set to 1 if the corresponding access is attempted, either in-order or out-of-order, even if the attempt causes an exception. #### **Change Bit** The Change bit is set to 1 if a *Store* instruction is executed and the store is performed. Otherwise the Change bit may be set to 1 if a *Store* instruction is executed and the store is permitted by the storage protection mechanism and, if the *Store* instruction is executed out-of-order, the instruction would be required by the sequential execution model in the absence of the following kinds of interrupts: - system-caused interrupts (see Section 6.4 on page 550) - Floating-Point Enabled Exception type Program interrupts when the processor is in an Imprecise mode. #### **Programming Note** - A 64 KB virtual page in an MPS segment may be mapped by multiple PTEs. For each access of a virtual page, hardware may search the Page Table to update the R and C bits. If lookaside buffer information for the virtual page already indicates that all such bits to be set have already been set in a PTE that maps the virtual page, hardware need not make an update. Consider the following sequence of events: - 1. A virtual page is mapped by 2 PTEs A and B and the R and C bits in both PTEs are 0. - A Load instruction accesses the virtual page and the R bit is updated in PTE A. - A Load instruction accesses the virtual page and the R bit is updated in PTE B. - A Store instruction accesses the virtual page and the C bit is updated in PTE B. - The virtual page is paged out. Software must examine both PTE A and B to get the state of the R and C bits for the virtual page. Furthermore, if in event 2, PTE A was not found, a Data Storage interrupt or Hypervisor Data Storage interrupt may occur. Subsequently, if in event 3 or 4, PTE B was not found, a Data Storage interrupt or Hypervisor Data Storage interrupt may occur. #### Programming Note Even though the execution of a *Store* instruction causes the Change bit to be set to 1, the store might not be performed or might be only partially performed in cases such as the following. - A Store Conditional instruction (stwcx. or stdcx.) is executed, but no store is performed. - A Store String Word Indexed instruction (stswx) is executed, but the length is zero. - The *Store* instruction causes a Data Storage exception (for which setting the Change bit is not prohibited). - The *Store* instruction causes an Alignment exception. - The Page Table Entry that translates the virtual address of the storage operand is altered such that the new contents of the Page Table Entry preclude performing the store (e.g., the PTE is made invalid, or the PP bits are changed). For example, when executing a *Store* instruction, the processor may search the Page Table for the purpose of setting the Change bit and then re-execute the instruction. When reexecuting the instruction, the processor may search the Page Table a second time. If the Page Table Entry has meanwhile been altered, by a program executing on another processor, the second search may obtain the new contents, which may preclude the store. A system-caused interrupt occurs before the store has been performed. Figure 24 on page 523 summarizes the rules for setting the Reference and Change bits. The table applies to each atomic storage reference. It should be read from the top down; the first line matching a given situation applies. For example, if **stwcx**. fails due to both a storage protection violation and the lack of a reservation, the Change bit is not altered. In the figure, the "Load-type" instructions are the Load instructions described in Books I, II, and III-S, eciwx, and the Cache Management instructions that are treated as Loads. The "Store-type" instructions are the Store instructions described in Books I, II, and III-S, ecowx, and the Cache Management instructions that are treated as Stores. The "ordinary" Load and Store instructions are those described in Books I, II, and III-S. "set" means "set to 1". When the processor updates the Reference and Change bits in the Page Table Entry, the accesses are performed as described in Section 5.7.3.3, "Storage Control Attributes for Accesses in Real and Hypervisor Real Addressing Modes" on page 510. The accesses may be performed using operations equivalent to a store to a byte, halfword, word, or doubleword, and are not necessarily performed as an atomic read/modify/ write of the affected bytes. These Reference and Change bit updates are not necessarily immediately visible to software. Executing a **sync** instruction ensures that all Reference and Change bit updates associated with address translations that were performed, by the processor executing the **sync** instruction, before the **sync** instruction is executed will be performed with respect to that processor before the **sync** instruction's memory barrier is created. There are additional requirements for synchronizing Reference and Change bit updates in multiprocessor systems; see Section 5.10, "Page Table Update Synchronization Requirements" on page 543. #### - Programming Note Because the *sync* instruction is execution synchronizing, the set of Reference and Change bit updates that are performed with respect to the processor executing the *sync* instruction before the memory barrier is created includes all Reference and Change bit updates associated with instructions preceding the *sync* instruction. If software refers to a Page Table Entry when MSR<sub>DR</sub>=1, the Reference and Change bits in the associated Page Table Entry are set as for ordinary loads and stores. See Section 5.10 for the rules software must follow when updating Reference and Change bits. | Status of Access | R | С | |--------------------------------------------------------|------------------|--------------------| | Storage protection violation | Acc <sup>1</sup> | No | | Out-of-order I-fetch or <i>Load</i> -type insn | Acc | No | | Out-of-order Store-type insn | | | | Would be required by the sequential | | | | execution model in the absence of | | | | system-caused or imprecise | | | | interrupts <sup>3</sup> | Acc | Acc <sup>1 2</sup> | | All other cases | Acc | No | | In-order <i>Load</i> -type or <i>Store</i> -type insn, | | | | access not performed | | | | <i>Load</i> -type insn | Acc | No | | Store-type insn | Acc | Acc <sup>2</sup> | | Other in-order access | | | | I-fetch | Yes | No | | Ordinary <i>Load</i> , <i>eciwx</i> | Yes | No | | Other ordinary Store, ecowx, dcbz | Yes | Yes | | icbi, dcbt, dcbtst, dcbst, dcbf[I] | Acc | No | "Acc" means that it is acceptable to set the bit. - 1 It is preferable not to set the bit. - 2 If C is set, R is also set unless it is already set. - 3 For Floating-Point Enabled Exception type Program interrupts, "imprecise" refers to the exception mode controlled by MSR<sub>FE0 FE1</sub>. Figure 24. Setting the Reference and Change bits # 5.7.9 Storage and Virtual Page Class Key Protection The storage and virtual page class key protection mechanism provides a means for selectively granting instruction fetch access, granting read access, granting read/write access, and prohibiting access to areas of storage based on a number of control criteria. The operation of the protection mechanism depends on one or more of the following conditions. - the state of MSR bits HV, IR, DR, PR - the value of the key bits in the associated SLB entry - the values of the page protection and key bits in the associated PTE - the contents of the Authority Mask Register When translation is enabled for an access, the access is permitted if and only if the access is permitted by the virtual page class key protection (see Section 5.7.9.1) and the storage protection mechanism (see Section 5.7.9.2). If an instruction fetch is not permitted, an Instruction Storage exception is generated. If a data access is not permitted, a Data Storage exception is generated. (See Section 5.2) Unless otherwise indicated, references to "storage protection mechanism" or "protection mechanism" throughout the Books refer to both the Storage Protection mechansm and the Virtual Page Class Key Protection mechanism. When address translation is enabled, a protection domain is a range of unmapped effective addresses, a virtual page, or a segment. When address translation is disabled and LPES $_1$ =1 there are two protection domains: the set of effective addresses that are less than the value specified by the RMLS, and all other effective addresses. When address translation is disabled and LPES $_1$ =0 the entire effective address space comprises a single protection domain. A protection boundary is a boundary between protection domains. ### 5.7.9.1 Virtual Page Class Key Protection The Virtual Page Class Key protection mechanism provides the means to assign virtual pages to one of 32 classes, and to modify access permissions for each class quickly by modifying the Authority Mask Register (AMR) shown in Figure 25. The access permissions associated with the Virtual Page Class Key protection mechanism apply only to load and store operations when address translation is enabled. The Virtual Page Class Key protection mechanism has no effect on instruction fetches. Figure 25. Authority Mask Register (AMR) The contents of the AMR are as follows. | Bit | Description | |---------|--------------------------------| | 0:1 | Access mask for class number 0 | | 2:3 | Access mask for class number 1 | | | | | 2n:2n+1 | Access mask for class number n | ... 62:63 The access mask for each class defines the access permissions used in conjunction with load and store operations corresponding to page table entries containing a KEY field value equal to the class number. The access permissions associated with each class are defined as follows, where $\mathsf{AMR}_{2n}$ and $\mathsf{AMR}_{2n+1}$ refer to the first and second bits of the of the access mask corresponding to class number n. Access mask for class number 31 - An access caused by a Store instruction is permitted if AMR<sub>2n</sub>=0b0; otherwise the access is not permitted. - An access caused by a Load instruction is permitted if AMR<sub>2n+1</sub>=0b0; otherwise the access is not permitted. #### Programming Note – If translation is disabled for a given access, the access is not affected by the Virtual Page Class Key protection mechanism even if the access is made in virtual real addressing mode. #### **Programming Note** The Virtual Page Class Key protection mechanism replaces the Data Address Compare mechanism that was defined in versions of the architecture that precede Version 2.04 (e.g., the two facilities use some of the same processor resources, as described below). However, the Virtual Page Class Key protection mechanism can be used to emulate the Data Address Compare mechanism. Moreover, programs that use the Data Address Compare mechanism can be modified in a manner such that they will work correctly both on processors that comply with versions of the architecture that precede Version 2.04 (and hence implement the Data Address Compare mechanism) and on processors that comply with Version 2.04 of the architecture or with any subsequent version (and hence instead implement the Virtual Page Class Key protection mechanism). The technique takes advantage of the facts that the AMR has the same SPR number as the Data Address Compare mechanism's ACCR (Address Compare Control Register), that KEY $_4$ occupies the same bit in the PTE as the Data Address Compare mechanism's AC (Address Compare) bit, and that the definition of ACCR $_{62:63}$ is very similar to the definition of each even-odd pair of AMR bits. The technique is as follows, where PTE1 refers to doubleword 1 of the PTE. - Set bits 2:3 and 62:63 of SPR 29 (which is either the ACCR or the AMR) to x, where x is the desired 2-bit value for controlling Data Address Compare matches, and set bits 0:1 to 0s. - Set PTE1<sub>54</sub> (which is either the AC bit or KEY<sub>4</sub>) to the same value that the AC bit would be set to, and set PTE1<sub>2:3</sub> (which are either RPN bits, that correspond to a real address size larger than the size implemented by any processor that implements the Data Address Compare mechanism, or KEY<sub>0:1</sub>) and PTE1<sub>52:53</sub> (which are either reserved bits or KEY<sub>2:3</sub>) to 0s. - Use PTE<sub>KEY</sub> values 0 and 1 only for purposes of emulating the Data Address Compare mechanism, except that PTE<sub>KEY</sub> value 0 may - also be used for any virtual pages for which it is desired that the Virtual Page Class Key mechanism permit all accesses. Do not use $PTE_{KEY} = 31$ . - When a Data Storage interrupt occurs, if DSISR<sub>42</sub>=1 then ignore the interrupt for Cache Management instructions other than *dcbz*. (These instructions can cause a virtual page class key protection violation but cannot cause a Data Address Compare match.) Otherwise treat the interrupt as if a Data Address Compare match had occurred. (Note: Cases for which it is undefined whether a Data Address Compare match occurs do not necessarily cause a virtual page class key protection violation.) ### 5.7.9.2 Storage Protection, Address Translation Enabled When address translation is enabled, the protection mechanism is controlled both by virtual page class key protection (see Section 5.7.9.1) and the following. - MSR<sub>PR</sub>, which distinguishes between supervisor (privileged) state and problem state - K<sub>s</sub> and K<sub>p</sub>, the supervisor (privileged) state and problem state storage key bits in the SLB entry used to translate the effective address - PP, page protection bits 0:2 in the Page Table Entry used to translate the effective address - For instruction fetches only: - the N (No-execute) value used for the access (see Sections 5.7.6.1 and 5.7.7.3) - PTE<sub>G</sub>, the G (Guarded) bit in the Page Table Entry used to translate the effective address Using the above values, the following rules are applied. 1. For an instruction fetch, the access is not permitted if the N value is 1 or if PTE<sub>G</sub>=1. 2. For any access except an instruction fetch that is not permitted by rule 1, a "Key" value is computed using the following formula: $$\texttt{Key} \leftarrow (\texttt{K}_{\texttt{p}} \texttt{ \& MSR}_{\texttt{PR}}) \ | \ (\texttt{K}_{\texttt{S}} \texttt{ \& } \neg \texttt{MSR}_{\texttt{PR}})$$ Using the computed Key, Figure 26 is applied. An instruction fetch is permitted for any entry in the figure except "no access". A load is permitted for any entry except "no access". A store is permitted only for entries with "read/write". | Key | PP | Access Authority | | | | |-----|-----|------------------|--|--|--| | 0 | 000 | read/write | | | | | 0 | 001 | read/write | | | | | 0 | 010 | read/write | | | | | 0 | 011 | ead only | | | | | 0 | 110 | read only | | | | | 1 | 000 | no access | | | | | 1 | 001 | read only | | | | | 1 | 010 | ead/write | | | | | 1 | 011 | read only | | | | | 1 | 110 | no access | | | | All PP encodings not shown above are reserved. The results of using reserved PP encodings are boundedly undefined. Figure 26. PP bit protection states, address translation enabled ### 5.7.9.3 Storage Protection, Address Translation Disabled When address translation is disabled, the protection mechanism is controlled by the following (see Chapter 2 and Section 5.7.3, "Real And Virtual Real Addressing Modes"). - LPES<sub>1</sub>, which distinguishes between the two modes of accessing storage using the LPAR facility - MSR<sub>HV</sub>, which distinguishes between hypervisor state and other privilege states - RMLS, which specifies the real mode limit value Using the above values, Figure 27 is applied. The access is permitted for any entry in the figure except "no access". | LPES <sub>1</sub> | HV | Access Authority | |-------------------|----|--------------------------| | 0 | 0 | no access | | 0 | 1 | read/write | | 1 | 0 | read/write or no access1 | | 1 | 1 | read/write | If VPM<sub>0</sub>=1, the access authority is read/write. If VPM<sub>0</sub>=0 and the effective address for the access is less than the value specified by the RMLS, the access authority is read/write; otherwise the access is not permitted. Figure 27. Protection states, address translation disabled #### Programming Note The comparison described in note 1 in Figure 27 ignores bits 0:3 of the effective address and may ignore bits 4:63-m; see Section 5.7.3. #### 5.8 Storage Control Attributes This section describes aspects of the storage control attributes that are relevant only to privileged software programmers. The rest of the description of storage control attributes may be found in Section 1.6 of Book II and subsections. #### 5.8.1 Guarded Storage Storage is said to be "well-behaved" if the corresponding real storage exists and is not defective, and if the effects of a single access to it are indistinguishable from the effects of multiple identical accesses to it. Data and instructions can be fetched out-of-order from well-behaved storage without causing undesired side effects. Storage is said to be Guarded if any of the following conditions is satisfied. - MSR bit IR or DR is 1 for instruction fetches or data accesses respectively, and the G bit is 1 in the relevant Page Table Entry. - MSR bit IR or DR is 0 for instruction fetches or data accesses respectively, MSR<sub>HV</sub>=1, and the storage is outside the range(s) specified by the Real Mode Storage Control facility (see Section 5.7.3.3.1). In general, storage that is not well-behaved should be Guarded. Because such storage may represent a control register on an I/O device or may include locations that do not exist, an out-of-order access to such storage may cause an I/O device to perform unintended operations or may result in a Machine Check. The following rules apply to in-order execution of *Load* and *Store* instructions for which the first byte of the storage operand is in storage that is both Caching Inhibited and Guarded. Load or Store instruction that causes an atomic access If any portion of the storage operand has been accessed and an External, Decrementer, Hypervisor Decrementer, or Imprecise mode Floating-Point Enabled exception is pending, the instruction completes before the interrupt occurs. Load or Store instruction that causes an Alignment exception, or that causes a Data Storage exception for reasons other than Data Address Breakpoint match. The portion of the storage operand that is in Caching Inhibited and Guarded storage is not accessed. (The corresponding rules for instructions that cause a Data Address Breakpoint match are given in Section 8.1.2.) ## 5.8.1.1 Out-of-Order Accesses to Guarded Storage In general, Guarded storage is not accessed out-oforder. The only exceptions to this rule are the following. #### **Load Instruction** If a copy of any byte of the storage operand is in a cache then that byte may be accessed in the cache or in main storage. #### **Instruction Fetch** If MSR<sub>HV IR</sub>=0b10 then an instruction may be fetched if any of the following conditions are met. - 1. The instruction is in a cache. In this case it may be fetched from the cache or from main storage. - The instruction is in a real page from which an instruction has previously been fetched, except that if that previous fetch was based on condition 1 then the previously fetched instruction must have been in the instruction cache. - The instruction is in the same real page as an instruction that is required by the sequential execution model, or is in the real page immediately following such a page. #### Programming Note Software should ensure that only well-behaved storage is copied into a cache, either by accessing as Caching Inhibited (and Guarded) all storage that may not be well-behaved, or by accessing such storage as not Caching Inhibited (but Guarded) and referring only to cache blocks that are well-behaved. If a real page contains instructions that will be executed when $MSR_{IR}=0$ and $MSR_{HV}=1$ , software should ensure that this real page and the next real page contain only well-behaved storage (or that the Real Mode Storage Control facility specifies that this real page is not Guarded). #### 5.8.2 Storage Control Bits When address translation is enabled, each storage access is performed under the control of the Page Table Entry used to translate the effective address. Each Page Table Entry contains storage control bits that specify the presence or absence of the corresponding storage control for all accesses translated by the entry as shown in Figure 28. | Bit | Storage Control Attribute | |----------------|--------------------------------------------------------------------| | W <sup>1</sup> | 0 - not Write Through Required 1 - Write Through Required | | I | 0 - not Caching Inhibited 1 - Caching Inhibited | | M <sup>2</sup> | 0 - not Memory Coherence Required<br>1 - Memory Coherence Required | | G | 0 - not Guarded<br>1 - Guarded | - Support for the 1 value of the W bit is optional. Implementations that do not support the 1 value treat the bit as reserved and assume its value to be 0. - <sup>2</sup> [Category: Memory Coherence] Support for the 0 value of the M bit is optional, implementations that do not support the 0 value assume the value of the bit to be 1, and may either preserve the value of the bit or write it as 1. #### Figure 28. Storage control bits When address translation is enabled, instructions are not fetched from storage for which the G bit in the Page Table Entry is set to 1; see Section 5.7.9. When address translation is disabled, the storage control attributes are implicit; see Section 5.7.3.3. In Section 5.8.2.1 and 5.8.2.2, "access" includes accesses that are performed out-of-order, and references to W, I, M, and G bits include the values of those bits that are implied when address translation is disabled. #### Programming Note - In a uniprocessor system in which only the processor has caches, correct coherent execution does not require the processor to access storage as Memory Coherence Required, and accessing storage as not Memory Coherence Required may give better performance. #### 5.8.2.1 Storage Control Bit Restrictions All combinations of W, I, M, and G values are permitted except those for which both W and I are 1. #### Programming Note – If an application program requests both the Write Through Required and the Caching Inhibited attributes for a given storage location, the operating system should set the I bit to 1 and the W bit to 0. At any given time, the value of the I bit must be the same for all accesses to a given real page. At any given time, the value of the W bit must be the same for all accesses to a given real page. ### 5.8.2.2 Altering the Storage Control Bits When changing the value of the I bit for a given real page from 0 to 1, software must set the I bit to 1 and then flush all copies of locations in the page from the caches using **dcbf**[I] and **icbi** before permitting any other accesses to the page. When changing the value of the W bit for a given real page from 0 to 1, software must ensure that no processor modifies any location in the page until after all copies of locations in the page that are considered to be modified in the data caches have been copied to main storage using **dcbst** or **dcbf[I]** #### Programming Note - It is recommended that *dcbf* be used, rather than *dcbfl*, when changing the value of the I or W bit from 0 to 1. (*dcbfl* would have to be executed on all processors for which the contents of the data cache may be inconsistent with the new value of the bit, whereas, if the M bit for the page is 1, *dcbf* need be executed on only one processor in the system.) When changing the value of the M bit for a given real page, software must ensure that all data caches are consistent with main storage. The actions required to do this to are system-dependent. #### - Programming Note For example, when changing the M bit in some directory-based systems, software may be required to execute *dcbf[I]* on each processor to flush all storage locations accessed with the old M value before permitting the locations to be accessed with the new M value. Additional requirements for changing the storage control bits in the Page Table are given in Section 5.10. #### **5.9 Storage Control Instructions** #### 5.9.1 Cache Management Instructions This section describes aspects of cache management that are relevant only to privileged software programmers. For a *dcbz* instruction that causes the target block to be newly established in the data cache without being fetched from main storage, the processor need not verify that the associated real address is valid. The existence of a data cache block that is associated with an invalid real address (see Section 5.6) can cause a delayed Machine Check interrupt or a delayed Checkstop. Each implementation provides an efficient means by which software can ensure that all blocks that are considered to be modified in the data cache have been copied to main storage before the processor enters any power conserving mode in which data cache contents are not maintained. #### 5.9.2 Synchronize Instruction The *Synchronize* instruction is described in Section 3.4.3 of Book II, but only at the level required by an application programmer (*sync* with L=0 or L=1). This section describes properties of the instruction that are relevant only to operating system and hypervisor software programmers. This variant of the *Synchronize* instruction is designated the Page Table Entry *sync* and is specified by the extended mnemonic *ptesync* (equivalent to *sync* with L=2). The **ptesync** instruction has all of the properties of **sync** with L=0 and also the following additional properties. - The memory barrier created by the *ptesync* instruction provides an ordering function for the storage accesses associated with all instructions that are executed by the processor executing the *ptesync* instruction and, as elements of set A, for all Reference and Change bit updates associated with additional address translations that were performed, by the processor executing the *ptesync* instruction, before the *ptesync* instruction is executed. The applicable pairs are all pairs a<sub>i</sub>,b<sub>j</sub> in which b<sub>j</sub> is a data access and a<sub>i</sub> is not an instruction fetch. - The *ptesync* instruction causes all Reference and Change bit updates associated with address translations that were performed, by the processor executing the *ptesync* instruction, before the *ptesync* instruction is executed, to be performed with respect to that processor before the *ptesync* instruction's memory barrier is created. - The *ptesync* instruction provides an ordering function for all stores to the Page Table caused by *Store* instructions preceding the *ptesync* instruction with respect to searches of the Page Table that are performed, by the processor executing the *ptesync* instruction, after the *ptesync* instruction completes. Executing a *ptesync* instruction ensures that all such stores will be performed, with - respect to the processor executing the *ptesync* instruction, before any implicit accesses to the affected Page Table Entries, by such Page Table searches, are performed with respect to that processor - In conjunction with the *tlbie* and *tlbsync* instructions, the *ptesync* instruction provides an ordering function for TLB invalidations and related storage accesses on other processors as described in the *tlbsync* instruction description on page 542. #### Programming Note For instructions following a **ptesync** instruction, the memory barrier need not order implicit storage accesses for purposes of address translation and reference and change recording. The functions performed by the *ptesync* instruction may take a significant amount of time to complete, so this form of the instruction should be used only if the functions listed above are needed. Otherwise *sync* with L=0 should be used (or *sync* with L=1, or *eieio*, if appropriate). Section 5.10, "Page Table Update Synchronization Requirements" on page 543 gives examples of uses of *ptesync*. #### 5.9.3 Lookaside Buffer Management All implementations have a Segment Lookaside Buffer (SLB). For performance reasons, most implementations also have implementation-specific lookaside information that is used in address translation. This lookaside information may be: a Translation Lookaside Buffer (TLB) which is a cache of recently used Page Table Entries (PTEs); a cache of recently used translations of effective addresses to real addresses; etc.; or any combination of these. Lookaside information, including the SLB, is managed using the instructions described in the subsections of this section. Lookaside information derived from PTEs is not necessarily kept consistent with the Page Table. When software alters the contents of a PTE, in general it must also invalidate all corresponding implementation-specific lookaside information; exceptions to this rule are described in Section 5.10.1.2. The effects of the *slbie*, *slbia*, and *TLB Management* instructions on address translations, as specified in Sections 5.9.3.1 and 5.9.3.3 for the SLB and TLB respectively, apply to all implementation-specific lookaside information that is used in address translation. Unless otherwise stated or obvious from context, references to SLB entry invalidation and TLB entry invalidation elsewhere in the Books apply also to all implementation-specific lookaside information that is derived from SLB entries and PTEs respectively. The *tlbia* instruction is optional. However, all implementations provide a means by which software can invalidate all implementation-specific lookaside information that is derived from PTEs. Implementation-specific lookaside information that contains translations of effective addresses to real addresses may include "translations" that apply in real addressing mode. Because such "translations" are affected by the contents of the LPCR, RMOR, and HRMOR, when software alters the contents of these registers it must also invalidate the corresponding implementation-specific lookaside information. Software can invalidate all such lookaside information by using the *slbia* instruction with IH=0b000. However, better performance will likely be observed if other appropriate IH values are used to limit the amount of lookaside information invalidated. All implementations that have such lookaside information provide a means by which software can invalidate all such lookaside information. For simplicity, elsewhere in the Books it is assumed that the TLB exists. #### - Programming Note Because the instructions used to manage implementation-specific lookaside information that is derived from PTEs may be changed in a future version of the architecture, it is recommended that software "encapsulate" uses of the TLB Management instructions into subroutines. #### Programming Note – The function of all the instructions described in Sections 5.9.3.1 - 5.9.3.3 is independent of whether address translation is enabled or disabled. For a discussion of software synchronization requirements when invalidating SLB and TLB entries, see Chapter 10. #### 5.9.3.1 SLB Management Instructions #### Programming Note Accesses to a given SLB entry caused by the instructions described in this section obey the sequential execution model with respect to the contents of the entry and with respect to data dependencies on those contents. That is, if an instruction sequence contains two or more of these instructions, when the sequence has completed, the final state of the SLB entry and of General Purpose Registers is as if the instructions had been executed in program order. However, software synchronization is required in order to ensure that any alterations of the entry take effect correctly with respect to address translation; see Chapter 10. #### SLB Invalidate Entry X-form slbie RB | 31 | /// | /// | RB | 434 | / | |----|-----|-----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` ea_{0:35} \leftarrow (RB)_{0:35} if, for SLB entry that translates or most recently translated ea, entry class = (RB)_{36} and entry seg size = size specified in (RB) 37.38 then for SLB entry (if any) that translates ea SLBE_{V} \leftarrow 0 all other fields of SLBE ← undefined else s ← log_base_2(entry_seg_size) esid \leftarrow (RB)<sub>0:63-s</sub> u \leftarrow undefined 1-bit value if u then if an SLB entry translates esid SLBE_V \leftarrow 0 all other fields of SLBE ← undefined ``` Let the Effective Address (EA) be any EA for which $EA_{0:35} = (RB)_{0:35}$ . Let the class be (RB)<sub>36</sub>. Let the segment size be equal to the segment size specified in (RB)<sub>37:38</sub>; the allowed values of (RB)<sub>37:38</sub>, and the correspondence between the values and the segment size, are the same as for the B field in the SLBE (see Figure 18 on page 515). The class value and segment size must be the same as the class value and segment size in the SLB entry that translates the EA, or the values that were in the SLB entry that most recently translated the EA if the translation is no longer in the SLB; if these values are not the same, it is implementation-dependent whether the SLB entry (or implementation-dependent translation information) that translates the EA is invalidated, and the next paragraph need not apply. If the SLB contains only a single entry that translates the EA, then that is the only SLB entry that is invalidated, except that it is implementation-dependent whether an implementation-specific lookaside entry for a real mode address "translation" is invalidated. If the SLB contains more than one such entry, then zero or more such entries are invalidated, and similarly for any implementation-specific lookaside information used in address translation; additionally, a machine check may occur. SLB entries are invalidated by setting the V bit in the entry to 0, and the remaining fields of the entry are set to undefined values. The processor ignores the contents of RB listed below and software must set them to 0s. - (RB)<sub>37</sub> - (RB)<sub>39:63</sub> - If s = 40, $(RB)_{24:35}$ If this instruction is executed in 32-bit mode, $(RB)_{0:31}$ must be zeros. This instruction is privileged. #### **Special Registers Altered:** None #### Programming Note slbie does not affect SLBs on other processors. #### Programming Note The reason the class value specified by *slbie* must be the same as the Class value that is or was in the relevant SLB entry is that the processor may use these values to optimize invalidation of implementation-specific lookaside information used in address translation. If the value specified by *slbie* differs from the value that is or was in the relevant SLB entry, these optimizations may produce incorrect results. (An example of implementation-specific address translation lookaside information is the set of recently used translations of effective addresses to real addresses that some processors maintain in an Effective to Real Address Translation (ERAT) lookaside buffer.) When switching tasks in certain cases, it may be advantageous to preserve some implementation-specific lookaside entries while invalidating others. The IH=0b001 invalidation hint of the *slbia* instruction can be used for this purpose if SLB class values are appropriately assigned, i.e. a class value of 0 gives the hint that the entry should be preserved and a class value of 1 indicates the entry must be invalidated. Also, it is advantageous to assign a class value of 1 to entries that need to be invalidated via an *slbie* instruction while preserving implementation-specific lookaside entries that are not derived from an SLB entry since such entries are assigned a class value of 0. The *Move To Segment Register* instructions (see Section 5.9.3.2.1) create SLB entries in which the Class value is 0. #### **Programming Note** The B value in register RB may be needed for invalidating ERAT entries corresponding to the translation being invalidated. #### SLB Invalidate All #### X-form slbia IH | 31 | // | IH | /// | /// | 498 | / | |----|----|----|-----|-----|-----|----| | 0 | 6 | 8 | 11 | 16 | 21 | 31 | for each SLB entry except SLB entry 0 SLBEV ← 0 all other fields of SLBE ← undefined For all SLB entries except SLB entry 0, the V bit in the entry is set to 0, making the entry invalid, and the remaining fields of the entry are set to undefined values. SLB entry 0 is not altered. On implementations that have implementation-specific lookaside information for effective to real address translations, the IH field provides a hint which can be used to selectively invalidate entries in such lookaside information. The defined values for IH are as follows. 0b000 All implementation-specific lookaside information is invalidated. (This value is not a hint.) 0b001 Preserve implementation-specific lookaside information with a Class value of 0. 0b010 Preserve implementation-specific lookaside information created when MSR<sub>IR/DR</sub>=0. 0b110 Preserve implementation-specific lookaside information created when MSR $_{HV}$ =1, MSR $_{PR}$ =0, and MSR $_{IR/DR}$ =0. All other values are reserved. Implementation specific lookaside information for which preservation is not requested must be invalidated. Implementation specific lookaside information for which preservation is requested may be invalidated. This instruction is privileged. #### **Special Registers Altered:** None #### Programming Note — slbia does not affect SLBs on other processors. #### Programming Note If *slbia* is executed when instruction address translation is enabled, software can ensure that attempting to fetch the instruction following the *slbia* does not cause an Instruction Segment interrupt by placing the *slbia* and the subsequent instruction in the effective segment mapped by SLB entry 0. (The preceding assumes that no other interrupts occur between executing the *slbia* and executing the subsequent instruction.) #### Programming Note - The defined values for IH are as follows. 0b000 All ERAT entries are invalidated. (This value is not a hint.) This value should be used by the hypervisor when relocating itself (i.e. when modifying the HRMOR) or when reconfiguring real storage. 0b001 Preserve ERAT entries with a Class value of 0. This value should be used by an operating system when switching tasks in certain cases; for example, if SLBE<sub>C</sub>=0 is used for SLB translations shared between the tasks. 0b010 Preserve ERAT entries created when MSR<sub>IR/DR</sub>=0. This value should generally be used by an operating system when switching tasks. 0b110 Preserve ERAT entries created when MSR<sub>HV</sub>=1 and MSR<sub>IR/DR</sub>=0. This value should be used by the hypervisor when switching partitions. All other values are reserved. If the IH field contains a reserved value, the hint provided by the instruction is undefined. #### - Programming Note **slbia** serves as both a basic and an extended mnemonic. The Assembler will recognize an **slbia** mnemonic with one operand as the basic form, and an **slbia** mnemonic with no operand as the extended form. In the extended form the IH operand is omitted and assumed to be 0. #### **SLB Move To Entry** X-form slbmte RS,RB The SLB entry specified by bits 52:63 of register RB is loaded from register RS and from the remainder of register RB. The contents of these registers are interpreted as shown in Figure 29. RS RB RS<sub>0:1</sub> B RS<sub>2:51</sub> VSID RS<sub>52</sub> K<sub>s</sub> RS<sub>53</sub> K<sub>p</sub> RS<sub>54</sub> N RS<sub>55</sub> L RS<sub>56</sub> C RS<sub>57</sub> must be 0b0 RS<sub>58:59</sub> LP RS<sub>60:63</sub> must be 0b0000 RB<sub>0:35</sub> ESID RB<sub>36</sub> V RB<sub>37:51</sub> must be 0b000 || 0x000 RB<sub>52:63</sub> index, which selects the SLB entry #### Figure 29. GPR contents for slbmte On implementations that support a virtual address size of only n bits, n<78, $(RS)_{0:77-n}$ must be zeros. (RS)<sub>57</sub> and(RS)<sub>60.63</sub> must be ignored by the processor. High-order bits of (RB)<sub>52:63</sub> that correspond to SLB entries beyond the size of the SLB provided by the implementation must be zeros. If this instruction is executed in 32-bit mode, $(RB)_{0:31}$ must be zeros (i.e., the ESID must be in the range 0-15). This instruction cannot be used to invalidate an SLB entry. This instruction is privileged. #### **Special Registers Altered:** None #### Programming Note - The reason *slbmte* cannot be used to invalidate an SLB entry is that it does not necessarily affect implementation-specific address translation lookaside information. *slbie* (or *slbia*) must be used for this purpose. #### SLB Move From Entry VSID X-form | sibmtev | R1,RI | В | | |---------|-------|---|--| | | | | | If the SLB entry specified by bits 52:63 of register RB is valid (V=1), the contents of the B, VSID, $K_s$ , $K_p$ , N, L, C, and LP fields of the entry are placed into register RT. The contents of these registers are interpreted as shown in Figure 30. RT RB RT<sub>60:63</sub> RT<sub>0:1</sub> В RT<sub>2:51</sub> **VSID** RT<sub>52</sub> $K_s$ $\mathsf{K}_\mathsf{p}$ $RT_{53}$ $RT_{54}$ N L $RT_{55}$ С $RT_{56}$ set to 0b0 RT<sub>57</sub> LP RT<sub>58:59</sub> RB<sub>0:51</sub> must be 0x0\_0000\_0000\_0000 RB<sub>52:63</sub> index, which selects the SLB entry #### Figure 30. GPR contents for slbmfev set to 0b0000 On implementations that support a virtual address size of only n bits, n<78, RT<sub>0:77-n</sub> are set to zeros. If the SLB entry specified by bits 52:63 of register RB is invalid (V=0), the contents of register RT are set to 0. High-order bits of $(RB)_{52:63}$ that correspond to SLB entries beyond the size of the SLB provided by the implementation must be zeros. This instruction is privileged. #### Special Registers Altered: None #### SLB Move From Entry ESID X-form | slbmfee | RT,R | В | | | | |---------|------|-----|----|-----|----| | 31 | RT | /// | RB | 915 | / | | 0 | 6 | 11 | 16 | 21 | 31 | If the SLB entry specified by bits 52:63 of register RB is valid (V=1), the contents of the ESID and V fields of the entry are placed into register RT. The contents of these registers are interpreted as shown in Figure 31. RT $\begin{array}{lll} \text{RT}_{0:35} & \text{ESID} \\ \text{RT}_{36} & \text{V} \\ \text{RT}_{37:63} & \text{set to } 0\text{b000} \mid\mid 0\text{x}00\_0000 \\ \text{RB}_{0:51} & \text{must be } 0\text{x}0\_0000\_0000\_0000 \\ \text{RB}_{52:63} & \text{index, which selects the SLB entry} \end{array}$ #### Figure 31. GPR contents for slbmfee If the SLB entry specified by bits 52:63 of register RB is invalid (V=0), the contents of register RT are set to 0. High-order bits of (RB)<sub>52:63</sub> that correspond to SLB entries beyond the size of the SLB provided by the implementation must be zeros. This instruction is privileged. #### **Special Registers Altered:** #### SLB Find Entry ESID X-form slbfee. RT,RB The SLB is searched for an entry that matches the effective address specified by register RB. The search is performed as if it were being performed for purposes of address translation. E.g., in order for a given entry to satisfy the search, the entry must be valid (V=1), and (RB)<sub>0:63-s</sub> must equal SLBE[ESID<sub>0:63-s</sub>] (where 2<sup>s</sup> is the segment size selected by the B field in the entry). If exactly one matching entry is found, the contents of the B, VSID, K<sub>s</sub>, K<sub>p</sub>, N, L, C, and LP fields of the entry are placed into register RT. If no matching entry is found, register RT is set to 0. If more than one matching entry is found, either one of the matching entries is used, as if it were the only matching entry, or a Machine Check occurs. If a Machine Check occurs, register RT, and CR Field 0 are set to undefined values, and the description below of how this register and this field is set does not apply. The contents of registers RT and RB are interpreted as shown in Figure 32. RT | В | VSID | K <sub>s</sub> K <sub>p</sub> NLC | 0 | LP | 0s | |---|------|-----------------------------------|----|----|-------| | 0 | 2 | 52 | 57 | 58 | 60 63 | RB RT<sub>0:1</sub> В **VSID** RT<sub>2:51</sub> $RT_{52}$ $K_s$ $RT_{53}$ N $RT_{54}$ $RT_{55}$ С $RT_{56}$ RT<sub>57</sub> set to 0b0 RT<sub>58:59</sub> LP RT<sub>60:63</sub> set to 0b0000 RB<sub>0:35</sub> ESID RB<sub>36:63</sub> must be 0x0000000 #### Figure 32. GPR contents for slbfee. If s > 28, $RT_{80-s:51}$ are set to zeros. On implementations that support a virtual address size of only n bits, n < 78, RT<sub>2:79-n</sub> are set to zeros. CR Field 0 is set as follows. j is a 1-bit value that is equal to 0b1 if a matching entry was found. Otherwise, j is 0b0. $CR0_{LT\ GT\ EQ\ SO} = 0b00 \parallel j \parallel XER_{SO}$ If this instruction is executed in 32-bit mode, (RB)<sub>0:31</sub> must be zeros (i.e., the ESID must be in the range 0- This instruction is privileged. **Special Registers Altered:** #### 5.9.3.2 Bridge to SLB Architecture [Category:Server.Phased-Out] The facility described in this section can be used to ease the transition to the current Power ISA software-managed Segment Lookaside Buffer (SLB) architecture, from the Segment Register architecture provided by 32-bit PowerPC implementations. A complete description of the Segment Register architecture may be found in "Segmented Address Translation, 32-Bit Implementations," Section 4.5, Book III of Version 1.10 of the PowerPC architecture, referenced in the introduction to this architecture. The facility permits the operating system to continue to use the 32-bit PowerPC implementation's Segment Register Manipulation instructions. ### 5.9.3.2.1 Segment Register Manipulation Instructions The instructions described in this section -- *mtsr*, *mtsrin*, *mfsr*, and *mfsrin* -- allow software to associate effective segments 0 through 15 with any of virtual segments 0 through 2<sup>27</sup>-1. SLB entries 0:15 serve as virtual Segment Registers, with SLB entry i used to emulate Segment Register i. The *mtsr* and *mtsrin* instructions move 32 bits from a selected GPR to a selected SLB entry. The *mfsr* and *mfsrin* instructions move 32 bits from a selected GPR to a selected GPR. The contents of the GPRs used by the instructions described in this section are shown in Figure 33. Fields shown as zeros must be zero for the *Move To Segment Register* instructions. Fields shown as hyphens are ignored. Fields shown as periods are ignored by the *Move To Segment Register* instructions and set to zero by the *Move From Segment Register* instructions. Fields shown as colons are ignored by the *Move To Segment Register* instructions and set to undefined values by the *Move From Segment Register* instructions. Figure 33. GPR contents for mtsr, mtsrin, mfsr, and mfsrin #### Programming Note The "Segment Register" format used by the instructions described in this section corresponds to the low-order 32 bits of RS and RT shown in the figure. This format is essentially the same as that for the Segment Registers of 32-bit PowerPC implementations. The only differences are the following. - Bit 36 corresponds to a reserved bit in Segment Registers. Software must supply 0 for the bit because it corresponds to the L bit in SLB entries, and large pages are not supported for SLB entries created by the *Move To Segment Register* instructions. - VSID bits 23:25 correspond to reserved bits in Segment Registers. Software can use these extra VSID bits to create VSIDs that are larger than those supported by the Segment Register Manipulation instructions of 32-bit PowerPC implementations. Bit 32 of RS and RT corresponds to the T (directstore) bit of early 32-bit PowerPC implementations. No corresponding bit exists in SLB entries. #### **Programming Note** The Programming Note in the introduction to Section 5.9.3.1 applies also to the *Segment Register Manipulation* instructions described in this section, and to any combination of the instructions described in the two sections, except as specified below for *mfsr* and *mfsrin*. The requirement that the SLB contain at most one entry that translates a given effective address (see Section 5.7.6.1) applies to SLB entries created by *mtsr* and *mtsrin*. This requirement is satisfied naturally if only *mtsr* and *mtsrin* are used to create SLB entries for a given ESID, because for these instructions the association between SLB entries and ESID values is fixed (SLB entry i is used for ESID i). However, care must be taken if *slbmte* is also used to create SLB entries for the ESID, because for *slbmte* the association between SLB entries and ESID values is specified by software. #### Move To Segment Register X-form mtsr SR,RS | 31 | RS | / | SR | /// | 210 | / | |----------|----|----|-----|-----|------------|-----| | <u> </u> | NO | ′ | ٠.٠ | "" | | ı ′ | | 0 | 6 | 11 | 12 | 16 | 21 | 31 | | 0 | J | | | 1.0 | <u>-</u> ' | ٠. | The SLB entry specified by SR is loaded from register RS, as follows. | SLBE<br>Bit(s) | Set to | SLB Field(s) | |----------------|-----------------------|------------------------------------| | 0:31 | 0x0000_0000 | ESID <sub>0:31</sub> | | 32:35 | SR | ESID <sub>32:35</sub> | | 36 | 0b1 | V | | 37:38 | 0b00 | В | | 39:61 | 0b000 0x0_0000 | VSID <sub>0:22</sub> | | 62:88 | (RS) <sub>37:63</sub> | VSID <sub>23:49</sub> | | 89:91 | (RS) <sub>33:35</sub> | $K_sK_pN$ | | 92 | (RS) <sub>36</sub> | L ((RS) <sub>36</sub> must be 0b0) | | 93 | 0b0 | С | | 94 | 0b0 | reserved | | 95:96 | 0b00 | LP | $\ensuremath{\mathsf{MSR}_\mathsf{SF}}$ must be 0 when this instruction is executed; otherwise the results are boundedly undefined. This instruction is privileged. #### **Special Registers Altered:** None ### Move To Segment Register Indirect X-form mtsrin RS,RB | 31 | RS | /// | RB | 242 | / | |----|----|-----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | The SLB entry specified by $(RB)_{32:35}$ is loaded from register RS, as follows. | SLBE<br>Bit(s) | Set to | SLB Field(s) | |----------------|-----------------------|------------------------------------| | 0:31 | 0x0000_0000 | ESID <sub>0:31</sub> | | 32:35 | (RB) <sub>32:35</sub> | ESID <sub>32:35</sub> | | 36 | 0b1 | V | | 37:38 | 0b00 | В | | 39:61 | 0b000 0x0_0000 | VSID <sub>0:22</sub> | | 62:88 | (RS) <sub>37:63</sub> | VSID <sub>23:49</sub> | | 89:91 | (RS) <sub>33:35</sub> | $K_sK_pN$ | | 92 | (RS) <sub>36</sub> | L ((RS) <sub>36</sub> must be 0b0) | | 93 | 0b0 | C | | 94 | 0b0 | reserved | | 95:96 | 0b00 | LP | ${\sf MSR}_{\sf SF}$ must be 0 when this instruction is executed; otherwise the results are boundedly undefined. This instruction is privileged. #### **Special Registers Altered:** #### Move From Segment Register X-form mfsr RT,SR | 31 | RT | / | SR | /// | 595 | / | |----|----|----|----|-----|-----|----| | 0 | 6 | 11 | 12 | 16 | 21 | 31 | The contents of the low-order 27 bits of the VSID field and the contents of the $K_s$ , $K_p$ , N, and L fields of the SLB entry specified by SR are placed into register RT as follows. | SLBE Bit(s) | Copied to | SLB Field(s) | |-------------|---------------------|-----------------------------------| | 62:88 | RT <sub>37:63</sub> | VSID <sub>23:49</sub> | | 89:91 | RT <sub>33:35</sub> | $K_sK_pN$ | | 92 | RT <sub>36</sub> | L (SLBE <sub>1</sub> must be 0b0) | $RT_{32}$ is set to 0. The contents of $RT_{0:31}$ are undefined. MSR<sub>SF</sub> must be 0 when this instruction is executed; otherwise the results are boundedly undefined. This instruction must be used only to read an SLB entry that was, or could have been, created by mtsr or mtsrin and has not subsequently been invalidated (i.e., an SLB entry in which ESID<16, V=1, VSID<2<sup>27</sup>, L=0, and C=0). If the SLB entry is invalid (V=0), RT<sub>33:63</sub> are set to 0. Otherwise the contents of register RT are undefined. This instruction is privileged. #### **Special Registers Altered:** None ### Move From Segment Register Indirect X-form mfsrin RT,RB | 31 | RT | /// | RB | 659 | / | |----|----|-----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | The contents of the low-order 27 bits of the VSID field and the contents of the $K_s$ , $K_p$ , N, and L fields of the SLB entry specified by (RB) $_{32:35}$ are placed into register RT as follows. | SLBE Bit(s) | Copied to | SLB Field(s) | |-------------|---------------------|-----------------------------------| | 62:88 | RT <sub>37:63</sub> | VSID <sub>23:49</sub> | | 89:91 | RT <sub>33:35</sub> | $K_sK_pN$ | | 92 | RT <sub>36</sub> | L (SLBE <sub>L</sub> must be 0b0) | $RT_{32}$ is set to 0. The contents of $RT_{0:31}$ are undefined. MSR<sub>SF</sub> must be 0 when this instruction is executed; otherwise the results are boundedly undefined. This instruction must be used only to read an SLB entry that was, or could have been, created by mtsr or mtsrin and has not subsequently been invalidated (i.e., an SLB entry in which ESID<16, V=1, VSID<2<sup>27</sup>, L=0, and C=0). If the SLB entry is invalid (V=0), RT<sub>33:63</sub> are set to 0. Otherwise the contents of register RT are undefined. This instruction is privileged. #### **Special Registers Altered:** #### 5.9.3.3 TLB Management Instructions #### TLB Invalidate Entry X-form tlbie RB,L ``` if L = 0 then p = 12 if (RB)<sub>56</sub>=0 then pg_size ← 4 KB else pg_size ← 64 KB else pg_size ← page size specified in (RB)<sub>44:51</sub> p ← log_base_2(pg_size) sg_size ← segment size specified in (RB)<sub>54:55</sub> for each processor in the partition for each TLB entry if (entry_VA<sub>14:77-p</sub> = (RB)<sub>0:63-p</sub>) & (entry_sg_size = sg_size) & (entry_pg_size = pg_size) then TLB entry ← invalid ``` The operation performed by this instruction is based upon the contents of RB and the L field. The contents of RB are shown below, where L is the L field in the instruction. L=0: L=1: If the L field of the instruction contains 0, $RB_{56}$ (AP - Admixed Page size field) must be set to 0 if the page size specified by the PTE that was used to create the TLB entry to be invalidated is 4 KB and must be set to 1 if the page size specified by the PTE that was used to create the TLB entry to be invalidated is 64 KB. The VPN field in register RB must contain bits 14:65 of the virtual address translated by the TLB entry to be invalidated. If the L field in the instruction contains 1, the following rules apply, where c is the number of "r" bits in the LP field of the PTE that was used to create the TLB entry to be invalidated. The page size is specified in the LP field in register RB, where the relationship between (RB)<sub>LP</sub> and the page size is the same as the relationship between PTE<sub>LP</sub> and the page size (see Figure 22). Specifically, (RB)<sub>44+c:51</sub> must be equal to the contents of bits c:7 of the - LP field of the PTE that was used to create the TLB entry to be invalidated. - (RB)<sub>0:43+c</sub> must contain bits 14:77-p of the virtual address translated by the TLB to be invalidated, followed by p+c-20 zeros which must be ignored by the processor. Let the segment size be equal to the segment size specified in RB $_{54:55}$ (B field). The contents of RB $_{54:55}$ must be the same as the contents of PTE $_{\rm B}$ used to create the TLB entry to be invalidated. $RB_{52:53}$ , $RB_{56}$ (when the L field of the instruction is 1), and $RB_{57:63}$ must be set to zeros and must be ignored by the processor. All TLB entries that have all of the following properties are made invalid on all processors that are in the same partition as the processor executing the *tlbie* instruction. - The entry translates a virtual address for which VA<sub>14:77-p</sub> is equal to (RB)<sub>0:63-p</sub>. - The segment size of the entry is the same as the segment size specified in (RB)<sub>54:55</sub>. - Either of the following is true: - The L field in the instruction is 0, and either the page size of the entry is 4KB and (RB)<sub>56</sub>=0, or the page size of the entry is 64KB and (RB)<sub>56</sub>=1. - The L field of the instruction is 1, and the page size of the entry matches the page size specified in (RB)<sub>44:51</sub>. Additional TLB entries may also be made invalid on any processor that is in the same partition as the processor executing the *tlbie* instruction. MSR<sub>SF</sub> must be 1 when this instruction is executed; otherwise the results are undefined. The operation performed by this instruction is ordered by the *eieio* (or *sync* or *ptesync*) instruction with respect to a subsequent *tlbsync* instruction executed by the processor executing the *tlbie* instruction. The operations caused by *tlbie* and *tlbsync* are ordered by *eieio* as a fourth set of operations, which is independent of the other three sets that *eieio* orders. This instruction is hypervisor privileged. See Section 5.10, "Page Table Update Synchronization Requirements" for a description of other requirements associated with the use of this instruction. #### **Special Registers Altered:** #### Programming Note - For *tlbie[I]* instructions in which L=0, the AP value in RB is provided to make it easier for the processor to locate address translations, in lookaside buffers, corresponding to the address translation being invalidated. #### TLB Invalidate Entry Local X-form tlbiel RB,L ``` if L = 0 then p = 12 if (RB)_{56}=0 then p=12 t ``` The operation performed by this instruction is based upon the contents of RB and the L field. The contents of RB are shown below, where L is the L field in the instruction. L=0: | VPN | 0s | В | ΑP | 0s | | |-----|----|----|----|----|----| | 0 | 52 | 54 | 56 | 57 | 63 | L=1: If the L field of the instruction contains 0, $RB_{56}$ (AP - Admixed Page size field) must be set to 0 if the page size specified by the PTE that was used to create the TLB entry to be invalidated is 4 KB and must be set to 1 if the page size specified by the PTE that was used to create the TLB entry to be invalidated is 64 KB. The VPN field in register RB must contain bits 14:65 of the virtual address translated by the TLB entry to be invalidated. If the L field in the instruction contains 1, the following rules apply, where c is the number of "r" bits in the LP field of the PTE that was used to create the TLB entry to be invalidated. - The page size is specified in the LP field in register RB, where the relationship between (RB)<sub>LP</sub> and the page size is the same as the relationship between PTE<sub>LP</sub> and the page size (see Figure 22). Specifically, (RB)<sub>44+c:51</sub> must be equal to the contents of bits c:7 of the LP field of the PTE that was used to create the TLB entry to be invalidated. - (RB)<sub>0:43+c</sub> must contain bits 14:77-p of the virtual address translated by the TLB to be inval- idated, followed by p+c-20 zeros which must be ignored by the processor. Let the segment size be equal to the segment size specified in RB $_{54:55}$ (B field). The contents of RB $_{54:55}$ must be the same as the contents of PTE $_{\rm B}$ used to create the TLB entry to be invalidated. $\rm RB_{52:53}$ , $\rm RB_{56}$ (when the L field of the instruction is 1), and RB $_{57:63}$ must be set to 0s and must be ignored by the processor. All TLB entries that have all of the following properties are made invalid on the processor executing the *tlbiel* instruction. - The entry translates a virtual address for which VA<sub>14:77-p</sub> is equal to (RB)<sub>0:63-p</sub>. - The segment size of the entry is the same as the segment size specified in (RB)<sub>54·55</sub>. - Either of the following is true: - The L field in the instruction is 0, and either the page size of the entry is 4KB and (RB)<sub>56</sub>=0, or the page size of the entry is 64KB and (RB)<sub>56</sub>=1. - The L field of the instruction is 1, and the page size of the entry matches the page size specified in (RB)<sub>44-51</sub>. Only TLB entries on the processor executing the *tlbiel* instruction are affected. MSR<sub>SF</sub> must be 1 when this instruction is executed; otherwise the results are undefined. This instruction is hypervisor privileged. See Section 5.10, "Page Table Update Synchronization Requirements" on page 543 for a description of other requirements associated with the use of this instruction. #### **Special Registers Altered:** None #### **Programming Note** The primary use of this instruction by hypervisor state code is to invalidate TLB entries prior to reassigning a processor to a new logical partition. **tlbiel** may be executed on a given processor even if the sequence **tlbie** - **eieio** - **tlbsync** - **ptesync** is concurrently being executed on another processor. See also the Programming Note with the description of the *tlbie* instruction. #### TLB Invalidate All X-form #### TLB Synchronize X-form tlbia | 3 | 1 / | /// /// | /// | 370 | / | |---|-----|---------|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | all TLB entries ← invalid All TLB entries are made invalid on the processor executing the *tlbia* instruction. This instruction is hypervisor privileged. This instruction is optional, and need not be implemented. #### **Special Registers Altered:** None #### **Programming Note** tlbia does not affect TLBs on other processors. tlbsync | 31 | /// | /// | /// | 566 | / | |----|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | The *tlbsync* instruction provides an ordering function for the effects of all *tlbie* instructions executed by the processor executing the *tlbsync* instruction, with respect to the memory barrier created by a subsequent *ptesync* instruction executed by the same processor. Executing a *tlbsync* instruction ensures that all of the following will occur. - All TLB invalidations caused by tlbie instructions preceding the tlbsync instruction will have completed on any other processor before any data accesses caused by instructions following the ptesync instruction are performed with respect to that processor. - All storage accesses by other processors for which the address was translated using the translations being invalidated, and all Reference and Change bit updates associated with address translations that were performed by other processors using the translations being invalidated, will have been performed with respect to the processor executing the ptesync instruction, to the extent required by the associated Memory Coherence Required attributes, before the ptesync instruction's memory barrier is created. The operation performed by this instruction is ordered by the *eieio* (or *sync* or *ptesync*) instruction with respect to preceding *tlbie* instructions executed by the processor executing the *tlbsync* instruction. The operations caused by *tlbie* and *tlbsync* are ordered by *eieio* as a fourth set of operations, which is independent of the other three sets that *eieio* orders. The *tlbsync* instruction may complete before operations caused by *tlbie* instructions preceding the *tlb-sync* instruction have been performed. This instruction is hypervisor privileged. See Section 5.10 for a description of other requirements associated with the use of this instruction. #### **Special Registers Altered:** None #### **Programming Note -** *tlbsync* should not be used to synchronize the completion of *tlbiel*. #### 5.10 Page Table Update Synchronization Requirements This section describes rules that software must follow when updating the Page Table, and includes suggested sequences of operations for some representative cases. In the sequences of operations shown in the following subsections, any alteration of a Page Table Entry (PTE) that corresponds to a single line in the sequence is assumed to be done using a *Store* instruction for which the access is atomic. Appropriate modifications must be made to these sequences if this assumption is not satisfied (e.g., if a store doubleword operation is done using two *Store Word* instructions). Stores are not performed out-of-order, as described in Section 5.5, "Performing Operations Out-of-Order" on page 506. Moreover, address translations associated with instructions preceding the corresponding Store instructions are not performed again after the stores have been performed. (These address translations must have been performed before the store was determined to be required by the sequential execution model, because they might have caused an exception.) As a result, an update to a PTE need not be preceded by a context synchronizing operation. All of the sequences require a context synchronizing operation after the sequence if the new contents of the PTE are to be used for address translations associated with subsequent instructions. As noted in the description of the *Synchronize* instruction in Section 3.4.3 of Book II, address translation associated with instructions which occur in program order subsequent to the *Synchronize* (and this includes the *ptesync* variant) may actually be performed prior to the completion of the *Synchronize*. To ensure that these instructions and data which may have been speculatively fetched are discarded, a context synchronizing operation is required. #### **Programming Note** In many cases this context synchronization will occur naturally; for example, if the sequence is executed within an interrupt handler the *rfid* or *hrfid* instruction that returns from the interrupt handler may provide the required context synchronization. Page Table Entries must not be changed in a manner that causes an implicit branch. #### 5.10.1 Page Table Updates **TLBs are non-coherent caches of the HTAB.** TLB entries must be invalidated explicitly with one of the *TLB Invalidate* instructions. Unsynchronized lookups in the HTAB continue even while it is being modified. Any processor, including a processor on which software is modifying the HTAB, may look in the HTAB at any time in an attempt to translate a virtual address. When modifying a PTE, software must ensure that the PTE's Valid bit is 0 if the PTE is inconsistent (e.g., if the RPN field is not correct for the current AVPN field). Updates of Reference and Change bits by the processor are not synchronized with the accesses that cause the updates. When modifying doubleword 1 of a PTE, software must take care to avoid overwriting a processor update of these bits and to avoid having the value written by a *Store* instruction overwritten by a processor update. Before permitting one or more *tlbie* instructions to be executed on a given processor in a given partition software must ensure that no other processor will execute a "conflicting instruction" until after the following sequence of instructions has been executed on the given processor. the *tlbie* instruction(s) *eieio tlbsync ptesync* The "conflicting instructions" in this case are the following. - a tibie or tibsync instruction, if executed on another processor in the given partition - an mtspr instruction that modifies the LPIDR, if the modification has either of the following properties. - The old LPID value (i.e., the contents of the LPIDR just before the *mtspr* instruction is executed) is the value that identifies the given partition - The new LPID value (i.e., the value specified by the *mtspr* instruction) is the value that identifies the given partition Other instructions (excluding *mtspr* instructions that modify the LPIDR as described above, and excluding *tlbie* instructions except as shown) may be interleaved with the instruction sequence shown above, but the instructions in the sequence must appear in the order shown. On uniprocessor systems, the *eieio* and *tlb-sync* instructions can be omitted. Other instructions may be interleaved with this sequence of instructions, but these instructions must appear in the order shown. #### Programming Note - The eieio instruction prevents the reordering of tlbie instructions previously executed by the processor with respect to the subsequent tlbsync instruction. The tlbsync instruction and the subsequent ptesync instruction together ensure that all storage accesses for which the address was translated using the translations being invalidated, and all Reference and Change bit updates associated with address translations that were performed using the translations being invalidated, will be performed with respect to any processor or mechanism, to the extent required by the associated Memory Coherence Required attributes, before any data accesses caused by instructions following the *ptesync* instruction are performed with respect to that processor or mechanism. The requirements specified above for *tlbie* instructions apply also to *tlbsync* instructions, except that the "sequence of instructions" consists solely of the *tlbsync* instruction(s) followed by a *ptesync* instruction. Before permitting an *mtspr* instruction that modifies the LPIDR to be executed on a given processor, software must ensure that no other processor will execute a "conflicting instruction" until after the *mtspr* instruction followed by a context synchronizing instruction have been executed on the given processor (a context synchronizing event can be used instead of the context synchronizing instruction; see Chapter 10). The "conflicting instructions" in this case are the following. - a tlbie or tlbsync instruction, if executed on a processor in either of the following partitions - the partition identified by the old LPID value - the partition identified by the new LPID value #### Programming Note The restrictions specified above regarding modifying the LPIDR apply even on uniprocessor systems, and even if the new LPID value is equal to the old LPID value. Similarly, when a *tlbsync* instruction has been executed by a processor in a given partition, a *ptesync* instruction must be executed by that processor before a *tlbie* or *tlbsync* instruction is executed by another processor in that partition. The sequences of operations shown in the following subsections assume a multiprocessor environment. In a uniprocessor environment the *tlbsync* must be omitted, and the *eieio* that separates the *tlbie* from the *tlbsync* can be omitted. In a multiprocessor environment, when *tlbiel* is used instead of *tlbie* in a Page Table update, the synchronization requirements are the same as when *tlbie* is used in a uniprocessor environment. #### Programming Note For all of the sequences shown in the following subsections, if it is necessary to communicate completion of the sequence to software running on another processor, the ptesync instruction at the end of the sequence should be followed by a Store instruction that stores a chosen value to some chosen storage location X. The memory barrier created by the ptesync instruction ensures that if a Load instruction executed by another processor returns the chosen value from location X, the sequence's stores to the Page Table have been performed with respect to that other processor. The Load instruction that returns the chosen value should be followed by a context synchronizing instruction in order to ensure that all instructions following the context synchronizing instruction will be fetched and executed using the values stored by the sequence (or values stored subsequently). (These instructions may have been fetched or executed out-of-order using the old contents of the PTE.) This Note assumes that the Page Table and location X are in storage that is Memory Coherence Required. #### 5.10.1.1 Adding a Page Table Entry This is the simplest Page Table case. The Valid bit of the old entry is assumed to be 0. The following sequence can be used to create a PTE, maintain a consistent state, and ensure that a subsequent reference to the virtual address translated by the new entry will use the correct real address and associated attributes ``` PTE<sub>ARPN,LP,AC,R,C,WIMG,N,PP</sub> ← new values eieio /* order 1st update before 2nd */ PTE<sub>B,AVPN,SW,L,H,V</sub> ← new values (V=1) ptesync /* order updates before next Page Table search and before next data access. */ ``` #### 5.10.1.2 Modifying a Page Table Entry #### **General Case** If a valid entry is to be modified and the translation instantiated by the entry being modified is to be invalidated, the following sequence can be used to modify the PTE, maintain a consistent state, ensure that the translation instantiated by the old entry is no longer available, and ensure that a subsequent reference to the virtual address translated by the new entry will use the correct real address and associated attributes. (The sequence is equivalent to deleting the PTE and then adding a new one; see Sections 5.10.1.1 and 5.10.1.3.) ``` PTE_V \leftarrow 0 /* (other fields don't matter)*/ ptesync /* order update before tlbie and before next Page Table search */ tlbie(old_B,old_VA<sub>14:77-p</sub>,old_L,old_LP,old_AP) /*invalidate old translation*/ /* order tlbie before tlbsync eieio tlbsync /* order tlbie before ptesync ptesync /* order tlbie, tlbsync and 1st update before 2nd update \texttt{PTE}_{\texttt{ARPN},\texttt{LP},\texttt{AC},\texttt{R},\texttt{C},\texttt{WIMG},\texttt{N},\texttt{PP}} \leftarrow \texttt{new values} /* order 2nd update before 3rd */ \texttt{PTE}_{\texttt{B,AVPN,SW,L,H,V}} \leftarrow \texttt{new values} \ (\texttt{V=1}) ptesync /* order 2nd and 3rd updates before next Page Table search and before next data access ``` #### Resetting the Reference Bit If the only change being made to a valid entry is to set the Reference bit to 0, a simpler sequence suffices because the Reference bit need not be maintained exactly. #### Modifying the SW field If the only change being made to a valid entry is to modify the SW field, the following sequence suffices, because the SW field is not used by the processor and doubleword 0 of the PTE is not modified by the processor. ``` loop: ldarx r1 ← PTE_dwd_0 /* load dwd 0 of PTE */ r1<sub>57:60</sub> ← new SW value /* replace SW, in r1 */ stdcx. PTE_dwd_0 ← r1 /* store dwd 0 of PTE if still reserved (new SW value, other fields unchanged) */ bne- loop /* loop if lost reservation */ ``` A *Iwarx/stwcx*. pair (specifying the low-order word of doubleword 0 of the PTE) can be used instead of the *Idarx /stdcx*. pair shown above. #### Modifying the Virtual Address If the virtual addrss translated by a valid PTE is to be modified and the new virtual address hashes to the same PTEG (or the same two PTEGs if the secondary Page Table search is enabled) as does the old virtual address, the following sequence can be used to modify the PTE, maintain a consistent state, ensure that the translation instantiated by the old entry is nolonger available, and ensure that a subsequent reference to the virtual address translated by the enew entry will use the correct real address and associated attributes. #### 5.10.1.3 Deleting a Page Table Entry The following sequence can be used to ensure that the translation instantiated by an existing entry is no longer available. ``` PTE<sub>V</sub> ← 0 /* (other fields don't matter) */ ptesync /* order update before tlbie and before next Page Table search */ tlbie(old_B,old_VA<sub>14:77-p</sub>,old_L,old_LP,old_AP) /*invalidate old translation*/ eieio /* order tlbie before tlbsync */ tlbsync /* order tlbie before ptesync */ ptesync /* order tlbie, tlbsync, and update before next data access */ ``` ### Chapter 6. Interrupts | 6.1 Overview 547 | 6.5.8 Alignment Interrupt562 | |------------------------------------------|--------------------------------------------| | 6.2 Interrupt Registers 548 | 6.5.9 Program Interrupt 563 | | 6.2.1 Machine Status Save/Restore Regis- | 6.5.10 Floating-Point Unavailable | | ters | Interrupt | | 6.2.2 Hypervisor Machine Status Save/ | 6.5.11 Decrementer Interrupt 565 | | Restore Registers 548 | 6.5.12 Hypervisor Decrementer | | 6.2.3 Data Address Register 548 | Interrupt | | 6.2.4 Hypervisor Data Address Register | 6.5.13 System Call Interrupt 565 | | 548 | 6.5.14 Trace Interrupt [Category: Trace] | | 6.2.5 Data Storage Interrupt | 565 | | Status Register 548 | 6.5.15 Hypervisor Data Storage Inter- | | 6.2.6 Hypervisor Data Storage Interrupt | rupt | | Status Register 549 | 6.5.16 Hypervisor Instruction Storage | | 6.2.7 Hypervisor Emulation Instruction | Interrupt | | Register [Category: Hypervisor Emula- | 6.5.17 Hypervisor Data Segment Inter- | | tion Assistance] 549 | rupt | | 6.2.8 Hypervisor Maintenance Exception | 6.5.18 Hypervisor Instruction Segment | | Register | Interrupt | | 6.2.9 Hypervisor Maintenance Exception | 6.5.19 Hypervisor Emulation Assis- | | Enable Register 549 | tance Interrupt [Category: Hypervisor | | 6.3 Interrupt Synchronization 550 | Emulation Assistance] 568 | | 6.4 Interrupt Classes 550 | 6.5.20 Hypervisor Maintenance Interrupt . | | 6.4.1 Precise Interrupt 550 | 568 | | 6.4.2 Imprecise Interrupt 550 | 6.5.21 Performance Monitor | | 6.4.3 Interrupt Processing 551 | Interrupt [Category: Server.Performance | | 6.4.4 Implicit alteration of HSRR0 and | Monitor] | | HSRR1 | 6.5.22 Vector Unavailable Interrupt [Cate- | | 6.5 Interrupt Definitions 555 | gory: Vector] | | 6.5.1 System Reset Interrupt 556 | 6.6 Partially Executed | | 6.5.2 Machine Check Interrupt 557 | Instructions 570 | | 6.5.3 Data Storage Interrupt 559 | 6.7 Exception Ordering 571 | | 6.5.4 Data Segment Interrupt 560 | 6.7.1 Unordered Exceptions 571 | | 6.5.5 Instruction Storage Interrupt 560 | 6.7.2 Ordered Exceptions 571 | | 6.5.6 Instruction Segment | 6.8 Interrupt Priorities 571 | | Interrupt | | | 6.5.7 External Interrupt 561 | | | | | #### 6.1 Overview The Power ISA provides an interrupt mechanism to allow the processor to change state as a result of external signals, errors, or unusual conditions arising in the execution of instructions. System Reset and Machine Check interrupts are not ordered. All other interrupts are ordered such that only one interrupt is reported, and when it is processed (taken) no program state is lost. Since Save/Restore Registers SRR0 and SRR1 are serially reusable resources used by most interrupts, program state may be lost when an unordered interrupt is taken. #### 6.2 Interrupt Registers #### 6.2.1 Machine Status Save/ Restore Registers When various interrupts occur, the state of the machine is saved in the Machine Status Save/Restore registers (SRR0 and SRR1). Section 6.5 describes which registers are altered by each interrupt. Figure 34. Save/Restore Registers SRR1 bits may be treated as reserved in a given implementation if they correspond to MSR bits that are reserved or are treated as reserved in that implementation or, for SRR1 bits in the range 33:36 and 42:47, they are specified as being set either to 0 or to an undefined value for all interrupts that set SRR1 (including implementation-dependent setting, e.g. by the Machine Check interrupt or by implementation-specific interrupts). # 6.2.2 Hypervisor Machine Status Save/Restore Registers When various interrupts occur, the state of the machine is saved in the Hypervisor Machine Status Save/Restore registers (HSRR0 and HSRR1). Section 6.5 describes which registers are altered by each interrupt. Figure 35. Hypervisor Save/Restore Registers HSRR1 bits may be treated as reserved in a given implementation if they correspond to MSR bits that are reserved or are treated as reserved in that implementation or, for HSRR1 bits in the range 33:36 and 42:47, they are specified as being set either to 0 or to an undefined value for all interrupts that set HSRR1 (including implementation-dependent setting, e.g. by implementation-specific interrupts). The HSRR0 and HSRR1 are hypervisor resources; see Chapter 2. #### Programming Note Execution of some instructions, and fetching instructions when $MSR_{IR}=1$ , may have the side effect of modifying HSRR0 and HSRR1; see Section 6.4.4. #### 6.2.3 Data Address Register The Data Address Register (DAR) is a 64-bit register that is set by the Machine Check, Data Storage, Data Segment, and Alignment interrupts; see Sections 6.5.2, 6.5.3, 6.5.4, and 6.5.8. In general, when one of these interrupts occurs the DAR is set to an effective address associated with the storage access that caused the interrupt, with the high-order 32 bits of the DAR set to 0 if the interrupt occurs in 32-bit mode. Figure 36. Data Address Register # 6.2.4 Hypervisor Data Address Register The Hypervisor Data Address Register (HDAR) is a 64-bit register that is set by the Hypervisor Data Storage and Hypervisor Data Segment interrupts; see Section 6.5.15 and Section 6.5.17. In general, when one of these interrupts occurs the HDAR is set to an effective address associated with the storage access that caused the interrupt, with the high-order 32 bits of the HDAR set to 0 if the interrupt occurs in 32-bit mode. Figure 37. Hypervisor Data Address Register # 6.2.5 Data Storage Interrupt Status Register The Data Storage Interrupt Status Register (DSISR) is a 32-bit register that is set by the Machine Check, Data Storage, Data Segment, and Alignment interrupts; see Sections 6.5.2, 6.5.3, 6.5.4, and 6.5.8. In general, when one of these interrupts occurs the DSISR is set to indicate the cause of the interrupt. Figure 38. Data Storage Interrupt Status Register DSISR bits may be treated as reserved in a given implementation if they are specified as being set either to 0 or to an undefined value for all interrupts that set the DSISR (including implementation-dependent set- ting, e.g. by the Machine Check interrupt or by implementation-specific interrupts). # 6.2.6 Hypervisor Data Storage Interrupt Status Register The Hypervisor Data Storage Interrupt Status Register (HDSISR) is a 32-bit register that is set by the Hypervisor Data Storage interrupt. In general, when one of these interrupts occurs the HDSISR is set to indicate the cause of the interrupt. Figure 39. Hypervisor Data Storage Interrupt Status Register # 6.2.7 Hypervisor Emulation Instruction Register [Category: Hypervisor Emulation Assistance] The Hypervisor Emulation Instruction Register (HEIR) is a 32-bit register that is set by the Hypervisor Emulation Assistance interrupt; see Section 6.5.19. The image of the instruction that caused the interrupt is loaded into the register. Figure 40. Hypervisor Emulation Instruction Register # 6.2.8 Hypervisor Maintenance Exception Register Each bit in the Hypervisor Maintenance Exception Register (HMER) is associated with one or more causes of the Hypervisor Maintenance exception, and is set when the associated exception(s) occur. If the corresponding bit in the Hypervisor Maintenance Exception Enable Register (HMEER) is set, a Hypervisor Maintenance Interrupt (HMI) may occur. If the processor is in a power-saving mode when the interrupt would have occurred, the processor will exit the power-saving mode; see Section 6.5.20 and Section 3.3.2. Figure 41. Hypervisor Maintenance Exception Register The contents of the HMER are as follows: O Set to 1 for a Malfunction Alert. - Set to 1 when performance is degraded for thermal reasons. - Set to 1 when processor recovery is invoked. Others Implementation-specific. When the *mtspr* instruction is executed with the HMER as the encoded Special Purpose Register, the contents of register RS are ANDed with the contents of the HMER and the result is placed into the HMER. The exception bits in the HMER are sticky; that is, once set to 1 they remain set to 1 until they are set to 0 by an *mthmer* instruction. #### Programming Note An access to the HMER is likely to be very slow. Software should access it sparingly. # 6.2.9 Hypervisor Maintenance Exception Enable Register The Hypervisor Maintenance Exception Enable Register (HMEER) is a 64-bit register in which each bit enables the corresponding exception in the HMER to cause the Hypervisor Maintenance interrupt, potentially causing exit from power-saving mode; see Section 6.5.20 and Section 3.3.2. Figure 42. Hypervisor Maintenance Exception Enable Register #### 6.3 Interrupt Synchronization When an interrupt occurs, SRR0 or HSRR0 is set to point to an instruction such that all preceding instructions have completed execution, no subsequent instruction has begun execution, and the instruction addressed by SRR0 or HSRR0 may or may not have completed execution, depending on the interrupt type. With the exception of System Reset and Machine Check interrupts, all interrupts are context synchronizing as defined in Section 1.5.1. System Reset and Machine Check interrupts are context synchronizing if they are recoverable (i.e., if bit 62 of SRR1 is set to 1 by the interrupt). If a System Reset or Machine Check interrupt is not recoverable (i.e., if bit 62 of SRR1 is set to 0 by the interrupt), it acts like a context synchronizing operation with respect to subsequent instructions. That is, a non-recoverable System Reset or Machine Check interrupt need not satisfy items 1 through 3 of Section 1.5.1, but does satisfy items 4 and 5. #### 6.4 Interrupt Classes Interrupts are classified by whether they are directly caused by the execution of an instruction or are caused by some other system exception. Those that are "system-caused" are: - System Reset - Machine Check - External - Decrementer - Hypervisor Decrementer - Hypervisor Maintenance External, Decrementer, Hypervisor Decrementer, and Hypervisor Maintenance interrupts are maskable interrupts. Therefore, software may delay the generation of these interrupts. System Reset and Machine Check interrupts are not maskable. "Instruction-caused" interrupts are further divided into two classes, *precise* and *imprecise*. #### 6.4.1 Precise Interrupt Except for the Imprecise Mode Floating-Point Enabled Exception type Program interrupt, all instruction-caused interrupts are precise. When the fetching or execution of an instruction causes a precise interrupt, the following conditions exist at the interrupt point. - SRR0 addresses either the instruction causing the exception or the immediately following instruction. Which instruction is addressed can be determined from the interrupt type and status bits. - 2. An interrupt is generated such that all instructions preceding the instruction causing the exception - appear to have completed with respect to the executing processor. - The instruction causing the exception may appear not to have begun execution (except for causing the exception), may have been partially executed, or may have completed, depending on the interrupt type. - Architecturally, no subsequent instruction has begun execution. #### 6.4.2 Imprecise Interrupt This architecture defines one imprecise interrupt, the Imprecise Mode Floating-Point Enabled Exception type Program interrupt. When an Imprecise Mode Floating-Point Enabled Exception type Program interrupt occurs, the following conditions exist at the interrupt point. - SRR0 addresses either the instruction causing the exception or some instruction following that instruction; see Section 6.5.9, "Program Interrupt" on page 563. - An interrupt is generated such that all instructions preceding the instruction addressed by SRR0 appear to have completed with respect to the executing processor. - The instruction addressed by SRR0 may appear not to have begun execution (except, in some cases, for causing the interrupt to occur), may have been partially executed, or may have completed; see Section 6.5.9. - No instruction following the instruction addressed by SRR0 appears to have begun execution. All Floating-Point Enabled Exception type Program interrupts are maskable using the MSR bits FE0 and FE1. Although these interrupts are maskable, they differ significantly from the other maskable interrupts in that the masking of these interrupts is usually controlled by the application program, whereas the masking of all other maskable interrupts is controlled by either the operating system or the hypervisor. ### 6.4.3 Interrupt Processing Associated with each kind of interrupt is an *interrupt vector*, which contains the initial sequence of instructions that is executed when the corresponding interrupt occurs. Interrupt processing consists of saving a small part of the processor's state in certain registers, identifying the cause of the interrupt in other registers, and continuing execution at the corresponding interrupt vector location. When an exception exists that will cause an interrupt to be generated and it has been determined that the interrupt will occur, the following actions are performed. The handling of Machine Check interrupts (see Section 6.5.2) differs from the description given below in several respects. - 1. SRR0 or HSRR0 is loaded with an instruction address that depends on the type of interrupt; see the specific interrupt description for details. - Bits 33:36 and 42:47 of SRR1 or HSRR1 are loaded with information specific to the interrupt type. - Bits 0:32, 37:41, and 48:63 of SRR1 or HSRR1 are loaded with a copy of the corresponding bits of the MSR. - 4. The MSR is set as shown in Figure 43 on page 555. In particular, MSR bits IR and DR are set to 0, disabling relocation, and MSR bit SF is set to 1, selecting 64-bit mode. The new values take effect beginning with the first instruction executed following the interrupt. - Instruction fetch and execution resumes, using the new MSR value, at the effective address specific to the interrupt type. These effective addresses are shown in Figure 44 on page 556. Interrupts do not clear reservations obtained with *Iwarx* or *Idarx*. ### Programming Note - In general, when an interrupt occurs, the following instructions should be executed by the operating system before dispatching a "new" program. - stwcx. or stdcx., to clear the reservation if one is outstanding, to ensure that a lwarx or ldarx in the interrupted program is not paired with a stwcx. or stdcx. in the "new" program. - sync, to ensure that all storage accesses caused by the interrupted program will be performed with respect to another processor before the program is resumed on that other processor. - isync or rfid, to ensure that the instructions in the "new" program execute in the "new" context. ### **Programming Note** For instruction-caused interrupts, in some cases it may be desirable for the operating system to emulate the instruction that caused the interrupt, while in other cases it may be desirable for the operating system not to emulate the instruction. The following list, while not complete, illustrates criteria by which decisions regarding emulation should be made. The list applies to general execution environments; it does not necessarily apply to special environments such as program debugging, processor bring-up, etc. In general, the instruction should be emulated if: - The interrupt is caused by a condition for which the instruction description (including related material such as the introduction to the section describing the instruction) implies that the instruction works correctly. Example: Alignment interrupt caused by *Imw* for which the storage operand is not aligned, or by *dcbz* for which the storage operand is in storage that is Write Through Required or Caching Inhibited. - The instruction is an illegal instruction that should appear, to the program executing it, as if it were supported by the implementation. Example: An Illegal Instruction type Program interrupt (or a Hypervisor Emulation Assistance interrupt if Category: HEA is supported) is caused by an instruction that has been phased out of the architecture but is still used by some programs that the operating system supports, or by an instruction that is in a category that the implementation does not support but is used by some programs that the operating system supports. In general, the instruction should not be emulated if: - The purpose of the instruction is to cause an interrupt. Example: System Call interrupt caused by sc. - The interrupt is caused by a condition that is stated, in the instruction description, potentially to cause the interrupt. Example: Alignment interrupt caused by *Iwarx* for which the storage operand is not aligned. - The program is attempting to perform a function that it should not be permitted to perform. Example: Data Storage interrupt caused by *Iwz* for which the storage operand is in storage that the program should not be permitted to access. (If the function is one that the program should be permitted to perform, the conditions that caused the interrupt should be corrected and the program re-dispatched such that the instruction will be re-executed. Example: Data Storage interrupt caused by *Iwz* for which the storage operand is in storage that the program should be permitted to access but for which there currently is no PTE that satisfies the Page Table search.) #### Programming Note If a program modifies an instruction that it or another program will subsequently execute and the execution of the instruction causes an interrupt, the state of storage and the content of some processor registers may appear to be inconsistent to the interrupt handler program. For example, this could be the result of one program executing an instruction that causes a Hypervisor Emulation Assistance interrupt if Category: HEA is supported or the Illegal Instruction type Program interrupt if Category: HEA is not supported just before another instance of the same program stores an Add Immediate instruction in that storage location. To the interrupt handler code, it would appear that a processor generated the interrupt as the result of executing a valid instruction. ### Programming Note - In order to handle Machine Check and System Reset interrupts correctly, the operating system should manage MSR<sub>RI</sub> as follows. - In the Machine Check and System Reset interrupt handlers, interpret SRR1 bit 62 (where MSR<sub>RI</sub> is placed) as: - 0: interrupt is not recoverable - 1: interrupt is recoverable - In each interrupt handler, when enough state has been saved that a Machine Check or System Reset interrupt can be recovered from, set MSR<sub>RI</sub> to 1. - In each interrupt handler, do the following (in order) just before returning. - 1. Set MSR<sub>RI</sub> to 0. - Set SRR0 and SRR1 to the values to be used by *rfid*. The new value of SRR1 should have bit 62 set to 1 (which will happen naturally if SRR1 is restored to the value saved there by the interrupt, because the interrupt handler will not be executing this sequence unless the interrupt is recoverable). - 3. Execute rfid. For interrupts that set the SRRs other than Machine Check or System Reset, MSR<sub>RI</sub> can be managed similarly when these interrupts occur within interrupt handlers for other interrupts that set the SRRs. This Note does not apply to interrupts that set the HSRRs because these interrupts put the processor into hypervisor state, and either do not occur or can be prevented from occurring within interrupt handlers for other interrupts that set the HSRRs. ## 6.4.4 Implicit alteration of HSRR0 and HSRR1 Executing some of the more complex instructions may have the side effect of altering the contents of HSRR0 and HSRR1. The instructions listed below are guaranteed not to have this side effect. Any omission of instruction suffixes is significant; e.g., **add** is listed but **add.** is excluded. 1. Branch instructions b[I][a], bc[I][a], bcIr[I], bcctr[I] 2. Fixed-Point Load and Store Instructions lbz, lbzx, lhz, lhzx, lwz, lwzx, ld<64>, ldx<64>, stb, stbx, sth, sthx, stw, stwx, std<64>, stdx<64> Execution of these instructions is guaranteed not to have the side effect of altering HSRR0 and HSRR1 only if the storage operand is aligned and $MSR_{DR}=0$ . 3. Arithmetic instructions addi, addis, add, subf, neg 4. Compare instructions cmpi, cmp, cmpli, cmpl 5. Logical and Extend Sign instructions ori, oris, xori, xoris, and, or, xor, nand, nor, eqv, andc, orc, extsb, extsh, extsw 6. Rotate and Shift instructions rldicl<64>, rldicr<64>, rldic<64>, rlwinm, rldcl<64>, rldcr<64>, rlwnm, rldimi<64>, rlwimi, sld<64>, slw, srd<64>, srw 7. Other instructions isync rfid, hrfid mtspr, mfspr, mtmsrd, mfmsr #### Programming Note - Instructions excluded from the list include the following. - instructions that set or use XER<sub>CA</sub> - instructions that set XER<sub>OV</sub> or XER<sub>SO</sub> - andi., andis., and fixed-point instructions with Rc=1 (Fixed-point instructions with Rc=1 can be replaced by the corresponding instruction with Rc=0 followed by a Compare instruction.) - all floating-point instructions - mftb These instructions, and the other excluded instructions, may be implemented with the assistance of implementation-specific interrupts that modify HSRR0 and HSRR1. The included instructions are guaranteed not to be implemented thus. (The included instructions are sufficiently simple as to be unlikely to need such assistance. Moreover, they are likely to be needed in interrupt handlers before HSRR0 and HSRR1 have been saved or after HSRR0 and HSRR1 have been restored.) Similarly, fetching instructions may have the side effect of altering the contents of HSRR0 and HSRR1 unless MSR<sub>IR</sub>=0. ## **6.5 Interrupt Definitions** Figure 43 shows all the types of interrupts and the values assigned to the MSR for each. Figure 44 shows the effective address of the interrupt vector for each interrupt type. (Section 5.7.4 on page 512 summarizes all architecturally defined uses of effective addresses, including those implied by Figure 44.) | u | ues assigned to the MSR for each. Figure 44 shows the | inc | ludi | ng t | hose | imp | olied | by l | Fig | |--------|------------------------------------------------------------|-------------|-------|------|-------|------|-------|-------|------| | Ī | nterrupt Type | MSF<br>IR [ | | | FE1 | EE | RII | ME I | ٦V | | S | System Reset | 0 | 0 | 0 | 0 | 0 | 0 | р | 1 | | | Machine Check | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | | | Data Storage | 0 | 0 | 0 | 0 | 0 | 0 | _ | m | | | Data Segment | 0 | 0 | 0 | 0 | 0 | 0 | _ | m | | | nstruction Storage | 0 | 0 | 0 | 0 | 0 | 0 | - | m | | I | nstruction Segment | 0 | 0 | 0 | 0 | 0 | 0 | _ | m | | E | External | 0 | 0 | 0 | 0 | 0 | h | _ | е | | | Alignment | 0 | 0 | 0 | 0 | 0 | 0 | _ | m | | | Program | 0 | 0 | 0 | 0 | 0 | 0 | _ | m | | F | FP Unavailable | 0 | 0 | 0 | 0 | 0 | 0 | _ | m | | | Decrementer | 0 | 0 | 0 | 0 | 0 | 0 | _ | m | | H | Hypervisor Decrementer | 0 | 0 | 0 | 0 | 0 | _ | _ | 1 | | 5 | System Call | 0 | 0 | 0 | 0 | 0 | 0 | _ | S | | 1 | Trace | 0 | 0 | 0 | 0 | 0 | 0 | - | m | | H | Hypervisor Data Storage | 0 | 0 | 0 | 0 | 0 | - | - | 1 | | H | Hypervisor Instr. Storage. | 0 | 0 | 0 | 0 | 0 | - | - | 1 | | H | Hypervisor Instr. Segment | 0 | 0 | 0 | 0 | 0 | - | - | 1 | | F | Hypervisor Data Segment | 0 | 0 | 0 | 0 | 0 | - | - | 1 | | ŀ | Hypv Em'n Assistance | 0 | 0 | 0 | 0 | 0 | - | - | 1 | | F | Hypervisor Maintenance | 0 | 0 | 0 | 0 | 0 | - | - | 1 | | | Performance Monitor | 0 | 0 | 0 | 0 | 0 | 0 | - | m | | ١ | Vector Unavailable <sup>1</sup> | 0 | 0 | 0 | 0 | 0 | 0 | - | m | | ,<br>, | | U | 0 | - | - | - | | | - 11 | | 1 | | | | | | | | | | | . 1 | | was | in po | owe | r-sav | /ina | mod | de: c | oth- | | þ | erwise not altered | | | | | 9 | | , - | | | ' - | bit is not altered | | | | | | | | | | r | m if LPES <sub>1</sub> =0, set to 1; otherwise not altered | | | | | | | | | | e | , , | | | | | | | | | | l þ | 0 , , | | | | | | | | | | S | 1 / / | itere | d | | | | | | | | | Settings for Other Bits | | | | | | | | | Settings for Other Bits Bits BE, FP, PMM, PR, SE, and VEC<sup>1</sup>are set to 0. If the interrupt results in HV being equal to 1, the LE bit is copied from the HILE bit; otherwise the LE bit is copied from the LPCR $_{\rm ILE}$ bit. The SF bit is set to 1. Reserved bits are set as if written as 0. Category: Vector Figure 43. MSR setting due to interrupt | Effective | _ | | | | | |----------------------|---------------------------------|--|--|--|--| | Address <sup>1</sup> | Interrupt Type | | | | | | 000000_0100 | System Reset | | | | | | 000000_0200 | Machine Check | | | | | | 000000_0300 | Data Storage | | | | | | 000000_0380 | Data Segment | | | | | | 000000_0400 | Instruction Storage | | | | | | 000000_0480 | Instruction Segment | | | | | | 000000_0500 | External | | | | | | 000000_0600 | Alignment | | | | | | 000000_0700 | Program | | | | | | 000000_0800 | Floating-Point Unavailable | | | | | | 000000_0900 | Decrementer | | | | | | 000000_0980 | Hypervisor Decrementer | | | | | | 00000_0A00 | Reserved | | | | | | 000000_0B00 | Reserved | | | | | | 000000_0C00 | System Call | | | | | | 000000_0D00 | Trace | | | | | | 000000_0E00 | Hypervisor Data Storage | | | | | | 000000_0E10 | Hypervisor Instruction Storage | | | | | | 000000_0E20 | Hypervisor Data Segment | | | | | | 000000_0E30 | Hypervisor Instruction Segment | | | | | | 000000_0E40 | Hypervisor Emulation Assistance | | | | | | 000000_0E50 | Hypervisor Maintenance | | | | | | 000000_0E60 | Reserved | | | | | | | | | | | | | 000000_0EFF | Reserved | | | | | | 000000_0F00 | Performance Monitor | | | | | | 000000_0F10 | Reserved | | | | | | 000000_0F20 | Vector Unavailable <sup>3</sup> | | | | | | 000000_0F30 | Reserved | | | | | | | | | | | | | 000000_0FFF | | | | | | | | Effective Address column are | | | | | | interpreted as fol | interpreted as follows. | | | | | ■ 00...0000\_nnnn *means* 0x0000\_0000\_0000\_nnnn Effective addresses 0x0000\_0000\_0000\_0000 through 0x0000\_0000\_0000\_00FF are used by software and will not be assigned as interrupt vectors. Category: Vector. Figure 44. Effective address of interrupt vector by interrupt type #### Programming Note When address translation is disabled, use of any of the effective addresses that are shown as reserved in Figure 44 risks incompatibility with future implementations. #### 6.5.1 **System Reset Interrupt** If a System Reset exception causes an interrupt that is not context synchronizing or causes the loss of a Machine Check exception or a Direct External exception, or if the state of the processor has been corrupted, the interrupt is not recoverable. When the processor is in any power-saving level, a System Reset interrupt occurs when a System Reset exception exists. When the processor is in doze or nap power-saving levels, a System Reset interrupt occurs when any of the following exceptions exists provided that the exception is enabled to cause exit from power saving mode (see Section 2.2, "Logical Partitioning Control Register (LPCR)"). When the processor is in sleep or rvwinkle power-saving level, it is implementation-specific whether the following exceptions, when enabled, cause exit, or whether only a system-reset causes exit. - External - Decrementer - Hypervisor Maintenance - Implementation-specific SRR1 indicates the exception that caused exit from power-saving mode as specified below. The following registers are set: SRR0 If the interrupt did not occur when the processor was in power-saving mode, set to the effective address of the instruction that the processor would have attempted to execute next if no interrupt conditions were present; otherwise, set to an undefined value. ### SRR1 33:36 Set to 0. 42:44 If the interrupt did not occur when the processor was in power-saving mode, set to implementation-specific value. If the interrupt occurred when the processor was in power-saving mode, set to indicate the exception that caused exit from power-saving mode as shown below: If multiple exceptions that cause exit from power-saving mode exist, the exception reported is the exception corresponding to the interrupt that would have occurred if the same exceptions existed and the processor was not in power-saving mode. **45** Set to 0 46:47 Set to indicate whether the interrupt occurred when the processor was in power-saving mode and, if so, the extent to which resource state was maintained while the processor was in power-saving mode, as follows: - The interrupt did not occur when the processor was in power-saving mode. - O1 The interrupt occurred when the processor was in power-saving mode. The state of all resources was maintained as if the processor was not in power-saving mode. - The interrupt occurred when the processor was in power-saving mode. The state of some resources was not maintained, but the state of all hypervisor resources was maintained as if the processor was not in power-saving mode and the state of all other resources is such that the hypervisor can resume execution. - 11 The interrupt occurred when the processor was in power-saving mode. The state of some resources was not maintained, and the state of some hypervisor resources was not maintained or the state of some resources is such that the hypervisor cannot resume execution. ### Programming Note - Although the resources that are maintained in power-saving mode (except in doze power-saving level) are implementation-dependent, the hypervisor can avoid implementation-dependence in the portion of the System Reset and Machine Check interrupt handlers that recover from having been in power-saving mode by using the contents of SRR146:47, to determine what state to restore. To avoid implementation-dependence in the portion of the hypervisor that enters power-saving mode, the hypervisor must use the specification of the four instructions to determine what state to save. If the interrupt did not occur while the processor was in power-saving mode, loaded from bit 62 of the MSR if the processor is in a recoverable state; otherwise set to 0. If the interrupt occurred while the processor was in power-saving mode, set to 1 if the processor is in a recoverable state; otherwise set to 0. Others Loaded from the MSR. MSR See Figure 43 on page 555. In addition, if the interrupt occurs when the processor is in power-saving mode and is caused by an exception other than a System Reset exception, all other registers, except HSRR0 and HSRR1, that would be set by the corresponding interrupt if the exception occurred when the processor was not in power-saving mode are set by the System Reset interrupt, and are set to the values to which they would be set if the exception occurred when the processor was not in power-saving mode. Execution resumes at effective address 0x0000\_0000\_0000\_0100. The means for software to distinguish between poweron Reset and other types of System Reset are implementation-dependent. ### 6.5.2 Machine Check Interrupt The causes of Machine Check interrupts are implementation-dependent. For example, a Machine Check interrupt may be caused by a reference to a storage location that contains an uncorrectable error or does not exist (see Section 5.6), or by an error in the storage subsystem. When the processor is not in power-saving mode, Machine Check interrupts are enabled when MSR<sub>ME</sub>=1; if MSR<sub>ME</sub>=0 and a Machine Check exception occurs, the processor enters the Checkstop state. When the processor is in doze or nap power-saving levels, Machine Check interrupts are treated as enabled when LPCR<sub>PECE[2]</sub>=1 and cannot occur when LPCR<sub>PECF[2]</sub>=0. When the processor is in sleep or rvwinkle power-saving level, it is implementation-specific whether Machine Check interrupts are treated as enabled under the same conditions as in doze and nap power-saving level or if they cannot occur. If a Machine Check exception occurs while the processor is in power-saving mode and the Machine Check exception is not enabled to cause exit from power-saving mode, the result is implementation specific The Checkstop state may also be entered if an access is attempted to a storage location that does not exist (see Section 5.6), or if an implementation-dependant hardware error occurs that prevents processor operation. **Disabled Machine Check (Checkstop State)** When a processor is in Checkstop state, instruction processing is suspended and generally cannot be restarted without resetting the processor. Some implementations may preserve some or all of the internal state of the processor when entering Checkstop state, so that the state can be analyzed as an aid in problem determination. #### **Enabled Machine Check** If a Machine Check exception causes an interrupt that is not context synchronizing or causes the loss of a Direct External exception, or if the state of the processor has been corrupted, the interrupt is not recoverable. In some systems, the operating system may attempt to identify and log the cause of the Machine Check. The following registers are set: SRR0 If the interrupt did not occur while the processor was in power-saving mode, set on a "best effort" basis to the effective address of some instruction that was executing or was about to be executed when the Machine Check exception occurred; otherwise set to an undefined value. ### **Programming Note** Since the hypervisor can save the address of the instruction following the power-saving mode instruction if needed, there is no need for the processor to preserve it and store it into SRRO. Therefore, for ease of implementation, the contents of SRRO upon exit from power-saving mode are specified to be undefined. ### SRR1 46:47 Set to indicate whether the interrupt occurred when the processor was in power-saving mode and, if so, the extent to which resource state was maintained while the processor was in power-saving mode, as follows. - 00 The interrupt did not occur when the processor was in power-saving mode. - O1 The interrupt occurred when the processor was in power-saving mode. The state of all resources was maintained as if the processor was not in power-saving mode. - The interrupt occurred when the processor was in power-saving mode. The state of some resources was not maintained, but the state of all hypervisor resources was maintained as if the processor was not in power-saving mode and the state of all other resources is such that the hypervisor can resume execution. - The interrupt occurred when the processor was in power-saving mode. The state of some resources was not maintained, and the state of some hypervisor resources was not maintained or the state of some resources is such that the hypervisor cannot resume execution. ### **Programming Note** - Although the resources that are maintained in power-saving mode (except in the doze power-saving level) are implementation-dependent, the hypervisor can avoid implementation-dependence in the portion of the System Reset and Machine Check interrupt handlers that recover from having been in power-saving mode by using the contents of SRR1<sub>46:47</sub>, to determine what state to restore. (To avoid implementation-dependence in the portion of the hypervisor that enters power-saving mode, the hypervisor must use the specification of the four instructions to determine what state to save) If the interrupt did not occur while the processor was in power-saving mode, loaded from bit 62 of the MSR if the processor is in a recoverable state; otherwise set to 0. If the interrupt occurred while the processor was in power-saving mode, set to 1 if the processor is in a recoverable state; otherwise set to 0. Others Set to an implementation-dependent value. MSR See Figure 43. **DSISR** Set to an implementation-dependent value. **DAR** Set to an implementation-dependent value. Execution resumes at effective address 0x0000\_0000\_0000\_0200. A Machine Check interrupt caused by the existence of multiple SLB entries or TLB entries (or similar entries in implementation-specific translation caches) which translate a given effective or virtual address (see Sections 5.7.6.2 and 5.7.7.3.) must occur while still in the context of the partition that caused it. The interrupt must be presented in a way that permits continuing execution, with damage limited to the causing partition. Treating the exception as instruction-caused will achieve these requirements. ### Programming Note If a Machine Check interrupt is caused by an error in the storage subsystem, the storage subsystem may return incorrect data, which may be placed into registers. This corruption of register contents may occur even if the interrupt is recoverable. ### 6.5.3 Data Storage Interrupt A Data Storage interrupt occurs when no higher priority exception exists, the value of the expression $$(MSR_{HV PR} = 0b10) \mid (\neg VPM_0 \& \neg MSR_{DR})$$ $$\mid (\neg VPM_1 \& MSR_{DR})$$ is 1, and a data access cannot be performed for any of the following reasons. - Data address translation is enabled (MSR<sub>DR</sub>=1) and the virtual address of any byte of the storage location specified by a Load, Store, icbi, dcbz, dcbst, dcbf[I], eciwx, or ecowx instruction cannot be translated to a real address. - The effective address specified by a *Iq*, *stq*, *Iwarx*, *Idarx*, *stwcx*., or *stdcx*. instruction refers to storage that is Write Through Required or Caching Inhibited. - The access violates storage protection. - A Data Address Breakpoint match occurs. - Execution of an *eciwx* or *ecowx* instruction is disallowed because EAR<sub>E</sub>=0. If a **stwcx.** or **stdcx.** would not perform its store in the absence of a Data Storage interrupt, and either (a) the specified effective address refers to storage that is Write Through Required or Caching Inhibited, or (b) a non-conditional *Store* to the specified effective address would cause a Data Storage interrupt, it is implementation-dependent whether a Data Storage interrupt If the contents of the XER specifies a length of zero bytes for a *Move Assist* instruction, a Data Storage interrupt does not occur for reasons of address translation, or storage protection. If such an instruction causes a Data Storage interrupt for other reasons, the setting of the DSISR and DAR reflects only these other reasons listed in the preceding sentence. (E.g., if such an instruction causes a storage protection violation and a Data Address Breakpoint match, the DSISR and DAR are set as if the storage protection violation did not occur.) The following registers are set: SRR0 Set to the effective address of the instruction that caused the interrupt. SRR1 **33:36** Set to 0. **42:47** Set to 0. Others Loaded from the MSR. MSR See Figure 43. **DSISR** **32** Set to 0. 33 Set to 1 if MSR<sub>DR</sub>=1 and the translation for an attempted access is not found in the Page Table; otherwise set to 0.. **34:35** Set to 0. 36 Set to 1 if the access is not permitted by Figure 26 or 27, as appropriate; otherwise set to 0. 37 Set to 1 if the access is due to a *Iq*, *stq*, *Iwarx*, *Idarx*, *stwcx.*, or *stdcx.* instruction that addresses storage that is Write Through Required or Caching Inhibited; otherwise set to 0. 38 Set to 1 for a *Store*, *dcbz*, or *ecowx* instruction; otherwise set to 0. **39:40** Set to 0. 41 Set to 1 if a Data Address Breakpoint match occurs; otherwise set to 0. Set to 1 if the access is not permitted by virtual page class key protection; otherwise set to 0. Set to 1 if execution of an eciwx or ecowx instruction is attempted when EAR<sub>E</sub>=0; otherwise set to 0. **44:63** Set to 0. DAR Set to the effective address of a storage element as described in the following list. The list should be read from the top down; the DAR is set as described by the first item that corresponds to an exception that is reported in the DSISR. For example, if a *Load* instruction causes a storage protection violation and a Data Address Breakpoint match (and both are reported in the DSISR), the DAR is set to the effective address of a byte in the first aligned doubleword for which access was attempted in the page that caused the exception. - a Data Storage exception occurs for reasons other than a Data Address Breakpoint match or, for eciwx and ecowx, EAR<sub>F</sub>=0 - a byte in the block that caused the exception, for a Cache Management instruction - a byte in the first aligned doubleword for which access was attempted in the page that caused the exception, for a Load, Store, eciwx, or ecowx instruction ("first" refers to address order; see Section 6.7) undefined, for a Data Address Breakpoint match, or if eciwx or ecowx is executed when EAR<sub>E</sub>=0 For the cases in which the DAR is specified above to be set to a defined value, if the interrupt occurs in 32-bit mode the high-order 32 bits of the DAR are set to 0. If multiple Data Storage exceptions occur for a given effective address, any one or more of the bits corresponding to these exceptions may be set to 1 in the DSISR. Execution resumes at effective address 0x0000 0000 0000 0300. ### 6.5.4 Data Segment Interrupt A Data Segment interrupt occurs when no higher priority exception exists and a data access cannot be performed because data address translation is enabled and the effective address of any byte of the storage location specified by a Load, Store, ichi, dcbz, dcbst, dcbf[I] eciwx, or ecowx instruction cannot be translated to a virtual address. If a **stwcx.** or **stdcx.** would not perform its store in the absence of a Data Segment interrupt, and a non-conditional *Store* to the specified effective address would cause a Data Segment interrupt, it is implementation-dependent whether a Data Segment interrupt occurs. If a *Move Assist* instruction has a length of zero (in the XER), a Data Segment interrupt does not occur, regardless of the effective address. The following registers are set: SRR0 Set to the effective address of the instruction that caused the interrupt. SRR1 **33:36** Set to 0. **42:47** Set to 0. Others Loaded from the MSR. MSR See Figure 43. **DSISR** Set to an undefined value. DAR Set to the effective address of a storage element as described in the following list. - a byte in the block that caused the Data Segment interrupt, for a Cache Management instruction - a byte in the first aligned doubleword for which access was attempted in the segment that caused the Data Segment interrupt, for a Load, Store, eciwx, or ecowx instruction ("first" refers to address order; see Section 6.7) If the interrupt occurs in 32-bit mode, the high-order 32 bits of the DAR are set to 0. Execution resumes at effective address 0x0000 0000 0000 0380. ### **Programming Note** A Data Segment interrupt occurs if MSR<sub>DR</sub>=1 and the translation of the effective address of any byte of the specified storage location is not found in the SLB (or in any implementation-specific address translation lookaside information). ### 6.5.5 Instruction Storage Interrupt An Instruction Storage interrupt occurs when no higher priority exception exists, the value of the expression $$(MSR_{HV PR} = 0b10) \mid (\neg VPM_0 \& \neg MSR_{IR})$$ $$\mid (\neg VPM_1 \& MSR_{IR})$$ is 1, and the next instruction to be executed cannot be fetched for any of the following reasons. - Instruction address translation is enabled and the virtual address cannot be translated to a real address. - The fetch access violates storage protection. The following registers are set: SRR0 Set to the effective address of the instruction that the processor would have attempted to execute next if no interrupt conditions were present (if the interrupt occurs on attempting to fetch a branch target, SRR0 is set to the branch target address). SRR1 33 Set to 1 if MSR<sub>IR</sub>=1 and the translation for an attempted access is not found in the Page Table; otherwise set to 0. **34** Set to 0. 35 Set to 1 if the access is to No-execute or Guarded storage; otherwise set to 0. Set to 1 if the access is not permitted by Figure 26, or 27, as appropriate; otherwise set to 0. ### **Programming Note** Storage protection violations for the Data Storage Interrupt are reported in $DSISR_{36}$ and $DSISR_{42}$ , whereas storage protection violations for the Instruction Storage Interrupt are reported in $SRR1_{35}$ and $SRR1_{36}$ . **42:47** Set to 0. Others Loaded from the MSR. MSR See Figure 43. If multiple Instruction Storage exceptions occur due to attempting to fetch a single instruction, any one or more of the bits corresponding to these exceptions may be set to 1 in SRR1. Execution resumes at effective address 0x0000 0000 0000 0400. ## 6.5.6 Instruction Segment Interrupt An Instruction Segment interrupt occurs when no higher priority exception exists and the next instruction to be executed cannot be fetched because instruction address translation is enabled and the effective address cannot be translated to a virtual address. The following registers are set: SRR0 Set to the effective address of the instruction that the processor would have attempted to execute next if no interrupt conditions were present (if the interrupt occurs on attempting to fetch a branch target, SRR0 is set to the branch target address). SRR1 **33:36** Set to 0. **42:47** Set to 0. Others Loaded from the MSR. MSR See Figure 43 on page 555. Execution resumes at effective address 0x0000\_0000\_0000\_0480. #### Programming Note An Instruction Segment interrupt occurs if $MSR_{IR}=1$ and the translation of the effective address of the next instruction to be executed is not found in the SLB (or in any implementation-specific address translation lookaside information). ### 6.5.7 External Interrupt An External interrupt is classified as being either a Direct External interrupt or a Mediated External interrupt. Throughout this Book, usage of the phrase "External interrupt', without further classification, refers to both a Direct External interrupt and a Mediated External interrupt. ### 6.5.7.1 Direct External Interrupt A Direct External interrupt occurs when no higher priority exception exists, a Direct External exception exists, and the value of the expression $MSR_{FF} \mid (\neg(LPES_0) \& (\neg(MSR_{HV}) \mid MSR_{PR}))$ is one. The occurrence of the interrupt does not cause the exception to cease to exist. When LPES<sub>0</sub>=0, the following registers are set: **HSRR0** Set to the effective address of the instruction that the processor would have attempted to execute next if no interrupt conditions were present. HSRR1 **33:36** Set to 0. **42:47** Set to 0. Others Loaded from the MSR. MSR See Figure 43 on page 555. When LPES<sub>0</sub>=1, the following registers are set: SRR0 Set to the effective address of the instruction that the processor would have attempted to execute next if no interrupt conditions were present. SRR1 **33:36** Set to 0. **42:47** Set to 0. Others Loaded from the MSR. MSR See Figure 43 on page 555. Execution resumes at effective address 0x0000 0000 0000 0500. ### Programming Note - Because the value of MSR<sub>EE</sub> is always 1 when the processor is in problem state, the simpler expression $MSR_{FF} \mid \neg (LPES_0 \mid MSR_{HV})$ is equivalent to the expression given above. ### Programming Note - The Direct External exception has the same meaning as the External exception in versions of the architecture prior to Version 2.05. ### 6.5.7.2 Mediated External Interrupt A Mediated External interrupt occurs when no higher priority exception exists, a Mediated External exception exists (see the definition of LPCR<sub>MER</sub> in Section 2.2), and the value of the expression $$MSR_{EE} \& (\neg (MSR_{HV}) | MSR_{PR})$$ is one. The occurrence of the interrupt does not cause the exception to cease to exist. When LPES<sub>0</sub>=0, the following registers are set: **HSRR0** Set to the effective address of the instruction that the processor would have attempted to execute next if no interrupt conditions were present. #### HSRR1 33:36 Set to 0. 42 Set to 1. 43:47 Set to 0. Others Loaded from the MSR. MSR See Figure 43 on page 555. When LPES $_0$ =1, the following registers are set: **SRR0** Set to the effective address of the instruction that the processor would have attempted to execute next if no interrupt conditions were present. #### SRR1 **33:36** Set to 0. **42:47** Set to 0. Others Loaded from the MSR. MSR See Figure 43 on page 555. Execution resumes at effective address 0x0000\_0000\_0000\_0500. ### 6.5.8 Alignment Interrupt An Alignment interrupt occurs when no higher priority exception exists and a data access cannot be performed for any of the following reasons. - The operand of a floating-point *Load* or *Store* is not word-aligned, or crosses a virtual page boundary. - The operand of Iq, stq, Imw, stmw, Iwarx, Idarx, stwcx., stdcx., eciwx, ecowx, fdp, Ifdpx, stfdp, or stfdpx is not aligned. - The operand of a single-register Load or Store is not aligned and the processor is in Little-Endian mode. - The instruction is Iq, stq, Imw, stmw, Iswi, Iswx, stswi, or stswx, and the operand is in storage that is Write Through Required or Caching Inhibited, or the processor is in Little-Endian mode. - The operand of a Load or Store crosses a segment boundary, or crosses a boundary between virtual pages that have different storage control attributes. - The operand of a Load or Store is not aligned and is in storage that is Write Through Required or Caching Inhibited. - The operand of *Ifdp*, *Ifdpx*, *stfdp*, *stfdpx*, *dcbz*, *Iwarx*, *Idarx*, *stwcx*., or *stdcx*. is in storage that is Write Through Required or Caching Inhibited. If a **stwcx**. or **stdcx**. would not perform its store in the absence of an Alignment interrupt and the specified effective address refers to storage that is Write Through Required or Caching Inhibited, it is implementation-dependent whether an Alignment interrupt occurs. Setting the DSISR and DAR as described below is optional for implementations on which Alignment interrupts occur rarely, if ever, for cases that the Alignment interrupt handler emulates. For such implementations, if the DSISR and DAR are not set as described below they are set to undefined values. The following registers are set: **SRR0** Set to the effective address of the instruction that caused the interrupt. #### SRR1 **33:36** Set to 0. **42:47** Set to 0. Others Loaded from the MSR. MSR See Figure 43. #### **DSISR** 32:43 Set to 0. **44:45** Set to bits 30:31 of the instruction if DS-form. Set to 0b00 if D-, or X-form. **46** Set to 0. **47:48** Set to bits 29:30 of the instruction if X-form. Set to 0b00 if D- or DS-form. 49 Set to bit 25 of the instruction if X-form. Set to bit 5 of the instruction if D- or DS-form. 50:53 Set to bits 21:24 of the instruction if X-form. Set to bits 1:4 of the instruction if D- or DS-form **54:58** Set to bits 6:10 of the instruction (RT/RS/FRT/FRS), except undefined for *dcbz*. 59:63 Set to bits 11:15 of the instruction (RA) for update form instructions; set to either bits 11:15 of the instruction or to any register number not in the range of registers to be loaded for a valid form *Imw*, a valid form *Iswi*, or a valid form *Iswx* for which neither RA nor RB is in the range of registers to be loaded; otherwise undefined. DAR Set to the effective address computed by the instruction, except that if the interrupt occurs in 32-bit mode the high-order 32 bits of the DAR are set to 0. For an X-form *Load* or *Store*, it is acceptable for the processor to set the DSISR to the same value that would have resulted if the corresponding D- or DS-form instruction had caused the interrupt. Similarly, for a D- or DS-form *Load* or *Store*, it is acceptable for the processor to set the DSISR to the value that would have resulted for the corresponding X-form instruction. For example, an unaligned *Iwax* (that crosses a protection boundary) would normally, following the description above, cause the DSISR to be set to binary: 00000000000 00 0 01 0 0101 ttttt ????? where "ttttt" denotes the RT field, and "?????" denotes an undefined 5-bit value. However, it is acceptable if it causes the DSISR to be set as for *Iwa*, which is ``` 000000000000 10 0 00 0 1101 ttttt ????? ``` If there is no corresponding alternative form instruction (e.g., for *Iwaux*), the value described above is set in the DSISR. The instruction pairs that may use the same DSISR value are. | lhz/lhzx<br>lwz/lwzx | lhzu/lhzux<br>lwzu/lwzux | lha/lh<br>lwa/lv | | lhau | /lhaux | |---------------------------------|--------------------------------------|-------------------|----------|-----------------|------------------| | ld/ldx<br>lsth/sthx<br>std/stdx | ldu/ldux<br>sthu/sthux<br>stdu/stdux | stw/s | twx | stwu | /stwux | | lfs/lfsx<br>stfs/stfsx | lfsu/lfsux<br>stfsu/stfsux | lfd/lfd<br>stfd/s | | lfdu/l<br>stfdu | fdux<br>ı/stfdux | | Execution 0x0000 0000 | resumes<br>0000 0600. | at | effectiv | e e | address | ### Programming Note - The architecture does not support the use of an unaligned effective address by *Iwarx*, *Idarx*, *stwcx.*, *stdcx.*, *eciwx*, and *ecowx*. If an Alignment interrupt occurs because one of these instructions specifies an unaligned effective address, the Alignment interrupt handler must not attempt to simulate the instruction, but instead should treat the instruction as a programming error. ### 6.5.9 Program Interrupt A Program interrupt occurs when no higher priority exception exists and one of the following exceptions arises during execution of an instruction: #### Floating-Point Enabled Exception A Floating-Point Enabled Exception type Program interrupt is generated when the value of the expression is 1. $FPSCR_{FEX}$ is set to 1 by the execution of a floating-point instruction that causes an enabled exception, including the case of a *Move To FPSCR* instruction that causes an exception bit and the corresponding enable bit both to be 1. ### **Illegal Instruction** An Illegal Instruction type Program interrupt is generated when execution is attempted of an illegal instruction, or of a reserved instruction or an instruction that is not provided by the implementation. It is also generated under the following conditions. - an mtspr or mfspr instruction is executed when MSR<sub>PR</sub>=1 if the instruction specifies an SPR with SPR<sub>0</sub>=0 that is not provided by the implementation - an mtspr or mfspr instruction is executed when MSR<sub>PR</sub>=1 if the instruction specifies SPR 0 - an mfspr instruction is executed when MSR<sub>PR</sub>=0 if the instruction specifies SPR 4 ,5, or 6. An Illegal Instruction type Program interrupt may be generated when execution is attempted of any of the following kinds of instruction. - an instruction that is in invalid form - an Iswx instruction for which RA or RB is in the range of registers to be loaded Note: If the Hypervisor Emulation Assistance category is supported, see the following Programming Note. ### Programming Note When the Hypervisor Emulation Assistance category is supported, the hardware will not generate this interrupt, but instead will generate a Hypervisor Emulation Assistance interrupt. See Section 6.5.19. The hypervisor must then present this interrupt to operating system software as if it were an Illegal Instruction type Program interrupt when the instruction that caused the interrupt is truely illegal, rather than one to be emulated. ### **Privileged Instruction** The following applies if the instruction is executed when $MSR_{PR} = 1$ . A Privileged Instruction type Program interrupt is generated when execution is attempted of a privileged instruction, or of an mtspr or mfspr instruction with an SPR field that contains one of the defined values having $spr_0=1$ . It may be generated when execution is attempted of an mtspr or mfspr instruction with an SPR field that contains a value having $spr_0=1$ . The following applies if the instruction is executed when $MSR_{HV\ PR} = 0b00$ . A Privileged Instruction type Program interrupt is generated when execution is attempted of an *mtspr* or *mfspr* instruction with an SPR field that designates an SPR that is accessible by the instruction only when the processor is in hypervisor state, or when execution of a hypervisor-privileged instruction is attempted. ### Programming Note - These are the only cases in which a Privileged Instruction type Program interrupt can be generated when $MSR_{PR}$ =0. They can be distinguished from other causes of Privileged Instruction type Program interrupts by examining $SRR1_{49}$ (the bit in which $MSR_{PR}$ was saved by the interrupt). #### Trap A Trap type Program interrupt is generated when any of the conditions specified in a *Trap* instruction is met. The following registers are set: SRRO For all Program interrupts except a Floating-Point Enabled Exception type Program interrupt, set to the effective address of the instruction that caused the corresponding exception. For a Floating-Point Enabled Exception type Program interrupt, set as described in the following list. If MSR<sub>FE0 FE1</sub> = 0b00, FPSCR<sub>FEX</sub> = 1, and an instruction is executed that changes MSR<sub>FE0 FE1</sub> to a nonzero value, set to the effective address of the instruction that the processor would have attempted to execute next if no interrupt conditions were present. ### **Programming Note** Recall that all instructions that can alter MSR<sub>FE0 FE1</sub> are context synchronizing, and therefore are not initiated until all preceding instructions have reported all exceptions they will cause. - If MSR<sub>FE0 FE</sub> = 0b11, set to the effective address of the instruction that caused the Floating-Point Enabled Exception. - If MSR<sub>FE0 FE</sub> = 0b01 or 0b10, set to the effective address of the first instruction that caused a Floating-Point Enabled Exception since the most recent time FPSCR<sub>FEX</sub> was changed from 1 to 0 or of some subsequent instruction. ### Programming Note - If SRR0 is set to the effective address of a subsequent instruction, that instruction will not be beyond the first such instruction at which synchronization of floating-point instructions occurs. (Recall that such synchronization is caused by Floating-Point Status and Control Register instructions, as well as by execution synchronizing instructions and events.) ### SRR1 33:36 Set to 0. **42** Set to 0. - Set to 1 for a Floating-Point Enabled Exception type Program interrupt; otherwise set to 0. - 44 Set to 1 for an Illegal Instruction type Program interrupt; otherwise set to 0. - 45 Set to 1 for a Privileged Instruction type Program interrupt; otherwise set to 0. - 46 Set to 1 for a Trap type Program interrupt; otherwise set to 0. - 47 Set to 0 if SRR0 contains the address of the instruction causing the exception and there is only one such instruction; otherwise set to 1. ### Programming Note - SRR1<sub>47</sub> can be set to 1 only if the exception is a Floating-Point Enabled Exception and either $MSR_{FE0\ FE1} = 0b01$ or 0b10 or $MSR_{FE0\ FE1}$ has just been changed from 0b00 to a nonzero value. (SRR1<sub>47</sub> is always set to 1 in the last case.) Others Loaded from the MSR. Only one of bits 43:46 can be set to 1. MSR See Figure 43 on page 555. Execution resumes at effective address 0x0000\_0000\_0000\_0700. ## 6.5.10 Floating-Point Unavailable Interrupt A Floating-Point Unavailable interrupt occurs when no higher priority exception exists, an attempt is made to execute a floating-point instruction (including floating-point loads, stores, and moves), and $MSR_{FP}=0$ . The following registers are set: **SRR0** Set to the effective address of the instruc- tion that caused the interrupt. SRR1 **33:36** Set to 0. **42:47** Set to 0. **Others** Loaded from the MSR. MSR See Figure 43 on page 555. Execution resumes at effective address 0x0000 0000 0000 0800. ### 6.5.11 Decrementer Interrupt A Decrementer interrupt occurs when no higher priority exception exists, a Decrementer exception exists, and MSR<sub>FF</sub>=1. The following registers are set: SRR0 Set to the effective address of the instruc- tion that the processor would have attempted to execute next if no interrupt conditions were present. SRR1 **33:36** Set to 0. **42:47** Set to 0. Others Loaded from the MSR. MSR See Figure 43 on page 555. Execution resumes at effective address 0x0000\_0000\_0000\_0900. ## 6.5.12 Hypervisor Decrementer Interrupt A Hypervisor Decrementer interrupt occurs when no higher priority exception exists, a Hypervisor Decrementer exception exists, and the value of the following expression is 1. (MSR<sub>EE</sub> | ¬(MSR<sub>HV</sub>) | MSR<sub>PR</sub>) & HDICE The following registers are set: HSRR0 Set to the effective address of the instruc- tion that the processor would have attempted to execute next if no interrupt conditions were present. HSRR1 **33:36** Set to 0. **42:47** Set to 0. Others Loaded from the MSR. MSR See Figure 43 on page 555. Execution resumes at effective address 0x0000\_0000\_0000\_0980. ### Programming Note - Because the value of $\mathsf{MSR}_{\mathsf{EE}}$ is always 1 when the processor is in problem state, the simpler expression (MSR<sub>EE</sub> | ¬(MSR<sub>HV</sub>)) & HDICE is equivalent to the expression given above. ### 6.5.13 System Call Interrupt A System Call interrupt occurs when a System Call instruction is executed. The following registers are set: **SRR0** Set to the effective address of the instruc- tion following the System Call instruction. SRR1 **33:36** Set to 0. **42:47** Set to 0. Others Loaded from the MSR. MSR See Figure 43 on page 555. Execution resumes at effective address 0x0000\_0000\_0000\_0C00. ### **Programming Note** An attempt to execute an $\emph{sc}$ instruction with LEV=1 in problem state should be treated as a programming error. ## **6.5.14** Trace Interrupt [Category: Trace] A Trace interrupt occurs when no higher priority exception exists and either MSR<sub>SE</sub>=1 and any instruction except *rfid* or *hrfid*, is successfully completed, or MSR<sub>BE</sub>=1 and a *Branch* instruction is completed. Successful completion means that the instruction caused no other interrupt. Thus a Trace interrupt never occurs for a *System Call* instruction, or for a *Trap* instruction that traps. The instruction that causes a Trace interrupt is called the "traced instruction". When a Trace interrupt occurs, the following registers are set: SRR0 Set to the effective address of the instruc- tion that the processor would have attempted to execute next if no interrupt conditions were present. SRR1 33:36 and 42:47 Set to an implementation-dependent value. Others Loaded from the MSR. MSR See Figure 43 on page 555. Execution resumes at effective address 0x0000\_0000\_0000\_0D00. Extensions to the Trace facility are described in Appendix C. ### Programming Note The following instructions are not traced. - rfid - hrfid - **sc**, and *Trap* instructions that trap - other instructions that cause interrupts (other than Trace interrupts) - the first instructions of any interrupt handler - instructions that are emulated by software In general, interrupt handlers can achieve the effect of tracing these instructions. ## 6.5.15 Hypervisor Data Storage Interrupt A Hypervisor Data Storage interrupt occurs when the processor is not in hypervisor state, no higher priority exception exists, the value of the expression $(VPM_0 \& \neg MSR_{DR}) | (VPM_1 \& MSR_{DR})$ is 1, and a data access cannot be performed for any of the following reasons. - Data address translation is enabled (MSR<sub>DR</sub>=1) and the virtual address of any byte of the storage location specified by a Load, Store, icbi, dcbz, dcbst, dcbf[I], eciwx, or ecowx instruction cannot be translated to a real address. - Data address translation is disabled (MSR<sub>DR</sub>=0), LPES<sub>1</sub> =1, and the virtual address of any byte of the storage location specified by a *Load*, *Store*, *icbi*, *dcbz*, *dcbst*, *dcbf*[*I*], *eciwx*, or *ecowx* instruction cannot be translated to a real address by means of the virtual real addressing mechanism. - The effective address specified by a *Iwarx*, *Idarx*, *stwcx.*, or *stdcx.* instruction refers to storage that is Write Through Required or Caching Inhibited. - The access violates storage protection. - A Data Address Compare match or a Data Address Breakpoint match occurs. - Execution of an eciwx or ecowx instruction is disallowed because EA<sub>RE</sub>=0. If a **stwcx**. or **stdcx**. would not perform its store in the absence of a Hypervisor Data Storage interrupt, and either (a) the specified effective address refers to storage that is Write Through Required or Caching Inhibited, or (b) a non-conditional *Store* to the specified effective address would cause a Hypervisor Data Stor- age interrupt, it is implementation-dependent whether a Hypervisor Data Storage interrupt occurs. If the contents of the XER specifies a length of zero bytes for a *Move Assist* instruction, a Hypervisor Data Storage interrupt does not occur for reasons of address translation, or storage protection. If such an instruction causes a Hypervisor Data Storage interrupt for other reasons, the setting of the HDSISR and HDAR reflects only these other reasons listed in the preceding sentence. (E.g., if such an instruction causes a storage protection violation and a Data Address Breakpoint match, the HDSISR and HDAR are set as if the storage protection violation did not occur.) The following registers are set: **HSRR0** Set to the effective address of the instruction that caused the interrupt. ### HSRR1 **33:36** Set to 0. **42:47** Set to 0. Others Loaded from the MSR. MSR See Figure 43. ### **HDSISR** **32** Set to 0. 33 Set to 1 if the value of the expression (MSR<sub>DR</sub>) | ((¬MSR<sub>DR</sub> & VPM<sub>0</sub>) & LPES<sub>1</sub>) is 1 and the translation for an attempted access is not found in the Page Table; oth- erwise set to 0. **34:35** Set to 0. 36 Set to 1 if the access is not permitted by the storage protection mechanism; otherwise set to 0. 37 Set to 1 if the access is due to a *Iq*, *stq*, *Iwarx*, *Idarx*, *stwcx*., or *stdcx*. instruction that addresses storage that is Write Through Required or Caching Inhibited; otherwise set to 0. 38 Set to 1 for a *Store*, *dcbz*, or *ecowx* instruction; otherwise set to 0. **39:40** Set to 0. Set to 1 if a Data Address Compare match or a Data Address Breakpoint match occurs; otherwise set to 0. **42** Set to 0. Set to 1 if execution of an *eciwx* or *ecowx* instruction is attempted when EAR<sub>E</sub>=0; otherwise set to 0. **44:63** Set to 0. **HDAR** Set to the effective address of a storage element as described in the following list. The list should be read from the top down; the HDAR is set as described by the first item that corresponds to an exception that is reported in the HDSISR. For example, if a *Load* instruction causes a storage protection violation and a Data Address Break- point match (and both are reported in the HDSISR), the HDAR is set to the effective address of a byte in the first aligned doubleword for which access was attempted in the page that caused the exception. - a Data Storage exception occurs for reasons other than a Data Address Breakpoint match or, for eciwx and ecowx, EAR<sub>E</sub>=0 - a byte in the block that caused the exception, for a Cache Management instruction - a byte in the first aligned doubleword for which access was attempted in the page that caused the exception, for a Load, Store, eciwx, or ecowx instruction ("first" refers to address order; see Section 6.7) - undefined, for a Data Address Breakpoint match, or if eciwx or ecowx is executed when EAR<sub>E</sub>=0 For the cases in which the HDAR is specified above to be set to a defined value, if the interrupt occurs in 32-bit mode the high-order 32 bits of the DAR are set to 0. If multiple Hypervisor Data Storage exceptions occur for a given effective address, any one or more of the bits corresponding to these exceptions may be set to 1 in the HDSISR. Execution resumes at effective address 0x0000\_0000\_0000\_0E00. ## 6.5.16 Hypervisor Instruction Storage Interrupt A Hypervisor Instruction Storage interrupt occurs when the processor is not in hypervisor state, no higher priority exception exists, the value of the expression $(VPM_0 \& \neg MSR_{IR}) | (VPM_1 \& MSR_{IR})$ is 1, and the next instruction to be executed cannot be fetched for any of the following reasons. - Instruction address translation is enabled (MSR<sub>IR</sub>=1) and the virtual address cannot be translated to a real address. - Instruction address translation is disabled (MSR<sub>IR</sub>=0), LPES<sub>1</sub> =1, and the virtual address cannot be translated to a real address by means of the virtual real addressing mechanism. - The fetch access violates storage protection. The following registers are set: **HSRR0** Set to the effective address of the instruction that the processor would have attempted to execute next if no interrupt conditions were present (if the interrupt occurs on attempting to fetch a branch target, HSRR0 is set to the branch target address). ### HSRR1 Set to 1 if the value of the expression $(MSR_{IR}) \mid ((\neg MSR_{IR} \& VPM_0) \& LPES_1)$ is 1 and the translation for an attempted access is not found in the Page Table; otherwise set to 0. 34 Set to 0. Set to 1 if the access is to No-execute or Guarded storage; otherwise set to 0. 36 Set to 1 if the access is not permitted by Figure 26; otherwise set to 0. ### **Programming Note** Storage protection violations for the Hypervisor Data Storage Interrupt are reported in HDSISR<sub>36</sub>, whereas storage protection violations for the Hypervisor Instruction Storage Interrupt are reported in HSRR1<sub>35</sub> and HSRR1<sub>36</sub>. **42:46** Set to 0. **47** Set to 0. Others Loaded from the MSR. MSR See Figure 43. If multiple Instruction Storage exceptions occur due to attempting to fetch a single instruction, any one or more of the bits corresponding to these exceptions may be set to 1 in HSRR1. Execution resumes at effective address 0x0000\_0000\_0000\_0E10. ## 6.5.17 Hypervisor Data Segment Interrupt A Hypervisor Data Segment interrupt may occur when the processor is not in hypervisor state, data address translation is disabled (MSR<sub>DR</sub>=0), VPM<sub>0</sub>=1, LPES<sub>1</sub>=1, no higher priority exception exists, the effective address of any byte of the storage location specified by a *Load*, *Store*, *icbi*, *dcbz*, *dcbst*, *dcbf[I] eciwx*, or *ecowx* instruction is beyond the 1 TB VRMA. If a **stwcx**. or **stdcx**. would not perform its store in the absence of a Hypervisor Data Segment interrupt, and a non-conditional *Store* to the specified effective address would cause a Hypervisor Data Segment interrupt, it is implementation-dependent whether a Hypervisor Data Segment interrupt occurs. If a *Move Assist* instruction has a length of zero (in the XER), a Hypervisor Data Segment interrupt does not occur, regardless of the effective address. The following registers are set: HSRR0 Set to the effective address of the instruc- tion that caused the interrupt. HSRR1 **33:36** Set to 0. **42:47** Set to 0. Others Loaded from the MSR. MSR See Figure 43. HDSISR Set to an undefined value. **HDAR** Set to the effective address of a storage element as described in the following list. a byte in the block that caused the Hypervisor Data Segment interrupt, for a Cache Management instruction a byte in the first aligned doubleword for which access was attempted in the segment that caused the Hypervisor Data Segment interrupt, for a Load, Store, eciwx, or ecowx instruction ("first" refers to address order; see Section 6.7) Execution resumes at effective address 0x0000\_0000\_0000\_0E20. ### 6.5.18 Hypervisor Instruction Segment Interrupt A Hypervisor Instruction Segment interrupt may occur when the processor is not in hypervisor state, instruction address translation is disabled (MSR $_{\rm IR}$ =0), VPM $_{\rm 0}$ =1, LPES $_{\rm 1}$ =1, no higher priority exception exists, and the effective address of any byte of the instruction is beyond the 1 TB VRMA. The following registers are set: HSRR0 Set to the effective address of the instruction that the processor would have attempted to execute next if no interrupt conditions were present (if the interrupt occurs on attempting to fetch a branch target, HSRR0 is set to the branch target address). HSRR1 **33:36** Set to 0. **42:47** Set to 0. Others Loaded from the MSR. MSR See Figure 43 on page 555. Execution resumes at effective address 0x0000\_0000\_0000\_0E30. ## 6.5.19 Hypervisor Emulation Assistance Interrupt [Category: ## Hypervisor Emulation Assistance] A Hypervisor Emulation Assistance interrupt is generated when execution is attempted of an illegal instruction, or of a reserved instruction or an instruction that is not provided by the implementation. It is also generated under the following conditions - an *mtspr* or *mfspr* instruction is executed when MSR<sub>PR</sub>=1 if the instruction specifies an SPR with SPR<sub>0</sub>=0 that is not provided by the implementation - an mtspr or mfspr instruction is executed when MSR<sub>PR</sub>=1 if the instruction specifies SPR 0 - an *mfspr* instruction is executed when MSR<sub>PR</sub>=0 if the instruction specifies SPR 4, 5, or 6 A Hypervisor Emulation Assistance interrupt may be generated when execution is attempted of any of the following kinds of instruction. - an instruction that is in invalid form - an Iswx instruction for which RA or RB is in the range of registers to be loaded The following registers are set: **HSRR0** Set to the effective address of the instruc- tion that caused the interrupt. HSRR1 **33:36** Set to 0. **42:47** Set to 0. Others Loaded from the MSR. MSR See Figure 43 on page 555. **HEIR** Set to a copy of the instruction that caused the interrupt Execution resumes at effective address 0x0000\_0000\_0000\_0E40. ## **6.5.20 Hypervisor Maintenance Interrupt** A Hypervisor Maintenance interrupt occurs when no higher priority exception exists, a Hypervisor Maintenance exception exists (a bit in the HMER is set to one), the exception is enabled in the HMEER, and the value of the following expression is 1. $(MSR_{EE} \mid \neg (MSR_{HV}) \mid MSR_{PR})$ The following registers are set: **HSRR0** Set to the effective address of the instruction that the processor would have attempted to execute next if no interrupt conditions were present. HSRR1 **33:36** Set to 0. **42:47** Set to 0. Others Loaded from the MSR. MSR See Figure 43 on page 555. HMER See Section 6.2.8 on page 549. The exception bits in the HMER are sticky; that is, once set to 1 they remain set to 1 until they are set to 0 by an *mthmer* instruction. Execution resumes at effective address 0x0000\_0000\_0000\_0E50. ### Programming Note - Because the value of $\mathsf{MSR}_{\mathsf{EE}}$ is always 1 when the processor is in problem state, the simpler expression $(MSR_{EE} \mid \neg(MSR_{HV}))$ is equivalent to the expression given above. ### - Programming Note If an implementation uses the HMER to record that a readable resource, such as the Time Base, has been corrupted, then, because the HMI is disabled in the hypervisor state, it is necessary for the hypervisor to check HMER after reading that resource to be sure an error has not occurred. # 6.5.21 Performance Monitor Interrupt [Category: Server.Performance Monitor] The Performance Monitor interrupt is part of the Performance Monitor facility; see Appendix C. If the Performance Monitor facility is not implemented or does not use this interrupt, the corresponding interrupt vector (see Figure 44 on page 556) is treated as reserved. ## 6.5.22 Vector Unavailable Interrupt [Category: Vector] A Vector Unavailable interrupt occurs when no higher priority exception exists, an attempt is made to execute a Vector instruction (including Vector loads, stores, and moves), and MSR<sub>VEC</sub>=0. The following registers are set: SRR0 Set to the effective address of the instruction that caused the interrupt. SRR1 **33:36** Set to 0. **42:47** Set to 0. Others Loaded from the MSR. MSR See Figure 43 on page 555. Execution resumes at effective address 0x0000 0000 0000 0F20. ## 6.6 Partially Executed Instructions If a Data Storage, Data Segment, Alignment, systemcaused, or imprecise exception occurs while a *Load* or *Store* instruction is executing, the instruction may be aborted. In such cases the instruction is not completed, but may have been partially executed in the following respects. - Some of the bytes of the storage operand may have been accessed, except that if access to a given byte of the storage operand would violate storage protection, that byte is neither copied to a register by a *Load* instruction nor modified by a *Store* instruction. Also, the rules for storage accesses given in Section 5.8.1, "Guarded Storage" and in Section 2.1 of Book II are obeyed. - Some registers may have been altered as described in the Book II section cited above. - Reference and Change bits may have been updated as described in Section 5.7.8. - For a stwcx. or stdcx. instruction that is executed in-order, CR0 may have been set to an undefined value and the reservation may have been cleared. - For an Iq instruction that is executed in-order, the TGCC may have been set to an undefined value. The architecture does not support continuation of an aborted instruction but intends that the aborted instruction be re-executed if appropriate. ### Programming Note - An exception may result in the partial execution of a *Load* or *Store* instruction. For example, if the Page Table Entry that translates the address of the storage operand is altered, by a program running on another processor, such that the new contents of the Page Table Entry preclude performing the access, the alteration could cause the *Load* or *Store* instruction to be aborted after having been partially executed. As stated in the Book II section cited above, if an instruction is partially executed the contents of registers are preserved to the extent that the instruction can be re-executed correctly. The consequent preservation is described in the following list. For any given instruction, zero or one item in the list applies. - For a fixed-point Load instruction that is not a multiple or string form, or for an eciwx instruction, if RT=RA or RT=RB then the contents of register RT are not altered. - For an *Iq* instruction, if RT+1 = RA then the contents of register RT+1 are not altered. - For an update form Load or Store instruction, the contents of register RA are not altered. ### 6.7 Exception Ordering Since multiple exceptions can exist at the same time and the architecture does not provide for reporting more than one interrupt at a time, the generation of more than one interrupt is prohibited. Some exceptions, such as the Mediated External exception, persist and can be deferred. However, other exceptions would be lost if they were not recognized and handled when they occur. For example, if an External interrupt was generated when a Data Storage exception existed, the Data Storage exception would be lost. If the Data Storage exception was caused by a Store Multiple instruction for which the storage operand crosses a virtual page boundary and the exception was a result of attempting to access the second virtual page, the store could have modified locations in the first virtual page even though it appeared that the Store Multiple instruction was never executed. For the above reasons, all exceptions are prioritized with respect to other exceptions that may exist at the same instant to prevent the loss of any exception that is not persistent. Some exceptions cannot exist at the same instant as some others. Data Storage, Hypervisor Data Storage, Data Segment, Hypervisor Data Segment, and Alignment exceptions occur as if the storage operand were accessed one byte at a time in order of increasing effective address (with the obvious caveat if the operand includes both the maximum effective address and effective address 0). ### 6.7.1 Unordered Exceptions The exceptions listed here are unordered, meaning that they may occur at any time regardless of the state of the interrupt processing mechanism. These exceptions are recognized and processed when presented. - 1. System Reset - 2. Machine Check ### 6.7.2 Ordered Exceptions The exceptions listed here are ordered with respect to the state of the interrupt processing mechanism. In the following list, the hypervisor forms of the Data Storage, Instruction Storage, Data Segment, and Instruction Segment exceptions can be substituted for the non-hypervisor forms since the hypervisor forms cannot be caused by the same instruction and have the same ordering. ### **System-Caused or Imprecise** - 1. Program - Imprecise Mode Floating-Point Enabled Exception - 2. Hypervisor Maintenance - 3. External and [Hypervisor] Decrementer #### **Instruction-Caused and Precise** - 1. [Hypervisor] Instruction Segment - 2. [Hypervisor] Instruction Storage - 3.a Hypervisor Emulation Assistance [Category: HEA] - 3.b Program - Illegal Instruction, if Category: HEA is not supported - Privileged Instruction - 4. Function-Dependent - 4.a Fixed-Point and Branch - 1a Program - Trap - 1b System Call - 1c [Hypervisor] Data Storage, [Hypervisor] Data Segment, or Alignment - 2 Trace - 4.b Floating-Point - 1 FP Unavailable - 2a Program - Precise Mode Floating-Pt Enabled Excep'n - 2b [Hypervisor] Data Storage, [Hypervisor] Data Segment, or Alignment - 3 Trace - 4.c Vector - 1 Vector Unavailable - 2a [Hypervisor] Data Storage, [Hypervisor] Data Segment, or Alignment - 3 Trace For implementations that execute multiple instructions in parallel using pipeline or superscalar techniques, or combinations of these, it can be difficult to understand the ordering of exceptions. To understand this ordering it is useful to consider a model in which each instruction is fetched, then decoded, then executed, all before the next instruction is fetched. In this model, the exceptions a single instruction would generate are in the order shown in the list of instruction-caused exceptions. Exceptions with different numbers have different ordering. Exceptions with the same numbering but different lettering are mutually exclusive and cannot be caused by the same instruction. The External, Decrementer, and Hypervisor Decrementer interrupts have equal ordering. Similarly, where Data Storage, Data Segment, and Alignment exceptions are listed in the same item they have equal ordering. Even on processors that are capable of executing several instructions simultaneously, or out of order, instruction-caused interrupts (precise and imprecise) occur in program order. ### **6.8 Interrupt Priorities** This section describes the relationship of non-maskable, maskable, precise, and imprecise interrupts. In the following descriptions, the interrupt mechanism waiting for all possible exceptions to be reported includes only exceptions caused by previously initiated instructions (e.g., it does not include waiting for the Decrementer to step through zero). The exceptions are listed in order of highest to lowest priority. The phrase "corresponding interrupt" means the interrupt having the same name as the exception unless the processor is in power-saving mode, in which case the phrase means the System Reset interrupt. Unless otherwise stated or obvious from context, it is assumed below that one of the following conditions is satisfied. - The processor is not in power-saving mode and the interrupt, unless it is the Machine Check interrupt, is not disabled. (For the Machine Check interrupt no assumption is made regarding enablement.) - The processor is in power-saving mode and the exception is enabled to cause exit from the mode. In the following list, the hypervisor forms of the Data Storage, Instruction Storage, Data Segment, and Instruction Segment exceptions can be substituted for the non-hypervisor forms since the hypervisor forms cannot occur simultaneously and have the same priority. ### 1. System Reset System Reset exception has the highest priority of all exceptions. If this exception exists, the interrupt mechanism ignores all other exceptions and generates a System Reset interrupt. Once the System Reset interrupt is generated, no nonmaskable interrupts are generated due to exceptions caused by instructions issued prior to the generation of this interrupt. ### 2. Machine Check Machine Check exception is the second highest priority exception. If this exception exists and a System Reset exception does not exist, the interrupt mechanism ignores all other exceptions and generates a Machine Check interrupt. Once the Machine Check interrupt is generated, no nonmaskable interrupts are generated due to exceptions caused by instructions issued prior to the generation of this interrupt. ### 3. Instruction-Dependent This exception is the third highest priority exception. When this exception is created, the interrupt mechanism waits for all possible Imprecise exceptions to be reported. It then generates the appropriate ordered interrupt if no higher priority exception exists when the interrupt is to be generated. Within this category a particular instruction may present more than a single exception. When this occurs, those exceptions are ordered in priority as indicated in the following lists. Where [Hypervisor] Data Storage, [Hypervisor] Data Segment, and Alignment exceptions are listed in the same item they have equal priority (i.e., the processor may generate any one of the three interrupts for which an exception exists). #### A. Fixed-Point Loads and Stores - a. These exceptions are mutually exclusive and have the same priority: - Hypervisor Emulation Assistance [Category: HEA] - Program Illegal Instruction if Category: HEA is not supported - Program Privileged Instruction - b. [Hypervisor] Data Storage, [Hypervisor] Data Segment, or Alignment - c. Trace ### B. Floating-Point Loads and Stores - a. Hypervisor Emulation Assistance [Category: HEA], or Program - Illegal Instruction if Category: HEA is not supported - b. Floating-Point Unavailable - c. [Hypervisor] Data Storage, [Hypervisor] Data Segment, or Alignment - d. Trace ### C. Vector Loads and Stores - a. Hypervisor Emulation Assistance [Category: HEA], or Program - Illegal Instruction if Category: HEA is not supported - b. Vector Unavailable - c. [Hypervisor] Data Storage, [Hypervisor] Data Segment, or Alignment - d. Trace ### D. Other Floating-Point Instructions - a. Floating-Point Unavailable - b. Program Precise Mode Floating-Point Enabled Exception - c. Trace #### E. Other Vector Instructions - a. Vector Unavailable - b. Trace ### F. *rfid*, *hrfid* and *mtmsr*[*d*] - a. Program Privileged Instruction - b. Program Floating-Point Enabled Exception - c. Trace, for mtmsr[d] only ### G. Other Instructions - a. These exceptions are mutually exclusive and have the same priority: - Program Trap - System Call - Program Privileged Instruction - Hypervisor Emulation Assistance [Category: HEA], or Program Illegal Instruction if Category: HEA is not supported b.Trace ## H. [Hypervisor] Instruction Storage and [Hypervisor] Instruction Segment These exceptions have the lowest priority in this category. They are recognized only when all instructions prior to the instruction causing one of these exceptions appear to have completed and that instruction is the next instruction to be executed. The two exceptions are mutually exclusive. The priority of these exceptions is specified for completeness and to ensure that they are not given more favorable treatment. It is acceptable for an implementation to treat these exceptions as though they had a lower priority. Program - Imprecise Mode Floating-Point Enabled Exception This exception is the fourth highest priority exception. When this exception is created, the interrupt mechanism waits for all other possible exceptions to be reported. It then generates this interrupt if no higher priority exception exists when the interrupt is to be generated. ### 5. Hypervisor Maintenance This exception is the fifth highest priority exception. When this exception is created, the interrupt mechanism waits for all other possible exceptions to be reported. It then generates this interrupt if no higher priority exception exists when the interrupt is to be generated. If a Hypervisor Maintenance exception exists and each attempt to execute an instruction when the Hypervisor Maintenance interrupt is enabled causes an exception (see the Programming Note below), the Hypervisor Maintenance interrupt is not delayed indefinitely. Direct External, Mediated External, and [Hypervisor] Decrementer These exceptions are the lowest priority exceptions. All have equal priority (i.e., the processor may generate any one of the corresponding interrupts for which an exception exists). When one of these exceptions is created, the interrupt processing mechanism waits for all other possible exceptions to be reported. It then generates the corresponding interrupt if no higher priority exception exists when the interrupt is to be generated. If a Hypervisor Decrementer exception exists and each attempt to execute an instruction when the Hypervisor Decrementer interrupt is enabled causes an exception (see the Programming Note below), the Hypervisor Decrementer interrupt is not delayed indefinitely. If LPES<sub>0</sub>=1 and a Direct External exception exists and each attempt to execute an instruction when this interrupt is enabled causes an exception (see the Programming Note below), the Direct External interrupt is not delayed indefinitely. ### **Programming Note** An incorrect or malicious operating system could corrupt the first instruction in the interrupt vector location for an instructioncaused interrupt such that the attempt to execute the instruction causes the same exception that caused the interrupt (a looping interrupt; e.g., illegal instruction and Program interrupt). Similarly, the first instruction of the interrupt vector for one instruction-caused interrupt could cause a different instructioncaused interrupt, and the first instruction of interrupt vector for the second instruction-caused interrupt could cause first instruction-caused interrupt (e.g., Program interrupt and Floating-Point Unavailable interrupt). Similarly, if the Real Mode Area is virtualized and there is no PTE for the page containing the interrupt vectors, every attempt to execute the first instruction of the OS's Instruction Storage interrupt handler would cause a Hypervisor Instruction Storage interrupt; if the Hypervisor Instruction Storage interrupt handler returns to the OS's Instruction Storage interrupt handler without the relevant PTE having been created, another Hypervisor Instruction Storage interrupt would occur immediately. The looping caused by these and similar cases is terminated by the occurrence of a System Reset or Hypervisor Decrementer interrupt. ## **Chapter 7. Timer Facilities** | 7.1 Overview 575 | 7.4 Hypervisor Decrementer 577 | |--------------------------------------|----------------------------------------| | 7.2 Time Base (TB) 575 | 7.5 Processor Utilization of Resources | | 7.2.1 Writing the Time Base 576 | Register (PURR) 578 | | 7.3 Decrementer 576 | 7.6 Scaled Processor Utilization of | | 7.3.1 Writing and Reading the Decre- | Resources Register (PURR) 578 | | menter 577 | - , , | ### 7.1 Overview The Time Base, Decrementer, Hypervisor Decrementer, Processor Utilization of Resources, and Scaled Processor Utilization of Resources registers provide timing functions for the system. The remainder of this section describes these registers and related facilities. ## 7.2 Time Base (TB) The Time Base (TB) is a 64-bit register (see Figure 45) containing a 64-bit unsigned integer that is incremented periodically. | Field | Description | |-------|----------------------------| | TBU40 | Upper 40 bits of Time Base | | TBU | Upper 32 bits of Time Base | | TBL | Lower 32 bits of Time Base | ### Figure 45. Time Base The Time Base is a hypervisor resource; see Chapter 2. The SPRs TBU40, TBU, and TBL provide access to the fields of the Time Base shown in Figure 45. When a *mtspr* instruction is executed specifying one of these SPRs, the associated field of the Time Base is altered and the remaining bits of the Time Base are not affected. See Chapter 4 of Book II for infromation about the update frequency of the Time Base. The Time Base is implemented such that: - Loading a GPR from the Time Base has no effect on the accuracy of the Time Base. - Copying the contents of a GPR to the Time Base replaces the contents of the Time Base with the contents of the GPR. The Power ISA does not specify a relationship between the frequency at which the Time Base is updated and other frequencies, such as the CPU clock or bus clock in a Power ISA system. The Time Base update frequency is not required to be constant. What *is* required, so that system software can keep time of day and operate interval timers, is one of the following. - The system provides an (implementation-dependent) interrupt to software whenever the update frequency of the Time Base changes, and a means to determine what the current update frequency is. - The update frequency of the Time Base is under the control of the system software. Implementations must provide a means for either preventing the Time Base from incrementing or preventing it from being read in problem state (MSR $_{PR}$ =1). If the means is under software control, it must be privileged and, in implementations of the Server environment, must be accessible only in hypervisor state (MSR $_{HV}$ $_{PR}$ = 0b10). There must be a method for getting all processors' Time Bases to start incrementing with values that are identical or almost identical in all processors. ### - Programming Note If software initializes the Time Base on power-on to some reasonable value and the update frequency of the Time Base is constant, the Time Base can be used as a source of values that increase at a constant rate, such as for time stamps in trace entries. Even if the update frequency is not constant, values read from the Time Base are monotonically increasing (except when the Time Base wraps from $2^{64}$ -1 to 0). If a trace entry is recorded each time the update frequency changes, the sequence of Time Base values can be post-processed to become actual time values. Successive readings of the Time Base may return identical values. If Time Base bits 60:63 are used as part of a random number generator, software must account for the fact that these bits are set to 0x0 only when bit 59 changes state regardless of whether or not they incremented to 0xF since they were previously set to 0x0. See the description of the Time Base in Chapter of Book II for ways to compute time of day in POSIX format from the Time Base. ### 7.2.1 Writing the Time Base Writing the Time Base is privileged, and can be done only in hypervisor state. Reading the Time Base is not privileged; it is discussed in Chapter 4 of Book II. It is not possible to write the entire 64-bit Time Base using a single instruction. The *mttbl* and *mttbu* extended mnemonics write the lower and upper halves of the Time Base (TBL and TBU), respectively, preserving the other half. These are extended mnemonics for the *mtspr* instruction; see Appendix A, "Assembler Extended Mnemonics" on page 589. The Time Base can be written by a sequence such as: ``` lwz Rx,upper # load 64-bit value for lwz Ry,lower # TB into Rx and Ry li Rz,0 mttbl Rz # set TBL to 0 mttbu Rx # set TBU mttbl Ry # set TBL ``` Provided that no interrupts occur while the last three instructions are being executed, loading 0 into TBL prevents the possibility of a carry from TBL to TBU while the Time Base is being initialized. The preferred method of changing the Time Base utilizes the TBU40 facility. The following code sequence demonstrates the process. Assume the upper 40 bits of Rx contain the desired value upper 40 bits of the Time Base. ``` mftb Ry # Read 64-bit Time Base value clrldi Ry,Ry,40# lower 24 bits of old TB mttbu40Rx # write upper 40 bits of TB mftb Rz # read TB value again clrldi Rz,Rz,40# lower 24 bits of new TB cmpld Rz,Ry # compare new and old lwr 24 bge done # no carry out of low 24 bits addis Rx,Rx,0x0100#increment upper 40 bits mttbu40 Rx # update to adjust for carry ``` ### Programming Note The instructions for writing the Time Base are mode-independent. Thus code written to set the Time Base will work correctly in either 64-bit or 32-bit mode. ### 7.3 Decrementer The Decrementer (DEC) is a 32-bit decrementing counter that provides a mechanism for causing a Decrementer interrupt after a programmable delay. The contents of the Decrementer are treated as a signed integer. Figure 46. Decrementer The Decrementer counts down. Decrementer bits 32:59 count down until their value becomes 0x000\_0000, at the next increment their value becomes 0xFFF\_FFFF. Decrementer bits 60:63 may decrement at a variable rate. When the value of bit 59 changes, bits 60:63 are set to 0xF; if bits 60:63 decrement to 0x0 before the value of bit 59 changes, they remain at 0x0 until the value of bit 59 changes. The Decrementer is driven by the same frequency as the Time Base. The period of the Decrementer will depend on the driving frequency, but if the same values are used as given earlier for the Time Base (see Section 4.2 of Book II), and if the Time Base update frequency is constant, the period would be $$T_{\text{DEC}} = \frac{2^{32} \times 32}{1 \text{ GHz}} = 137 \text{ seconds.}$$ When the contents of $\mathrm{DEC}_{32}$ change from 0 to 1, a Decrementer exception will come into existence within a reasonable period or time. When the contents of $\mathrm{DEC}_{32}$ change from 1 to 0, an existing Decrementer exception will cease to exist within a reasonable period of time, but not later than the completion of the next context synchronizing instruction or event. The preceding paragraph applies regardless of whether the change in the contents of $DEC_{32}$ is the result of decrementation of the Decrementer by the processor or of modification of the Decrementer caused by execution of an mtspr instruction. The operation of the Decrementer satisfies the following constraints. - The operation of the Time Base and the Decrementer is coherent, i.e., the counters are driven by the same fundamental time base. - Loading a GPR from the Decrementer has no effect on the accuracy of the Time Base. - Copying the contents of a GPR to the Decrementer replaces the contents of the Decrementer with the contents of the GPR. ### **Programming Note** - In systems that change the Time Base update frequency for purposes such as power management, the Decrementer input frequency will also change. Software must be aware of this in order to set interval timers. If Decrementer bits 60:63 are used as part of a random number generator, software must account for the fact that these bits are set to 0xF only when bit 59 changes state regardless of whether or not they decremented to 0x0 since they were previously set to 0xF. ## 7.3.1 Writing and Reading the Decrementer The contents of the Decrementer can be read or written using the *mfspr* and *mtspr* instructions, both of which are privileged when they refer to the Decrementer. Using an extended mnemonic (see Appendix A, "Assembler Extended Mnemonics" on page 589), the Decrementer can be written from GPR Rx using: The Decrementer can be read into GPR Rx using: Copying the Decrementer to a GPR has no effect on the Decrementer contents or on the interrupt mechanism. ## 7.4 Hypervisor Decrementer The Hypervisor Decrementer (HDEC) is a 32-bit decrementing counter that provides a mechanism for causing a Hypervisor Decrementer interrupt after a programma- ble delay. The contents of the Decrementer are treated as a signed integer. Figure 47. Hypervisor Decrementer The Hypervisor Decrementer is a hypervisor resource; see Chapter 2. Hypervisor Decrementer bits 32:59 count down until their value becomes 0x000\_0000, at the next increment their value becomes 0xFFF\_FFFF. Bits 60:63 may decrement at a variable rate. When the value of bit 59 changes, bits 60:63 are set to 0xF; if bits 60:63 decrement to 0x0 before the value of bit 59 changes, they remain at 0x0 until the value of bit 59 changes. The Hypervisor Decrementer is driven by the same frequency as the Time Base. The period of the Hypervisor Decrementer will depend on the driving frequency, but if the same values are used as given above for the Time Base (see Section 7.2), and if the Time Base update frequency is constant, the period would be $$T_{\text{DEC}} = \frac{2^{32} \times 32}{1 \text{ GHz}} = 137 \text{ seconds.}$$ When the contents of $HDEC_{32}$ change from 0 to 1, a Hypervisor Decrementer exception will come into existence within a reasonable period or time. When the contents of $HDEC_{32}$ change from 1 to 0, an existing Hypervisor Decrementer exception will cease to exist within a reasonable period of time, but not later than the completion of the next context synchronizing instruction or event. The preceding paragraph applies regardless of whether the change in the contents of $HDEC_{32}$ is the result of decrementation of the Hypervisor Decrementer by the processor or of modification of the Hypervisor Decrementer caused by execution of an mtspr instruction. The operation of the Hypervisor Decrementer satisfies the following constraints. - The operation of the Time Base and the Hypervisor Decrementer is coherent, i.e., the counters are driven by the same fundamental time base. - Loading a GPR from the Hypervisor Decrementer has no effect on the accuracy of the Hypervisor Decrementer. - Copying the contents of a GPR to the Hypervisor Decrementer replaces the contents of the Hypervisor Decrementer with the contents of the GPR. ### **Programming Note** In systems that change the Time Base update frequency for purposes such as power management, the Hypervisor Decrementer update frequency will also change. Software must be aware of this in order to set interval timers. If Hypervisor Decrementer bits 60:63 are used as part of a random number generator, software must account for the fact that these bits are set to 0xF only when bit 59 changes state regardless of whether or not they decremented to 0x0 since they were previously set to 0xF. ## 7.5 Processor Utilization of Resources Register (PURR) The Processor Utilization of Resources Register (PURR) is a 64-bit counter, the contents of which provide an estimate of the resources used by the processor. The contents of the PURR are treated as a 64-bit unsigned integer. Figure 48. Processor Utilization of Resources Register The PURR is a hypervisor resource; see Chapter 2. The contents of the PURR increase monotonically, unless altered by software, until the sum of the contents plus the amount by which it is to be increased exceed 0xFFFF\_FFFF\_FFFF\_FFFF (2<sup>64</sup> - 1) at which point the contents are replaced by that sum modulo 2<sup>64</sup>. There is no interrupt or other indication when this occurs. The rate at which the value represented by the contents of the PURR increases is an estimate of the portion of resources used by the processor per unit time with respect to other processors that share those resources monitored by the PURR. When the processor is idle, the rate at which the PURR value increases is implementation dependent. Let the difference between the value represented by the contents of the Time Base at times $T_a$ and $T_b$ be $T_{ab}$ . Let the difference between the value represented by the contents of the PURR at time $T_a$ and $T_b$ be the value $P_{ab}$ . The ratio of $P_{ab}/T_{ab}$ is an estimate of the percentage of shared resources used by the processor during the interval $T_{ab}$ . For the set {S} of processors that share the resources monitored by the PURR, the sum of the usage estimates for all the processors in the set is 1.0. The definition of the set of processors S, the shared resources corresponding to the set S, and specifics of the algorithm for incrementing the PURR are implementation-specific. The PURR is implemented such that: - Loading a GPR from the PURR has no effect on the accuracy of the PURR. - Copying the contents of a GPR to the PURR replaces the contents of the PURR with the contents of the GPR. ### Programming Note - Estimates computed as described above may be useful for purposes related to resource utilization, including utilization-based system management and planning. Because the rate at which the PURR accumulates resource usage estimates is dependent on the frequency at which the Time Base is incremented, and the frequency of the oscillator that drives instruction execution may vary independently from that of the Time Base, the interpretation of the contents of the PURR may be inaccurate as a measurement of capacity consumption for accounting purposes. The SPURR should be used for accounting purposes. ## 7.6 Scaled Processor Utilization of Resources Register (PURR) The Scaled Processor Utilization of Resources Register (SPURR) is a 64-bit counter, the contents of which provide an estimate of the resources used by the processor. The contents of the SPURR are treated as a 64-bit unsigned integer. Figure 49. Scaled Processor Utilization of Resources Register The SPURR is a hypervisor resource; see Section 2.7. The contents of the SPURR increase monotonically, unless altered by software, until the sum of the contents plus the amount by which it is to be increased exceed 0xFFFF\_FFFF\_FFFF\_FFFF (2<sup>64</sup> - 1) at which point the contents are replaced by that sum modulo 2<sup>64</sup>. There is no interrupt or other indication when this occurs. The rate at which the value represented by the contents of the SPURR increases is an estimate of the portion of resources used by the processor with respect to other processors that share those resources monitored by the SPURR, and relative to the computational capacity provided by those resources. The computational capacity provided by the shared resources may vary as a function of the frequency of the oscillator which drives the resources or as a result of deliberate delays in processing that are created to reduce power consumption. When the processor is idle, the rate at which the SPURR value increases is implementation dependent. Let the difference between the value represented by the contents of the Time Base at times $T_a$ and $T_b$ be $T_{ab}$ . Let the ratio of the effective and nominal frequencies of the oscillator driving instruction execution $f_e/f_n$ be $f_r$ . Let the ratio of delay cycles created by power reduction circuitry and total cycles $c_d/c_t$ be $c_r$ . Let the difference between the value represented by the contents of the SPURR at time $T_a$ and $T_b$ be the value $S_{ab}$ . The ratio of $S_{ab}/(T_{ab} \times f_r \times (1 - c_r))$ is an estimate of the percentage of shared resource capacity used by the processor during the interval $T_{ab}$ . For the set {S} of processors that share the resources monitored by the SPURR, the sum of the usage estimates for all the processors in the set is 1.0. The definition of the set of processors S, the shared resources corresponding to the set S, and specifics of the algorithm for incrementing the SPURR are implementation-specific. The SPURR is implemented such that: - 1. Loading a GPR from the SPURR has no effect on the accuracy of the SPURR. - Copying the contents of a GPR to the SPURR replaces the contents of the SPURR with the contents of the GPR. ### **Programming Note** Estimates computed as described above may be useful for purposes of resource use accounting, program dispatching, etc. ## **Chapter 8. Debug Facilities** | 8.1 | Overview | 581 | 8.1.2 | Data Address Breakpoint | 581 | |-------|----------------------------|-----|-------|-------------------------|-----| | 8.1.1 | Come-From Address Register | 581 | | | | ### 8.1 Overview Processors provide debug facilities to enable hardware and software debug functions, such as instructions and data breakpoints and program single stepping. The debug facilities consist of a data address breakpoint register (DABR), a data address breakpoint register extension (DABRX) (see Section 8.1.2) and an associated interrupt (see Section 6.5.3). The *mfspr* and *mtspr* instructions (see Section 4.4.5) provide access to the registers of the debug facilities. In addition to the facilities described here, implementations will typically include debug facilities, modes, and access mechanisms which are implementation-specific. For example, implementations will typically provide access to the debug facilities via a dedicated interface such as the IEEE 1149.1 Test Access Port (JTAG). ## 8.1.1 Come-From Address Register The Come-From Address Register (CFAR) is a 64-bit register. When an *rfid* instruction is executed, the register is set to the effective address of the instruction. When a *Branch* instruction is executed and the branch is taken, the register is set to the effective address of an instruction in the instruction cache block containing the *Branch* instruction. For *Branch* instructions, this setting need not occur until a subsequent context synchronizing operation has occurred. Figure 50. Come-From Address Register The contents of the CFAR can be read and written using the *mfspr* and *mtspr* instructions. Access to the CFAR is privileged. ### - Programming Note - This register can be used for purposes of debugging software. For example, often a software bug results in the program executing a portion of the code that it should not have reached or causing an unexpected interrupt. In the former case, a breakpoint can be placed in the portion of the code that was erroneously reached and the program reexecuted. In either case, the interrupt handler can save the contents of the CFAR (before executing the first instruction that would modify the register), and then make the saved contents available for a debugger to use in determining the control flow path by which the exception was reached. In order to preserve the CFAR's contents for each partition and to prevent it from being used to implement a "covert channel" between partitions, the hypervisor should initialize/save/restore the CFAR when switching partitions on a given processor. ### 8.1.2 Data Address Breakpoint The Data Address Breakpoint mechanism provides a means of detecting load and store accesses to a designated doubleword. The address comparison is done on an effective address (EA). The Data Address Breakpoint mechanism is controlled by the Data Address Breakpoint Register (DABR), shown in Figure 51, and the Data Address Breakpoint Register Extension (DABRX), shown in Figure 52. ### Bit(s) Name Description 0:60 DAB Data Address Breakpoint61 BT Breakpoint Translation 62 DW Data Write63 DR Data Read Figure 51. Data Address Breakpoint Register | | /// | BTI | PRIV | /M | |---|-----|-----|------|----| | 0 | | 60 | 61 | 63 | ### Bit(s) Name Description 60 BTI Breakpoint Translation Ignore 61:63 PRIVM Privilege Mask 61 HYP Hypervisor state 62 PNH Privileged but Non-Hypervisor state 63 PRO Problem state All other fields are reserved. ## Figure 52. Data Address Breakpoint Register Extension The DABR and DABRX are hypervisor resources; see Section 2.7 on page 475. The supported PRIVM values are 0b000, 0b001, 0b010, 0b011, 0b100, and 0b111. If the PRIVM field does not contain one of the supported values, then whether a match occurs for a given storage access is undefined. Elsewhere in this section it is assumed that the PRIVM field contains one of the supported values. ### **Programming Note** PRIVM value 0b000 causes matches not to occur regardless of the contents of other DABR and DABRX fields. PRIVM values 0b101 and 0b110 are not supported because a storage location that is shared between the hypervisor and non-hypervisor software is unlikely to be accessed using the same EA by both the hypervisor and the non-hypervisor software. (PRIVM value 0b111 is supported primarily for reasons of software compatibility, as described in a subsequent Programming Note.) A Data Address Breakpoint match occurs for a Load or Store instruction if, for any byte accessed, all of the following conditions are satisfied. - $EA_{0:60} = DABR_{DAB}$ - $(MSR_{DR} = DABR_{BT}) | DABRX_{BTI}$ - if the processor is in - hypervisor state and DABRX<sub>HYP</sub> = 1 or - privileged but non-hypervisor state and DABRX<sub>PNH</sub> = 1 or - problem state and DABRX<sub>PR</sub> = 1 the instruction is a Store and DABR<sub>DW</sub> = 1, or the instruction is a Load and DABR<sub>DR</sub> = 1. In 32-bit mode the high-order 32 bits of the EA are treated as zeros for the purpose of detecting a match. If the above conditions are satisfied, a match also occurs for *eciwx* and *ecowx*. For the purpose of determining whether a match occurs, *eciwx* is treated as a *Load*, and *ecowx* is treated as a *Store*. If the above conditions are satisfied, it is undefined whether a match occurs in the following cases. - The instruction is Store Conditional but the store is not performed. - The instruction is a *Load/Store String* of zero length. - The instruction is dcbz. (For the purpose of determining whether a match occurs, dcbz is treated as a Store.) The Cache Management instructions other than **dcbz** never cause a match. A Data Address Breakpoint match causes a Data Storage exception (see Section 6.5.3, "Data Storage Interrupt" on page 559). If a match occurs, some or all of the bytes of the storage operand may have been accessed; however, if a *Store* or *ecowx* instruction causes the match, the storage operand is not modified if the instruction is one of the following: - any Store instruction that causes an atomic access - ecowx ### Programming Note The Data Address Breakpoint mechanism does not apply to instruction fetches. ### - Programming Note Before setting a breakpoint requested by the operating system, the hypervisor must verify that the requested contents of the DABR and DABRX cannot cause the hypervisor to receive a Data Storage interrupt that it is not prepared to handle, or that it intrinsically cannot handle (e.g., the EA is in the range of EAs at which the hypervisor's Data Storage interrupt handler saves registers, DABR<sub>BT</sub> || DABRX<sub>BTI</sub> $\neq$ 0b10, DABR<sub>DW</sub> = 1, and DABRX<sub>HYP</sub> = 1). ### **Programming Note** Processors that comply with versions of the architecture that precede Version 2.02 do not provide the DABRX. Forward compatibility for software that was written for such processors (and uses the Data Address Breakpoint facility) can be obtained by setting DABRX<sub>60:63</sub> to 0b0111. ## Chapter 9. External Control [Category: External Control] The External Control facility permits a program to communicate with a special-purpose device. The facility consists of a Special Purpose Register, called EAR, and two instructions, called External Control In Word Indexed (eciwx) and External Control Out Word Indexed (ecowx). This facility must provide a means of synchronizing the devices with the processor to prevent the use of an address by the device when the translation that produced that address is being invalidated. ## 9.1 External Access Register This 32-bit Special Purpose Register controls access to the External Control facility and, for external control operations that are permitted, identifies the target device. | Е | /// | RID | |-------|-----|-------| | 32 33 | | 58 63 | | Bit(s) | Name | Description | |--------|------|-------------| | 32 | E | Enable bit | | 58:63 | RID | Resource ID | All other fields are reserved. #### Figure 53. External Access Register The External Access Register (EAR) is a hypervisor resource; see Chapter 2. The high-order bits of the RID field that correspond to bits of the Resource ID beyond the width of the Resource ID supported by the implementation are treated as reserved bits. #### Programming Note - The hypervisor can use the EAR to control which programs are allowed to execute *External Access* instructions, when they are allowed to do so, and which devices they are allowed to communicate with using these instructions. ### 9.2 External Access Instructions The External Access instructions, External Control In Word Indexed (eciwx) and External Control Out Word Indexed (ecowx), are described in Book II. Additional information about them is given below. If attempt is made to execute either of these instructions when EAR<sub>E</sub>=0, a Data Storage interrupt occurs with bit 43 of the DSISR set to 1. The instructions are supported whenever $MSR_{DR}=1$ . If either instruction is executed when $MSR_{DR}=0$ (real addressing mode), the results are boundedly undefined. ## **Chapter 10. Synchronization Requirements for Context Alterations** Changing the contents of certain System Registers, the contents of SLB entries, or the contents of other system resources that control the context in which a program executes can have the side effect of altering the context in which data addresses and instruction addresses are interpreted, and in which instructions are executed and data accesses are performed. For example, changing MSR<sub>IR</sub> from 0 to 1 has the side effect of enabling translation of instruction addresses. These side effects need not occur in program order, and therefore may require explicit synchronization by software. (Program order is defined in Book II.) An instruction that alters the context in which data addresses or instruction addresses are interpreted, or in which instructions are executed or data accesses are performed, is called a *context-altering instruction*. This chapter covers all the context-altering instructions. The software synchronization required for them is shown in Table 1 (for data access) and Table 2 (for instruction fetch and execution). The notation "CSI" in the tables means any context synchronizing instruction (e.g., sc, isync, or rfid). A context synchronizing interrupt (i.e., any interrupt except non-recoverable System Reset or non-recoverable Machine Check) can be used instead of a context synchronizing instruction. If it is, phrases like "the synchronizing instruction", below, should be interpreted as meaning the instruction at which the interrupt occurs. If no software synchronization is required before (after) a context-altering instruction, "the synchronizing instruction before (after) the context-altering instruction" should be interpreted as meaning the context-altering instruction itself. The synchronizing instruction before the context-altering instruction ensures that all instructions up to and including that synchronizing instruction are fetched and executed in the context that existed before the alteration. The synchronizing instruction after the context-altering instruction ensures that all instructions after that synchronizing instruction are fetched and executed in the context established by the alteration. Instructions after the first synchronizing instruction, up to and including the second synchronizing instruction, may be fetched or executed in either context. If a sequence of instructions contains context-altering instructions and contains no instructions that are affected by any of the context alterations, no software synchronization is required within the sequence. ### **Programming Note** - Sometimes advantage can be taken of the fact that certain events, such as interrupts, and certain instructions that occur naturally in the program, such as the *rfid* that returns from an interrupt handler, provide the required synchronization. No software synchronization is required before or after a context-altering instruction that is also context synchronizing or when altering the MSR in most cases (see the tables). No software synchronization is required before most of the other alterations shown in Table 2, because all instructions preceding the context-altering instruction are fetched and decoded before the context-altering instruction is executed (the processor must determine whether any of these preceding instructions are context synchronizing). Unless otherwise stated, the material in this chapter assumes a uniprocessor environment. | Instruction or<br>Event | Required<br>Before | Required<br>After | Notes | |-------------------------|--------------------|----------------------------|-------| | interrupt | none | none | | | rfid | none | none | | | hrfid | none | none | | | sc | none | none | | | Trap | none | none | | | mtmsrd (SF) | none | none | | | mtmsr[d] (PR) | none | none | | | mtmsr[d] (DR) | none | none | | | mtsr[in] | CSI | CSI | | | mtspr (SDR1) | ptesync | CSI | 3,4 | | mtspr (AMR) | CSI | CSI | | | mtspr (EAR) | CSI | CSI | | | mtspr (RMOR) | CSI | CSI | 13 | | mtspr (HRMOR) | CSI | CSI | 13 | | mtspr (LPCR) | CSI | CSI | 13 | | mtspr (DABR) | | | 2 | | mtspr (DABRX) | | | 2 | | slbie | CSI | CSI | | | slbia | CSI | CSI | | | slbmte | CSI | CSI | 11 | | tlbie | CSI | CSI | 5,7 | | tlbiel | CSI | ptesync | 5 | | tlbia | CSI | CSI | 5 | | Store(PTE) | none | { <b>ptesync</b> ,<br>CSI} | 6,7 | Table 1: Synchronization requirements for data access | Instruction or Event | Required<br>Before | Required<br>After | Notes | |----------------------|--------------------|-------------------|--------| | | | | | | interrupt | none | none | | | rfid | none | none | | | hrfid | none | none | | | sc | none | none | | | Trap | none | none | | | mtmsrd (SF) | none | none | 8 | | mtmsr[d] (EE) | none | none | 1 | | mtmsr[d] (PR) | none | none | 9 | | mtmsr[d] (FP) | none | none | | | mtmsr[d](FE0,FE1) | none | none | | | mtmsr[d] (SE, BE) | none | none | | | mtmsr[d] (IR) | none | none | 9 | | mtmsr[d] (RI) | none | none | | | mtsr[in] | none | CSI | 9 | | mtspr (DEC) | none | none | 10 | | mtspr (SDR1) | ptesync | CSI | 3,4 | | mtspr (CTRL) | none | none | | | mtspr (HDEC) | none | none | 10 | | mtspr (RMOR) | none | CSI | 13 | | mtspr (HRMOR) | none | CSI | 9,13 | | mtspr (LPCR) | none | CSI | 13, 14 | | mtspr (LPIDR) | CSI | CSI | 7,12 | | mtspr (PCR) | none | CSI | | | slbie | none | CSI | | | slbia | none | CSI | | | slbmte | none | CSI | 9,11 | | tlbie | none | CSI | 5,7 | | tlbiel | none | CSI | 5 | | tlbia | none | CSI | 5 | | Store(PTE) | none | {ptesync, CSI} | 6,7 | Table 2: Synchronization requirements for instruction fetch and/or execution ## Notes: - The effect of changing the EE bit is immediate, even if the *mtmsr[d]* instruction is not context synchronizing (i.e., even if L=1). - If an mtmsr[d] instruction sets the EE bit to 0, neither an External interrupt nor a Decrementer interrupt occurs after the mtmsr[d] is executed. - If an mtmsr[d] instruction changes the EE bit from 0 to 1 when an External, Decrementer, or higher priority exception exists, the corresponding interrupt occurs immediately after the mtmsr[d] is executed, and before the next instruction is executed in the program that set EE to 1. - If a hypervisor executes the mtmsr[d] instruction that sets the EE bit to 0, a Hypervisor Decrementer interrupt does not occur after mtmsr[d] is executed as long as the processor remains in hypervisor state. - If the hypervisor executes an *mtmsr[d]* instruction that changes the EE bit from 0 to 1 when a Hypervisor Decrementer or higher priority exception exists, the corresponding interrupt occurs immediately after the *mtmsr[d]* instruction is executed, and before the next instruction is executed, provided HDICE is 1. - 2. Synchronization requirements for this instruction are implementation-dependent. - SDR1 must not be altered when MSR<sub>DR</sub>=1 or MSR<sub>IR</sub>=1; if it is, the results are undefined. - 4. A ptesync instruction is required before the mtspr instruction because (a) SDR1 identifies the Page Table and thereby the location of Reference and Change bits, and (b) on some implementations, use of SDR1 to update Reference and Change bits may be independent of translating the virtual address. (For example, an implementation might identify the PTE in which to update the Reference and Change bits in terms of its offset in the Page Table, instead of its real address, and then add the Page Table address from SDR1 to the offset to determine the real address at which to update the bits.) To ensure that Reference and Change bits are updated in the correct Page Table, SDR1 must not be altered until all Reference and Change bit updates associated with address translations that were performed, by the processor executing the mtspr instruction, before the mtspr instruction is executed have been performed with respect to that processor. A ptesync instruction guarantees this synchronization of Reference and Change bit updates, while neither a context synchronizing operation nor the instruction fetching mechanism does so. - For data accesses, the context synchronizing instruction before the *tlbie*, *tlbiel*, or *tlbia* instruc- tion ensures that all preceding instructions that access data storage have completed to a point at which they have reported all exceptions they will cause. The context synchronizing instruction after the *tlbie*, *tlbiel*, or *tlbia* instruction ensures that storage accesses associated with instructions following the context synchronizing instruction will not use the TLB entry(s) being invalidated. (If it is necessary to order storage accesses associated with preceding instructions, or Reference and Change bit updates associated with preceding address translations, with respect to subsequent data accesses, a *ptesync* instruction must also be used, either before or after the *tlbie*, *tlbiel*, or *tlbia* instruction. These effects of the *ptesync* instruction are described in the last paragraph of Note 8.) The notation "{ptesync,CSI}" denotes an instruction sequence. Other instructions may be interleaved with this sequence, but these instructions must appear in the order shown. No software synchronization is required before the *Store* instruction because (a) stores are not performed out-of-order and (b) address translations associated with instructions preceding the *Store* instruction are not performed again after the store has been performed (see Section 5.5). These properties ensure that all address translations associated with instructions preceding the *Store* instruction will be performed using the old contents of the PTE. The **ptesync** instruction after the **Store** instruction ensures that all searches of the Page Table that are performed after the **ptesync** instruction completes will use the value stored (or a value stored subsequently). The context synchronizing instruction after the **ptesync** instruction ensures that any address translations associated with instructions following the context synchronizing instruction that were performed using the old contents of the PTE will be discarded, with the result that these address translations will be performed again and, if there is no corresponding entry in any implementation-specific address translation lookaside information, will use the value stored (or a value stored subsequently). The *ptesync* instruction also ensures that all storage accesses associated with instructions preceding the *ptesync* instruction, and all Reference and Change bit updates associated with additional address translations that were performed, by the processor executing the *ptesync* instruction, before the *ptesync* instruction is executed, will be performed with respect to any processor or mechanism, to the extent required by the associated Memory Coherence Required attributes, before any data accesses caused by instructions follow- ing the *ptesync* instruction are performed with respect to that processor or mechanism. 7. There are additional software synchronization requirements for this instruction in multiprocessor environments (e.g., it may be necessary to invalidate one or more TLB entries on all processors in the multiprocessor system and to be able to determine that the invalidations have completed and that all side effects of the invalidations have taken effect). Section 5.10 gives examples of using *tlbie*, *Store*, and related instructions to maintain the Page Table, in both multiprocessor and uniprocessor environments. ### **Programming Note -** In a multiprocessor system, if software locking is used to help ensure that the requirements described in Section 5.10 are satisfied, the *Iwsync* instruction near the end of the lock acquisition sequence (see Section B.2.1.1 of Book II) may naturally provide the context synchronization that is required before the alteration. - 8. The alteration must not cause an implicit branch in effective address space. Thus, when changing MSR<sub>SF</sub> from 1 to 0, the *mtmsrd* instruction must have an effective address that is less than 2<sup>32</sup> 4. Furthermore, when changing MSR<sub>SF</sub> from 0 to 1, the *mtmsrd* instruction must not be at effective address 2<sup>32</sup> 4 (see Section 5.3.2 on page 506). - 9. The alteration must not cause an implicit branch in real address space. Thus the real address of the context-altering instruction and of each subsequent instruction, up to and including the next context synchronizing instruction, must be independent of whether the alteration has taken effect. - The elapsed time between the contents of the Decrementer or Hypervisor Decrementer becoming negative and the signaling of the corresponding exception is not defined. - 11. If an *slbmte* instruction alters the mapping, or associated attributes, of a currently mapped ESID, the *slbmte* must be preceded by an *slbie* (or *slbia*) instruction that invalidates the existing translation. This applies even if the corresponding entry is no longer in the SLB (the translation may still be in implementation-specific address translation lookaside information). No software synchronization is needed between the *slbie* and the *slbmte*, regardless of whether the index of the SLB entry (if any) containing the current translation is the same as the SLB index specified by the *slb-mte*. No *slbie* (or *slbia*) is needed if the *slbmte* instruction replaces a valid SLB entry with a mapping of a different ESID (e.g., to satisfy an SLB miss). However, the *slbie* is needed later if and when the translation that was contained in the replaced SLB entry is to be invalidated. 12. The context synchronizing instruction before the *mtspr* instruction ensures that the LPIDR is not altered out-of-order. (Out-of-order alteration of the LPIDR could permit the requirements described in Section 5.10.1 to be violated. For the same reason, such a context synchronizing instruction may be needed even if the new LPID value is equal to the old LPID value.) See also Chapter 2. "Logical Partitioning (LPAR)" on page 471 regarding moving a processor from one partition to another. When the RMOR or HRMOR is modified, or the VC, VRMASD, RMLS, or LPES<sub>1</sub> fields of the LPCR are modified, software must invalidate all implementation-specific lookaside information used in address translation that depends on values stored in these registers. The *slbia* instruction can be used to invalidate all such implementation-specific lookaside information. 13. A context synchronizing instruction or event that is executed or occurs when LPCR<sub>MER</sub> = 1 does not necessarily ensure that the exception effects of LPCR<sub>MER</sub> are consistent with the contents of LPCR<sub>MER</sub>. See Section 2.2. ## **Appendix A. Assembler Extended Mnemonics** In order to make assembler language programs simpler to write and easier to understand, a set of extended mnemonics and symbols is provided for certain instructions. This appendix defines extended mnemonics and symbols related to instructions defined in Book III. Assemblers should provide the extended mnemonics and symbols listed here, and may provide others. ## A.1 Move To/From Special Purpose Register Mnemonics This section defines extended mnemonics for the *mtspr* and *mfspr* instructions, including the Special Purpose Registers (SPRs) defined in Book I and certain privileged SPRs, and for the *Move From Time Base* instruction defined in Book II. The *mtspr* and *mfspr* instructions specify an SPR as a numeric operand; extended mnemonics are provided that represent the SPR in the mnemonic rather than requiring it to be coded as an operand. Similar extended mnemonics are provided for the *Move From Time Base* instruction, which specifies the portion of the Time Base as a numeric operand. **Note:** *mftb* serves as both a basic and an extended mnemonic. The Assembler will recognize an *mftb* mnemonic with two operands as the basic form, and an *mftb* mnemonic with one operand as the extended form. In the extended form the TBR operand is omitted and assumed to be 268 (the value that corresponds to TR) ### **Programming Note** - The extended mnemonics in Table 3 for SPRs associated with the Performance Monitor facility are based on the definitions in Appendix B. Other versions of Performance Monitor facilities used different sets of SPR numbers (all 32-bit PowerPC processors used a different set, and some early Power ISA processors used yet a different set). | Created Burress Berister | Move ' | To SPR | Move Fr | om SPR <sup>1</sup> | |--------------------------------------------|-------------|----------------|-------------|----------------------------| | Special Purpose Register | Extended | Equivalent to | Extended | Equivalent to | | Fixed-Point Exception Register | mtxer Rx | mtspr 1,Rx | mfxer Rx | mfspr Rx,1 | | Link Register | mtlr Rx | mtspr 8,Rx | mflr Rx | mfspr Rx,8 | | Count Register | mtctr Rx | mtspr 9,Rx | mfctr Rx | mfspr Rx,9 | | Data Stream Control Register | mtdscr Rx | mtspr 17,Rx | mfdscr Rx | mfspr Rx,17 | | Data Storage Interrupt Status<br>Register | mtdsisr Rx | mtspr 18,Rx | mfdsisr Rx | mfspr Rx,18 | | Data Address Register | mtdar Rx | mtspr 19,Rx | mfdar Rx | mfspr Rx,19 | | Decrementer | mtdec Rx | mtspr 22,Rx | mfdec Rx | mfspr Rx,22 | | Storage Description Register 1 | mtsdr1 Rx | mtspr 25,Rx | mfsdr1 Rx | mfspr Rx,25 | | Save/Restore Register 0 | mtsrr0 Rx | mtspr 26,Rx | mfsrr0 Rx | mfspr Rx,26 | | Save/Restore Register 1 | mtsrr1 Rx | mtspr 27,Rx | mfsrr1 Rx | mfspr Rx,27 | | Come-From Address Register | mtcfar Rx | mtspr 28,Rx | mfcfar Rx | mfspr Rx,28 | | AMR | mtamr Rx | mtspr 29,Rx | mfamr Rx | mfspr Rx,29 | | CTRL | mtctrl Rx | mtspr 152,Rx | mfctrl Rx | mfspr Rx,13 | | Special Purpose Registers<br>G0 through G3 | mtsprg n,Rx | mtspr 272+n,Rx | mfsprg Rx,n | mfspr Rx,272 | | Time Base [Lower] | mttbl Rx | mtspr 284,Rx | mftb Rx | mftb Rx,268<br>mfspr Rx,26 | | Time Base Upper | mttbu Rx | mtspr 285,Rx | mftbu Rx | mftb Rx,269<br>mfspr Rx,26 | | Time Base Upper 40 | mttbu40 Rx | mtspr 286,Rx | - | - | | Processor Version Register | - | - | mfpvr Rx | mfspr Rx,28 | | HMER | mthmer Rx | mtspr 336,Rx | mfhmer Rx | mfspr Rx,33 | | HMEER | mthmeer Rx | mtspr 337,Rx | mfhmeer Rx | mfspr Rx,33 | | MMCRA | mtmmcra Rx | mtspr 786,Rx | mfmmcra Rx | mfspr Rx,77 | | PMC1 | mtpmc1 Rx | mtspr 787,Rx | mfpmc1 Rx | mfspr Rx,77 | | PMC2 | mtpmc2 Rx | mtspr 788,Rx | mfpmc2 Rx | mfspr Rx,77 | | PMC3 | mtpmc3 Rx | mtspr 789,Rx | mfpmc3 Rx | mfspr Rx,77 | | PMC4 | mtpmc4 Rx | mtspr 790,Rx | mfpmc4 Rx | mfspr Rx,77 | | PMC5 | mtpmc5 Rx | mtspr 791,Rx | mfpmc5 Rx | mfspr Rx,77 | | PMC6 | mtpmc6 Rx | mtspr 792,Rx | mfpmc6 Rx | mfspr Rx,77 | | MMCR0 | mtmmcr0 Rx | mtspr 795,Rx | mfmmcr0 Rx | mfspr Rx,77 | | MMCR1 | mtmmcr1 Rx | mtspr 798,Rx | mfmmcr1 Rx | mfspr Rx,78 | | PPR | mtppr Rx | mtspr 896, Rx | mfppr Rx | mfspr Rx, 89 | | Processor Identification Register | - | - | mfpir Rx | mfspr Rx,102 | The *mftb* instruction is Category: Server.Phased-Out. Assemblers targeting version 2.03 or later of the architecture should generate an *mfspr* instruction for the mftb and mftbu extended mnemonics; see the corresponding Assembler Note in the *mftb* instruction description (see Section 4.2.1 of Book II). ## **Appendix B. Example Performance Monitor** #### Note This Appendix describes an example implementation of a Performance Monitor. A subset of these requirements are being considered for inclusion in the Architecture as part of Category: Server.Performance Monitor. A Performance Monitor facility provides a means of collecting information about program and system performance. The resources (e.g., SPR numbers) that a Performance Monitor facility may use are identified elsewhere in this Book. All other aspects of any Performance Monitor facility are implementation-dependent. This appendix provides an example of a Performance Monitor facility. It is only an example; implementations may provide all, some, or none of the features described here, or may provide features that are similar to those described here but differ in detail. ## **Programming Note** Because the features provided by a Performance Monitor facility are implementation-dependent, operating systems should provide services that support the useful performance monitoring functions in a generic fashion. Application programs should use these services, and should not depend on the features provided by a particular implementation. The example Performance Monitor facility consists of the following features (described in detail in subsequent sections). - one MSR bit - PMM (Performance Monitor Mark), which can be used to select one or more programs for monitoring - SPRs - PMC1 PMC6 (Performance Monitor Counter registers 1 - 6), which count events - MMCR0, MMCR1, and MMCRA (Monitor Mode Control Registers 0, 1, and A), which control the Performance Monitor facility - SIAR and SDAR (Sampled Instruction Address Register and Sampled Data Address Register), which contain the address of the "sampled instruction" and of the "sampled data" - the Performance Monitor interrupt, which can be caused by monitored conditions and events The minimal subset of the features that makes the resulting Performance Monitor useful to software consists of MSR<sub>PMM</sub>, PMC1, PMC2, PMC3, PMC4, MMCR0, MMCR1, and MMCRA and certain bits and fields of these three Monitor Mode Control Registers, and the Performance Monitor Interrupt. These features are identified as the "basic" features below. The remaining features (the remaining SPRs, and the remaining bits and fields in the three Monitor Mode Control Registers) are considered "extensions". The events that can be counted in the PMCs as well as the code that identifies each event are implementation-dependent. The events and codes may vary between PMCs, as well as between implementations. For the programmable PMCs, the event to be counted is selected by specifying the appropriate code in the MMCR "Selector" field for the PMC. Some events may include operations that are performed out-of-order. Many aspects of the operation of the Performance Monitor are summarized by the following hierarchy, which is described starting at the lowest level. - A "counter negative condition" exists when the value in a PMC is negative (i.e., when bit 0 of the PMC is 1). A "Time Base transition event" occurs when a selected bit of the Time Base changes from 0 to 1 (the bit is selected by an MMCR field). The term "condition or event" is used as an abbreviation for "counter negative condition or Time Base transition event". A condition or event can be caused implicitly by the processor (e.g., incrementing a PMC) or explicitly by software (*mtspr*). - A condition or event is enabled if the corresponding "Enable" bit in an MMCR is 1. The occurrence of an enabled condition or event can have side effects within the Performance Monitor, such as causing the PMCs to cease counting. - An enabled condition or event causes a Performance Monitor alert if Performance Monitor alerts are enabled by the corresponding "Enable" bit in an MMCR. A single Performance Monitor alert may reflect multiple enabled conditions and events. A Performance Monitor alert causes a Performance Monitor exception. The exception effects of the Performance Monitor are said to be consistent with the contents of MMCR0<sub>PMAO</sub> if one of the following statements is true. (MMCR0<sub>PMAO</sub> reflects the occurrence of Performance Monitor alerts; see the definition of that bit in Section B.2.2.) - MMCR0<sub>PMAO</sub>=0 and a Performance Monitor exception does not exist. - MMCR0<sub>PMAO</sub>=1 and a Performance Monitor exception exists. A context synchronizing instruction or event that occurs when MMCR0 $_{PMAO}$ =0 ensures that the exception effects of the Performance Monitor are consistent with the contents of MMCR0 $_{PMAO}$ . Even without software synchronization, when the contents of MMCR0<sub>PMAO</sub> change, the exception effects of the Performance Monitor become consistent with the new contents of MMCR0<sub>PMAO</sub> sufficiently soon that the Performance Monitor facility is useful to software for its intended purposes. A Performance Monitor exception causes a Performance Monitor interrupt when MSR<sub>FF</sub>=1. ## Programming Note The Performance Monitor can be effectively disabled (i.e., put into a state in which Performance Monitor SPRs are not altered and Performance Monitor interrupts do not occur) by setting MMCR0 to 0x0000\_0000\_8000\_0000. # B.1 PMM Bit of the Machine State Register The Performance Monitor uses MSR bit PMM, which is defined as follows. ### Bit Description #### 61 **Performance Monitor Mark (PMM)** This bit is a basic feature. This bit contains the Performance Monitor "mark" (0 or 1). ## - Programming Note - Software can use this bit as a process-specific marker which, in conjunction with MMCR0<sub>FCM0</sub> FCM1 (see Section B.2.2), permits events to be counted on a process-specific basis. (The bit is saved by interrupts and restored by *rfid.*) Common uses of the PMM bit include the following. - Count events for a few selected processes. This use requires the following bit settings. - MSR<sub>PMM</sub>=1 for the selected processes, MSR<sub>PMM</sub>=0 for all other processes - MMCR0<sub>FCM0</sub>=1 - MMCR0<sub>FCM1</sub>=0 - Count events for all but a few selected processes. This use requires the following bit settings. - MSR<sub>PMM</sub>=1 for the selected processes, MSR<sub>PMM</sub>=0 for all other processes - MMCR0<sub>FCM0</sub>=0 - MMCR0<sub>FCM1</sub>=1 Notice that for both of these uses a mark value of 1 identifies the "few" processes and a mark value of 0 identifies the remaining "many" processes. Because the PMM bit is set to 0 when an interrupt occurs (see Figure 43 on page 555), interrupt handlers are treated as one of the "many". If it is desired to treat interrupt handlers as one of the "few", the mark value convention just described would be reversed. ## **B.2 Special Purpose Registers** The Performance Monitor SPRs count events, control the operation of the Performance Monitor, and provide associated information. The Performance Monitor SPRs can be read and written using the *mfspr* and *mtspr* instructions (see Section 4.4.5, "Move To/From System Register Instructions" on page 496). The Performance Monitor SPR numbers are shown in Figure 54. Writing any of the Performance Monitor SPRs is privileged. Reading any of the Performance Monitor SPRs is *not* privileged (however, the privileged SPR numbers used to write the SPRs can also be used to read them; see the figure). The elapsed time between the execution of an instruction and the time at which events due to that instruction have been reflected in Performance Monitor SPRs is not defined. No means are provided by which software can ensure that all events due to preceding instructions have been reflected in Performance Monitor SPRs. Similarly, if the events being monitored may be caused by operations that are performed out-of-order, no means are provided by which software can prevent such events due to subsequent instructions from being reflected in Performance Monitor SPRs. Thus the contents obtained by reading a Performance Monitor SPR may not be precise: it may fail to reflect some events due to instructions that precede the *mfspr* and may reflect some events due to instructions that follow the *mfspr*. This lack of precision applies regardless of whether the state of the processor is such that the SPR is subject to change by the processor at the time the *mfspr* is executed. Similarly, if an *mtspr* instruction is executed that changes the contents of the Time Base, the change is not guaranteed to have taken effect with respect to causing Time Base transition events until after a subsequent context synchronizing instruction has been executed. If an *mtspr* instruction is executed that changes the value of a Performance Monitor SPR other than SIAR or SDAR, the change is not guaranteed to have taken effect until after a subsequent context synchronizing instruction has been executed (see Chapter 10. "Synchronization Requirements for Context Alterations" on page 585). ### Programming Note - Depending on the events being monitored, the contents of Performance Monitor SPRs may be affected by aspects of the runtime environment (e.g., cache contents) that are not directly attributable to the programs being monitored. | SPR <sup>1,2</sup> | Register | Privi- | |---------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | spr <sub>5:9</sub> spr <sub>0:4</sub> | Name | leged | | 11000 n0010 | MMCRA | no,yes | | 11000 n0011 | PMC1 | no,yes | | 11000 n0100 | PMC2 | no,yes | | 11000 n0101 | PMC3 | no,yes | | 11000 n0110 | PMC4 | no,yes | | 11000 n0111 | PMC5 | no,yes | | 11000 n1000 | PMC6 | no,yes | | | | | | 11000 n1011 | MMCR0 | no,yes | | 11000 n1100 | SIAR | no,yes | | 11000 n1101 | SDAR | no,yes | | 11000 n1110 | MMCR1 | no,yes | | | 11000 n0010<br>11000 n0011<br>11000 n0100<br>11000 n0101<br>11000 n0110<br>11000 n1000<br>11000 n1011<br>11000 n1101<br>11000 n1100 | spr <sub>5:9</sub> spr <sub>0:4</sub> Name 11000 n0010 MMCRA 11000 n0011 PMC1 11000 n0100 PMC2 11000 n0101 PMC3 11000 n0110 PMC4 11000 n0111 PMC5 11000 n1000 PMC6 11000 n1011 MMCR0 11000 n1100 SIAR 11000 n1101 SDAR | Note that the order of the two 5-bit halves of the SPR number is reversed. Figure 54. Performance Monitor SPR encodings for mfspr | | SPR <sup>1</sup> | Register | Privi- | |---------|---------------------------------------|----------|--------| | decimal | spr <sub>5:9</sub> spr <sub>0:4</sub> | Name | leged | | 786 | 11000 10010 | MMCRA | yes | | 787 | 11000 10011 | PMC1 | yes | | 788 | 11000 10100 | PMC2 | yes | | 789 | 11000 10101 | PMC3 | yes | | 790 | 11000 10110 | PMC4 | yes | | 791 | 11000 10111 | PMC5 | yes | | 792 | 11000 11000 | PMC6 | yes | | 795 | 11000 11011 | MMCR0 | yes | | 796 | 11000 11100 | SIAR | yes | | 797 | 11000 11101 | SDAR | yes | | 798 | 11000 11110 | MMCR1 | yes | Note that the order of the two 5-bit halves of the SPR number is reversed. Figure 55. Performance Monitor SPR encodings for mtspr # **B.2.1 Performance Monitor Counter Registers** The six Performance Monitor Counter registers, PMC1 through PMC6, are 32-bit registers that count events. | | PMC1 | | |----|------|----| | | PMC2 | | | | PMC3 | | | | PMC4 | | | | PMC5 | | | | PMC6 | | | | | | | 32 | | 63 | Figure 56. Performance Monitor Counter registers PMC1, PMC2, PMC3, and PMC4 are basic features. PMC5 and PMC6 are not programmable. PMC5 counts instructions completed and PMC6 counts cycles. Normally each PMC is incremented each processor cycle by the number of times the corresponding event occurred in that cycle. Other modes of incrementing may also be provided (e.g., see the description of MMCR1 bits PMC1HIST and PMCjHIST). "PMCj" is used as an abbreviation for "PMCi, i > 1". ### Programming Note - PMC5 and PMC6 are defined to facilitate calculating basic performance metrics such as cycles per instruction (CPI). Reading the SPR is privileged if and only if n=1. ## Programming Note - Software can use a PMC to "pace" the collection of Performance Monitor data. For example, if it is desired to collect event counts every n cycles, software can specify that a particular PMC count cycles and set that PMC to 0x8000\_0000 - n. The events of interest would be counted in other PMCs. The counter negative condition that will occur after n cycles can, with the appropriate setting of MMCR bits, cause counter values to become frozen, cause a Performance Monitor interrupt to occur, etc. # B.2.2 Monitor Mode Control Register 0 Monitor Mode Control Register 0 (MMCR0) is a 64-bit register. This register, along with MMCR1 and MMCRA, controls the operation of the Performance Monitor. MMCR0 0 63 ### Figure 57. Monitor Mode Control Register 0 MMCR0 is a basic feature. Within MMCR0, some of the bits and fields are basic features and some are extensions. The basic bits and fields are identified as such, below. Some bits of MMCR0 are altered by the processor when various events occur, as described below. The bit definitions of MMCR0 are as follows. MMCR0 bits that are not implemented are treated as reserved. #### Bit(s) Description 0:31 Reserved #### 32 Freeze Counters (FC) This bit is a basic feature. - 0 The PMCs are incremented (if permitted by other MMCR bits). - 1 The PMCs are not incremented. The processor sets this bit to 1 when an enabled condition or event occurs and $\mbox{MMCR0}_{\mbox{FCECE}}=1$ . ### 33 Freeze Counters in Privileged State (FCS) This bit is a basic feature. - The PMCs are incremented (if permitted by other MMCR bits). - 1 The PMCs are not incremented if MSR<sub>HV PR</sub>=0b00. ## 34 Freeze Counters in Problem State (FCP) This bit is a basic feature. - O The PMCs are incremented (if permitted by other MMCR bits). - 1 The PMCs are not incremented if MSR<sub>PR</sub>=1. #### 35 Freeze Counters while Mark = 1 (FCM1) This bit is a basic feature. - 0 The PMCs are incremented (if permitted by other MMCR bits). - 1 The PMCs are not incremented if MSR<sub>PMM</sub>=1. ## 36 Freeze Counters while Mark = 0 (FCM0) This bit is a basic feature. - The PMCs are incremented (if permitted by other MMCR bits). - 1 The PMCs are not incremented if MSR<sub>PMM</sub>=0. ### 37 **Performance Monitor Alert Enable (PMAE)** This bit is a basic feature. - 0 Performance Monitor alerts are disabled. - 1 Performance Monitor alerts are enabled until a Performance Monitor alert occurs, at which time: - MMCR0<sub>PMAE</sub> is set to 0 - MMCR0<sub>PMAO</sub> is set to 1 ## **Programming Note -** Software can set this bit and MMCR0<sub>PMAO</sub> to 0 to prevent Performance Monitor interrupts. Software can set this bit to 1 and then poll the bit to determine whether an enabled condition or event has occurred. This is especially useful for software that runs with MSR<sub>FF</sub>=0. In earlier versions of the architecture that lacked the concept of Performance Monitor alerts, this bit was called Performance Monitor Exception Enable (PMXE). ## 38 Freeze Counters on Enabled Condition or Event (FCECE) - The PMCs are incremented (if permitted by other MMCR bits). - 1 The PMCs are incremented (if permitted by other MMCR bits) until an enabled condition or event occurs when MMCR0<sub>TRIGGER</sub>=0, at which time: - MMCR0<sub>FC</sub> is set to 1 If the enabled condition or event occurs when MMCR0<sub>TRIGGER</sub>=1, the FCECE bit is treated as if it were 0. ## 39:40 Time Base Selector (TBSEL) This field selects the Time Base bit that can cause a Time Base transition event (the event occurs when the selected bit changes from 0 to 1). - 00 Time Base bit 63 is selected. - 01 Time Base bit 55 is selected. - 10 Time Base bit 51 is selected. - 11 Time Base bit 47 is selected. ### **Programming Note** Time Base transition events can be used to collect information about processor activity, as revealed by event counts in PMCs and by addresses in SIAR and SDAR, at periodic intervals. In multiprocessor systems in which the Time Base registers are synchronized among the processors, Time Base transition events can be used to correlate the Performance Monitor data obtained by the several processors. For this use, software must specify the same TBSEL value for all the processors in the system. Because the frequency of the Time Base is implementation-dependent, software should invoke a system service program to obtain the frequency before choosing a value for TBSEL. #### 41 **Time Base Event Enable (TBEE)** - 0 Time Base transition events are disabled. - 1 Time Base transition events are enabled. #### 42:47 Reserved ## 48 **PMC1 Condition Enable** (PMC1CE) This bit controls whether counter negative conditions due to a negative value in PMC1 are enabled. - Counter negative conditions for PMC1 are - Counter negative conditions for PMC1 are enabled. ## 49 **PMCj Condition Enable** (PMCjCE) This bit controls whether counter negative conditions due to a negative value in any PMCj (i.e., in any PMC except PMC1) are enabled. - Counter negative conditions for all PMCjs are disabled. - Counter negative conditions for all PMCjs are enabled. #### 50 **Trigger** (TRIGGER) 0 The PMCs are incremented (if permitted by other MMCR bits). - 1 PMC1 is incremented (if permitted by other MMCR bits). The PMCjs are not incremented until PMC1 is negative or an enabled condition or event occurs, at which time: - the PMCjs resume incrementing (if permitted by other MMCR bits) - MMCR0<sub>TRIGGER</sub> is set to 0 See the description of the FCECE bit, above, regarding the interaction between TRIGGER and FCECE. ## Programming Note – Uses of TRIGGER include the following. - Resume counting in the PMCjs when PMC1 becomes negative, without causing a Performance Monitor interrupt. Then freeze all PMCs (and optionally cause a Performance Monitor interrupt) when a PMCj becomes negative. The PMCjs then reflect the events that occurred between the time PMC1 became negative and the time a PMCj becomes negative. This use requires the following MMCR0 bit settings. - TRIGGER=1 - PMC1CE=0 - PMCjCE=1 - TBEE=0 - FCECE=1 - PMAE=1 (if a Performance Monitor interrupt is desired) - Resume counting in the PMCjs when PMC1 becomes negative, and cause a Performance Monitor interrupt without freezing any PMCs. The PMCjs then reflect the events that occurred between the time PMC1 became negative and the time the interrupt handler reads them. This use requires the following MMCR0 bit settings. - TRIGGER=1 - PMC1CE=1 - TBEE=0 - FCECE=0 - PMAE=1 - 51:52 Setting is implementation-dependent. - 53:55 Reserved - 56 **Performance Monitor Alert Occurred** (PMAO) This bit is a basic feature. - O A Performance Monitor alert has not occurred since the last time software set this bit to 0. - A Performance Monitor alert has occurred since the last time software set this bit to 0. This bit is set to 1 by the processor when a Performance Monitor alert occurs. This bit can be set to 0 only by the *mtspr* instruction. ## **Programming Note** - Software can set this bit to 1 to simulate the occurrence of a Performance Monitor alert. Software should set this bit to 0 after handling the Performance Monitor alert. - 57 Setting is implementation-dependent. - 58 Freeze Counters 1-4 (FC1-4) - PMC1 PMC4 are incremented (if permitted by other MMCR bits). - 1 PMC1 PMC4 are not incremented. - 59 **Freeze Counters 5-6** (FC5-6) - 0 PMC5 PMC6 are incremented (if permitted by other MMCR bits). - 1 PMC5 PMC6 are not incremented. #### 60:61 Reserved ## 62 Freeze Counters in Wait State (FCWAIT) This bit is a basic feature. - 0 The PMCs are incremented (if permitted by other MMCR bits). - 1 The PMCs are not incremented if CTRL<sub>31</sub>=0. Software is expected to set CTRL<sub>31</sub>=0 when it is in a "wait state", i.e, when there is no process ready to run. Only Branch Unit type of events do not increment if $CTRL_{31}=0$ . Other units continue to count. ## 63 Freeze Counters in Hypervisor State (FCH) This bit is a basic feature. - 0 The PMCs are incremented (if permitted by other MMCR bits). - 1 The PMCs are not incremented if MSR<sub>HV PR</sub>=0b10. ## B.2.3 Monitor Mode Control Register 1 Monitor Mode Control Register 1 (MMCR1) is a 64-bit register. This register, along with MMCR0 and MMCRA, controls the operation of the Performance Monitor. MMCR1 0 63 ### Figure 58. Monitor Mode Control Register 1 MMCR1 is a basic feature. Within MMCR1, some of the bits and fields are basic features and some are extensions. The basic bits and fields are identified as such, below. Some bits of MMCR1 are altered by the processor when various events occur, as described below. The bit definitions of MMCR1 are as follows. MMCR1 bits that are not implemented are treated as reserved. ## Bit(s) Description 0:31 Implementation-Dependent Use These bits have implementation-dependent uses (e.g., extended event selection). 32:39 *PMC1 Selector* (PMC3SEL) 40:47 *PMC2 Selector* (PMC4SEL) 48:55 *PMC3 Selector* (PMC5SEL) 56:63 *PMC4 Selector* (PMC6SEL) Each of these fields contains a code that identifies the event to be counted by PMCs 1 through 4 respectively. PMC Selectors are basic features. ## Compatibility Note In versions of the architecture that precede Version 2.02 the PMC Selector Fields were six bits long, and were split between MMCR0 and MMCR1. PMC1-8 were all programmable. If more programmable PMCs are implemented in the future, additional MMCRs may be defined to cover the additional selectors. ## B.2.4 Monitor Mode Control Register A Monitor Mode Control Register A (MMCRA) is a 64-bit register. This register, along with MMCR0 and MMCR1, controls the operation of the Performance Monitor. MMCRA 63 #### Figure 59. Monitor Mode Control Register A MMCRA is a basic feature. Within MMCRA, some of the bits and fields are basic features and some are extensions. The basic bits and fields are identified as such, below. Some bits of MMCRA are altered by the processor when various events occur, as described below. The bit definitions of MMCRA are as follows. MMCRA bits that are not implemented are treated as reserved. ## Bit(s) Description 0:31 Reserved ## 32 Contents of SIAR and SDAR Are Related (CSSR) Set to 1 by the processor if the contents of SIAR and SDAR are associated with the same instruction; otherwise set to 0. 33:34 Setting is implementation-dependent. ## 35 **Sampled MSR**<sub>HV</sub> (SAMPHV) Value of MSR<sub>HV</sub> when the Performance Monitor Alert occurred. ## 36 Sampled MSR<sub>PR</sub> (SAMPPR) Value of MSR<sub>PR</sub> when the Performance Monitor Alert occurred. 37:47 Setting is implementation-dependent. ### 48:53 *Threshold* (THRESHOLD) This field contains a "threshold value", which is a value such that only events that exceed the value are counted. The events to which a threshold value can apply are implementation-dependent, as are the dimension of the threshold (e.g., duration in cycles) and the granularity with which the threshold value is interpreted. ## **Programming Note** By varying the threshold value, software can obtain a profile of the characteristics of the events subject to the threshold. For example, if PMC1 counts the number of cache misses for which the duration exceeds the threshold value, then software can obtain the distribution of cache miss durations for a given program by monitoring the program repeatedly using a different threshold value each time. 54:59 Reserved for implementation-specific use. #### 60:62 Reserved 63 Setting is implementation-dependent. # **B.2.5 Sampled Instruction Address Register** The Sampled Instruction Address Register (SIAR) is a 64-bit register. It contains the address of the "sampled instruction" when a Performance Monitor alert occurs. ## Figure 60. Sampled Instruction Address Register When a Performance Monitor alert occurs, SIAR is set to the effective address of an instruction that was being executed, possibly out-of-order, at or around the time that the Performance Monitor alert occurred. This instruction is called the "sampled instruction". The contents of SIAR may be altered by the processor if and only if MMCR0<sub>PMAE</sub>=1. Thus after the Performance Monitor alert occurs, the contents of SIAR are not altered by the processor until software sets MMCR0<sub>PMAE</sub> to 1. After software sets MMCR0<sub>PMAE</sub> to 1, the contents of SIAR are undefined until the next Performance Monitor alert occurs. See Section B.4 regarding the effects of the Trace facility on SIAR. ### Programming Note - If the Performance Monitor alert causes a Performance Monitor interrupt, the value of MSR<sub>HV PR</sub> that was in effect when the sampled instruction was being executed is reported in MMCRA. # **B.2.6 Sampled Data Address Register** The Sampled Data Address Register (SDAR) is a 64-bit register. It contains the address of the "sampled data" when a Performance Monitor alert occurs. ## Figure 61. Sampled Data Address Register When a Performance Monitor alert occurs, SDAR is set to the effective address of the storage operand of an instruction that was being executed, possibly out-of-order, at or around the time that the Performance Monitor alert occurred. This storage operand is called the "sampled data". The sampled data may be, but need not be, the storage operand (if any) of the sampled instruction (see Section B.2.5). The contents of SDAR may be altered by the processor if and only if MMCR0<sub>PMAE</sub>=1. Thus after the Performance Monitor alert occurs, the contents of SDAR are not altered by the processor until software sets ${\rm MMCR0_{PMAE}}$ to 1. After software sets ${\rm MMCR0_{PMAE}}$ to 1, the contents of SDAR are undefined until the next Performance Monitor alert occurs. See Section B.4 regarding the effects of the Trace facility on SDAR. ## Programming Note If the Performance Monitor alert causes a Performance Monitor interrupt, MMCRA indicates whether the sampled data is the storage operand of the sampled instruction. # B.3 Performance Monitor Interrupt The Performance Monitor interrupt is a system caused interrupt (Section 6.4). It is masked by MSR<sub>EE</sub> in the same manner that External and Decrementer interrupts are. The Performance Monitor interrupt is a basic feature. A Performance Monitor interrupt occurs when no higher priority exception exists, a Performance Monitor exception exists, and MSR<sub>FF</sub>=1. If multiple Performance Monitor exceptions occur before the first causes a Performance Monitor interrupt, the interrupt reflects the most recent Performance Monitor exception and the preceding Performance Monitor exceptions are lost. The following registers are set: SRR0 Set to the effective address of the instruc- tion that the processor would have attempted to execute next if no interrupt conditions were present. SRR1 33:36 and 42:47 Implementation-specific. **Others** Loaded from the MSR. MSR See Figure 43 on page 555. SIAR Set to the effective address of the "sampled instruction" (see Section B.2.5). **SDAR** Set to the effective address of the "sampled data" (see Section B.2.6). Execution resumes at effective address 0x0000 0000 0000 0F00. In general, statements about External and Decrementer interrupts elsewhere in this Book apply also to the Performance Monitor interrupt; for example, if a Performance Monitor exception exists when an *mtm-srd[d]* instruction is executed that changes MSR<sub>EE</sub> from 0 to 1, the Performance Monitor interrupt will occur before the next instruction is executed (if no higher priority exception exists). The priority of the Performance Monitor exception is equal to that of the External, Decrementer, and Hypervisor Decrementer exceptions (i.e., the processor may generate any one of the four interrupts for which an exception exists) (see Section 6.7.2, "Ordered Exceptions" on page 571 and Section 6.8, "Interrupt Priorities" on page 571). # B.4 Interaction with the Trace Facility If the Trace facility includes setting SIAR and SDAR (see Appendix C, "Example Trace Extensions" on page 599), and tracing is active (MSR $_{\rm SE}$ =1) or MSR $_{\rm BE}$ =1), the contents of SIAR and SDAR as used by the Performance Monitor facility are undefined and may change even when MMCR0 $_{\rm PMAE}$ =0. ## Programming Note — A potential combined use of the Trace and Performance Monitor facilities is to trace the control flow of a program and simultaneously count events for that program. ## **Appendix C. Example Trace Extensions** #### Note This Appendix describes an example implementation of Trace Extensions. A subset of these requirements are being considered for inclusion in the Architecture as part of Category: Trace. This appendix provides an example of extensions that may be added to the Trace facility described in Section 6.5.14, "Trace Interrupt [Category: Trace]" on page 565. It is only an example; implementations may provide all, some, or none of the features described here, or may provide features that are similar to those described here but differ in detail. The extensions consist of the following features (described in detail below). - use of MSR<sub>SE BE</sub>=0b11 to specify new causes of Trace interrupts - specification of how certain SRR1 bits are set when a Trace interrupt occurs - setting of SIAR and SDAR (see Appendix B, "Example Performance Monitor" on page 591) when a Trace interrupt occurs ## $MSR_{SEBE} = 0b11$ If MSR<sub>SE BE</sub>=0b11, the processor generates a Trace exception under the conditions described in Section 6.5.14 for MSR<sub>SE BE</sub>=0b01, and also after successfully completing the execution of any instruction that would cause at least one of SRR1 bits 33:36, 42, and 44:46 to be set to 1 (see below) if the instruction were executed when MSR<sub>SE BE</sub>=0b10. This overrides the implicit statement in Section 6.5.14 that the effects of MSR $_{\rm SE}$ BE=0b11 are the same as those of MSR $_{\rm SE}$ BE=0b10. ## SRR1 When a Trace interrupt occurs, the SRR1 bits that are not loaded from the MSR are set as follows instead of as described in Section 6.5.14. 33 Set to 1 if the traced instruction is *icbi*; otherwise set to 0. - 34 Set to 1 if the traced instruction is dcbt, dcbtst, dcbz, dcbst, dcbf[I]; otherwise set to 0. - Set to 1 if the traced instruction is a Load instruction or *eciwx*; may be set to 1 if the traced instruction is *icbi*, *dcbt*, *dcbtst*, *dcbst*, *dcbf*, *dcbtst*, - 36 Set to 1 if the traced instruction is a Store instruction, dcbz, or ecowx; otherwise set to 0. - 42 Set to 1 if the traced instruction is *Iswx* or *stswx*: otherwise set to 0. - 43 Implementation-dependent. - Set to 1 if the traced instruction is a *Branch* instruction and the branch is taken; otherwise set to 0. - 45 Set to 1 if the traced instruction is eciwx or ecowx; otherwise set to 0. - Set to 1 if the traced instruction is *Iwarx*, *Idarx*, *stwcx*., or *stdcx*.; otherwise set to 0. - 47 Implementation-dependent. ## SIAR and SDAR If the Performance Monitor facility is implemented and includes SIAR and SDAR (see Appendix B), the following additional registers are set when a Trace interrupt occurs: SIAR Set to the effective address of the traced instruction. SDAR Set to the effective address of the storage operand (if any) of the traced instruction; otherwise undefined. If the state of the Performance Monitor is such that the Performance Monitor may be altering these registers (i.e., if MMCR0<sub>PMAE</sub>=1), the contents of SIAR and SDAR as used by the Trace facility are undefined and may change even when no Trace interrupt occurs. # Appendix D. Interpretation of the DSISR as Set by an Alignment Interrupt For most causes of Alignment interrupt, the interrupt handler will emulate the interrupting instruction. To do this, it needs the following characteristics of the interrupting instruction: Load or store Length (halfword, word, doubleword) String, multiple, or elementary Fixed-point or floating-point Update or non-update Byte reverse or not Is it *dcbz*? The Power ISA optionally provides this information by setting bits in the DSISR that identify the interrupting instruction type. It is not necessary for the interrupt handler to load the interrupting instruction from storage. The mapping is unique except for a few exceptions that are discussed below. The near-uniqueness depends on the fact that many instructions, such as the fixed-and floating-point arithmetic instructions and the one-byte loads and stores, cannot cause an Alignment interrupt. See Section 6.5.8 for a description of how the opcode and extended opcode are mapped to a DSISR value for an X-, D-, or DS-form instruction that causes an Alignment interrupt. The table on the next page shows the inverse mapping: how the DSISR bits identify the interrupting instruction. The following notes are cited in the table. 1. The instructions *Iwz* and *Iwarx* give the same DSISR bits (all zero). But if *Iwarx* causes an Alignment interrupt, it should not be emulated. It is adequate for the Alignment interrupt handler simply to treat the instruction as if it were *Iwz*. The emulator must use the address in the DAR, rather than compute it from RA/RB/D, because *Iwz* and *Iwarx* have different instruction formats. If opcode 0 ("Illegal or Reserved") can cause an Alignment interrupt, it will be indistinguishable to the interrupt handler from *Iwarx* and *Iwz*. These are distinguished by DSISR bits 44:45, which are not shown in the table. The interrupt handler has no need to distinguish between an X-form instruction and the corresponding D- or DS-form instruction if one exists, and vice versa. Therefore two such instructions may yield the same DSISR value (all 32 bits). For example, *stw* and *stwx* may both yield either the DSISR value shown in the following table for *stw*, or that shown for *stwx*. | | then it is | or D/ | | | | then it is | or D/ | | |------------------------|--------------------------|-------------|------------------------|---|------------------------|--------------------------|---------|------------------------| | | either X- | DS- | | | | either X- | DS- | | | If DSISR | form | form | | | If DSISR | form | form | | | 47:53 is: | opcode: | opcode: | so the instruction is: | | 47:53 is: | opcode: | opcode: | so the instruction is: | | 00 0 0000 | 00000xxx00 | x00000 | lwarx,lwz,reserved(1) | | 1000000 | | | - | | 00 0 0001 | 00010xxx00 | x00010 | ldarx | | 1000001 | 00010xxx10 | | - | | | | | stw | | 1000010 | | | stwcx. | | 00 0 0011 | 00110xxx00 | | - | | 10 0 0011 | | | stdcx. | | 00 0 0100 | 01000xxx00 | | lhz | | 1000100 | | | - | | 00 0 0101 | 01010xxx00<br>01100xxx00 | | lha | | 10 0 0101 | | | - | | 00 0 0 110 | 011100xxx00 | | sth<br>Imw | | 1000110 | 011100xxx10 | | <del>-</del> | | 00 0 1000 | 10000xxx00 | | lfs | | 10 0 1000 | | | lwbrx | | 00 0 1001 | 10010xxx00 | x10010 | lfd | | 10 0 1001 | | | - | | 0001010 | 10100xxx00 | | stfs | | 1001010 | | | stwbrx | | 00 0 1011 | 10110xxx00 | x10110 | stfd | | 10 0 1011 | 10110xxx10 | | - | | 00 0 1100 | 11000xxx00 | | <del>-</del> | | 10 0 1100 | 11000xxx10 | | Ihbrx | | 00 0 1101 | 11010xxx00 | | ld, Idu, Iwa (2) | | 10 0 1101 | 11010xxx10 | | - | | 00 0 1110 | 11100xxx00 | | - otd otdu (2) | | 1001110 | 11100xxx10 | | sthbrx | | 00 0 1111 | 11110xxx00<br>00001xxx00 | | std, stdu (2)<br>Iwzu | | 10 0 1111 | 11110xxx10<br>00001xxx10 | | - | | 00 1 0000 | 00001xxx00 | | IWZU | | 10 1 0000 | 00001xxx10 | | [ | | 00 1 0001 | 00101xxx00 | | stwu | | 10 1 0001 | | | - | | 00 1 0011 | 00111xxx00 | | - | | 10 1 0011 | 00111xxx10 | | - | | 00 1 0100 | 01001xxx00 | | lhzu | | 10 1 0100 | 01001xxx10 | | eciwx | | 00 1 0101 | 01011xxx00 | | lhau | | 10 1 0101 | | | - | | 00 1 0110 | 01101xxx00 | | sthu | | 10 1 0110 | | | ecowx | | 00 1 0111 | 011111xxx00 | | stmw | | 10 1 0111 | | | - | | 00 1 1000 | 10001xxx00 | | lfsu | | 10 1 1000 | | | - | | 00 1 1001 | 10011xxx00 | | lfdu | | 10 1 1001 | 10011xxx10 | | - | | 00 1 1010<br>00 1 1011 | 10101xxx00<br>10111xxx00 | | stfsu<br>stfdu | | 10 1 1010<br>10 1 1011 | 10101xxx10<br>10111xxx10 | | - | | | | | | | 10 1 1011 | | | - | | 00 1 1100<br>00 1 1101 | 11001xxx00 | | lfdp<br>- | | 10 1 1101 | 11011xxx10 | | - | | 00 1 1101 | 11011xxx00<br>11101xxx00 | | stfdp | | 10 1 1110 | | | - | | 00 1 1110 | 111111xxx00 | x11111 | -<br>- | | 10 1 1111 | 111111xxx10 | | dcbz | | 01 0 0000 | 00000xxx01 | X 1 1 1 1 1 | ldx | | 11 0 0000 | | | lwzx | | 01 0 0001 | 00010xxx01 | | - | | 11 0 0001 | | | | | 01 0 0010 | 00100xxx01 | | stdx | | | 00100xxx11 | | stwx | | 01 0 0011 | 00110xxx01 | | - | | 11 0 0011 | 00110xxx11<br>01000xxx11 | | -<br>lhav | | 01 0 0100 | 01000xxx01 | | <del>.</del> | | 11 0 0100 | | | Ihzx<br> Ihax | | 01 0 0101 | 01010xxx01 | | lwax | | | 01100xxx11 | | sthx | | 01 0 0110<br>01 0 0111 | 01100xxx01<br>01110xxx01 | | - | | 11 0 0111 | | | - | | 01 0 1000 | 10000xxx01 | | Iswx | | | 10000xxx11 | | Ifsx | | 01 0 1000 | 10010xxx01 | | Iswi | | | 10010xxx11 | | lfdx | | 01 0 1010 | | | stswx | | 11 0 1010 | 10100xxx11 | | stfsx | | 01 0 1011 | 10110xxx01 | | stswi | _ | | 10110xxx11 | | stfdx | | 01 0 1100 | 11000xxx01 | | - | | | 11000xxx11<br>11010xxx11 | | lfdpx<br>lfiwax | | 01 0 1101 | 11010xxx01 | | - | | 11 0 1101 | | | stfdpx | | 01 0 1110 | 11100xxx01 | | - | ı | 11 0 1110 | | | stfiwx | | 01 0 1111<br>01 1 0000 | 11110xxx01<br>00001xxx01 | | -<br>ldux | | | 00001xxx11 | | lwzux | | 01 1 0000 | 00001xxx01 | | iuux | | 11 1 0001 | | | - | | 01 1 0001 | 00101xxx01 | | stdux | | 11 1 0010 | 00101xxx11 | | stwux | | 01 1 00 10 | 00111xxx01 | | - | | 11 1 0011 | | | - | | 01 1 0100 | 01001xxx01 | | - | | | 01001xxx11 | | lhzux | | 01 1 0101 | 01011xxx01 | | lwaux | | | 01011xxx11 | | lhaux | | 01 1 0110 | 01101xxx01 | | - | | 11 1 0110<br>11 1 0111 | 01101xxx11<br>01111xxx11 | | sthux | | 01 1 0111 | 011111xxx01 | | - | | | 10001xxx11 | | Ifsux | | 01 1 1000 | 10001xxx01 | | - | | 11 1 1000 | | | Ifdux | | 01 1 1001<br>01 1 1010 | 10011xxx01<br>10101xxx01 | | <del>-</del> | | | 10101xxx11 | | stfsux | | 01 1 1010 | 10101xxx01 | | <del>-</del> | | 11 1 1011 | 10111xxx11 | | stfdux | | 01 1 1011 | 11001xxx01 | | _ | | 11 1 1100 | 11001xxx11 | | - | | 01 1 1101 | 11011xxx01 | | - | | 11 1 1101 | | | - | | 01 1 1110 | 11101xxx01 | | - | | 11 1 1110 | | | - | | 01 1 1111 | 111111xxx01 | | - | | 11 1 1111 | 111111xxx11 | | - | ## **Appendix E. Programming Examples** ## E.1 Unsigned Single-Precision-BCD Arithmetic **addg6s** can be used to add or subtract two BCD operands. In these examples it is assumed that r0 contains 0x666...666. (BCD data formats are described in Section 5.3 of Book I - III.) Addition of the unsigned BCD operand in register RA to the unsigned BCD operand in register RB can be accomplished as follows. ``` add r1,RA,r0 add r2,r1,RB addg6s RT,r1,RB subf RT,RT,r2 # RT = RA +<sub>BCD</sub> RB ``` Subtraction of the unsigned BCD operand in register RA from the unsigned BCD operand in register RB can be accomplished as follows. (In this example it is assumed that RB is not register 0.) Additional instructions are needed to handle signed BCD operands, and BCD operands that occupy more than one register (e.g., unsigned BCD operands that have more than 16 decimal digits). # E.2 Signed Single-Precision BCD Arithmetic Addition of the signed 15-digit BCD operand in register RA to the signed BCD operand in register RB can be accomplished as follows. If the signs of operands are different, then the operand of smaller magnitude is subtracted from the operand of larger magnitude and the sign of the larger operand is preserved; otherwise the operands are added and the sign is preserved. The sign code is in the low order 4 bits of the operands and uses one of the standard encodings. (See Section 5.3 of Book I - III for a description of BCD and sign encodings.) This example assumes preferred sign option 1 (0b1100 is plus and 0b1101 is minus). For preferred sign option 2 (0b1111 is plus and 0b1101 is minus), replace the xori after the "SignedSub" label with "xori RA,RA,2". Preserving the appropriate sign code is accomplished by zeroing the sign code of the other operand before performing a 16 digit BCD addition/subtraction. Other addends (ones complement or 6's) must leave the sign code position as zero. (In this example r11 contains 0x6666 6666 6666 6660.) ``` SignedSub: RA, RA, 1 xori SignedAdd: r5,RA,RB xor andi. r5, r5, 15 # compare sign codes cmpld cr1,RA,RB # compare magnitudes beq cr0, samesign ble cr1, BminusA \# set up for RT = RA -_{BCD} RB # one's complement of RB nor r9,RB,RB addi r10,RA,16 # generate the carry in submaq BminusA: \# set up for RT = RB -<sub>BCD</sub> RA # one's complement of RA nor r9,RA,RA addi r10,RB,16 # generate the carry in submaq: rldicr r9, r9, 0, 59 # remove the sign code add r8, r10, r9 addg6s RT, r10, r9 rldicr RT, RT, 0, 59 # remove generated 6 from # sign position subf RT,RT,r8 h done samesign: # remove the sign code rldicr r8, RB, 0, 59 add r10,RA,r11 # add 6's add r9, r10, r8 addg6s RT, r10, RB \# RT = RA +<sub>BCD</sub> RB subf RT,RT,r9 done: ``` # E.3 Unsigned Extended-Precision BCD Arithmetic Multiple precision BCD arithmetic requires additional code to add/subtract higher order digits and handle the carry between 16 digit groups. For example, the following sequence implements a 32-digit BCD add. In this example the contents of register R3 concatenated with the contents of R4 represent the first 32-digit operand and the contents of register R5 concatenated with the contents of R6 represents the second operand. The contents of register R3 concatenated with the contents of register R4 represents the result. (In this example r0 contains 0x6666 6666 6666 6666.) ``` add r10,R4,r0 addc r9, r10, R6 # generate the carry R4,r10,R6 addg6s subf R4,R4,r9 \# RT1 = RA1 +<sub>BCD</sub> RB1 addze R5,R5 # propagate the carry add r10,R3,r0 add r9, r10, R5 addg6s R3, r10, R5 subf R3,R3,r9 \# RT0 = RA0 +<sub>BCD</sub> RB0 ``` Note that an extra instruction (addze) is required to propagate the carry so that the same value is used in the subsequent add and addg6s. The following sequence implements a 32-digit BCD subtraction. In this example the first operand in R3 and R4 is subtracted from the 2nd operand in R5 and R6. The result is in R3 and R4. ``` addi r10,R6,1 # one's complement of RAO nor r9,R4,R4 addc r8,r10,r9 # Generate the carry addg6s R4,r10,r9 subf R4,R4,r8 \# RT1 = RB1 -<sub>BCD</sub> RA1 addze r10,R5 # propagate the carry # one's complement of RAO nor r9,R3,R3 add r8,r10,r2 addq6s R3,r10,r9 subf R3,R3,r8 \# RT0 = RB0 -<sub>BCD</sub> RA0 ``` ## **Book III-E:** Power ISA Operating Environment Architecture - Embedded Environment ## **Chapter 1. Introduction** | 1.1 Overview | | 1.5 Exceptions 608 | |--------------------------------|-------|-------------------------------------| | 1.2 32-Bit Implementations | . 607 | 1.6 Synchronization 609 | | 1.3 Document Conventions | . 607 | 1.6.1 Context Synchronization 609 | | 1.3.1 Definitions and Notation | . 607 | 1.6.2 Execution Synchronization 609 | | 1.3.2 Reserved Fields | . 608 | | | 1.4 General Systems Overview | . 608 | | ## 1.1 Overview Chapter 1 of Book I describes computation modes, document conventions, a general systems overview, instruction formats, and storage addressing. This chapter augments that description as necessary for the Power ISA Operating Environment Architecture. ## 1.2 32-Bit Implementations Though the specifications in this document assume a 64-bit implementation, 32-bit implementations are permitted as described in Appendix C, "Guidelines for 64-bit Implementations in 32-bit Mode and 32-bit Implementations" on page 735. ## 1.3 Document Conventions The notation and terminology used in Book I apply to this Book also, with the following substitutions. - For "system alignment error handler" substitute "Alignment interrupt". - For "system auxiliary processor enabled exception error handler" substitute "Auxiliary Processor Enabled Exception type Program interrupt", - For "system data storage error handler" substitute "Data Storage interrupt" or Data TLB Error interrupt" as appropriate. - For "system error handler" substitute "interrupt". - For "system floating-point enabled exception error handler" substitute "Floating-Point Enabled Exception type Program interrupt". - For "system illegal instruction error handler" substitute "Illegal Instruction exception type Program interrupt" or "Unimplemented Operation exception type Program interrupt", as appropriate. - For "system instruction storage error handler" substitute "Instruction Storage interrupt" or "Instruction TLB Error", as appropriate. - For "system privileged instruction error handler" substitute "Privileged Instruction exception type Program interrupt". - For "system service program" substitute "System Call interrupt". - For "system trap handler" substitute "Trap type Program interrupt". ## 1.3.1 Definitions and Notation The definitions and notation given in Book I are augmented by the following. real page A unit of real storage that is aligned at a boundary that is a multiple of its size. The real page size may range from 1KB to 1TB. context of a program The processor state (e.g., privilege and relocation) in which the program executes. The context is controlled by the contents of certain System Registers, such as the MSR, of certain lookaside buffers, such as the TLB, and of other resources. ■ exception An error, unusual condition, or external signal, that may set a status bit and may or may not cause an interrupt, depending upon whether the corresponding interrupt is enabled. #### ■ interrupt The act of changing the machine state in response to an exception, as described in Chapter 5. "Interrupts and Exceptions" on page 661. trap interrupt An interrupt that results from execution of a *Trap* instruction. - Additional exceptions to the rule that the processor obeys the sequential execution model, beyond those described in the section entitled "Instruction Fetching" in Book I, are the following. - A System Reset or Machine Check interrupt may occur. The determination of whether an instruction is required by the sequential execution model is not affected by the potential occurrence of a System Reset or Machine Check interrupt. (The determination is affected by the potential occurrence of any other kind of interrupt.) - A context-altering instruction is executed (Chapter 10. "Synchronization Requirements for Context Alterations" on page 723). The context alteration need not take effect until the required subsequent synchronizing operation has occurred. ### hardware Any combination of hard-wired implementation, emulation assist, or interrupt for software assistance. In the last case, the interrupt may be to an architected location or to an implementation-dependent location. Any use of emulation assists or interrupts to implement the architecture is implementation-dependent. - /, //, ///, ... denotes a field that is reserved in an instruction, in a register, or in an architected storage table. - ?, ??, ???, ... denotes a field that is implementation-dependent in an instruction, in a register, or in an architected storage table. ## 1.3.2 Reserved Fields Some fields of certain architected registers may be written to automatically by the processor, e.g., Reserved bits in System Registers. When the processor writes to such a register, the following rules are obeyed. Unless otherwise stated, no defined field other than the one(s) the processor is specifically updating are modified. Contents of reserved fields are either preserved by the processor or written as zero. The reader should be aware that reading and writing of some of these registers (e.g., the MSR) can occur as a side effect of processing an interrupt and of returning from an interrupt, as well as when requested explicitly by the appropriate instruction (e.g., *mtmsr* instruction). ## 1.4 General Systems Overview The processor or processor unit contains the sequencing and processing controls for instruction fetch, instruction execution, and interrupt action. Most implementations also contain data and instruction caches. Instructions that the processing unit can execute fall into the following classes: - instructions executed in the Branch Processor - instructions executed in the Fixed-Point Processor - instructions executed in the Floating-Point Processor - instructions executed in the Vector Processor - instructions executed in an Auxiliary Processor - other instructions executed by the processor Almost all instructions executed in the Branch Processor, Fixed-Point Processor, Floating-Point Processor, and Vector Processor are nonprivileged and are described in Book I. Book I may describe additional nonprivileged instructions (e.g., Book II describes some nonprivileged instructions for cache management). Instructions executed in an Auxiliary Processor are implementation-dependent. Instructions related to the supervisor mode, control of processor resources, control of the storage hierarchy, and all other privileged instructions are described here or are implementation-dependent. ## 1.5 Exceptions The following augments the exceptions defined in Book I that can be caused directly by the execution of an instruction: - the execution of a floating-point instruction when MSR<sub>FP</sub>=0 (Floating-Point Unavailable interrupt) - execution of an instruction that causes a debug event (Debug interrupt). - the execution of an auxiliary processor instruction when the auxiliary processor instruction is unavailable (Auxiliary Processor Unavailable interrupt) - the execution of a Vector, SPE, or Embedded Floating-Point instruction when MSR<sub>SPV</sub>=0 (SPE/ Embedded Floating-Point/Vector Unavailable interrupt) ## 1.6 Synchronization The synchronization described in this section refers to the state of the processor that is performing the synchronization. ## 1.6.1 Context Synchronization An instruction or event is *context synchronizing* if it satisfies the requirements listed below. Such instructions and events are collectively called *context synchronizing operations*. The context synchronizing operations include the *isync* instruction, the *System Linkage* instructions, the *mtmsr* instruction, and most interrupts (see Section 5.1). - The operation causes instruction dispatching (the issuance of instructions by the instruction fetching mechanism to any instruction execution mechanism) to be halted. - The operation is not initiated or, in the case of *dnh* [Category: Embedded.Enhanced Debug], *isync* and *wait* [Category: Wait], does not complete, until all instructions that precede the operation have completed to a point at which they have reported all exceptions they will cause. - The operation ensures that the instructions that precede the operation will complete execution in the context (privilege, relocation, storage protection, etc.) in which they were initiated. - 4. If the operation directly causes an interrupt (e.g., sc directly causes a System Call interrupt) or is an interrupt, the operation is not initiated until no exception exists having higher priority than the exception associated with the interrupt (see Section 5.9, "Exception Priorities" on page 689). - 5. The operation ensures that the instructions that follow the operation will be fetched and executed in the context established by the operation. (This requirement dictates that any prefetched instructions be discarded and that any effects and side effects of executing them out-of-order also be discarded, except as described in Section 4.5, "Performing Operations Out-of-Order".) ### Programming Note - A context synchronizing operation is necessarily execution synchronizing; see Section 1.6.2. Unlike the *Synchronize* instruction, a context synchronizing operation does not affect the order in which storage accesses are performed. Item 2 permits a choice only for *isync* (and *sync*; see Section 1.6.2) because all other execution synchronizing operations also alter context. ## 1.6.2 Execution Synchronization An instruction is *execution synchronizing* if it satisfies items 2 and 3 of the definition of context synchronization (see Section 1.6.1). *sync* is treated like *isync* with respect to item 2. The execution synchronizing instructions are *sync*, *mtmsr* and all context synchronizing instructions. ### Programming Note All context synchronizing instructions are execution synchronizing. Unlike a context synchronizing operation, an execution synchronizing instruction does not ensure that the instructions following that instruction will execute in the context established by that instruction. This new context becomes effective sometime after the execution synchronizing instruction completes and before or at a subsequent context synchronizing operation. ## **Chapter 2. Branch Processor** | 2.1 | Branch Processor Overview | 611 | 2.3 | Branch Processor Instructions 613 | |------|----------------------------|-----|-----|-----------------------------------| | 2.2 | Branch Processor Registers | 611 | 2.4 | System Linkage Instructions 613 | | 2.2. | 1 Machine State Register | 611 | | , | ## 2.1 Branch Processor Overview This chapter describes the details concerning the registers and the privileged instructions implemented in the Branch Processor that are not covered in Book I. ## 2.2 Branch Processor Registers ## 2.2.1 Machine State Register The MSR (MSR) is a 32-bit register. MSR bits are numbered 32 (most-significant bit) to 63 (least-significant bit). This register defines the state of the processor. The MSR can also be modified by the *mtmsr*, *rfi*, *rfci*, *rfdi* [Category: Embedded.Enhanced Debug], *rfmci*, *wrtee* and *wrteei* instructions and interrupts. It can be read by the *mfmsr* instruction. Figure 1. Machine State Register Below are shown the bit definitions for the Machine State Register. ## Bit Description ## 32 Computation Mode (CM) - 0 The processor runs in 32-bit mode. - 1 The processor runs in 64-bit mode. ## 33 Interrupt Computation Mode (ICM) On interrupt this bit is copied to $MSR_{CM}$ , selecting 32-bit or 64-bit mode for interrupt handling. - 0 MSR<sub>CM</sub> is set to 0 (32-bit mode) when an interrupt occurs. - 1 MSR<sub>CM</sub> is set to 1 (64-bit mode) when an interrupt occurs. ### 34:36 Implementation-dependent ## 37 User Cache Locking Enable (UCLE) [Category: Embedded Cache Locking.User Mode] - 0 Cache Locking instructions are privileged. - 1 Cache Locking instructions can be executed in user mode (MSR<sub>PR</sub>=1). If category Embedded Cache Locking. User Mode is not supported, this bit is treated as reserved. ## 38 SP/Embedded Floating-Point/Vector Available (SPV) [Category: Signal Processing]: - The processor cannot execute any SP instructions except for the *brinc* instruction. - The processor can execute all SP instructions. ## [Category: Vector]: - 0 The processor cannot execute any Vector instruction. - 1 The processor can execute Vector instructions. ## 39:44 Reserved ## 45 Wait State Enable (WE) - The processor is not in wait state and continues processing - 1 The processor enters the wait state by ceasing to execute instructions and entering low power mode. The details of how the wait state is entered and exited, and how the processor behaves while in the wait state, are implementation-dependent. #### 46 Critical Enable (CE) - Critical Input, Watchdog Timer, and Processor Doorbell Critical interrupts are disabled - Critical Input, Watchdog Timer, and Processor Doorbell Critical interrupts are enabled - 47 Reserved ## 48 External Enable (EE) - External Input, Decrementer, Fixed-Interval Timer, Processor Doorbell, and Embedded Performance Monitor [Category:E.PM] interrupts are disabled. - External Input, Decrementer, Fixed-Interval Timer, Processor Doorbell, and Embedded Performance Monitor [Category:E.PM] interrupts are enabled. ## 49 **Problem State** (PR) - The processor is in supervisor mode, can execute any instruction, and can access any resource (e.g. GPRs, SPRs, MSR, etc.). - 1 The processor is in user mode, cannot execute any privileged instruction, and cannot access any privileged resource. MSR<sub>PR</sub> also affects storage access control, as described in Section 6.2.4. ## 50 Floating-Point Available (FP) [Category: Floating-Point] - O The processor cannot execute any floating-point instructions, including floatingpoint loads, stores and moves. - The processor can execute floating-point instructions. #### 51 **Machine Check Enable (ME)** - 0 Machine Check interrupts are disabled. - Machine Check interrupts are enabled. ## 52 Floating-Point Exception Mode 0 (FE0) [Category: Floating-Point] (See below) 53 Implementation-dependent ### 54 **Debug Interrupt Enable (DE)** - 0 Debug interrupts are disabled - 1 Debug interrupts are enabled if DBCR0<sub>IDM</sub>=1 #### 55 Floating-Point Exception Mode 1 (FE1) [Category: Floating-Point] (See below) - 56 Reserved - 57 Reserved ## 58 Instruction Address Space (IS) - 0 The processor directs all instruction fetches to address space 0 (TS=0 in the relevant TLB entry). - 1 The processor directs all instruction fetches to address space 1 (TS=1 in the relevant TLB entry). #### 59 **Data Address Space (DS)** - 0 The processor directs all data storage accesses to address space 0 (TS=0 in the relevant TLB entry). - 1 The processor directs all data storage accesses to address space 1 (TS=1 in the relevant TLB entry). ## 60 Implementation-dependent ## 61 **Performance Monitor Mark** (PMM) [Category: Embedded.Performance Monitor] - Disable statistics gathering on marked processes. - Enable statistics gathering on marked processes See Appendix E for additional information. - 62 Reserved - 63 Reserved The Floating-Point Exception Mode bits FE0 and FE1 are interpreted as shown below. For further details see Book I. | FE0 | FE1 | Mode | |-----|-----|--------------------------| | 0 | 0 | Ignore Exceptions | | 0 | 1 | Imprecise Nonrecoverable | | 1 | 0 | Imprecise Recoverable | | 1 | 1 | Precise | See Section 6.3, "Processor State After Reset" on page 693 for the initial state of the MSR. ## Programming Note - A Machine State Register bit that is reserved may be altered by *rfi/rfci/rfmci/rfdi* [Category:Embedded.Enhanced Debug]. ## 2.3 Branch Processor Instructions ## 2.4 System Linkage Instructions These instructions provide the means by which a program can call upon the system to perform a service, and by which the system can return from performing a service or from processing an interrupt. The System Call instruction is described in Book I, but only at the level required by an application programmer. A complete description of this instruction appears below. ## System Call ### SC-form ## Return From Interrupt XL-form SC | ſ | 17 | /// | /// | /// | /// | // | 1 / | |---|----|-----|-----|-----|-----|----|-------| | C | ) | 6 | 11 | 16 | 20 | 27 | 30 31 | ``` SRRO ←iea CIA + 4 SRR1 ← MSR NIA \leftarrow IVPR_{0:47} \mid | IVOR8_{48:59} \mid | 0b0000 MSR \leftarrow new value (see below) ``` The effective address of the instruction following the System Call instruction is placed into SRR0. The contents of the MSR are copied into SRR1. Then a System Call interrupt is generated. The interrupt causes the MSR to be set as described in Section 5.6 on page 672. The interrupt causes the next instruction to be fetched from effective address IVPR<sub>0:47</sub>||IVOR8<sub>48:59</sub>||0b0000. This instruction is context synchronizing. ## **Special Registers Altered:** SRR0 SRR1 MSR rfi ``` MSR ← SRR1 NIA \leftarrow_{\text{iea}} SRR0_{0:61} \mid\mid 0b00 ``` The rfi instruction is used to return from a base class interrupt, or as a means of simultaneously establishing a new context and synchronizing on that new context. The contents of SRR1 are placed into the MSR. If the new MSR value does not enable any pending exceptions, then the next instruction is fetched, under control of the new MSR value, from the address SRR0<sub>0:61</sub>||0b00. (Note: VLE behavior may be different; see Book VLE.) If the new MSR value enables one or more pending exceptions, the interrupt associated with the highest priority pending exception is generated; in this case the value placed into the applicable save/ restore register 0 by the interrupt processing mechanism (see Section 5.6 on page 672) is the address of the instruction that would have been executed next had the interrupt not occurred (i.e. the address in SRR0 at the time of the execution of the rfi). This instruction is privileged and context synchronizing. ## **Special Registers Altered:** **MSR** ## Return From Critical Interrupt XL-form rfci | 19 | /// | /// | /// | 51 | / | |----|-----|-----|-----|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` MSR ← CSRR1 NIA ← iea CSRR0<sub>0:61</sub> || 0b00 ``` The **rfci** instruction is used to return from a critical class interrupt, or as a means of establishing a new context and synchronizing on that new context simultaneously. The contents of CSRR1 are placed into the MSR. If the new MSR value does not enable any pending exceptions, then the next instruction is fetched, under control of the new MSR value, from the address CSRR0<sub>0:61</sub>||0b00. (Note: VLE behavior may be different; see Book VLE.) If the new MSR value enables one or more pending exceptions, the interrupt associated with the highest priority pending exception is generated; in this case the value placed into SRR0 or CSRR0 by the interrupt processing mechanism (see Section 5.6 on page 672) is the address of the instruction that would have been executed next had the interrupt not occurred (i.e. the address in CSRR0 at the time of the execution of the *rfci*). This instruction is privileged and context synchronizing. ## **Special Registers Altered:** **MSR** ## Return From Debug Interrupt X-form rfdi [Category: Embedded.Enhanced Debug] ``` \begin{aligned} & \text{MSR} \leftarrow \text{DSRR1} \\ & \text{NIA} \leftarrow_{\text{iea}} \text{DSRR0}_{\text{0:61}} \mid\mid \text{0b00} \end{aligned} ``` The **rfdi** instruction is used to return from a Debug interrupt, or as a means of establishing a new context and synchronizing on that new context simultaneously. The contents of DSRR1 are placed into the MSR. If the new MSR value does not enable any pending exceptions, then the next instruction is fetched, under control of the new MSR value, from the address DSRR0<sub>0:61</sub>||0b00. (Note: VLE behavior may be different; see Book VLE.) If the new MSR value enables one or more pending exceptions, the interrupt associated with the highest priority pending exception is generated; in this case the value placed into SRR0, CSRR0, or DSRR0 by the interrupt processing mechanism is the address of the instruction that would have been executed next had the interrupt not occurred (i.e. the address in DSRR0 at the time of the execution of the *rfdi*). This instruction is privileged and context synchronizing. ## Special Registers Altered: MSR ## Return From Machine Check Interrupt XL-form rfmci $\begin{array}{l} \text{MSR} \; \leftarrow \; \text{MCSRR1} \\ \text{NIA} \; \leftarrow_{\mathsf{iea}} \; \text{MCSRR0}_{0:61} \; |\, | \; \text{0b00} \end{array}$ The *rfmci* instruction is used to return from a Machine Check class interrupt, or as a means of establishing a new context and synchronizing on that new context simultaneously. The contents of MCSRR1 are placed into the MSR. If the new MSR value does not enable any pending exceptions, then the next instruction is fetched, under control of the new MSR value, from the address MCSRR0<sub>0:61</sub>||0b00. (Note: VLE behavior may be different; see Book VLE.) If the new MSR value enables one or more pending exceptions, the interrupt associated with the highest priority pending exception is generated; in this case the value placed into SRR0, CSRR0, MCSRR0, or DSRR0 [Category: Embedded.Enhanced Debug] by the interrupt processing mechanism (see Section 5.6 on page 672) is the address of the instruction that would have been executed next had the interrupt not occurred (i.e. the address in MCSRR0 at the time of the execution of the *rfmci*). This instruction is privileged and context synchronizing. ## **Special Registers Altered:** **MSR** ## **Chapter 3. Fixed-Point Processor** | 3.1 Fixed-Point Processor Overview . 617 3.2 Special Purpose Registers 617 3.3 Fixed-Point Processor Registers 617 3.3.1 Processor Version Register 617 3.3.2 Processor Identification Register 617 3.3.3 Software-use SPRs 618 3.3.4 External Process ID Registers [Category: Embedded.External PID] 619 3.3.4.1 External Process ID Load Context (EPLC) Register 619 | 3.3.4.2 External Process ID Store Context (EPSC) Register | |------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------| | | | # 3.1 Fixed-Point Processor Overview This chapter describes the details concerning the registers and the privileged instructions implemented in the Fixed-Point Processor that are not covered in Book I. ## 3.2 Special Purpose Registers Special Purpose Registers (SPRs) are read and written using the *mfspr* (page 624) and *mtspr* (page 622) instructions. Most SPRs are defined in other chapters of this book; see the index to locate those definitions. # 3.3 Fixed-Point Processor Registers ## 3.3.1 Processor Version Register The Processor Version Register (PVR) is a 32-bit readonly register that contains a value identifying the version and revision level of the processor. The contents of the PVR can be copied to a GPR by the *mfspr* instruction. Read access to the PVR is privileged; write access is not provided. Figure 2. Processor Version Register The PVR distinguishes between processors that differ in attributes that may affect software. It contains two fields. Version A 16-bit number that identifies the version of the processor. Different version numbers indicate major differences between processors, such as which optional facilities and instructions are supported. Revision A 16-bit number that distinguishes between implementations of the version. Different revision numbers indicate minor differences between processors having the same version number, such as clock rate and Engineering Change level. Version numbers are assigned by the Power ISA Architecture process. Revision numbers are assigned by an implementation-defined process. # 3.3.2 Processor Identification Register The Processor Identification Register (PIR) is a 32-bit register that contains a value that can be used to distinguish the processor from other processors in the system. The contents of the PIR can be copied to a GPR by the *mfspr* instruction. Read access to the PIR is privileged; write access, if provided, is implementation-dependent. PROCID 32 63 BitsNameDescription32:63PROCIDProcessor ID ## Figure 3. Processor Identification Register The means by which the PIR is initialized are implementation-dependent. ## 3.3.3 Software-use SPRs Software-use SPRs are 64-bit registers provided for use by software. | SPRG0 | |-------------------------------------------| | SPRG1 | | SPRG2 | | SPRG3 | | SPRG4 | | SPRG5 | | SPRG6 | | SPRG7 | | SPRG8 | | SPRG9 [Category: Embedded.Enhanced Debug] | | 0 63 | Figure 4. Special Purpose Registers ## - Programming Note- USPRG0 was made a 32-bit register and renamed to VRSAVE; see Book I, Section 6.3.3. ## SPRG0 through SPRG2 These 64-bit registers can be accessed only in supervisor mode. ### SPRG3 This 64-bit register can be read in supervisor mode and can be written only in supervisor mode. It is implementation-dependent whether or not this register can be read in user mode. ## SPRG4 through SPRG7 These 64-bit registers can be written only in supervisor mode. These registers can be read in supervisor and user modes. ## SPRG8 through SPRG9 These 64-bit registers can be accessed only in supervisor mode. The contents of SPRGi can be read using *mfspr* and written into SPRGi using *mtspr*. # 3.3.4 External Process ID Registers [Category: Embedded.External PID] The External Process ID Registers provide capabilities for loading and storing General Purpose Registers and performing cache management operations using a supplied context other than the context normally used by the programming model. Two SPRs describe the context for loading and storing using external contexts. The External Process ID Load Context (EPLC) Register provides the context for External Process ID Load instructions, and the External Process ID Store Context (EPSC) Register provides the context for External Process ID Store instructions. Each of these registers contains a PR (privilege) bit, an AS (address space) bit, and a Process ID. Changes to the EPLC or the EPSC Register require that a context synchronizing operation be performed prior to using any External Process ID instructions that use these registers. External Process ID instructions that use the context provided by the EPLC register include *Ibepx, Ihepx, Iwepx, Idepx, dcbtep, dcbtstep, dcbtep, dcbstep, icbiep, Ifdepx, evIddepx, Ivepx,* and *IvepxI* and those that use the context provided by the EPSC register include *stbepx, sthepx, stwepx, stdepx, dcbzep, stfdepx, evstddepx, stvepx,* and *stvepxI*. Instruction definitions appear in Section 3.4.2. System software configures the EPLC register to reflect the Process ID, AS, and PR state from the context that it wishes to perform loads from and configures the EPSC register to reflect the Process ID, AS, and PR state from the context it wishes to perform stores to. Software then issues External Process ID instructions to manipulate data as required. When the processor executes an External Process ID Load instruction, it uses the context information in the EPLC Register instead of the normal context with respect to address translation and storage access control. EPLC<sub>EPR</sub> is used in place of MSR<sub>PR</sub>, EPLC<sub>EAS</sub> is used in place of MSR<sub>DS</sub>, and EPLC<sub>EPID</sub> is used in place of any Process ID registers implemented by the processor. Similarly, when the processor executes an External Process ID Store instruction, it uses the context information in the EPSC Register instead of the normal context with respect to address translation and storage access control. EPSC<sub>EPR</sub> is used in place of MSR<sub>PR</sub>, EPSC<sub>EAS</sub> is used in place of MSR<sub>DS</sub>, and EPSC<sub>EPID</sub> is used in place of all Process ID registers implemented by the processor. Translation occurs using the new substituted values. If the TLB lookup is successful, the storage access control mechanism grants or denies the access using context information from EPLC<sub>EPR</sub> or EPSC<sub>EPR</sub> for loads and stores respectively. If access is not granted, a Data Storage interrupt occurs, and the ESR<sub>EPID</sub> bit is set to 1. If the operation was a *Store*, the ESR<sub>ST</sub> bit is also set to 1. ## 3.3.4.1 External Process ID Load Context (EPLC) Register The EPLC register contains fields to provide the context for External Process ID load instructions. Figure 5. External Process ID Load Context Register These bits are interpreted as follows: ### Bit Definition ## 0 External Load Context PR Bit (EPR) Used in place of MSR<sub>PR</sub> by the storage access control mechanism when an External Process ID *Load* instruction is executed. - 0 Supervisor mode - 1 User mode #### 1 External Load Context AS Bit (EAS) Used in place of MSR<sub>DS</sub> for translation when an External Process ID *Load* instruction is executed. - 0 Address space 0 - 1 Address space 1 #### 2:17 Reserved ## 18:31 External Load Context Process ID Value (EPID) Used in place of all Process ID register values for translation when an external Process ID *Load* instruction is executed. ## 3.3.4.2 External Process ID Store Context (EPSC) Register The EPSC register contains fields to provide the context for External Process ID Store instructions. The field encoding is the same as the EPLC Register. ## Figure 6. External Process ID Store Context Register These bits are interpreted as follows: #### Bits Definition ## 0 External Store Context PR Bit (EPR) Used in place of MSR<sub>PR</sub> by the storage access control mechanism when an External Process ID *Store* instruction is executed. - 0 Supervisor mode - 1 User mode ## 1 External Store Context AS Bit (EAS) Used in place of MSR<sub>DS</sub> for translation when an External Process ID *Store* instruction is executed. - 0 Address space 0 - 1 Address space 1 ## 2:17 Reserved ## 18:31 External Store Context Process ID Value (EPID) Used in place of all Process ID register values for translation when an external PID *Store* instruction is executed. ## 3.4 Fixed-Point Processor Instructions ## 3.4.1 Move To/From System Register Instructions The Move To Special Purpose Register and Move From Special Purpose Register instructions are described in Book I, but only at the level available to an application programmer. For example, no mention is made there of registers that can be accessed only in supervisor mode. The descriptions of these instructions given below extend the descriptions given in Book I, but do not list Special Purpose Registers that are implementation-dependent. In the descriptions of these instructions given below, the "defined" SPR numbers are the SPR numbers shown in Table 7 and the imple- mentation-specific SPR numbers that are implemented, and similarly for "defined" registers. ## **Extended mnemonics** Extended mnemonics are provided for the *mtspr* and *mfspr* instructions so that they can be coded with the SPR name as part of the mnemonic rather than as a numeric operand; see Appendix B. Figure 7. SPR Numbers | docimal | SPR <sup>1</sup> | Register | Privi | leged | Length | Cat <sup>2</sup> | |---------|---------------------------------------|-----------|------------------|-------|-----------------|------------------| | decimal | spr <sub>5:9</sub> spr <sub>0:4</sub> | Name | mtspr | mfspr | (bits) | | | 1 | 00000 00001 | XER | no | no | 64 | В | | 8 | 00000 01000 | LR | no | no | 64 | В | | 9 | 00000 01001 | CTR | no | no | 64 | В | | 22 | 00000 10110 | DEC | yes | yes | 32 | В | | 26 | 00000 11010 | SRR0 | yes | yes | 64 | В | | 27 | 00000 11011 | SRR1 | yes | yes | 64 | В | | 48 | 00001 10000 | PID | yes | yes | 32 | Е | | 54 | 00001 10110 | DECAR | yes | yes | 32 | Е | | 58 | 00001 11010 | CSRR0 | yes | yes | 64 | Е | | 59 | 00001 11011 | CSRR1 | yes | yes | 32 | Е | | 61 | 00001 11101 | DEAR | yes | yes | 64 | Е | | 62 | 00001 11110 | ESR | yes | yes | 32 | E | | 63 | 00001 11111 | IVPR | yes | yes | 64 | Е | | 256 | 01000 00000 | VRSAVE | no | no | 32 | E,V | | 259 | 01000 00011 | SPRG3 | - | no | 64 | В | | 260-263 | 01000 001xx | SPRG[4-7] | - | no | 64 | E | | 268 | 01000 01100 | ТВ | - | no | 64_ | В | | 269 | 01000 01101 | TBU | - | no | 32 <sup>5</sup> | В | | 272-275 | 01000 100xx | SPRG[0-3] | yes | yes | 64 | В | | 276-279 | 01000 101xx | SPRG[4-7] | yes | yes | 64 | Е | | 282 | 01000 11010 | EAR | yes | yes | 32 | EC | | 284 | 01000 11100 | TBL | yes | - | 32 | В | | 285 | 01000 11101 | TBU | yes | - | 32 | В | | 286 | 01000 11110 | PIR | - | yes | 32 | E | | 287 | 01000 11111 | PVR | | yes | 32 | В | | 304 | 01001 10000 | DBSR | yes <sup>3</sup> | yes | 32 | E | | 308 | 01001 10100 | DBCR0 | yes | yes | 32 | E | | 309 | 01001 10101 | DBCR1 | yes | yes | 32 | E | | 310 | 01001 10110 | DBCR2 | yes | yes | 32 | Е | | 312 | 01001 11000 | IAC1 | yes | yes | 64 | Е | | 313 | 01001 11001 | IAC2 | yes | yes | 64 | E | | 314 | 01001 11010 | IAC3 | yes | yes | 64 | Е | | 315 | 01001 11011 | IAC4 | yes | yes | 64 | Е | | 316 | 01001 11100 | DAC1 | yes | yes | 64 | E | | 317 | 01001 11101 | DAC2 | yes | yes | 64 | Е | | 318 | 01001 11110 | DVC1 | yes | yes | 64 | E | | 319 | 01001 11111 | DVC2 | yes | yes | 64 | E | | 336 | 01010 10000 | TSR | yes <sup>3</sup> | yes | 32 | E | | 340 | 01010 10100 | TCR | yes | yes | 32 | Е | Figure 7. SPR Numbers | decimal | SPR <sup>1</sup> | Register | Privi | leged | Length | Cat <sup>2</sup> | |---------|---------------------------------------|-------------|----------|-------|--------|------------------| | | spr <sub>5:9</sub> spr <sub>0:4</sub> | Name | mtspr | mfspr | (bits) | Cat- | | 400-415 | 01100 1xxxx | IVOR[0-15] | yes | yes | 32 | Е | | 512 | 10000 00000 | SPEFSCR | no | no | 32 | SP | | 526 | 10000 01110 | ATB/ATBL | - | no | 64 | ATB | | 527 | 10000 01111 | ATBU | - | no | 32 | ATB | | 528 | 10000 10000 | IVOR32 | yes | yes | 32 | SP | | 529 | 10000 10001 | IVOR33 | yes | yes | 32 | SP | | 530 | 10000 10010 | IVOR34 | yes | yes | 32 | SP | | 531 | 10000 10011 | IVOR35 | yes | yes | 32 | E.PM | | 532 | 10000 10100 | IVOR36 | yes | yes | 32 | E.PC | | 533 | 10000 10101 | IVOR37 | yes | yes | 32 | E.PC | | 570 | 10001 11010 | MCSRR0 | yes | yes | 64 | E | | 571 | 10001 11011 | MCSRR1 | yes | yes | 32 | Е | | 572 | 10001 11100 | MCSR | yes | yes | 64 | Е | | 574 | 10001 11110 | DSRR0 | yes | yes | 64 | E.ED | | 575 | 10001 11111 | DSRR1 | yes | yes | 32 | E.ED | | 604 | 10010 11100 | SPRG8 | yes | yes | 64 | Е | | 605 | 10010 11101 | SPRG9 | yes | yes | 64 | E.ED | | 624 | 10011 10000 | MAS0 | yes | yes | 32 | E.MF | | 625 | 10011 10001 | MAS1 | yes | yes | 32 | E.MF | | 626 | 10011 10010 | MAS2 | yes | yes | 64 | E.MF | | 627 | 10011 10011 | MAS3 | yes | yes | 32 | E.MF | | 628 | 10011 10100 | MAS4 | yes | yes | 32 | E.MF | | 630 | 10011 10110 | MAS6 | yes | yes | 32 | E.MF | | 633 | 10011 11001 | PID1 | yes | yes | 32 | E.MF | | 634 | 10011 11010 | PID2 | yes | yes | 32 | E.MF | | 688-691 | 10101 100xx | TLB[0-3]CFG | yes | yes | 32 | E.MF | | 702 | 10101 11110 | EPR | -, | yes | 32 | EXP | | 924 | 11100 11100 | DCDBTRL | _4 | yes | 32 | E.CD | | 925 | 11100 11101 | DCDBTRH | _4<br>_5 | yes | 32 | E.CD | | 926 | 11100 11110 | ICDBTRL | _5 | yes | 32 | E.CD | | 927 | 11100 11111 | ICDBTRH | _5 | yes | 32 | E.CD | | 944 | 11101 10000 | MAS7 | yes | yes | 32 | E.MF | | 947 | 11101 10011 | EPLC | yes | yes | 32 | E.PD | | 948 | 11101 10100 | EPSC | yes | yes | 32 | E.PD | | 979 | 11110 10011 | ICDBDR | _5 | yes | 32 | E.CD | | 1012 | 11111 10100 | MMUCSR0 | yes | yes | 32 | E.MF | | 1015 | 11111 10111 | MMUCFG | yes | yes | 32 | E.MF | - This register is not defined for this instruction. - Note that the order of the two 5-bit halves of the SPR number is reversed. - <sup>2</sup> See Section 1.3.5 of Book I. - This register cannot be directly written to. Instead, bits in the register corresponding to 1 bits in (RS) can be cleared using *mtspr SPR,RS*. - <sup>4</sup> The register can be written by the *dcread* instruction. - <sup>5</sup> The register can be written by the *icread* instruction. All SPR numbers that are not shown above and are not implementationspecific are reserved. ## Move To Special Purpose Register XFX-form mtspr SPR,RS | 31 | RS | spr | 467 | / | |----|----|-----|-----|----| | 0 | 6 | 11 | 21 | 31 | | $\begin{array}{l} n \leftarrow spr_{5:9} \mid \mid spr_{0:4} \\ \text{if length}(SPR(n)) = 64 \end{array}$ | then | |------------------------------------------------------------------------------------------------------------|------| | $SPR(n) \leftarrow (RS)$ | | | else | | | $SPR(n) \leftarrow (RS)_{32:63}$ | | The SPR field denotes a Special Purpose Register, encoded as shown in Figure 7. The contents of register RS are placed into the designated Special Purpose Register. For Special Purpose Registers that are 32 bits long, the low-order 32 bits of RS are placed into the SPR. For this instruction, SPRs TBL and TBU are treated as separate 32-bit registers; setting one leaves the other unaltered. ${\rm spr}_0=1$ if and only if writing the register is privileged. Execution of this instruction specifying a defined and privileged register when MSR<sub>PR</sub>=1 causes a Privileged Instruction type Program interrupt. Execution of this instruction specifying an SPR number that is not defined for the implementation causes either an Illegal Instruction type Program interrupt or one of the following. - if spr<sub>0</sub>=0: boundedly undefined results - $\blacksquare$ if spr<sub>0</sub>=1: - if MSR<sub>PR</sub>=1: Privileged Instruction type Program interrupt; if MSR<sub>PR</sub>=0: boundedly undefined results If the SPR number is set to a value that is shown in Figure 7 but corresponds to an optional Special Purpose Register that is not provided by the implementation, the effect of executing this instruction is the same as if the SPR number were reserved. ### **Special Registers Altered:** See Figure 7 #### Compiler and Assembler Note For the *mtspr* and *mfspr* instructions, the SPR number coded in assembler language does not appear directly as a 10-bit binary number in the instruction. The number coded is split into two 5-bit halves that are reversed in the instruction, with the high-order 5 bits appearing in bits 16:20 of the instruction and the low-order 5 bits in bits 11:15. ### Programming Note - For a discussion of software synchronization requirements when altering certain Special Purpose Registers, see Chapter 10. "Synchronization Requirements for Context Alterations" on page 723. ### Move From Special Purpose Register XFX-form mfspr RT,SPR | 31 | RT | spr | 339 | / | |----|----|-----|-----|----| | 0 | 6 | 11 | 21 | 31 | ``` \begin{array}{ll} n \leftarrow spr_{5:9} \mid \mid spr_{0:4} \\ \text{if length}(SPR(n)) = 64 \text{ then} \\ RT \leftarrow SPR(n) \\ \text{else} \\ RT \leftarrow \begin{array}{ll} 32_0 \mid \mid SPR(n) \end{array} \end{array} ``` The SPR field denotes a Special Purpose Register, encoded as shown in Figure 7. The contents of the designated Special Purpose Register are placed into register RT. For Special Purpose Registers that are 32 bits long, the low-order 32 bits of RT receive the contents of the Special Purpose Register and the high-order 32 bits of RT are set to zero. ${\rm spr_0=1}$ if and only if reading the register is privileged. Execution of this instruction specifying a defined and privileged register when MSR<sub>PR</sub>=1 causes a Privileged Instruction type Program interrupt. Execution of this instruction specifying an SPR number that is not defined for the implementation causes either an Illegal Instruction type Program interrupt or one of the following. - if spr<sub>0</sub>=0: boundedly undefined results - if $spr_0=1$ : - if MSR<sub>PR</sub>=1: Privileged Instruction type Program interrupt - if MSR<sub>PR</sub>=0: boundedly undefined results If the SPR field contains a value that is shown in Figure 7 but corresponds to an optional Special Purpose Register that is not provided by the implementation, the effect of executing this instruction is the same as if the SPR number were reserved. #### **Special Registers Altered:** None Note See the Notes that appear with mtspr. # Move To Device Control Register XFX-form mtdcr DCRN,RS | 31 | RS | dcr | 451 | / | |----|----|-----|-----|----| | 0 | 6 | 11 | 21 | 31 | DCRN $$\leftarrow$$ dcr<sub>0:4</sub> || dcr<sub>5:9</sub> DCR(DCRN) $\leftarrow$ (RS) Let DCRN denote a Device Control Register. (The supported Device Control Registers are implementation-dependent.) The contents of register RS are placed into the designated Device Control Register. For 32-bit Device Control Registers, the contents of bits 32:63 of (RS) are placed into the Device Control Register. This instruction is privileged. #### **Special Registers Altered:** Implementation-dependent. ### Move To Device Control Register Indexed X-form mtdcrx RA,RS | 31 | RS | RA | /// | 387 | / | |----|----|----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` DCRN ← (RA) DCR(DCRN) ← (RS) ``` Let the contents of register RA denote a Device Control Register. (The supported Device Control Registers supported are implementation-dependent.) The contents of register RS are placed into the designated Device Control Register. For 32-bit Device Control Registers, the contents of RS<sub>32:63</sub> are placed into the Device Control Register. The specification of Device Control Registers using *mtdcrx*, *mtdcrux* (see Book I), and *mtdcr* is implementation-dependent. For example, *mtdcr* 105,r2 and *mtdcrux* r1,r2 (where register r1 contains the value 105) may not produce identical results on an implementation. This instruction is privileged. ### **Special Registers Altered:** Implementation-dependent. # Move From Device Control Register XFX-form mfdcr RT,DCRN | | 31 | RT | dcr | 323 | / | |---|----|----|-----|-----|----| | ( | ) | 6 | 11 | 21 | 31 | DCRN $$\leftarrow$$ dcr<sub>0:4</sub> || dcr<sub>5:9</sub> RT $\leftarrow$ DCR(DCRN) Let DCRN denote a Device Control Register. (The supported Device Control Registers are implementation-dependent.) The contents of the designated Device Control Register are placed into register RT. For 32-bit Device Control Registers, the contents of the Device Control Register are placed into bits 32:63 of RT. Bits 0:31 of RT are set to 0. This instruction is privileged. #### **Special Registers Altered:** Implementation-dependent. # Move From Device Control Register Indexed X-form mfdcrx RT,RA | 31 | RT | RΑ | /// | 259 | / | |----|-------|--------|-----|---------|----| | 01 | 1 ( ) | 1 (7 ( | /// | 200 | , | | 0 | 6 | 11 | 16 | 21 | 31 | | O | U | | 10 | <u></u> | 01 | $$\begin{array}{l} \texttt{DCRN} \; \leftarrow \; (\texttt{RA}) \\ \texttt{RT} \; \leftarrow \; \texttt{DCR} \, (\texttt{DCRN}) \end{array}$$ Let the contents of register RA denote a Device Control Register (the supported Device Control Registers are implementation-dependent.) The contents of the designated Device Control Register are placed into register RT. For 32-bit Device Control Registers, the contents of bits 32:63 of the designated Device Control Register are placed into RT. Bits 0:31 of RT are set to 0. The specification of Device Control Registers using *mfdcrx* and *mfdcrux* (see Book I) compared to the specification of Device Control Registers using *mfdcr* is implementation-dependent. For example, *mfdcr* 12,105 and *mfdcrx* 12,11 (where register r1 contains the value 105) may not produce identical results on an implementation or between implementations. Also, accessing privileged Device Control Registers with *mfdcrux* when the processor is in supervisor mode is implementation-dependent. This instruction is privileged. ### Special Registers Altered: Implementation-dependent. ### Move To Machine State Register X-form mtmsr RS | 31 | RS | /// | /// | 146 | / | |----|----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | newmsr $$\leftarrow$$ (RS)<sub>32:63</sub> if MSR<sub>CM</sub> = 0 & newmsr<sub>CM</sub> = 1 then NIA<sub>0:31</sub> $\leftarrow$ 0 MSR $\leftarrow$ newmsr The contents of register RS $_{32:63}$ are placed into the MSR. If the processor is changing from 32-bit mode to 64-bit mode, the next instruction is fetched from $^{32}0||NIA_{32:63}|$ This instruction is privileged and execution synchronizing. In addition, alterations to the EE or CE bits are effective as soon as the instruction completes. Thus if $MSR_{EE}=0$ and an External interrupt is pending, executing an *mtmsr* that sets $MSR_{EE}$ to 1 will cause the External interrupt to be taken before the next instruction is executed, if no higher priority exception exists. Likewise, if $MSR_{CE}=0$ and a Critical Input interrupt is pending, executing an *mtmsr* that sets $MSR_{CE}$ to 1 will cause the Critical Input interrupt to be taken before the next instruction is executed if no higher priority exception exists. (See Section 5.6 on page 672). ### **Special Registers Altered:** MSR #### Programming Note For a discussion of software synchronization requirements when altering certain MSR bits please refer to Chapter 10. # Move From Machine State Register X-form mfmsr RT | | 31 | RT | /// | /// | 83 | / | |---|----|----|-----|-----|----|----| | 0 | | 6 | 11 | 16 | 21 | 31 | RT $$\leftarrow$$ 320 || MSR The contents of the MSR are placed into bits 32:63 of register RT and bits 0:31 of RT are set to 0. This instruction is privileged. ### **Special Registers Altered:** None ### Write MSR External Enable X-form | wrtee | RS | | | | |-------|----|-----|-----|-----| | 31 | RS | /// | /// | 131 | 16 21 $$MSR_{EE} \leftarrow (RS)_{48}$$ The content of (RS)<sub>48</sub> is placed into MSR<sub>EE</sub>. Alteration of the $MSR_{EE}$ bit is effective as soon as the instruction completes. Thus if $MSR_{EE}$ =0 and an External interrupt is pending, executing a wrtee instruction that sets $MSR_{EE}$ to 1 will cause the External interrupt to occur before the next instruction is executed, if no higher priority exception exists (Section 5.9, "Exception Priorities" on page 689). This instruction is privileged. ### **Special Registers Altered:** **MSR** ### Write MSR External Enable Immediate X-form wrteei E | 31 | /// | /// | Ε | /// | 163 | / | |----|-----|-----|----|-----|-----|----| | 0 | 6 | 11 | 16 | 17 | 21 | 31 | $$MSR_{EE} \leftarrow E$$ The value specified in the E field is placed into MSR<sub>EE</sub>. Alteration of the MSR<sub>EE</sub> bit is effective as soon as the instruction completes. Thus if MSR<sub>EE</sub>=0 and an External interrupt is pending, executing a wrtee instruction that sets $MSR_{EE}$ to 1 will cause the External interrupt to occur before the next instruction is executed, if no higher priority exception exists (Section 5.9, "Exception Priorities" on page 689). This instruction is privileged. ### **Special Registers Altered:** **MSR** ### Programming Note - wrtee and wrteei are used to provide atomic update of MSR<sub>FF</sub>. Typical usage is: ``` update of MSR<sub>EE</sub>. Typical usage is: \#save EE in (Rn)_{48} mfmsr Rn wrteei 0 #turn off EE \#save EE in (Rn)_{48} mfmsr Rn wrteei 0 #turn off EE : #code with EE disabled wrtee #restore EE without altering #other MSR bits that might #have changed ``` ### 3.4.2 External Process ID Instructions [Category: Embedded.External PID] External Process ID instructions provide capabilities for loading and storing General Purpose Registers and performing cache management operations using a supplied context other than the context normally used by translation. The EPLC and EPSC registers provide external contexts for performing loads and stores. The EPLC and the EPSC registers are described in Section 3.3.4. If an Alignment interrupt, Data Storage interrupt, or a Data TLB Error interrupt, occurs while attempting to execute an External Process ID instruction, ESR<sub>EPID</sub> is set to 1 indicating that the instruction causing the interrupt was an External Process ID instruction; any other applicable ESR bits are also set. ### Load Byte by External Process ID Indexed X-form #### Ibepx RT,RA,RB | 31 | RT | RA | RB | 95 | / | |----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) RT \leftarrow {}^{56}0 \parallel MEM(EA, 1) ``` Let the effective address (EA) be the sum (RA|0)+(RB). The byte in storage addressed by EA is loaded into $RT_{56:63}$ . $RT_{0:55}$ are set to 0. For *Ibepx*, the normal translation mechanism is not used. The contents of the EPLC register are used to provide the context in which translation occurs. The following substitutions are made for just the translation and access control process: EPLC<sub>EPR</sub> is used in place of MSR<sub>PR</sub> EPLC<sub>EAS</sub> is used in place of MSR<sub>DS</sub> EPLC<sub>EPID</sub> is used in place of all Process ID registors This instruction is privileged. ### **Special Registers Altered:** None ### Programming Note This instruction behaves identically to a *lbzx* instruction except for using the EPLC register to provide the translation context. ### Load Halfword by External Process ID Indexed X-form Ihepx RT,RA,RB | 31 | RT | RA | RB | 287 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) RT \leftarrow ^{48}0 || MEM(EA,2) ``` Let the effective address (EA) be the sum (RA|0)+(RB). The halfword in storage addressed by EA is loaded into $RT_{48:63}$ . $RT_{0:47}$ are set to 0. For *Ihepx*, the normal translation mechanism is not used. The contents of the EPLC register are used to provide the context in which translation occurs. The following substitutions are made for just the translation and access control process: EPLC<sub>EPR</sub> is used in place of MSR<sub>PR</sub> EPLC<sub>EAS</sub> is used in place of MSR<sub>DS</sub> EPLC<sub>FPID</sub> is used in place of all Process ID regis- ters. This instruction is privileged. #### **Special Registers Altered:** None ### **Programming Note** This instruction behaves identically to a *lhzx* instruction except for using the EPLC register to provide the translation context. # Load Word by External Process ID Indexed X-form lwepx RT,RA,RB ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) RT \leftarrow 320 || MEM(EA,4) ``` Let the effective address (EA) be the sum (RA|0)+(RB). The word in storage addressed by EA is loaded into $RT_{32:63}$ . $RT_{0:31}$ are set to 0. For *Iwepx*, the normal translation mechanism is not used. The contents of the EPLC register are used to provide the context in which translation occurs. The following substitutions are made for just the translation and access control process: EPLC<sub>EPR</sub> is used in place of MSR<sub>PR</sub> EPLC<sub>EAS</sub> is used in place of MSR<sub>DS</sub> EPLC<sub>EPID</sub> is used in place of all Process ID registers. This instruction is privileged. ### **Special Registers Altered:** None ### **Programming Note** This instruction behaves identically to a *Iwzx* instruction except for using the EPLC register to provide the translation context. # Load Doubleword by External Process ID Indexed X-form Idepx RT,RA,RB ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) RT \leftarrow MEM(EA, 8) ``` Let the effective address (EA) be the sum (RA|0)+(RB). The doubleword in storage addressed by EA is loaded into RT. For *Idepx*, the normal translation mechanism is not used. The contents of the EPLC register are used to provide the context in which translation occurs. The following substitutions are made for just the translation and access control process: EPLC<sub>EPR</sub> is used in place of MSR<sub>PR</sub> EPLC<sub>EAS</sub> is used in place of MSR<sub>DS</sub> EPLC<sub>EPID</sub> is used in place of all Process ID registers. This instruction is privileged. ### **Corequisite Categories:** 64-Bit ### **Special Registers Altered:** None ### **Programming Note** This instruction behaves identically to a *Idx* instruction except for using the EPLC register to provide the translation context. # Store Byte by External Process ID Indexed X-form stbepx RS,RA,RB | 31 | RS | RA | RB | 223 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) MEM(EA,1) \leftarrow (RS)_{56:63} ``` Let the effective address (EA) be the sum (RA|0)+(RB). (RS) $_{56:63}$ are stored into the byte in storage addressed by EA. For **stbepx**, the normal translation mechanism is not used. The contents of the EPSC register are used to provide the context in which translation occurs. The following substitutions are made for just the translation and access control process: EPSC<sub>EPR</sub> is used in place of MSR<sub>PR</sub> EPSC<sub>EAS</sub> is used in place of MSR<sub>DS</sub> EPSC<sub>EPID</sub> is used in place of all Process ID registers. This instruction is privileged. ### **Special Registers Altered:** None ### **Programming Note** This instruction behaves identically to a **stbx** instruction except for using the EPSC register to provide the translation context. # Store Halfword by External Process ID Indexed X-form sthepx RS,RA,RB | 31 | RS | RA | RB | 415 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | if RA = 0 then b $$\leftarrow$$ 0 else b $\leftarrow$ (RA) EA $\leftarrow$ b + (RB) MEM(EA,2) $\leftarrow$ (RS)<sub>48:63</sub> Let the effective address (EA) be the sum (RA|0)+(RB). (RS) $_{48:63}$ are stored into the halfword in storage addressed by EA. For *sthepx*, the normal translation mechanism is not used. The contents of the EPSC register are used to provide the context in which translation occurs. The following substitutions are made for just the translation and access control process: $$\begin{split} & \mathsf{EPSC}_{\mathsf{EPR}} \text{ is used in place of } \mathsf{MSR}_{\mathsf{PR}} \\ & \mathsf{EPSC}_{\mathsf{EAS}} \text{ is used in place of } \mathsf{MSR}_{\mathsf{DS}} \\ & \mathsf{EPSC}_{\mathsf{EPID}} \text{ is used in place of all Process ID registers} \end{split}$$ This instruction is privileged. ### **Special Registers Altered:** None #### **Programming Note** This instruction behaves identically to a **sthx** instruction except for using the EPSC register to provide the translation context. # Store Word by External Process ID Indexed X-form stwepx RS,RA,RB if RA = 0 then $$b \leftarrow 0$$ else $b \leftarrow (RA)$ EA $\leftarrow b + (RB)$ MEM(EA,4) $\leftarrow (RS)_{32:63}$ Let the effective address (EA) be the sum (RA|0)+(RB). (RS) $_{32:63}$ are stored into the word in storage addressed by EA. For **stwepx**, the normal translation mechanism is not used. The contents of the EPSC register are used to provide the context in which translation occurs. The following substitutions are made for just the translation and access control process: $$\begin{split} & \mathsf{EPSC}_{\mathsf{EPR}} \text{ is used in place of } \mathsf{MSR}_{\mathsf{PR}} \\ & \mathsf{EPSC}_{\mathsf{EAS}} \text{ is used in place of } \mathsf{MSR}_{\mathsf{DS}} \\ & \mathsf{EPSC}_{\mathsf{EPID}} \text{ is used in place of all Process ID regis-} \end{split}$$ This instruction is privileged. ### **Special Registers Altered:** None ### **Programming Note** This instruction behaves identically to a **stwx** instruction except for using the EPSC register to provide the translation context. # Store Doubleword by External Process ID Indexed X-form stdepx RS,RA,RB | 31 | RS | RA | RB | 157 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | if RA = 0 then b $$\leftarrow$$ 0 else b $\leftarrow$ (RA) EA $\leftarrow$ b + (RB) MEM(EA,8) $\leftarrow$ (RS) Let the effective address (EA) be the sum (RA|0)+(RB). (RS) is stored into the doubleword in storage addressed by EA. For **stdepx**, the normal translation mechanism is not used. The contents of the EPSC register are used to provide the context in which translation occurs. The following substitutions are made for just the translation and access control process: $$\begin{split} &\mathsf{EPSC}_{\mathsf{EPR}} \text{ is used in place of } \mathsf{MSR}_{\mathsf{PR}} \\ &\mathsf{EPSC}_{\mathsf{EAS}} \text{ is used in place of } \mathsf{MSR}_{\mathsf{DS}} \\ &\mathsf{EPSC}_{\mathsf{EPID}} \text{ is used in place of all Process ID regis-} \end{split}$$ This instruction is privileged. #### **Corequisite Categories:** 64-Bit ### **Special Registers Altered:** None ### - Programming Note This instruction behaves identically to a **stdx** instruction except for using the EPSC register to provide the translation context. ### Data Cache Block Store by External PID X-form dcbstep RA,RB | 31 | /// | RA | RB | 63 | / | |----|-----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | Let the effective address (EA) be the sum (RA|0)+(RB). If the block containing the byte addressed by EA is in storage that is Memory Coherence Required, a block containing the byte addressed by EA is in the data cache of any processor, and any locations in the block are considered to be modified there, then those locations are written to main storage. Additional locations in the block may be written to main storage. The block ceases to be considered modified in that data cache. If the block containing the byte addressed by EA is in storage that is not Memory Coherence Required and the block is in the data cache of this processor, and any locations in the block are considered to be modified there, those locations are written to main storage. Additional locations in the block may be written to main storage, and the block ceases to be considered modified in that data cache. The function of this instruction is independent of whether the block containing the byte addressed by EA is in storage that is Write Through Required or Caching Inhibited. The instruction is treated as a *Load* with respect to translation, memory protection, and is treated as a *Write* with respect to debug events. This instruction is privileged. For *dcbstep*, the normal translation mechanism is not used. The contents of the EPLC register are used to provide the context in which translation occurs. The following substitutions are made for just the translation and access control process: EPLC<sub>EPR</sub> is used in place of MSR<sub>PR</sub> EPLC<sub>EAS</sub> is used in place of MSR<sub>DS</sub> EPLC<sub>EPID</sub> is used in place of all Process ID registers ### **Special Registers Altered:** None ### - Programming Note This instruction behaves identically to a *dcbst* instruction except for using the EPLC register to provide the translation context. ### Data Cache Block Touch by External PID X-form | dobtop | 111,10 | ν,ινο | | | | | |--------|--------|-------|----|-----|---|--| | 31 | TH | RA | RB | 319 | / | | TH RA RR dchten Let the effective address (EA) be the sum (RA|0)+(RB). The *dcbtep* instruction provides a hint that describes a block or data stream, or indicates the expected use thereof. A hint that the program will probably soon load from a given storage location is ignored if the location is Caching Inhibited or Guarded. The only operation that is "caused" by the *dcbtep* instruction is the providing of the hint. The actions (if any) taken by the processor in response to the hint are not considered to be "caused by" or "associated with" the *dcbtep* instruction (e.g., *dcbtep* is considered not to cause any data accesses). No means are provided by which software can synchronize these actions with the execution of the instruction stream. For example, these actions are not ordered by the memory barrier created by a *sync* instruction. The *dcbtep* instruction may complete before the operation it causes has been performed. The nature of the hint depends, in part, on the value of the TH field, as specified in the *dcbt* instruction in Section 3.3.2 of Book II. The instruction is treated as a *Load*, except that no interrupt occurs if a protection violation occurs. The instruction is privileged. The normal address translation mechanism is not used. The contents of the EPLC register are used to provide the context in which translation occurs. The following substitutions are made for just the translation and access control process: EPLC<sub>EPR</sub> is used in place of MSR<sub>PR</sub> EPLC<sub>EAS</sub> is used in place of MSR<sub>DS</sub> EPLC<sub>EPID</sub> is used in place of all Process ID registers ### **Special Registers Altered:** None ### **Extended Mnemonics:** Extended mnemonics are provided for the *Data Cache Block Touch by External PID* instruction so that it can be coded with the TH value as the last operand for all categories. . Extended: Equivalent to: dcbtctep RA,RB,TH dcbtep for TH values of 0b0000 - 0b0111; other TH values are invalid. ### Extended: Equivalent to: dcbtdsep RA,RB,TH dcbtep for TH values of 0b0000 or 0b1000 - 0b1010; other TH values are invalid. #### Programming Note This instruction behaves identically to a *dcbt* instruction except for using the EPLC register to provide the translation context. ### Data Cache Block Flush by External PID X-form dcbfep RA,RB | 31 | /// | RA | RB | 127 | / | |----|-----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | Let the effective address (EA) be the sum (RA|0)+(RB). If the block containing the byte addressed by EA is in storage that is Memory Coherence Required, a block containing the byte addressed by EA is in the data cache of any processor, and any locations in the block are considered to be modified there, then those locations are written to main storage. Additional locations in the block may also be written to main storage. The block is invalidated in the data cache of all processors. If the block containing the byte addressed by EA is in storage that is not Memory Coherence Required, a block containing the byte addressed by EA is in the data cache of this processor, and any locations in the block are considered to be modified there, then those locations are written to main storage. Additional locations in the block may also be written to main storage. The block is invalidated in the data cache of this processor. The function of this instruction is independent of whether the block containing the byte addressed by EA is in storage that is Write Through Required or Caching Inhibited. The instruction is treated as a *Load* with respect to translation, memory protection, and is treated as a *Write* with respect to debug events. This instruction is privileged. The normal translation mechanism is not used. The contents of the EPLC register are used to provide the context in which translation occurs. The following substitutions are made for just the translation and access control process: EPLC<sub>EPR</sub> is used in place of MSR<sub>PR</sub> EPLC<sub>EAS</sub> is used in place of MSR<sub>DS</sub> EPLC<sub>EPID</sub> is used in place of all Process ID regis- ### **Special Registers Altered:** None ### **Programming Note** This instruction behaves identically to a *dcbf* instruction except for using the EPLC register to provide the translation context. # Data Cache Block Touch for Store by External PID X-form dcbtstep TH,RA,RB Let the effective address (EA) be the sum (RA|0)+(RB). The *dcbtstep* instruction provides a hint that the program will probably soon store to the block containing the byte addressed by EA. If the Cache Specification category is supported, the nature of the hint depends on the value of the TH field, as specified in Section 3.3.2 of Book II. If the Cache Specification category is not supported, the TH field is treated as a reserved field. If the block is in a storage location that is Caching Inhibited or Guarded, then the hint is ignored. The only operation that is "caused" by the *dcbtstep* instruction is the providing of the hint. The actions (if any) taken by the processor in response to the hint are not considered to be "caused by" or "associated with" the *dcbtstep* instruction (e.g., *dcbtstep* is considered not to cause any data accesses). No means are provided by which software can synchronize these actions with the execution of the instruction stream. For example, these actions are not ordered by the memory barrier created by a *sync* instruction. The *dcbtstep* instruction may complete before the operation it causes has been performed. The instruction is treated as a *Load*, except that no interrupt occurs if a protection violation occurs. The instruction is privileged. The normal address translation mechanism is not used. The contents of the EPLC register are used to provide the context in which translation occurs. The following substitutions are made for just the translation and access control process: EPLC<sub>EPR</sub> is used in place of MSR<sub>PR</sub> EPLC<sub>EAS</sub> is used in place of MSR<sub>DS</sub> EPLC<sub>EPID</sub> is used in place of all Process ID registers ### **Special Registers Altered:** None ### **Extended Mnemonics:** Extended mnemonics are provided for the *Data Cache Block Touch for Store by External PID* instruction so that it can be coded with the TH value as the last operand for all categories. . Extended: Equivalent to: dcbtstctep RA,RB,TH dcbtstep for TH values of 0b0000 - 0b0111; other TH values are invalid. ### Programming Note This instruction behaves identically to a *dcbtst* instruction except for using the EPLC register to provide the translation context. # Instruction Cache Block Invalidate by External PID X-form icbiep RA,RB Let the effective address (EA) be the sum (RA|0)+(RB). If the block containing the byte addressed by EA is in storage that is Memory Coherence Required and a block containing the byte addressed by EA is in the instruction cache of any processor, the block is invalidated in those instruction caches. If the block containing the byte addressed by EA is in storage that is not Memory Coherence Required and a block containing the byte addressed by EA is in the instruction cache of this processor, the block is invalidated in that instruction cache. The function of this instruction is independent of whether the block containing the byte addressed by EA is in storage that is Write Through Required or Caching Inhibited. The instruction is treated as a Load. This instruction is privileged. For *icbiep*, the normal translation mechanism is not used. The contents of the EPLC register are used to provide the context in which translation occurs. The following substitutions are made for just the translation and access control process: EPLC<sub>EPR</sub> is used in place of MSR<sub>PR</sub> EPLC<sub>EAS</sub> is used in place of MSR<sub>DS</sub> EPLC<sub>EPID</sub> is used in place of all Process ID registers ### **Special Registers Altered:** None #### Programming Note This instruction behaves identically to an *icbi* instruction except for using the EPLC register to provide the translation context. # Data Cache Block set to Zero by External PID X-form dcbzep RA,RB ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) n \leftarrow block size (bytes) m \leftarrow log<sub>2</sub>(n) ea \leftarrow EA<sub>0:63-m</sub> | | m<sub>0</sub> MEM(ea, n) \leftarrow n0x00 ``` Let the effective address (EA) be the sum (RA|0)+(RB). All bytes in the block containing the byte addressed by EA are set to zero. This instruction is treated as a Store. This instruction is privileged. The normal translation mechanism is not used. The contents of the EPSC register are used to provide the context in which translation occurs. The following substitutions are made for just the translation and access control process: EPSC<sub>EPR</sub> is used in place of MSR<sub>PR</sub> EPSC<sub>EAS</sub> is used in place of MSR<sub>DS</sub> EPSC<sub>EPID</sub> is used in place of all Process ID registers ### **Special Registers Altered:** None #### Programming Note - See the Programming Notes for the *dcbz* instruction. #### Programming Note This instruction behaves identically to a *dcbz* instruction except for using the EPSC register to provide the translation context. ### Load Floating-Point Double by External Process ID Indexed X-form Ifdepx FRT,RA,RB | 31 | FRT | RA | RB | 607 | / | |----|-----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) FRT \leftarrow MEM(EA, 8) ``` Let the effective address (EA) be the sum (RA|0)+(RB). The doubleword in storage addressed by EA is loaded into FRT. For *Ifdepx*, the normal translation mechanism is not used. The contents of the EPLC register are used to provide the context in which translation occurs. The following substitutions are made for just the translation and access control process: EPLC<sub>EPR</sub> is used in place of MSR<sub>PR</sub> EPLC<sub>EAS</sub> is used in place of MSR<sub>DS</sub> EPLC<sub>EPID</sub> is used in place of all Process ID regis- This instruction is privileged. An attempt to execute *Ifdepx* while MSR<sub>FP</sub>=0 will cause a Floating-Point Unavailable interrupt. ### **Corequisite Categories:** Floating-Point ### **Special Registers Altered:** None #### **Programming Note** This instruction behaves identically to a *Ifdx* instruction except for using the EPLC register to provide the translation context. ### Store Floating-Point Double by External Process ID Indexed X-form stfdepx FRS,RA,RB | 31 | FRS | RA | RB | 735 | / | |----|-----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) MEM(EA,8) \leftarrow (FRS) ``` Let the effective address (EA) be the sum (RA|0)+(RB). (FRS) is stored into the doubleword in storage addressed by EA. For **stfdepx**, the normal translation mechanism is not used. The contents of the EPSC register are used to provide the context in which translation occurs. The following substitutions are made for just the translation and access control process: EPSC<sub>EPR</sub> is used in place of MSR<sub>PR</sub> EPSC<sub>EAS</sub> is used in place of MSR<sub>DS</sub> EPSC<sub>EPID</sub> is used in place of all Process ID registers This instruction is privileged. An attempt to execute **stfdepx** while MSR<sub>FP</sub>=0 will cause a Floating-Point Unavailable interrupt. ### **Corequisite Categories:** Floating-Point ### **Special Registers Altered:** None ### Programming Note This instruction behaves identically to a **stfdx** instruction except for using the EPSC register to provide the translation context. ### Vector Load Doubleword into Doubleword by External Process ID Indexed EVX-form evlddepx RT,RA,RB ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) RT \leftarrow MEM(EA, 8) ``` Let the effective address (EA) be the sum (RA|0)+(RB). The doubleword in storage addressed by EA is loaded into RT. For **eviddepx**, the normal translation mechanism is not used. The contents of the EPLC register are used to provide the context in which translation occurs. The following substitutions are made for just the translation and access control process: EPLC<sub>EPR</sub> is used in place of MSR<sub>PR</sub> EPLC<sub>EAS</sub> is used in place of MSR<sub>DS</sub> EPLC<sub>EPID</sub> is used in place of all Process ID registers This instruction is privileged. An attempt to execute **evIddepx** while MSR<sub>SPV</sub>=0 will cause an SPE Unavailable interrupt. #### **Corequisite Categories:** Signal Processing Engine ### **Special Registers Altered:** None ### **Programming Note** This instruction behaves identically to a **evlddx** instruction except for using the EPLC register to provide the translation context. ### Vector Store Doubleword into Doubleword by External Process ID Indexed EVX-form evstddepx RS,RA,RB ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) MEM(EA,8) \leftarrow (RS) ``` Let the effective address (EA) be the sum (RA|0)+(RB). (RS) is stored into the doubleword in storage addressed by EA. For **evstddepx**, the normal translation mechanism is not used. The contents of the EPSC register are used to provide the context in which translation occurs. The following substitutions are made for just the translation and access control process: EPSC<sub>EPR</sub> is used in place of MSR<sub>PR</sub> EPSC<sub>EAS</sub> is used in place of MSR<sub>DS</sub> EPSC<sub>EPID</sub> is used in place of all Process ID registers This instruction is privileged. An attempt to execute **evstddepx** while MSR<sub>SPV</sub>=0 will cause an SPE Unavailable interrupt. ### **Corequisite Categories:** Signal Processing Engine ### **Special Registers Altered:** None ### **Programming Note** This instruction behaves identically to a **evstddx** instruction except for using the EPSC register to provide the translation context. # Load Vector by External Process ID Indexed X-form lvepx VRT,RA,RB | 31 | VRT | RA | RB | 295 | / | |----|-----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) VRT \leftarrow MEM(EA & 0xFFFF_FFFF_FFFF_FFF0, 16) ``` Let the effective address (EA) be the sum (RA|0)+(RB). The quadword in storage addressed by the result of EA ANDed with 0xFFFF\_FFFF\_FFFF\_FFF0 is loaded into VRT. For *Ivepx*, the normal translation mechanism is not used. The contents of the EPLC register are used to provide the context in which translation occurs. The following substitutions are made for just the translation and access control process: $$\begin{split} & \mathsf{EPLC}_{\mathsf{EPR}} \text{ is used in place of MSR}_{\mathsf{PR}} \\ & \mathsf{EPLC}_{\mathsf{EAS}} \text{ is used in place of MSR}_{\mathsf{DS}} \\ & \mathsf{EPLC}_{\mathsf{EPID}} \text{ is used in place of all Process ID regis-} \end{split}$$ This instruction is privileged. An attempt to execute *Ivepx* while MSR<sub>SPV</sub>=0 will cause a Vector Unavailable interrupt. ### **Corequisite Categories:** Vector ### **Special Registers Altered:** None ### **Programming Note** This instruction behaves identically to a *Ivx* instruction except for using the EPLC register to provide the translation context. # Load Vector by External Process ID Indexed LRU X-form IvepxI VRT,RA,RB | 31 | VRT | RA | RB | 263 | / | |----|-----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) VRT \leftarrow MEM(EA & 0xFFFF_FFFF_FFF0, 16) mark as_not_likely_to_be_needed_again_anytime_soon (EA) ``` Let the effective address (EA) be the sum (RA|0)+(RB). The quadword in storage addressed by the result of EA ANDed with 0xFFFFFFFFFFFFFFF0 is loaded into VRT. **IvepxI** provides a hint that the quadword in storage addressed by EA will probably not be needed again by the program in the near future. For *IvepxI*, the normal translation mechanism is not used. The contents of the EPLC register are used to provide the context in which translation occurs. The following substitutions are made for just the translation and access control process: EPLC<sub>EPR</sub> is used in place of MSR<sub>PR</sub> EPLC<sub>EAS</sub> is used in place of MSR<sub>DS</sub> EPLC<sub>EPID</sub> is used in place of all Process ID registers This instruction is privileged. An attempt to execute *IvepxI* while MSR<sub>SPV</sub>=0 will cause a Vector Unavailable interrupt. ### **Corequisite Categories:** Vector ### **Special Registers Altered:** None ### Programming Note See the Programming Notes for the *IvxI* instruction in Section 6.7.2 of Book I. ### Programming Note This instruction behaves identically to a *IvxI* instruction except for using the EPLC register to provide the translation context. # Store Vector by External Process ID Indexed X-form stvepx VRS,RA,RB ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) MEM(EA & 0xFFFF_FFFF_FFFF_FFF0, 16) \leftarrow (VRS) ``` Let the effective address (EA) be the sum (RA|0)+(RB). The contents of VRS are stored into the quadword in storage addressed by the result of EA ANDed with 0xFFFF\_FFFF\_FFFF\_FFF0. For **stvepx**, the normal translation mechanism is not used. The contents of the EPSC register are used to provide the context in which translation occurs. The following substitutions are made for just the translation and access control process: $$\begin{split} & \mathsf{EPSC}_{\mathsf{EPR}} \text{ is used in place of } \mathsf{MSR}_{\mathsf{PR}} \\ & \mathsf{EPSC}_{\mathsf{EAS}} \text{ is used in place of } \mathsf{MSR}_{\mathsf{DS}} \\ & \mathsf{EPSC}_{\mathsf{EPID}} \text{ is used in place of all Process ID registers} \end{split}$$ This instruction is privileged. An attempt to execute **stvepx** while MSR<sub>SPV</sub>=0 will cause a Vector Unavailable interrupt. ### **Corequisite Categories:** Vector ### Special Registers Altered: None ### **Programming Note** This instruction behaves identically to a **stvx** instruction except for using the EPSC register to provide the translation context. # Store Vector by External Process ID Indexed LRU X-form stvepxl VRS,RA,RB | 31 | VRS | RA | RB | 775 | / | |----|-----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) MEM(EA & 0xFFFF_FFFF_FFFF_FFF0, 16) \leftarrow (VRS) mark_as_not_likely_to_be_needed_again_anytime_soon (EA) ``` Let the effective address (EA) be the sum (RA|0)+(RB). The contents of VRS are stored into the quadword in storage addressed by the result of EA ANDed with 0xFFFF\_FFFF\_FFFF\_FFF0. The **stvepxI** instruction provides a hint that the quadword addressed by EA will probably not be needed again by the program in the near future. For *stvepxI*, the normal translation mechanism is not used. The contents of the EPSC register are used to provide the context in which translation occurs. The following substitutions are made for just the translation and access control process: EPSC<sub>EPR</sub> is used in place of MSR<sub>PR</sub> EPSC<sub>EAS</sub> is used in place of MSR<sub>DS</sub> EPSC<sub>EPID</sub> is used in place of all Process ID registers This instruction is privileged. An attempt to execute **stvepxI** while MSR<sub>SPV</sub>=0 will cause a Vector Unavailable interrupt. ### **Corequisite Categories:** Vector ### **Special Registers Altered:** None ### - Programming Note - See the Programming Notes for the *IvxI* instruction in Section 6.7.2 of Book I. ### Programming Note This instruction behaves identically to a *stvxI* instruction except for using the EPSC register to provide the translation context. ### **Chapter 4. Storage Control** | 4.1 Storage Addressing 639 4.2 Storage Exceptions 639 4.3 Instruction Fetch 640 | 4.8 Storage Control Attributes 649 4.8.1 Guarded Storage 649 4.8.1.1 Out-of-Order Accesses to Guarded | |-------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------| | 4.3.1 Implicit Branch 640 | Storage650 | | 4.3.2 Address Wrapping Combined with | 4.8.2 User-Definable 650 | | Changing MSR Bit CM 640 | 4.8.3 Storage Control Bits 650 | | 4.4 Data Access 640 | 4.8.3.1 Storage Control Bit Restrictions | | 4.5 Performing Operations | 650 | | Out-of-Order 640 | 4.8.3.2 Altering the Storage Control Bits | | 4.6 Invalid Real Address 641 | 651 | | 4.7 Storage Control 641 | 4.9 Storage Control Instructions 652 | | 4.7.1 Storage Control Registers 641 | 4.9.1 Cache Management Instructions 652 | | 4.7.1.1 Process ID Register 641 | 4.9.2 Cache Locking [Category: Embed- | | 4.7.1.2 Translation Lookaside Buffer . 641 | ded Cache Locking] | | 4.7.2 Page Identification 643 | 4.9.2.1 Lock Setting and Clearing 653 | | 4.7.3 Address Translation 646 | 4.9.2.2 Error Conditions 653 | | 4.7.4 Storage Access Control 647 | 4.9.2.2.1 Overlocking 653 | | 4.7.4.1 Execute Access 647 | 4.9.2.2.2 Unable-to-lock and Unable-to- | | 4.7.4.2 Write Access 647 | unlock Conditions 654 | | 4.7.4.3 Read Access 647 | 4.9.2.3 Cache Locking Instructions 655 | | 4.7.4.4 Storage Access Control Applied to | 4.9.3 Synchronize Instruction 657 | | Cache Management Instructions 647 | 4.9.4 Lookaside Buffer | | 4.7.4.5 Storage Access Control Applied to | Management 657 | | String Instructions 648 | 4.9.4.1 TLB Management Instructions. 658 | | 4.7.5 TLB Management 648 | | ### 4.1 Storage Addressing A program references storage using the effective address computed by the processor when it executes a Load, Store, Branch, or Cache Management instruction, or when it fetches the next sequential instruction. The effective address is translated to a real address according to procedures described in Section 4.7.2 and in Section 4.7.3. The real address that results from the respective translations is used to access main storage. For a complete discussion of storage addressing and effective address calculation, see Section 1.10 of Book I ### 4.2 Storage Exceptions A storage exception results when the sequential execution model requires that a storage access be performed but the access is not permitted (e.g., is not permitted by the storage protection mechanism), the access cannot be performed because the effective address cannot be translated to a real address, or the access matches some tracking mechanism criteria (e.g., Data Address Breakpoint). In certain cases a storage exception may result in the "restart" of (re-execution of at least part of) a Load or Store instruction. See Section 2.1 of Book II and Section 5.7 on page 686 in this Book. ### 4.3 Instruction Fetch The effective address for an instruction fetch is processed under control of MSR<sub>IS</sub>. The Address Translation mechanism is described beginning in Section 4.7.2. ### 4.3.1 Implicit Branch Explicitly altering certain MSR bits (using *mtmsr*), or explicitly altering TLB entries, certain System Registers and possibly other implementation-dependent registers, may have the side effect of changing the addresses, effective or real, from which the current instruction stream is being fetched. This side effect is called an implicit branch. For example, an mtmsr instruction that changes the value of MSR<sub>CM</sub> may change the real address from which the current instruction stream is being fetched. The MSR bits and System Registers (excluding implementation-dependent registers) for which alteration can cause an implicit branch are indicated as such in Chapter 10. "Synchronization Requirements for Context Alterations" on page 723. Implicit branches are not supported by the Power ISA. If an implicit branch occurs, the results are boundedly undefined. # 4.3.2 Address Wrapping Combined with Changing MSR Bit CM If the current instruction is at effective address $2^{32}$ -4 and is an *mtmsr* instruction that changes the contents of MSR<sub>CM</sub>, the effective address of the next sequential instruction is undefined. ### - Programming Note In the case described in the preceding paragraph, if an interrupt occurs before the next sequential instruction is executed, the contents of SRR0, CSRR0, or MCSRR0, as appropriate to the interrupt, are undefined. ### 4.4 Data Access The effective address for a data access is processed under control of MSR<sub>DS</sub>. The Address Translation mechanism is described beginning in Section 4.7.2. Storage control attributes may also affect instruction fetch. # 4.5 Performing Operations Out-of-Order An operation is said to be performed "in-order" if, at the time that it is performed, it is known to be required by the sequential execution model. An operation is said to be performed "out-of-order" if, at the time that it is performed, it is not known to be required by the sequential execution model. Operations are performed out-of-order by the processor on the expectation that the results will be needed by an instruction that will be required by the sequential execution model. Whether the results are really needed is contingent on everything that might divert the control flow away from the instruction, such as *Branch*, *Trap*, *System Call*, and *Return From Interrupt* instructions, and interrupts, and on everything that might change the context in which the instruction is executed. Typically, the processor performs operations out-oforder when it has resources that would otherwise be idle, so the operation incurs little or no cost. If subsequent events such as branches or interrupts indicate that the operation would not have been performed in the sequential execution model, the processor abandons any results of the operation (except as described below). In the remainder of this section, including its subsections, "Load instruction" includes the Cache Management and other instructions that are stated in the instruction descriptions to be "treated as a Load", and similarly for "Store instruction". A data access that is performed out-of-order may correspond to an arbitrary *Load* or *Store* instruction (e.g., a *Load* or *Store* instruction that is not in the instruction stream being executed). Similarly, an instruction fetch that is performed out-of-order may be for an arbitrary instruction (e.g., the aligned word at an arbitrary location in instruction storage). Most operations can be performed out-of-order, as long as the machine appears to follow the sequential execution model. Certain out-of-order operations are restricted, as follows. ■ Stores Stores are not performed out-of-order (even if the Store instructions that caused them were executed out-of-order). Accessing Guarded Storage The restrictions for this case are given in Section 4.8.1.1. The only permitted side effects of performing an operation out-of-order are the following. - A Machine Check that could be caused by in-order execution may occur out-of-order. - Non-Guarded storage locations that could be fetched into a cache by in-order fetching or execution of an arbitrary instruction may be fetched outof-order into that cache. ### 4.6 Invalid Real Address A storage access (including an access that is performed out-of-order; see Section 4.5) may cause a Machine Check if the accessed storage location contains an uncorrectable error or does not exist. See Section 5.6.2 on page 674. ### 4.7 Storage Control This section describes the address translation facility, access control, and storage control attributes. Demand-paged virtual memory is supported, as well as a variety of other management schemes that depend on precise control of effective-to-real address translation and flexible memory protection. Translation misses and protection faults cause precise exceptions. Sufficient information is available to correct the fault and restart the faulting instruction. The effective address space is divided into pages. The page represents the granularity of effective address translation, access control, and storage control attributes. Up to sixteen page sizes (1KB, 4KB, 16KB, 64KB, 256KB, 1MB, 4MB, 16MB, 64MB, 256MB, 1GB, 4GB, 16GB, 64GB, 256GB, 1TB) may be simultaneously supported. In order for an effective to real translation to exist, a valid entry for the page containing the effective address must be in the Translation Lookaside Buffer (TLB). Addresses for which no TLB entry exists cause TLB Miss exceptions. ### 4.7.1 Storage Control Registers In addition to the registers described below, the Machine State Register provides the IS and DS bits, that specify which of the two address spaces the respective instruction or data storage accesses are directed towards. $MSR_{PR}$ bit is also used by the storage access control mechanism. ### 4.7.1.1 Process ID Register The Process ID Register (PID) is a 32-bit register. Process ID Register bits are numbered 32 (most-significant bit) to 63 (least-significant bit). The Process ID Register provides a value that is used to construct a virtual address for accessing storage. The Process ID Register can be read using *mfspr* and can be written using *mtspr*. An implementation may opt to implement only the least-significant n bits of the Process ID Register, where $0 \le n \le 32$ , and n must be the same as the number of implemented bits in the TID field of the TLB entry. The most-significant 32-n bits of the Process ID Register are treated as reserved. Some implementations may support more than one Process ID Register. See User's Manual for the implementation. ### 4.7.1.2 Translation Lookaside Buffer The Translation Lookaside Buffer (TLB) is the hardware resource that controls translation, protection, and storage control attributes. The organization of the TLB (e.g. unified versus separate instruction and data, hierarchies, associativity, number of entries, etc.) is implementation-dependent. Thus, the software for updating the TLB is also implementation-dependent. For the purposes of this discussion, a unified TLB organization is assumed. The differences for an implementation with separate instruction and data TLBs are for the most part obvious (e.g. separate instructions or separate index ranges for reading, writing, searching, and invalidating each TLB). For details on how to synchronize TLB updates with instruction execution see Chapter 10. Maintenance of TLB entries is under software control. System software determines TLB entry replacement strategy and the format and use of any page state information. The TLB entry contains all the information required to identify the page, to specify the translation, to specify access controls, and to specify the storage control attributes. The format of the TLB entry is implementation-dependent. While the TLB is managed by software, an implementation may include partial or full hardware assist for TLB management (e.g. support of the Server environment's virtual memory architecture). However, such implementations should be able to disable such support with implementation-dependent software or hardware configuration mechanisms. A TLB entry is written by copying information from a GPR or other implementation-dependent source, using a series of *tlbwe* instructions (see page 660). A TLB entry is read by copying information to a GPR or other implementation-dependent target, using a series of *tlbre* instructions (see page 658). Software can also search for specific TLB entries using the *tlbsx* instruction (see page 659). Writing, reading and searching the TLB is implementation-dependent. Each TLB entry describes a page that is eligible for translation and access controls. Fields in the TLB entry fall into four categories: - Page identification fields (information required to identify the page to the hardware translation mechanism). - Address translation fields - Access control fields - Storage attribute fields While the fields in the TLB entry are required, no particular TLB entry format is formally specified. The *tlbre* and *tlbwe* instructions provide the ability to read or write portions of individual entries. Below are shown the field definitions for the TLB entry. # Page Identification Fields Name Description #### EPN **Effective Page Number** (up to 54 bits) Bits 0:*n*–1 of the EPN field are compared to bits 0:*n*–1 of the effective address (EA) of the storage access (where *n*=64–log2(*page size* in bytes) and *page size* is specified by the SIZE field of the TLB entry). See Table 1. Note: Bits X:Y of the EPN field may be implemented, where X=0 or X=32, and Y ≤ 53. The number of bits implemented for EPN are not required to be the same number of bits as are implemented for RPN. #### TS Translation Address Space This bit indicates the address space this TLB entry is associated with. For instruction storage accesses, MSR<sub>IS</sub> must match the value of TS in the TLB entry for that TLB entry to provide the translation. Likewise, for data storage accesses, MSR<sub>DS</sub> must match the value of TS in the TLB entry. For *tlbsx* and *tlbivax* instructions, an implementation-dependent source provides the address space specification that must match the value of TS. #### SIZE Page Size The SIZE field specifies the size of the page associated with the TLB entry as $4^{\text{SIZE}}$ KB, where $0 \le \text{SIZE} \le 15$ . Implementations may implement any one or more of these page sizes. See Table 1. TID **Translation ID** (implementation-dependent size) Field used to identify a shared page (TID=0) or the owner's process ID of a private page (TID≠0). See Section 4.7.2. #### V Valid This bit indicates that this TLB entry is valid and may be used for translation. The Valid bit for a given entry can be set or cleared with a *tlbwe* instruction; alternatively, the Valid bit for an entry may be cleared by a *tlbivax* instruction. #### Translation Field #### Name Description ### RPN Real Page Number (up to 54 bits) Bits 0:*n*–1 of the RPN field are used to replace bits 0:*n*–1 of the effective address to produce the real address for the storage access (where *n*=64–log2(*page size* in bytes) and *page size* is specified by the SIZE field of the TLB entry). Software must set unused loworder RPN bits (i.e. bits *n*:53) to 0. See Section 4.7.3. **Note:** Bits X:Y of the RPN field may be implemented, where $X \ge 0$ and $53 \ge Y$ . The number of bits implemented for EPN are not required to be the same number of bits as are implemented for RPN. Storage Control Bits (see Section 4.8.3 on page 650) ### Name Description - W Write-Through Required See Section 1.6.1 of Book II. - I Caching Inhibited See Section 1.6.2 of Book II. - M Memory Coherence Required See Section 1.6.3 of Book II. - G Guarded See Section 1.6.4 of Book II and Section 4.8.1. - E **Endian Mode** See Section 1.10.1 of Book I and Section 1.6.5 of Book II. ### U0:U3 User-Definable Storage Control Attributes See Section 4.8.2. Specifies implementation-dependent and system-dependent storage control attributes for the page associated with the TLB entry. VLE Variable Length Encoding [Category: VLE] See Section 4.8.3 and Chapter 1 of Book VLE. #### Access Control Fields ### Name Description - UX **User State Execute Enable** See Section 4.7.4.1. - 0 Instruction fetch and execution is not permitted from this page while MSR<sub>PR</sub>=1 and will cause an Execute Access Control exception type Instruction Storage interrupt. - 1 Instruction fetch and execution is permitted from this page while MSR<sub>PR</sub>=1. - SX Supervisor State Execute Enable See Section 4.7.4.1. - 0 Instruction fetch and execution is not permitted from this page while MSR<sub>PR</sub>=0 and will cause an Execute Access Control exception type Instruction Storage interrupt. - Instruction fetch and execution is permitted from this page while MSR<sub>PR</sub>=1. - UW **User State Write Enable** See Section 4.7.4.2. - 0 Store operations, including dcba dcbz, and dcbzep are not permitted to this page when MSR<sub>PR</sub>=1 and will cause a Write Access Control exception. Except as noted in Table 3 on page 648, a Write Access Control exception will cause a Data Storage interrupt. - 1 Store operations, including dcba, dcbz, and dcbzep are permitted to this page when MSR<sub>PR</sub>=1. - SW Supervisor State Write Enable See Section 4.7.4.2. - 0 Store operations, including dcba, dcbi, dcbz, and dcbzep are not permitted to this page when MSR<sub>PR</sub>=0. Store operations, including dcbi, dcbz, and dcbzep, will cause a Write Access Control exception. Except as noted in Table 3 on page 648, a Write Access Control exception will cause a Data Storage interrupt. - 1 Store operations, including dcba, dcbi, dcbz, and dcbzep, are permitted to this page when MSR<sub>PR</sub>=0. - UR **User State Read Enable** See Section 4.7.4.3. - 0 Load operations (including load-class Cache Management instructions) are not permitted from this page when MSR<sub>PR</sub>=1 and will cause a Read Access Control exception. Except as noted in Table 3 on page 648, a Read Access Control exception will cause a Data Storage interrupt. - 1 Load operations (including load-class Cache Management instructions) are permitted from this page when MSR<sub>PR</sub>=1. - SR Supervisor State Read Enable See Section 4.7.4.3. - 0 Load operations (including load-class Cache Management instructions) are not permitted from this page when MSR<sub>PR</sub>=0 and will cause a Read Access Control exception. Except as noted in Table 3 on page 648, a Read Access Control exception will cause a Data Storage interrupt. - 1 Load operations (including load-class Cache Management instructions) are permitted from this page when MSR<sub>PR</sub>=0. ### 4.7.2 Page Identification Instruction effective addresses are generated for sequential instruction fetches and for addresses that correspond to a change in program flow (branches, interrupts). Data effective addresses are generated by Load, Store, and Cache Management instructions. TLB Management instructions generate effective addresses to determine the presence of or to invalidate a specific TLB entry associated with that address. The Valid (V) bit, Effective Page Number (EPN) field, Translation Space Identifier (TS) bit, Page Size (SIZE) field, and Translation ID (TID) field of a particular TLB entry identify the page associated with that TLB entry. Except as noted, all comparisons must succeed to validate this entry for subsequent translation and access control processing. Failure to locate a matching TLB entry based on this criteria for instruction fetches will result in an Instruction TLB Miss exception type Instruction TLB entry based on this criteria for data storage accesses will result in a Data TLB Miss exception which may result in a Data TLB Error interrupt. Figure 8 on page 644 illustrates the criteria for a virtual address to match a specific TLB entry. There are two address spaces, one typically associated with interrupt-related storage accesses and one typically associated with non-interrupt-related storage accesses. There are two bits in the Machine State Register, the Instruction Address Space bit (IS) and the Data Address Space bit (DS), that control which address space instruction and data storage accesses, respectively, are performed in, and a bit in the TLB entry (TS) that specifies which address space that TLB entry is associated with. Load, Store, Cache Management, Branch, tlbsx, and tlbivax instructions and next-sequential-instruction fetches produce a 64-bit effective address. The virtual address space is extended from this 64-bit effective address space by prepending a one-bit address space identifier and a process identifier. For instruction fetches, the address space identifier is provided by MSR<sub>IS</sub> and the process identifier is provided by the contents of the Process ID Register. For data storage accesses, the address space identifier is provided by the MSR<sub>DS</sub> and the process identifier is provided by the contents of the Process ID Register. For tlbsx, and tlbivax instructions, the address space identifier and the process identifier are provided by implementation-dependent sources. This virtual address is used to locate the associated entry in the TLB. The address space identifier, the process identifier, and the effective address of the storage access are compared to the Translation Address Space bit (TS), the Translation ID field (TID), and the value in the Effective Page Number field (EPN), respectively, of each TLB entry. The virtual address of a storage access matches a TLB entry if, for every TLB entry i in the congruence class specified by EA: - the value of the address specifier for the storage access (MSR<sub>IS</sub> for instruction fetches, MSR<sub>DS</sub> for data storage accesses, and implementationdependent source for *tlbsx* and *tlbivax*) is equal to the value of the TS bit of the TLB entry, and - either the value of the process identifier (Process ID Register for instruction and data storage accesses, and implementation-dependent source for *tlbsx* and *tlbivax*) is equal to the value in the TID field of the TLB entry, or the value of the TID field of the TLB entry is equal to 0, and - the contents of bits 0:n-1 of the effective address of the storage or TLB access are equal to the value of bits 0:n-1 of the EPN field of the TLB entry (where n=64-log<sub>2</sub>(page size in bytes) and page size is specified by the value of the SIZE field of the TLB entry). See Table 1. A TLB Miss exception occurs if there is no valid entry in the TLB for the page specified by the virtual address (Instruction or Data TLB Error interrupt). Although the possibility to place multiple entries into the TLB that match a specific virtual address exists, assuming a setassociative or fully-associative organization, doing so is a programming error and the results are undefined. Table 1: Page Size and Effective Address to EPN Comparison | SIZE | Page Size<br>(4 <sup>SIZE</sup> KB) | EA to EPN Comparison (bits 0:53–2¥SIZE) | |---------|-------------------------------------|-----------------------------------------| | =0b0000 | 1KB | $EPN_{0:53} = ? EA_{0:53}$ | | =0b0001 | 4KB | $EPN_{0:51} = ? EA_{0:51}$ | | =0b0010 | 16KB | $EPN_{0:49} = ? EA_{0:49}$ | | =0b0011 | 64KB | $EPN_{0:47} = ? EA_{0:47}$ | | =0b0100 | 256KB | $EPN_{0:45} = ? EA_{0:45}$ | | =0b0101 | 1MB | $EPN_{0:43} = ? EA_{0:43}$ | | =0b0110 | 4MB | $EPN_{0:41} = ? EA_{0:41}$ | | =0b0111 | 16MB | $EPN_{0:39} = ? EA_{0:39}$ | | =0b1000 | 64MB | $EPN_{0:37} = ? EA_{0:37}$ | | =0b1001 | 256MB | $EPN_{0:35} = ? EA_{0:35}$ | | =0b1010 | 1GB | $EPN_{0:33} = ? EA_{0:33}$ | | =0b1011 | 4GB | $EPN_{0:31} = ? EA_{0:31}$ | | =0b1100 | 16GB | $EPN_{0:29} = ? EA_{0:29}$ | | =0b1101 | 64GB | $EPN_{0:27} = ? EA_{0:27}$ | | =0b1110 | 256GB | $EPN_{0:25} = ? EA_{0:25}$ | | =0b1111 | 1TB | $EPN_{0:23} = ? EA_{0:23}$ | Figure 8. Virtual Address to TLB Entry Match Process Figure 9. Effective-to-Real Address Translation Flow ### 4.7.3 Address Translation A program references memory by using the effective address computed by the processor when it executes a *Load, Store, Cache Management*, or *Branch* instruction, and when it fetches the next instruction. The effective address is translated to a real address according to the procedures described in this section. The storage subsystem uses the real address for the access. All storage access effective addresses are translated to real addresses using the TLB mechanism. See Figure If the virtual address of the storage access matches a TLB entry in accordance with the selection criteria specified in Section 4.7.2, the value of the Real Page Number field (RPN) of the selected TLB entry provides the real page number portion of the real address. Let $n=64-\log_2(page\ size\ in\ bytes)$ where $page\ size$ is specified by the SIZE field of the TLB entry. Bits n:63 of the effective address are appended to bits 0:n-1 of the 54-bit RPN field of the selected TLB entry to produce the 64-bit real address (i.e. $RA = RPN_{0:n-1} \mid\mid EA_{n:63}$ ). The page size is determined by the value of the SIZE field of the selected TLB entry. See Table 2. The rest of the selected TLB entry provides the access control bits (UX, SX, UW, SW, UR, SR), and storage control attributes (U0, U1, U2, U3, W, I, M, G, E) for the storage access. The access control bits and storage attribute bits specify whether or not the access is allowed and how the access is to be performed. See Sections 4.7.4 and 4.7.5. The Real Page Number field (RPN) of the matching TLB entry provides the translation for the effective address of the storage access. Based on the setting of the SIZE field of the matching TLB entry, the RPN field replaces the corresponding most-significant N bits of the effective address (where N = $64 - \log_2(\text{page size})$ ), as shown in Table 2, to produce the 64-bit real address that is to be presented to main storage to perform the storage access. | Table 2: E | Table 2: Effective Address to Real Address | | | | | | | |-----------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|--| | SIZE | Page<br>Size<br>(4 <sup>SIZE</sup><br>KB) | RPN Bits<br>Required<br>to be Equal<br>to 0 | Real Address | | | | | | =0b0000<br>=0b0001<br>=0b0010<br>=0b0011<br>=0b0100<br>=0b0101<br>=0b0111<br>=0b1000<br>=0b1001<br>=0b1010<br>=0b1011 | 1KB<br>4KB<br>16KB<br>64KB<br>256KB<br>1MB<br>4MB<br>16MB<br>64MB<br>256MB<br>1GB<br>4GB | none<br>RPN <sub>52:53</sub> =0<br>RPN <sub>50:53</sub> =0<br>RPN <sub>48:53</sub> =0<br>RPN <sub>46:53</sub> =0<br>RPN <sub>42:53</sub> =0<br>RPN <sub>42:53</sub> =0<br>RPN <sub>40:53</sub> =0<br>RPN <sub>36:53</sub> =0<br>RPN <sub>36:53</sub> =0<br>RPN <sub>32:53</sub> =0<br>RPN <sub>30:53</sub> =0<br>RPN <sub>30:53</sub> =0 | RPN <sub>0:53</sub> EA <sub>54:63</sub><br>RPN <sub>0:51</sub> EA <sub>52:63</sub><br>RPN <sub>0:49</sub> EA <sub>50:63</sub><br>RPN <sub>0:47</sub> EA <sub>48:63</sub><br>RPN <sub>0:45</sub> EA <sub>46:63</sub><br>RPN <sub>0:45</sub> EA <sub>44:63</sub><br>RPN <sub>0:41</sub> EA <sub>42:63</sub><br>RPN <sub>0:39</sub> EA <sub>40:63</sub><br>RPN <sub>0:37</sub> EA <sub>38:63</sub><br>RPN <sub>0:35</sub> EA <sub>36:63</sub><br>RPN <sub>0:33</sub> EA <sub>34:63</sub><br>RPN <sub>0:31</sub> EA <sub>32:63</sub><br>RPN <sub>0:31</sub> EA <sub>32:63</sub><br>RPN <sub>0:29</sub> EA <sub>30:63</sub> | | | | | | =0b1101<br>=0b1110<br>=0b1111 | 64GB<br>256GB<br>1TB | RPN <sub>28:53</sub> =0<br>RPN <sub>26:53</sub> =0<br>RPN <sub>24:53</sub> =0 | RPN <sub>0:25</sub> EA <sub>28:63</sub><br>RPN <sub>0:25</sub> EA <sub>26:63</sub><br>RPN <sub>0:23</sub> EA <sub>24:63</sub> | | | | | Figure 10. Access Control Process ### 4.7.4 Storage Access Control After a matching TLB entry has been identified, an access control mechanism selectively grants shared access, grants execute access, grants read access, grants write access, and prohibits access to areas of storage based on a number of criteria. Figure 10 illustrates the access control process and is described in detail in Sections 4.7.4.1 through 4.7.4.5. An Execute, Read, or Write Access Control exception occurs if the appropriate TLB entry is found but the access is not allowed by the access control mechanism (Instruction or Data Storage interrupt). See Section 5.6 for additional information about these and other interrupt types. In certain cases, Execute, Read, and Write Access Control exceptions may result in the restart of (re-execution of at least part of) a *Load* or *Store* instruction Some implementation may provide additional access control capabilities beyond that described here. ### 4.7.4.1 Execute Access The UX and SX bits of the TLB entry control *execute* access to the page (see Table 3). Instructions may be fetched and executed from a page in storage while in user state (MSR<sub>PR</sub>=1) if the UX access control bit for that page is equal to 1. If the UX access control bit is equal to 0, then instructions from that page will not be fetched, and will not be placed into any cache as the result of a fetch request to that page while in user state. Instructions may be fetched and executed from a page in storage while in supervisor state (MSR<sub>PR</sub>=0) if the SX access control bit for that page is equal to 1. If the SX access control bit is equal to 0, then instructions from that page will not be fetched, and will not be placed into any cache as the result of a fetch request to that page while in supervisor state. Instructions from no-execute storage may be in the instruction cache if they were fetched into that cache when their effective addresses were mapped to execute permitted storage. Software need not flush a page from the instruction cache before marking it no-execute. Furthermore, if the sequential execution model calls for the execution of an instruction from a page that is not enabled for execution (i.e. UX=0 when $MSR_{PR}=1$ or SX=0 when $MSR_{PR}=0$ ), an Execute Access Control exception type Instruction Storage interrupt is taken. ### 4.7.4.2 Write Access The UW and SW bits of the TLB entry control *write* access to the page (seeTable 3). Store operations (including *Store*-class *Cache Management* instructions) are permitted to a page in storage while in user state (MSR<sub>PR</sub>=1) if the UW access control bit for that page is equal to 1. If the UW access control bit is equal to 0, then execution of the *Store* instruction is suppressed and a Write Access Control exception type Data Storage interrupt is taken. Store operations (including *Store*-class *Cache Management* instructions) are permitted to a page in storage while in supervisor state (MSR<sub>PR</sub>=0) if the SW access control bit for that page is equal to 1. If the SW access control bit is equal to 0, then execution of the *Store* instruction is suppressed and a Write Access Control exception type Data Storage interrupt is taken. ### 4.7.4.3 Read Access The UR and SR bits of the TLB entry control *read* access to the page (see Table 3). Load operations (including *Load*-class *Cache Management* instructions) are permitted from a page in storage while in user state (MSR<sub>PR</sub>=1) if the UR access control bit for that page is equal to 1. If the UR access control bit is equal to 0, then execution of the *Load* instruction is suppressed and a Read Access Control exception type Data Storage interrupt is taken. Load operations (including *Load*-class *Cache Management* instructions) are permitted from a page in storage while in supervisor state (MSR<sub>PR</sub>=0) if the SR access control bit for that page is equal to 1. If the SR access control bit is equal to 0, then execution of the *Load* instruction is suppressed and a Read Access Control exception type Data Storage interrupt is taken. # 4.7.4.4 Storage Access Control Applied to Cache Management Instructions dcbi, dcbz, and dcbzep instructions are treated as Stores since they can change data (or cause loss of data by invalidating a dirty line). As such, they both can cause Write Access Control exception type Data Storage interrupts. If an implementation first flushes a line before invalidating it during a dcbi, the dcbi is treated as a a Load since the data is not modified. **dcba** instructions are treated as *Stores* since they can change data. As such, they can cause Write Access Control exceptions. However, such exceptions will not result in a Data Storage interrupt. *icbi* and *icbiep* instructions are treated as *Loads* with respect to protection. As such, they can cause Read Access Control exception type Data Storage interrupts. **dcbt**, **dcbtep**, **dcbtst**, **dcbtstep**, and **icbt** instructions are treated as *Loads* with respect to protection. As such, they can cause Read Access Control exceptions. However, such exceptions will not result in a Data Storage interrupt. dcbf, dcbfep, dcbst, and dcbstep instructions are treated as Loads with respect to protection. Flushing or storing a line from the cache is not considered a Store since the store has already been done to update the cache and the dcbf, dcbfep, dcbst, or dcbstep instruction is only updating the copy in main storage. As a Load, they can cause Read Access Control exception type Data Storage interrupts. Table 3: Storage Access Control Applied to Cache Instructions | Instruction | Read Protection<br>Violation | Write Protection<br>Violation | |-------------|------------------------------|-------------------------------| | dcba | No | Yes <sup>2</sup> | | dcbf | Yes | No | | dcbfep | Yes | No | | dcbi | Yes <sup>3</sup> | Yes <sup>3</sup> | | dcblc | Yes | No | | dcbst | Yes | No | | dcbstep | Yes | No | | dcbt | Yes <sup>1</sup> | No | | dcbtep | Yes <sup>1</sup> | No | | dcbtls | Yes | No | | dcbtst | Yes <sup>1</sup> | No | | dcbtstep | Yes <sup>1</sup> | No | | dcbtstls | Yes <sup>4</sup> | Yes <sup>4</sup> | | dcbz | No | Yes | | dcbzep | No | Yes | | dci | No | No | | icbi | Yes | No | | icbiep | Yes | No | | icblc | Yes <sup>5</sup> | No | | icbt | Yes <sup>1</sup> | No | | icbtls | Yes <sup>5</sup> | No | | ici | No | No | - dcbt, dcbtep, dcbtst, dcbtstep, and icbt may cause a Read Access Control exception but does not result in a Data Storage interrupt. - 2. *dcba* may cause a Write Access Control exception but does not result in a Data Storage interrupt. - dcbi may cause a Read or Write Access Control Exception based on whether the data is flushed prior to invalidation. - 4. It is implementation-dependent whether *dcbtstls* is treated as a *Load* or a *Store*. - 5. icbtls and icblc require execute or read access. # 4.7.4.5 Storage Access Control Applied to String Instructions When the string length is zero, neither *Iswx* nor *stswx* can cause Data Storage interrupts. ### 4.7.5 TLB Management No format for the Page Tables or the Page Table Entries is implied. Software has significant flexibility in implementing a custom replacement strategy. For example, software may choose to lock TLB entries that correspond to frequently used storage, so that those entries are never cast out of the TLB and TLB Miss exceptions to those pages never occur. At a minimum, software must maintain an entry or entries for the Instruction and Data TLB Error interrupt handlers. TLB management is performed in software with some hardware assist. This hardware assist consists of a minimum of: - Automatic recording of the effective address causing a TLB Miss exception. For Instruction TLB Miss exceptions, the address is saved in the Save/Restore Register 0. For Data TLB Miss exceptions, the address is saved in the Data Exception Address Register. - Instructions for reading, writing, searching, invalidating, and synchronizing the TLB (see Section 4.9.4.1). ### **Programming Note** This Note suggests one example for managing reference and change recording. When performing physical page management, it is useful to know whether a given physical page has been referenced or altered. Note that this may be more involved than whether a given TLB entry has been used to reference or alter memory, since multiple TLB entries may translate to the same physical page. If it is necessary to replace the contents of some physical page with other contents, a page which has been referenced (accessed for any purpose) is more likely to be maintained than a page which has never been referenced. If the contents of a given physical page are to be replaced, then the contents of that page must be written to the backing store before replacement, if anything in that page has been changed. Software must maintain records to control this process. Similarly, when performing TLB management, it is useful to know whether a given TLB entry has been referenced. When making a decision about which entry to cast-out of the TLB, an entry which has been referenced is more likely to be maintained in the TLB than an entry which has never been referenced. Execute, Read and Write Access Control exceptions may be used to allow software to maintain reference information for a TLB entry and for its associated physical page. The entry is built, with its UX, SX, UR, SR, UW, and SW bits off, and the index and effective page number of the entry retained by software. The first attempt of application code to use the page will cause an Access Control exception (because the entry is marked "No Execute", "No Read", and "No Write"). The Instruction or Data Storage interrupt handler records the reference to the TLB entry and to the associated physical page in a software table, and then turns on the appropriate access control bit. An initial read from the page could be handled by only turning on the appropriate UR or SR access control bits, leaving the page "read-only". Subsequent execute, read, or write accesses to the page via this TLB entry will proceed normally. In a demand-paged environment, when the contents of a physical page are to be replaced, if any storage in that physical page has been altered, then the backing storage must be updated. The information that a physical page is dirty is typically recorded in a "Change" bit for that page. Write Access Control exceptions may be used to allow software to maintain change information for a physical page. For the example just given for reference recording, the first write access to the page via the TLB entry will create a Write Access Control exception type Data Storage interrupt. The Data Storage interrupt handler records the change status to the physical page in a software table, and then turns on the appropriate UW and SW bits. All subsequent accesses to the page via this TLB entry will proceed normally. ### 4.8 Storage Control Attributes This section describes aspects of the storage control attributes that are relevant only to privileged software programmers. The rest of the description of storage control attributes may be found in Section 1.6 of Book II and subsections. ### 4.8.1 Guarded Storage Storage is said to be "well-behaved" if the corresponding real storage exists and is not defective, and if the effects of a single access to it are indistinguishable from the effects of multiple identical accesses to it. Data and instructions can be fetched out-of-order from well-behaved storage without causing undesired side effects. Storage is said to be Guarded if the G bit is 1 in the TLB entry that translates the effective address. In general, storage that is not well-behaved should be Guarded. Because such storage may represent a control register on an I/O device or may include locations that do not exist, an out-of-order access to such storage may cause an I/O device to perform unintended operations or may result in a Machine Check. Instruction fetching is not affected by the G bit. Software must set guarded pages to no execute (i.e. UX=0 and SX=0) to prevent instruction fetching from guarded storage. The following rules apply to in-order execution of *Load* and *Store* instructions for which the first byte of the storage operand is in storage that is both Caching Inhibited and Guarded. Load or Store instruction that causes an atomic access If any portion of the storage operand has been accessed, the instruction completes before the interrupt occurs if any of the following exceptions is pending. - External, Decrementer, Critical Input, Machine Check, Fixed-Interval Timer, Watchdog Timer, Debug, or Imprecise mode Floating-Point or Auxiliary Processor Enabled - Load or Store instruction that causes an Alignment exception, a Data TLB Error exception, or that causes a Data Storage exception. The portion of the storage operand that is in Caching Inhibited and Guarded storage is not accessed. # 4.8.1.1 Out-of-Order Accesses to Guarded Storage In general, Guarded storage is not accessed out-oforder. The only exceptions to this rule are the following. #### **Load Instruction** If a copy of any byte of the storage operand is in a cache then that byte may be accessed in the cache or in main storage. ### 4.8.2 User-Definable User-definable storage control attributes control user-definable and implementation-dependent behavior of the storage system. These bits are both implementation-dependent and system-dependent in their effect. They may be used in any combination and also in combination with the other storage attribute bits. ### 4.8.3 Storage Control Bits Storage control attributes are specified on a per-page basis. These attributes are specified in storage control bits in the TLB entries. The interpretation of their values is given in Figure 11. | Bit | Storage Control Attribute | |--------------------|--------------------------------------------------------------------| | W <sup>1</sup> | 0 - not Write Through Required 1 - Write Through Required | | I | 0 - not Caching Inhibited<br>1 - Caching Inhibited | | M <sup>2</sup> | 0 - not Memory Coherence Required<br>1 - Memory Coherence Required | | G | 0 - not Guarded<br>1 - Guarded | | E <sup>3</sup> | 0 - Big-Endian<br>1 - Little-Endian | | U0-U3 <sup>4</sup> | User-Definable | | VLE <sup>5</sup> | 0 - non Variable Length Encoding (VLE).<br>1 - VLE | - Support for the 1 value of the W bit is optional. Implementations that do not support the 1 value treat the bit as reserved and assume its value to be 0. - Support of the 1 value is optional for implementations that do not support multiprocessing, implementations that do not support this storage attribute assume the value of the bit to be 0, and setting M=1 in a TLB entry will have no effect. - <sup>3</sup> [Category: Embedded.Little-Endian] - Support for these attributes is optional. - 5 [Category: VLE] Figure 11. Storage control bits In Section 4.8.3.1 and 4.8.3.2, "access" includes accesses that are performed out-of-order. #### Programming Note — In a uniprocessor system in which only the processor has caches, correct coherent execution does not require the processor to access storage as Memory Coherence Required, and accessing storage as not Memory Coherence Required may give better performance. ### 4.8.3.1 Storage Control Bit Restrictions All combinations of W, I, M, G, and E values are permitted except those for which both W and I are 1. ### Programming Note - If an application program requests both the Write Through Required and the Caching Inhibited attributes for a given storage location, the operating system should set the I bit to 1 and the W bit to 0. At any given time, the value of the I bit must be the same for all accesses to a given real page. Accesses to the same storage location using two effective addresses for which the W bit differs meet the memory coherence requirements described in Section 1.6.3 of Book II if the accesses are performed by a single processor. If the accesses are performed by two or more processors, coherence is enforced by the hardware only if the W bit is the same for all the accesses. At any given time, data accesses to a given real page may use both Endian modes. When changing the Endian mode of a given real page for instruction fetching, care must be taken to prevent accesses while the change is made and to flush the instruction cache(s) after the change has been completed. # 4.8.3.2 Altering the Storage Control Bits When changing the value of the I bit for a given real page from 0 to 1, software must set the I bit to 1 and then flush all copies of locations in the page from the caches using *dcbf*, *dcbfep*, or *dcbi*, and *icbi* or *icbiep* before permitting any other accesses to the page. When changing the value of the W bit for a given real page from 0 to 1, software must ensure that no processor modifies any location in the page until after all copies of locations in the page that are considered to be modified in the data caches have been copied to main storage using *dcbst*, *dcbstep*, *dcbf*, *dcbfep*, or *dcbi*. When changing the value of the M bit for a given real page, software must ensure that all data caches are consistent with main storage. The actions required to do this to are system-dependent. ### Programming Note - For example, when changing the M bit in some directory-based systems, software may be required to execute *dcbf* or *dcbfep* on each processor to flush all storage locations accessed with the old M value before permitting the locations to be accessed with the new M value. ### 4.9 Storage Control Instructions ### 4.9.1 Cache Management Instructions This section describes aspects of cache management that are relevant only to privileged software programmers. For a *dcbz* or *dcba* instruction that causes the target block to be newly established in the data cache without being fetched from main storage, the processor need not verify that the associated real address is valid. The existence of a data cache block that is associated with an invalid real address (see Section 4.6) can cause a delayed Machine Check interrupt or a delayed Checkstop. Each implementation provides an efficient means by which software can ensure that all blocks that are considered to be modified in the data cache have been copied to main storage before the processor enters any power conserving mode in which data cache contents are not maintained. ### Data Cache Block Invalidate X-form dcbi RA,RB if RA=0 then b $\leftarrow$ 0 else b $\leftarrow$ (RA) EA $\leftarrow$ b + (RB) InvalidateDataCacheBlock(EA) Let the effective address (EA) be the sum (RA|0)+(RB). If the block containing the byte addressed by EA is in storage that is Memory Coherence Required and a block containing the byte addressed by EA is in the data cache of any processors, then the block is invalidated in those data caches. On some implementations, before the block is invalidated, if any locations in the block are considered to be modified in any such data cache, those locations are written to main storage and additional locations in the block may be written to main storage. If the block containing the byte addressed by EA is in storage that is not Memory Coherence Required and a block containing the byte addressed by EA is in the data cache of this processor, then the block is invalidated in that data cache. On some implementations, before the block is invalidated, if any locations in the block are considered to be modified in that data cache, those locations are written to main storage and additional locations in the block may be written to main storage. The function of this instruction is independent of whether the block containing the byte addressed by EA is in storage that is Write Through Required or Caching Inhibited. This instruction is treated as a *Store* (see Section 4.7.4.4) on implementations that invalidate a block without first writing to main storage all locations in the block that are considered to be modified in the data cache, except that the invalidation is not ordered by **mbar**. On other implementations this instruction is treated as a *Load* (see the section cited above). If a processor holds a reservation and some other processor executes a *dcbi* to the same reservation granule, whether the reservation is lost is undefined. **dcbi** may cause a cache locking exception, the details of which are implementation-dependent. This instruction is privileged. **Special Registers Altered:** None ### 4.9.2 Cache Locking [Category: Embedded Cache Locking] The Embedded Cache Locking category defines instructions and methods for locking cache blocks for frequently used instructions and data. Cache locking allows software to instruct the cache to keep latency sensitive data readily available for fast access. This is accomplished by marking individual cache blocks as locked. A locked block differs from a normal block in the cache in the following way: blocks that are locked in the cache do not participate in the normal replacement policy when a block must be replaced. ### 4.9.2.1 Lock Setting and Clearing Blocks are locked into the cache by software using *Cache Locking* instructions. The following instructions are provided to lock data items into the data and instruction cache: - dcbtls Data cache block touch and lock set. - dcbtstls Data cache block touch for store and lock set. - icbtls Instruction cache block touch and lock set. The RA and RB operands in these instructions are used to identify the block to be locked. The CT field indicates which cache in the cache hierarchy should be targeted. (See Section 3.3 of Book II.) These instructions are similar in nature to the *dcbt*, *dcbtst*, and *icbt* instructions, but are not hints and thus locking instructions do not execute speculatively and may cause additional exceptions. For unified caches, both the instruction lock set and the data lock set target the same cache. Similarly, blocks are unlocked from the cache by software using *Lock Clear* instructions. The following instructions are provided to unlock instructions and data in their respective caches: - dcblc Data cache block lock clear. - *icblc* Instruction cache block lock clear. The RA and RB operands in these instructions are used to identify the block to be unlocked. The CT field indicates which cache in the cache hierarchy should be targeted. Additionally, an implementation-dependent method can be provided for software to clear all the locks in the cache. An implementation is not required to unlock blocks that contain data that has been invalidated unless it is explicitly unlocked with a *dcblc* or *icblc* instruction; if the implementation does not unlock the block upon invalidation, the block remains locked even though it contains invalid data. If the implementation does not clear locks when the associated block is invalidated, the method of locking is said to be *persistent*; otherwise it is *not persistent*. An implementation may choose to implement locks as persistent or not persistent; however, the preferred method is persistent. It is implementation-dependent if cache blocks are implicitly unlocked in the following ways: - A locked block is invalidated as the result of a dcbi, dcbf, dcbfep, icbi, or icbiep instruction. - A locked block is evicted because of an overlocking condition. - A snoop hit on a locked block that requires the block to be invalidated. This can occur because the data the block contains has been modified external to the processor, or another processor has explicitly invalidated the block. - The entire cache containing the locked block is invalidated. ### 4.9.2.2 Error Conditions Setting locks in the cache can fail for a variety of reasons. A *Lock Set* instruction addressing a byte in storage that is not allowed to be accessed by the storage access control mechanism (see Section 4.7.4) will cause a Data Storage interrupt (DSI). Addresses referenced by *Cache Locking* instructions are always translated as data references; therefore, *icbtls* instructions that fail to translate or are not allowed by the storage access control mechanism cause Data TLB Error interrupts and Data Storage interrupts, respectively. Additionally, cache locking and clearing operations can fail due to non-privileged access. The methods for determining other failure conditions such as unable-to-lock or overlocking (see below), is implementation-dependent When a *Cache Locking* instruction is executed in user mode and MSR<sub>UCLE</sub> is 0, a Data Storage interrupt occurs and one of the following ESR bits is set to 1. #### Bit Description - 42 **DLK**<sub>0</sub> - Default setting. - 1 A dcbtls, dcbtstls, or dcblc instruction was executed in user mode. - 43 **DLK**<sub>1</sub> - Default setting. - 1 An *icbtls* or *icblc* instruction was executed in user mode. ### 4.9.2.2.1 Overlocking If no exceptions occur for the execution of an *dcbtls*, *dcbtstls*, or *icbtls* instruction, an attempt is made to lock the specified block into the cache. If all of the available cache blocks into which the specified block may be loaded are already locked, an overlocking condition occurs. The overlocking condition may be reported in an implementation-dependent manner. If an overlocking condition occurs, it is implementationdependent whether the specified block is not locked into the cache or if another locked block is evicted and the specified block is locked. The selection of which block is replaced in an overlocking situation is implementation-dependent. The overlocking condition is still said to exist, and is reflected in any implementation-dependent overlocking status. An attempt to lock a block that is already present and valid in the cache will not cause an overlocking condition. If a cache block is to be loaded because of an instruction other than a *Cache Management* or *Cache Locking* instruction and all available blocks into which the block can be loaded are locked, the instruction executes and completes, but no cache blocks are unlocked and the block is not loaded into the cache. ### - Programming Note Since caches may be shared among processors, an overlocking condition may occur when loading a block even though a given processor has not locked all the available cache blocks. Similarly, blocks may be unlocked as a result of invalidations by other processors. # 4.9.2.2.2 Unable-to-lock and Unable-to-unlock Conditions If no exceptions occur and no overlocking condition exists, an attempt to set or unlock a lock may fail if any of the following are true: - The target address is marked Caching Inhibited, or the storage attributes of the address use a coherency protocol that does not support locking. - The target cache is disabled or not present. - The CT field of the instructions contains a value not supported by the implementation. - Any other implementation-specific error conditions are detected. If an unable-to-lock or unable-to-unlock condition occurs, the lock set or unlock instruction is treated as a no-op and the condition may be reported in an implementation-dependent manner. ### 4.9.2.3 Cache Locking Instructions # Data Cache Block Touch and Lock Set X-form dcbtls CT,RA,RB | 31 | / | CT | RA | RB | 166 | / | |----|---|----|----|----|-----|----| | 0 | 6 | 7 | 11 | 16 | 21 | 31 | Let the effective address (EA) be the sum (RA|0)+(RB). The *dcbtls* instruction provides a hint that the program will probably soon load from the block containing the byte addressed by EA, and that the block containing the byte addressed by EA is to be loaded and locked into the cache specified by the CT field. (See Section 3.3 of Book II.) If the CT field is set to a value not supported by the implementation, no operation is performed. If the block already exists in the cache, the block is locked without accessing storage. If the block is in a storage location that is Caching Inhibited, then no cache operation is performed. An unable-to-lock condition may occur (see Section 4.9.2.2.2), or an overlocking condition may occur (see Section 4.9.2.2.1). The *dcbtls* instruction may complete before the operation it causes has been performed. The instruction is treated as a *Load*. This instruction is privileged unless the Embedded Cache Locking. User Mode category is supported. If the Embedded Cache Locking. User Mode category is supported, this instruction is privileged only if MSR<sub>UCLE</sub>=0. ### **Special Registers Altered:** None # Data Cache Block Touch for Store and Lock Set X-form dcbtstls CT,RA,RB | 31 | / | СТ | RA | RB | 134 | / | |----|---|----|----|----|-----|----| | 0 | 6 | 7 | 11 | 16 | 21 | 31 | Let the effective address (EA) be the sum (RA|0)+(RB). The *dcbtstls* instruction provides a hint that the program will probably soon store to the block containing the byte addressed by EA, and that the block containing the byte addressed by EA is to be loaded and locked into the cache specified by the CT field. (See Section 3.3 of Book II.) If the CT field is set to a value not supported by the implementation, no operation is performed. If the block already exists in the cache, the block is locked without accessing storage. If the block is in a storage location that is Caching Inhibited, then no cache operation is performed. An unable-to-lock condition may occur (see Section 4.9.2.2.2), or an overlocking condition may occur (see Section 4.9.2.2.1). The *dcbtstls* instruction may complete before the operation it causes has been performed. It is implementation-dependent whether the instruction is treated as a *Load* or a *Store*. This instruction is privileged unless the Embedded Cache Locking. User Mode category is supported. If the Embedded Cache Locking. User Mode category is supported, this instruction is privileged only if MSR<sub>UCLE</sub>=0. ### **Special Registers Altered:** None ### Instruction Cache Block Touch and Lock Set X-form icbtls CT,RA,RB | 31 | / | CT | RA | RB | 486 | / | |----|---|----|----|----|-----|----| | 0 | 6 | 7 | 11 | 16 | 21 | 31 | Let the effective address (EA) be the sum (RA|0)+(RB). The *icbtls* instruction causes the block containing the byte addressed by EA to be loaded and locked into the instruction cache specified by CT, and provides a hint that the program will probably soon execute code from the block. See Section 3.3 of Book II for a definition of the CT field. If the block already exists in the cache, the block is locked without refetching from memory. If the block is in storage that is Caching Inhibited, no cache operation is performed. This instruction treated as a *Load* (see Section 3.3), except that the system instruction storage error handler is not invoked. An unable-to-lock condition may occur (see Section 4.9.2.2.2), or an overlocking condition may occur (see Section 4.9.2.2.1). This instruction is privileged unless the Embedded Cache Locking. User Mode category is supported. If the Embedded Cache Locking. User Mode category is supported, this instruction is privileged only if MSR<sub>HCLE</sub>=0. ### Special Registers Altered: None ### Instruction Cache Block Lock Clear X-form icblc CT,RA,RB | 31 | / CT | RA | RB | 230 | / | |----|------|----|----|-----|----| | 0 | 6 7 | 11 | 16 | 21 | 31 | Let the effective address (EA) be the sum (RA|0)+(RB). The block containing the byte addressed by EA in the instruction cache specified by the CT field is unlocked. The instruction is treated as a Load. An unable-to-unlock condition may occur (see Section 4.9.2.2.2). If the block containing the byte addressed by EA is not locked in the specified cache, no cache operation is performed. This instruction is privileged unless the Embedded Cache Locking. User Mode category is supported. If the Embedded Cache Locking. User Mode category is supported, this instruction is privileged only if MSR<sub>UCLE</sub>=0. ### **Special Registers Altered:** None ### Data Cache Block Lock Clear X-form dcblc CT,RA,RB | | 31 | / | СТ | RA | RB | 390 | / | |---|----|---|----|----|----|-----|----| | 0 | | 6 | 7 | 11 | 16 | 21 | 31 | Let the effective address (EA) be the sum (RA|0)+(RB). The block containing the byte addressed by EA in the data cache specified by the CT field is unlocked. The instruction is treated as a Load. An unable-to-unlock condition may occur (see Section 4.9.2.2.2). If the block containing the byte addressed by EA is not locked in the specified cache, no cache operation is performed. This instruction is privileged unless the Embedded Cache Locking. User Mode category is supported. If the Embedded Cache Locking. User Mode category is supported, this instruction is privileged only if MSR<sub>UCLE</sub>=0. ### **Special Registers Altered:** None ### Programming Note The **dcblc** and **icblc** instructions are used to remove locks previously set by the corresponding lock set instructions. ### 4.9.3 Synchronize Instruction The *Synchronize* instruction is described in Section 3.4.3 of Book II, but only at the level required by an application programmer. This section describes properties of the instruction that are relevant only to operating system programmers. In conjunction with the *tlbie* and *tlbsync* instructions, the *sync* instruction provides an ordering function for TLB invalidations and related storage accesses on other processors as described in the *tlbsync* instruction description on page 659. # 4.9.4 Lookaside Buffer Management All implementations include a TLB as the architected repository of translation, protection, and attribute information for storage. Each implementation that has a TLB or similar lookaside buffer provides a means by which software can invalidate the lookaside entry that translates a given effective address. ### Programming Note The invalidate all entries function is not required because each TLB entry can be addressed directly without regard to the contents of the entry. In addition, implementations provide a means by which software can do the following. - Read a specified TLB entry - Identify the TLB entry (if any) associated with a specified effective address - Write a specified TLB entry ### Programming Note - Because the presence, absence, and exact semantics of the *TLB Management* instructions are implementation-dependent, it is recommended that system software "encapsulate" uses of these instructions into subroutines to minimize the impact of moving from one implementation to another. ### 4.9.4.1 TLB Management Instructions The *tlbivax* instruction is used to invalidate TLB entries. Additional instructions are used to read and write, and search TLB entries, and to provide an ordering function for the effects of *tlbivax* ### TLB Invalidate Virtual Address Indexed X-form tlbivax (implementation-dependent) | 31 | ??? | ??? | ??? | 786 | / | |----|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | Bits 6:20 of the instruction encoding are implementation-dependent, and may be used to specify the TLB entry or entries to be invalidated. (E.g. they may specify virtual or effective addresses.) If a single *tlbivax* instruction can invalidate more entries than those corresponding to a single VA, a means must be provided to prevent specific TLB entries from being invalidated. If the Translation Lookaside Buffer (TLB) contains an entry specified, the entry or entries are made invalid (i.e. removed from the TLB). This instruction causes the target TLB entry to be invalidated in all processors. If the instruction specifies a TLB entry that does not exist, the results are undefined. Execution of this instruction may cause other implementation-dependent effects. The operation performed by this instruction is ordered by the *mbar* (or *sync*) instruction with respect to a subsequent *tlbsync* instruction executed by the processor executing the *tlbivax* instruction. The operations caused by *tlbivax* and *tlbsync* are ordered by *mbar* as a set of operations which is independent of the other sets that *mbar* orders. This instruction is privileged. ### Special Registers Altered: None #### Programming Note The effects of the invalidation may not be visible until the completion of a context synchronizing operation (see Section 1.6.1). ### Programming Note Care must be taken not to invalidate any TLB entry that contains the mapping for any interrupt vector. ### TLB Read Entry X-form tlbre (implementation-dependent) | Ī | 31 | ??? | ??? | ??? | 946 | / | |---|----|-----|-----|-----|-----|----| | ( | ) | 6 | 11 | 16 | 21 | 31 | Bits 6:20 of the instruction encoding are implementation-dependent, and may be used to specify the source TLB entry, the source portion of the source TLB entry, and the target resource that the result is placed into. The implementation-dependent-specified TLB entry is read, and the implementation-dependent-specified portion of the TLB entry is extracted and placed into an implementation-dependent target resource. If the instruction specifies a TLB entry that does not exist, the results are undefined. Execution of this instruction may cause other implementation-dependent effects. This instruction is privileged. ### **Special Registers Altered:** Implementation-dependent #### TLB Search Indexed #### X-form #### TLB Synchronize X-form tlbsx RA,RB, (implementation-dependent) | 31 | ??? | RA | RB | 914 | ? | |----|-----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA=0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) AS \leftarrow implementation-dependent value ProcessID \leftarrow implementation-dependent value VA \leftarrow AS \mid\mid ProcessID \mid\mid EA If there is a TLB entry for which TLBentry_VA=VA then result \leftarrow implementation-dependent value else result \leftarrow undefined target resource(???) \leftarrow result ``` Let the effective address (EA) be the sum(RA|0)+ (RB). Let address space (AS) be defined as implementation-dependent (e.g. could be $MSR_{DS}$ or a bit from an implementation-dependent SPR). Let the ProcessID be defined as implementationdependent (e.g. could be from the PID register or from an implementation-dependent SPR). Let the virtual address (VA) be the value AS || ProcessID || EA. See Figure 9 on page 645. Bits 6:10 of the instruction encoding are implementation-dependent, and may be used to specify the target resource that the result of the instruction is placed into. If the Translation Lookaside Buffer (TLB) contains an entry corresponding to VA, an implementation-dependent value is placed into an implementation-dependent-specified target. Otherwise the contents of the implementation-dependent-specified target are left undefined. Bit 31 of the instruction encoding is implementationdependent. For example, bit 31 may be interpreted as an "Rc" bit, used to enable recording the success or failure of the search operation. This instruction is privileged. #### **Special Registers Altered:** None tlbsync | | 31 | /// | /// | /// | 566 | / | |---|----|-----|-----|-----|-----|----| | 0 | | 6 | 11 | 16 | 21 | 31 | The *tlbsync* instruction provides an ordering function for the effects of all *tlbivax* instructions executed by the processor executing the *tlbsync* instruction, with respect to the memory barrier created by a subsequent *sync* instruction executed by the same processor. Executing a *tlbsync* instruction ensures that all of the following will occur. All storage accesses by other processors for which the address was translated using the translations being invalidated will have been performed with respect to the processor executing the sync instruction, to the extent required by the associated Memory Coherence Required attributes, before the sync instruction's memory barrier is created. The operation performed by this instruction is ordered by the *mbar* or *msync* instruction with respect to preceding *tlbivax* instructions executed by the processor executing the *tlbsync* instruction. The operations caused by *tlbivax* and *tlbsync* are ordered by *mbar* as a set of operations, which is independent of the other sets that *mbar* orders. The *tlbsync* instruction may complete before operations caused by *tlbivax* instructions preceding the *tlb-sync* instruction have been performed. This instruction is privileged. #### **Special Registers Altered:** None #### **TLB Write Entry** X-form tlbwe (implementation-dependent) | 31 | ??? | ??? | ??? | 978 | / | |----|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | Bits 6:20 of the instruction encoding are implementation-dependent, and may be used to specify the target TLB entry, the target portion of the target TLB entry, and the source of the value that is to be written into the TLB. The contents of the implementation-dependent-specified source are written into the implementation-dependent-specified portion of the implementation-dependent-specified TLB entry. If the instruction specifies a TLB entry that does not exist, the results are undefined. Execution of this instruction may cause other implementation-dependent effects. This instruction is privileged. #### **Special Registers Altered:** Implementation-dependent #### **Programming Note** The effects of the update may not be visible until the completion of a context synchronizing operation (see Section 1.6.1). #### Programming Note Care must be taken not to invalidate any TLB entry that contains the mapping for any interrupt vector. # **Chapter 5. Interrupts and Exceptions** | 5.6.7 Program Interrupt 678 | |---------------------------------------------------------------------------------------| | 5.6.8 Floating-Point Unavailable Interrupt. | | 679 | | 5.6.9 System Call Interrupt 679 | | 5.6.10 Auxiliary Processor Unavailable | | Interrupt | | 5.6.11 Decrementer Interrupt 680 | | 5.6.12 Fixed-Interval Timer Interrupt 680 | | 5.6.13 Watchdog Timer Interrupt 680 | | 5.6.14 Data TLB Error Interrupt 681 | | 5.6.15 Instruction TLB Error Interrupt . 681 | | 5.6.16 Debug Interrupt | | 5.6.17 SPE/Embedded Floating-Point/<br>Vector Unavailable Interrupt | | [Categories: SPE.Embedded Float Scalar | | Double, SPE.Embedded Float Vector, | | Vector] | | 5.6.18 Embedded Floating-Point Data | | Interrupt | | [Categories: SPE.Embedded Float Scalar | | Double, SPE.Embedded Float Scalar Sin- | | gle, SPE.Embedded Float Vector]684 | | 5.6.19 Embedded Floating-Point Round | | Interrupt | | [Categories: SPE.Embedded Float Scalar | | Double, SPE.Embedded Float Scalar Sin- | | gle, SPE.Embedded Float Vector]684 | | 5.6.20 Performance Monitor Interrupt [Cat- | | egory: Embedded.Performance Monitor] | | 685 | | 5.6.21 Processor Doorbell Interrupt [Cate- | | gory: Embedded.Processor Control] 685<br>5.6.22 Processor Doorbell Critical Interrupt | | [Category: Embedded Processor Control]. | | 685 | | 5.7 Partially Executed Instructions 686 | | 5.8 Interrupt Ordering and Masking 687 | | 5.8.1 Guidelines for System Software. 688 | | 5.8.2 Interrupt Order 689 | | 5.9 Exception Priorities 689 | | 5.9.1 Exception Priorities for Defined | | Instructions 690 | | 5.9.1.1 Exception Priorities for Defined | | Floating-Point Load and Store Instructions | | 690 | | | | 5.9.1.6 Exception Priorities for Defined | |------------------------------------------| | System Call Instruction 691 | | 5.9.1.7 Exception Priorities for Defined | | Branch Instructions 691 | | 5.9.1.8 Exception Priorities for Defined | | Return From Interrupt Instructions 691 | | 5.9.1.9 Exception Priorities for Other | | Defined Instructions 691 | | 5.9.2 Exception Priorities for Reserved | | Instructions 691 | | | ### 5.1 Overview An *interrupt* is the action in which the processor saves its old context (MSR and next instruction address) and begins execution at a pre-determined interrupt-handler address, with a modified MSR. *Exceptions* are the events that will, if enabled, cause the processor to take an interrupt. Exceptions are generated by signals from internal and external peripherals, instructions, the internal timer facility, debug events, or error conditions. Interrupts are divided into 4 classes, as described in Section 5.4.3, such that only one interrupt of each class is reported, and when it is processed no program state is lost. Since Save/Restore register pairs SRR0/SRR1, CSRR0/CSRR1, DSRR0/DSRR1 [Category: E.ED], and MCSSR0/MCSSR1 are serially reusable resources used by base, critical, debug [Category: E.ED], Machine Check interrupts, respectively, program state may be lost when an unordered interrupt is taken. (See Section 5.8. All interrupts, except Machine Check, are context synchronizing as defined in Section 1.6.1 on page 609. A Machine Check interrupt acts like a context synchronizing operation with respect to subsequent instructions; that is, a Machine Check interrupt need not satisfy items 2-3 of Section 1.6.1 but does satisfy items 1, 4, and 5. ## 5.2 Interrupt Registers ## 5.2.1 Save/Restore Register 0 Save/Restore Register 0 (SRR0) is a 64-bit register. SRR0 bits are numbered 0 (most-significant bit) to 63 (least-significant bit). The register is used to save machine state on non-critical interrupts, and to restore machine state when an *rfi* is executed. On a non-critical interrupt, SRR0 is set to the current or next instruction address. When *rfi* is executed, instruction execution continues at the address in SRR0. In general, SRR0 contains the address of the instruction that caused the non-critical interrupt, or the address of the instruction to return to after a non-critical interrupt is serviced. The contents of SRR0 when an interrupt is taken are mode dependent, reflecting the computation mode currently in use (specified by $MSR_{CM}$ ) and the computation mode entered for execution of the interrupt (specified by $MSR_{ICM}$ ). The contents of SRR0 upon interrupt can be described as follows (assuming Addr is the address to be put into SRR0): ``` \begin{array}{lll} & \text{if } (\text{MSR}_{\text{CM}} = 0) \& (\text{MSR}_{\text{ICM}} = 0) \\ & \text{then } \text{SRR0} \leftarrow {}^{32} \text{undefined} \parallel \text{Addr}_{32:63} \\ & \text{if } (\text{MSR}_{\text{CM}} = 0) \& (\text{MSR}_{\text{ICM}} = 1) \\ & \text{then } \text{SRR0} \leftarrow {}^{32} 0 \parallel \text{Addr}_{32:63} \\ & \text{if } (\text{MSR}_{\text{CM}} = 1) \& (\text{MSR}_{\text{ICM}} = 1) \text{ then } \text{SRR0} \leftarrow \text{Addr}_{0:63} \\ & \text{if } (\text{MSR}_{\text{CM}} = 1) \& (\text{MSR}_{\text{ICM}} = 0) \text{ then } \text{SRR0} \leftarrow \text{undefined} \\ \end{array} ``` The contents of SRR0 can be read into register RT using *mfspr* RT,SRR0. The contents of register RS can be written into the SRR0 using *mtspr* SRR0,RS. ## 5.2.2 Save/Restore Register 1 Save/Restore Register 1 (SRR1) is a 32-bit register. SRR1 bits are numbered 32 (most-significant bit) to 63 (least-significant bit). The register is used to save machine state on non-critical interrupts, and to restore machine state when an *rfi* is executed. When a non-critical interrupt is taken, the contents of the MSR are placed into SRR1. When *rfi* is executed, the contents of SRR1 are placed into the MSR. Bits of SRR1 that correspond to reserved bits in the MSR are also reserved. #### Programming Note A MSR bit that is reserved may be inadvertently modified by *rfi/rfci/rfmci*. The contents of SRR1 can be read into register RT using *mfspr* RT,SRR1. The contents of register RS can be written into the SRR1 using *mtspr* SRR1,RS. # 5.2.3 Critical Save/Restore Register 0 Critical Save/Restore Register 0 (CSRR0) is a 64-bit register. CSRR0 bits are numbered 0 (most-significant bit) to 63 (least-significant bit). The register is used to save machine state on critical interrupts, and to restore machine state when an **rfci** is executed. When a critical interrupt is taken, the CSRR0 is set to the current or next instruction address. When **rfci** is executed, instruction execution continues at the address in CSRR0. In general, CSRR0 contains the address of the instruction that caused the critical interrupt, or the address of the instruction to return to after a critical interrupt is serviced. The contents of CSRR0 when a critical interrupt is taken are mode dependent, reflecting the computation mode currently in use (specified by MSR<sub>CM</sub>) and the computation mode entered for execution of the critical interrupt (specified by MSR<sub>ICM</sub>). The contents of CSRR0 upon critical interrupt can be described as follows (assuming Addr is the address to be put into CSRR0): ``` \begin{array}{lll} \mbox{if } (\mbox{MSR}_{\mbox{CM}} = 0) & \& (\mbox{MSR}_{\mbox{ICM}} = 0) \\ & \mbox{then CSRR0} \leftarrow \mbox{$^{32}$undefined } \parallel \mbox{Addr}_{32:63} \\ \mbox{if } (\mbox{MSR}_{\mbox{CM}} = 0) & \& (\mbox{MSR}_{\mbox{ICM}} = 1) \\ & \mbox{then CSRR0} \leftarrow \mbox{$^{32}$0} \parallel \mbox{Addr}_{32:63} \\ \mbox{if } (\mbox{MSR}_{\mbox{CM}} = 1) & \& (\mbox{MSR}_{\mbox{ICM}} = 0) \mbox{ then CSRR0} \leftarrow \mbox{Addr}_{0:63} \\ \mbox{if } (\mbox{MSR}_{\mbox{CM}} = 1) & \& (\mbox{MSR}_{\mbox{ICM}} = 0) \mbox{ then CSRR0} \leftarrow \mbox{undefined} \\ \end{array} ``` The contents of CSRR0 can be read into register RT using *mfspr* RT,CSRR0. The contents of register RS can be written into CSRR0 using *mtspr* CSRR0,RS. # 5.2.4 Critical Save/Restore Register 1 Critical Save/Restore Register 1 (CSRR1) is a 32-bit register. CSRR1 bits are numbered 32 (most-significant bit) to 63 (least-significant bit). The register is used to save machine state on critical interrupts, and to restore machine state when an *rfci* is executed. When a critical interrupt is taken, the contents of the MSR are placed into CSRR1. When *rfci* is executed, the contents of CSRR1 are placed into the MSR. Bits of CSRR1 that correspond to reserved bits in the MSR are also reserved. #### Programming Note - A MSR bit that is reserved may be inadvertently modified by *rfi/rfci/rfmci*. The contents of CSRR1 can be read into bits 32:63 of register RT using *mfspr RT,CSRR1*, setting bits 0:31 of RT to zero. The contents of bits 32:63 of register RS can be written into the CSRR1 using *mtspr CSRR1,RS*. # 5.2.5 Debug Save/Restore Register 0 [Category: Embedded.Enhanced Debug] Debug Save/Restore Register 0 (DSRR0) is a 64-bit register used to save machine state on Debug interrupts, and to restore machine state when an *rfdi* is executed. When a Debug interrupt is taken, the DSRR0 is set to the current or next instruction address. When *rfdi* is executed, instruction execution continues at the address in DSRR0. In general, DSRR0 contains the address of an instruction that was executing or just finished execution when the Debug exception occurred. The contents of DSRR0 when a Debug interrupt is taken are mode dependent, reflecting the computation mode currently in use (specified by MSR<sub>CM</sub>) and the computation mode entered for execution of the Debug interrupt (specified by MSR<sub>ICM</sub>). The contents of DSRR0 upon Debug interrupt can be described as follows (assuming Addr is the address to be put into DSRR0): ``` if (MSR<sub>CM</sub> = 0) & (MSR<sub>ICM</sub> = 0) then DSRR0 \leftarrow ^{32}undefined \parallel Addr_{32:63} if (MSR<sub>CM</sub> = 0) & (MSR<sub>ICM</sub> = 1) then DSRR0 \leftarrow ^{32}0 \parallel Addr_{32:63} if (MSR<sub>CM</sub> = 1) & (MSR<sub>ICM</sub> = 1) then DSRR0 \leftarrow Addr_{0:63} if (MSR<sub>CM</sub> = 1) & (MSR<sub>ICM</sub> = 0) then DSRR0 \leftarrow undefined ``` The contents of DSRR0 can be read into register RT using *mfspr* RT,DSRR0. The contents of register RS can be written into DSRR0 using *mtspr* DSRR0,RS. # 5.2.6 Debug Save/Restore Register 1 [Category: Embedded.Enhanced Debug] Debug Save/Restore Register 1 (DSRR1) is a 32-bit register used to save machine state on Debug interrupts, and to restore machine state when an *rfdi* is executed. When a Debug interrupt is taken, the contents of the Machine State Register are placed into DSRR1. When *rfdi* is executed, the contents of DSRR1 are placed into the Machine State Register. Bits of DSRR1 that correspond to reserved bits in the Machine State Register are also reserved. The contents of DSRR1 can be read into bits 32:63 of register RT using *mfspr* RT,DSRR1, setting bits 0:31 of RT to zero. The contents of bits 32:63 of register RS can be written into the DSSR1 using *mtspr* DSRR1.RS. # **5.2.7 Data Exception Address Register** The Data Exception Address Register (DEAR) is a 64-bit register. DEAR bits are numbered 0 (most-significant bit) to 63 (least-significant bit). The DEAR contains the address that was referenced by a *Load*, *Store* or *Cache Management* instruction that caused an Alignment, Data TLB Miss, or Data Storage interrupt. The contents of the DEAR when an interrupt is taken are mode dependent, reflecting the computation mode currently in use (specified by MSR<sub>CM</sub>) and the computation mode entered for execution of the critical interrupt (specified by MSR<sub>ICM</sub>). The contents of the DEAR upon interrupt can be described as follows (assuming Addr is the address to be put into DEAR): ``` \begin{array}{lll} \mbox{if } & (\mbox{MSR}_{\mbox{CM}} = 0) & \& & (\mbox{MSR}_{\mbox{ICM}} = 0) \\ & & \mbox{then DEAR} \leftarrow & ^{32}\mbox{undefined} \parallel \mbox{Addr}_{32:63} \\ \mbox{if } & (\mbox{MSR}_{\mbox{CM}} = 0) & \& & (\mbox{MSR}_{\mbox{ICM}} = 1) \\ & & \mbox{then DEAR} \leftarrow & ^{32}\mbox{0} \parallel \mbox{Addr}_{32:63} \\ \mbox{if } & (\mbox{MSR}_{\mbox{CM}} = 1) & \& & (\mbox{MSR}_{\mbox{ICM}} = 1) & \mbox{then DEAR} \leftarrow \mbox{Addr}_{0:63} \\ \mbox{if } & (\mbox{MSR}_{\mbox{CM}} = 1) & \& & (\mbox{MSR}_{\mbox{ICM}} = 0) & \mbox{then DEAR} \leftarrow \mbox{undefined} \end{array} ``` The contents of DEAR can be read into register RT using *mtspr RT*,*DEAR*. The contents of register RS can be written into the DEAR using *mtspr DEAR*,*RS*. # 5.2.8 Interrupt Vector Prefix Register The Interrupt Vector Prefix Register (IVPR) is a 64-bit register. Interrupt Vector Prefix Register bits are numbered 0 (most-significant bit) to 63 (least-significant bit). Bits 48:63 are reserved. Bits 0:47 of the Interrupt Vector Prefix Register provides the high-order 48 bits of the address of the exception processing routines. The 16-bit exception vector offsets (provided in Section 5.2.10) are concatenated to the right of bits 0:47 of the Interrupt Vector Prefix Register to form the 64-bit address of the exception processing routine. The contents of Interrupt Vector Prefix Register can be read into register RT using *mfspr RT,IVPR*. The contents of register RS can be written into Interrupt Vector Prefix Register using *mtspr IVPR,RS*. ## 5.2.9 Exception Syndrome Register The Exception Syndrome Register (ESR) is a 32-bit register. ESR bits are numbered 32 (most-significant bit) to 63 (least-significant bit). The ESR provides a *syndrome* to differentiate between the different kinds of exceptions that can generate the same interrupt type. Upon the generation of one of these types of interrupts, the bit or bits corresponding to the specific exception that generated the interrupt is set, and all other ESR bits are cleared. Other interrupt types do not affect the contents of the ESR. The ESR does not need to be cleared by software. Figure 12 shows the bit definitions for the ESR. | d | |------------------------| | Vector Unavailable | | | | | | | | | | | | /a atau I la avellabla | | Vector Unavailable | | d | | u | | | | | | | | | | | | | Figure 12. Exception Syndrome Register Definitions #### Programming Note - The information provided by the ESR is not complete. System software may also need to identify the type of instruction that caused the interrupt, examine the TLB entry accessed by a data or instruction storage access, as well as examine the ESR to fully determine what exception or exceptions caused the interrupt. For example, a Data Storage interrupt may be caused by both a Protection Violation exception as well as a Byte Ordering exception. System software would have to look beyond ESR<sub>BO</sub>, such as the state of MSR<sub>PR</sub> in SRR1 and the page protection bits in the TLB entry accessed by the storage access, to determine whether or not a Protection Violation also occurred. The contents of the ESR can be read into bits 32:63 of register RT using *mfspr RT,ESR*, setting bits 0:31 of RT to zero. The contents of bits 32:63 of register RS can be written into the ESR using *mtspr ESR,RS*. # 5.2.10 Interrupt Vector Offset Registers The Interrupt Vector Offset Registers (IVORs) are 32-bit registers. Interrupt Vector Offset Register bits are numbered 32 (most-significant bit) to 63 (least-significant bit). Bits 32:47 and bits 60:63 are reserved. An Interrupt Vector Offset Register provides the quadword index from the base address provided by the IVPR (see Section 5.2.8) for its respective interrupt. Interrupt Vector Offset Registers 0 through 15 and 32-37 are provided for the defined interrupts. SPR numbers corresponding to Interrupt Vector Offset Registers 16 through 31 are reserved. SPR numbers corresponding to Interrupt Vector Offset Registers 38 through 63 are allocated for implementation-dependent use. Figure 13 provides the assignments of specific Interrupt Vector Offset Registers to specific interrupts. | IVORi | Interrupt | | | | | | |----------------|----------------------------------------|--|--|--|--|--| | IVOR0 | Critical Input | | | | | | | IVOR1 | Machine Check | | | | | | | IVOR2 | Data Storage | | | | | | | IVOR3 | Instruction Storage | | | | | | | IVOR4<br>IVOR5 | External | | | | | | | IVOR5 | Alignment<br>Program | | | | | | | IVOR7 | Floating-Point Unavailable | | | | | | | IVOR8 | System Call | | | | | | | IVOR9 | Auxiliary Processor Unavailable | | | | | | | IVOR10 | Decrementer | | | | | | | IVOR11 | Fixed-Interval Timer Interrupt | | | | | | | IVOR12 | Watchdog Timer Interrupt | | | | | | | IVOR13 | Data TLB Error | | | | | | | IVOR14 | Instruction TLB Error | | | | | | | IVOR15 | Debug | | | | | | | IVOR16 | Reserved | | | | | | | :<br>IVOR31 | | | | | | | | | Signal Processing Engine] | | | | | | | [Category: | Vector] | | | | | | | IVOR 32 | SPE/Embedded Floating-Point/Vector | | | | | | | | Unavailable Interrupt | | | | | | | | SP.Embedded Float_*] | | | | | | | | & 34 are required if any SP.Float_ | | | | | | | | ent category is supported.) | | | | | | | | Embedded Floating-Point Data Interrupt | | | | | | | IVOR 34 | Embedded Floatgpt. round Interrupt | | | | | | | [Category: | Embedded Performance Monitor] | | | | | | | IVOR 35 | Embedded Performance Monitor Inter- | | | | | | | | rupt | | | | | | | [Category: | Embedded.Processor Control] | | | | | | | IVOR 36 | Processor Doorbell Interrupt | | | | | | | IVOR 37 | Processor Doorbell Critical Interrupt | | | | | | | IVOR38 | Implementation-dependent | | | | | | | IVOR63 | | | | | | | Figure 13. Interrupt Vector Offset Register Assignments Bits 48:59 of the contents of IVORi can be read into bits 48:59 of register RT using *mfspr RT,IVORi*, setting bits 0:47 and bits 60:63 of GPR(RT) to zero. Bits 48:59 of the contents of register RS can be written into bits 48:59 of IVORi using *mtspr IVORi,RS*. ## 5.2.11 Machine Check Registers A set of Special Purpose Registers are provided to support Machine Check interrupts. # 5.2.11.1 Machine Check Save/Restore Register 0 Machine Check Save/Restore Register 0 (MCSRR0) is a 64-bit register used to save machine state on Machine Check interrupts, and to restore machine state when an *rfmci* is executed. When a Machine Check interrupt is taken, the MCSRR0 is set to the current or next instruction address. When *rfmci* is executed, instruction execution continues at the address in MCSRR0. In general, MCSRR0 contains the address of an instruction that was executing or about to be executed when the Machine Check exception occurred. The contents of MCSRR0 when a Machine Check interrupt is taken are mode dependent, reflecting the computation mode currently in use (specified by MSR<sub>CM</sub>) and the computation mode entered for execution of the Machine Check interrupt (specified by MSR<sub>ICM</sub>). The contents of MCSRR0 upon Machine Check interrupt can be described as follows (assuming Addr is the address to be put into MCSRR0): ``` \begin{array}{llll} & \text{if } (\text{MSR}_{\text{CM}} = 0) & \text{\& } (\text{MSR}_{\text{ICM}} = 0) \\ & \text{then } \text{MCSRR0} \leftarrow & ^{32} \text{undefined} & | | & \text{Addr}_{32:63} \\ & \text{if } (\text{MSR}_{\text{CM}} = 0) & \text{\& } (\text{MSR}_{\text{ICM}} = 1) \\ & \text{then } \text{MCSRR0} \leftarrow & ^{32} \text{O} & | | & \text{Addr}_{32:63} \\ & \text{if } (\text{MSR}_{\text{CM}} = 1) & \text{\& } (\text{MSR}_{\text{ICM}} = 1) & \text{then } \text{MCSRR0} \leftarrow \text{Addr}_{0:63} \\ & \text{if } (\text{MSR}_{\text{CM}} = 1) & \text{\& } (\text{MSR}_{\text{ICM}} = 0) & \text{then } \text{MCSRR0} \leftarrow \text{undefined} \\ \end{array} ``` The contents of MCSRR0 can be read into register RT using *mfspr RT,MCSRR0*. The contents of register RS can be written into MCSRR0 using *mtspr MCSRR0,RS*. # 5.2.11.2 Machine Check Save/Restore Register 1 Machine Check Save/Restore Register 1 (MCSRR1) is a 32-bit register used to save machine state on Machine Check interrupts, and to restore machine state when an **rfmci** is executed. When a Machine Check interrupt is taken, the contents of the MSR are placed into MCSRR1. When **rfmci** is executed, the contents of MCSRR1 are placed into the MSR. Bits of MCSRR1 that correspond to reserved bits in the MSR are also reserved. #### Programming Note A MSR bit that is reserved may be inadvertently modified by *rfi/rfci/rfmci*. The contents of MCSRR1 can be read into register RT using *mfspr RT,MCSRR1*. The contents of register RS can be written into the MCSRR1 using *mtspr MCSRR1,RS*. # 5.2.11.3 Machine Check Syndrome Register MCSR (MCSR) is a 64-bit register that is used to record the cause of the Machine Check interrupt. The specific definition of the contents of this register are implementation-dependent (see the User Manual of the implementation). The contents of MCSR can be read into register RT using *mfspr RT,MCSR*. The contents of register RS can be written into the MCSR using *mtspr MCSR,RS*. # 5.2.12 External Proxy Register [Category: External Proxy] The External Proxy Register (EPR) contains implementation-dependent information related to an External Input interrupt when an External Input interrupt occurs. The EPR is only considered valid from the time that the External Input Interrupt occurs until MSR<sub>EE</sub> is set to 1 as the result of a *mtmsr* or a return from interrupt instruction. The format of the EPR is shown below. Figure 14. External Proxy Register When the External Input interrupt is taken, the contents of the EPR provide information related to the External Input Interrupt. #### Programming Note - The EPR is provided for faster interrupt processing as well as situations where an interrupt must be taken, but software must delay the resultant processing for later. The EPR contains the vector from the interrupt controller. The process of receiving the interrupt into the EPR acknowledges the interrupt to the interrupt controller. The method for enabling or disabling the acknowledgment of the interrupt by placing the interrupt-related information in the EPR is implementation-dependent. If this acknowledgement is disabled, then the EPR is set to 0 when the External Input interrupt occurs. # 5.3 Exceptions There are two kinds of exceptions, those caused directly by the execution of an instruction and those caused by an asynchronous event. In either case, the exception may cause one of several types of interrupts to be invoked. Examples of exceptions that can be caused directly by the execution of an instruction include but are not limited to the following: - an attempt to execute a reserved-illegal instruction (Illegal Instruction exception type Program interrupt) - an attempt by an application program to execute a 'privileged' instruction (Privileged Instruction exception type Program interrupt) - an attempt by an application program to access a 'privileged' Special Purpose Register (Privileged Instruction exception type Program interrupt) - an attempt by an application program to access a Special Purpose Register that does not exist (Unimplemented Operation Instruction exception type Program interrupt) - an attempt by a system program to access a Special Purpose Register that does not exist (boundedly undefined results) - the execution of a defined instruction using an invalid form (Illegal Instruction exception type Program interrupt, Unimplemented Operation exception type Program interrupt, or Privileged Instruction exception type Program interrupt) - an attempt to access a storage location that is either unavailable (Instruction TLB Error interrupt or Data TLB Error interrupt) or not permitted (Instruction Storage interrupt or Data Storage interrupt) - an attempt to access storage with an effective address alignment not supported by the implementation (Alignment interrupt) - the execution of a System Call instruction (System Call interrupt) - the execution of a Trap instruction whose trap condition is met (Trap type Program interrupt) - the execution of a floating-point instruction when floating-point instructions are unavailable (Floating-point Unavailable interrupt) - the execution of a floating-point instruction that causes a floating-point enabled exception to exist (Enabled exception type Program interrupt) - the execution of a defined instruction that is not implemented by the implementation (Illegal Instruction exception or Unimplemented Operation exception type of Program interrupt) - the execution of an instruction that is not implemented by the implementation (Illegal Instruction exception or Unimplemented Operation exception type of Program interrupt) - the execution of an auxiliary processor instruction when the auxiliary processor instruction is unavailable (Auxiliary Processor Unavailable interrupt) - the execution of an instruction that causes an auxiliary processor enabled exception (Enabled exception type Program interrupt) The invocation of an interrupt is precise, except that if one of the imprecise modes for invoking the Floating-point Enabled Exception type Program interrupt is in effect then the invocation of the Floating-point Enabled Exception type Program interrupt may be imprecise. When the interrupt is invoked imprecisely, the excepting instruction does not appear to complete before the next instruction starts (because one of the effects of the excepting instruction, namely the invocation of the interrupt, has not yet occurred). # 5.4 Interrupt Classification All interrupts, except for Machine Check, can be classified as either Asynchronous or Synchronous. Independent from this classification, all interrupts, including Machine Check, can be classified into one of the following classes: - Base - Critical - Machine Check - Debug[Category:Embedded.Enhanced Debug]. ## 5.4.1 Asynchronous Interrupts Asynchronous interrupts are caused by events that are independent of instruction execution. For asynchronous interrupts, the address reported to the exception handling routine is the address of the instruction that would have executed next, had the asynchronous interrupt not occurred. # 5.4.2 Synchronous Interrupts Synchronous interrupts are those that are caused directly by the execution (or attempted execution) of instructions, and are further divided into two classes, *precise* and *imprecise*. Synchronous, precise interrupts are those that *pre-cisely* indicate the address of the instruction causing the exception that generated the interrupt; or, for certain synchronous, precise interrupt types, the address of the immediately following instruction. Synchronous, imprecise interrupts are those that may indicate the address of the instruction causing the exception that generated the interrupt, or some instruction after the instruction causing the exception. # 5.4.2.1 Synchronous, Precise Interrupts When the execution or attempted execution of an instruction causes a synchronous, precise interrupt, the following conditions exist at the interrupt point. - SRR0, CSRR0, or DSRR0 [Category: Embedded.Enhanced Debug] addresses either the instruction causing the exception or the instruction immediately following the instruction causing the exception. Which instruction is addressed can be determined from the interrupt type and status bits. - An interrupt is generated such that all instructions preceding the instruction causing the exception appear to have completed with respect to the executing processor. However, some storage accesses associated with these preceding instructions may not have been performed with respect to other processors and mechanisms. - The instruction causing the exception may appear not to have begun execution (except for causing the exception), may have been partially executed, or may have completed, depending on the interrupt type. See Section 5.7 on page 686. - Architecturally, no subsequent instruction has executed beyond the instruction causing the exception. # 5.4.2.2 Synchronous, Imprecise Interrupts When the execution or attempted execution of an instruction causes an imprecise interrupt, the following conditions exist at the interrupt point. - SRR0 or CSRR0 addresses either the instruction causing the exception or some instruction following the instruction causing the exception that generated the interrupt. - An interrupt is generated such that all instructions preceding the instruction addressed by SRR0 or CSRR0 appear to have completed with respect to the executing processor. - If the imprecise interrupt is forced by the context synchronizing mechanism, due to an instruction that causes another exception that generates an interrupt (e.g., Alignment, Data Storage), then SRR0 addresses the interrupt-forcing instruction, and the interrupt-forcing instruction may have been partially executed (see Section 5.7 on page 686). - If the imprecise interrupt is forced by the execution synchronizing mechanism, due to executing an execution synchronizing instruction other than msync or isync, then SRR0 or CSRR0 addresses the interrupt-forcing instruction, and the interruptforcing instruction appears not to have begun exe- - cution (except for its forcing the imprecise interrupt). If the imprecise interrupt is forced by an *msync* or *isync* instruction, then SRR0 or CSRR0 may address either the *msync* or *isync* instruction, or the following instruction. - If the imprecise interrupt is not forced by either the context synchronizing mechanism or the execution synchronizing mechanism, then the instruction addressed by SRR0 or CSRR0 may have been partially executed (see Section 5.7 on page 686). - No instruction following the instruction addressed by SRR0 or CSRR0 has executed. ## 5.4.3 Interrupt Classes Interrupts can also be classified as base, critical, Machine Check, and Debug [Category: Embedded.Enhanced Debug]. Interrupt classes other than the base class may demand immediate attention even if another class of interrupt is currently being processed and software has not yet had the opportunity to save the state of the machine (i.e. return address and captured state of the MSR). For this reason, the interrupts are organized into a hierarchy (see Section 5.8). To enable taking a critical, Machine Check, or Debug [Category: Embedded.Enhanced Debug] interrupt immediately after a base class interrupt occurs (i.e. before software has saved the state of the machine), these interrupts use the Save/Restore Register pair CSRR0/CSRR1, MCSRR0/MCSRR1, or DSRR0/DSRR1 [Category: Embedded.Enhanced Debug], and base class interrupts use Save/Restore Register pair SRR0/SRR1. ## 5.4.4 Machine Check Interrupts Machine Check interrupts are a special case. They are typically caused by some kind of hardware or storage subsystem failure, or by an attempt to access an invalid address. A Machine Check may be caused indirectly by the execution of an instruction, but not be recognized and/or reported until long after the processor has executed past the instruction that caused the Machine Check. As such, Machine Check interrupts cannot properly be thought of as synchronous or asynchronous, nor as precise or imprecise. The following general rules apply to Machine Check interrupts: - No instruction after the one whose address is reported to the Machine Check interrupt handler in MCSRR0 has begun execution. - 2. The instruction whose address is reported to the Machine Check interrupt handler in MCSRR0, and all prior instructions, may or may not have completed successfully. All those instructions that are ever going to complete appear to have done so already, and have done so within the context existing prior to the Machine Check interrupt. No further interrupt (other than possible additional Machine Check interrupts) will occur as a result of those instructions. # 5.5 Interrupt Processing Associated with each kind of interrupt is an *interrupt vector*, that is the address of the initial instruction that is executed when the corresponding interrupt occurs. Interrupt processing consists of saving a small part of the processor's state in certain registers, identifying the cause of the interrupt in another register, and continuing execution at the corresponding interrupt vector location. When an exception exists that will cause an interrupt to be generated and it has been determined that the interrupt can be taken, the following actions are performed, in order: - SRR0, DSRR0 [Category: Embedded.Enhanced Debug], MCSRR0, or CSRR0 is loaded with an instruction address that depends on the interrupt; see the specific interrupt description for details. - The ESR is loaded with information specific to the exception. Note that many interrupts can only be caused by a single kind of exception event, and thus do not need nor use an ESR setting to indicate to the cause of the interrupt was. - SRR1, DSRR1 [Category: Embedded.Enhanced Debug], or MCSRR1, or CSRR1 is loaded with a copy of the contents of the MSR. - The MSR is updated as described below. The new values take effect beginning with the first instruction following the interrupt. MSR bits of particular interest are the following. - MSR<sub>WE,EE,PR,FP,FE0,FE1,IS,DS</sub> are set to 0 by all interrupts. - MSR<sub>ME</sub> is set to 0 by Machine Check interrupts and left unchanged by all other interrupts. - MSR<sub>CE</sub> is set to 0 by critical class interrupts, Debug interrupts, and Machine Check interrupts, and is left unchanged by all other interrupts. - MSR<sub>DE</sub> is set to 0 by critical class interrupts unless Category E.ED is supported, by Debug interrupts, and by Machine Check interrupts, and is left unchanged by all other interrupts. - MSR<sub>CM</sub> is set to MSR<sub>ICM</sub>. - Other supported MSR bits are left unchanged by all interrupts. See Section 2.2.1 for more detail on the definition of the MSR. Instruction fetching and execution resumes, using the new MSR value, at a location specific to the interrupt. The location is IVPR<sub>0:47</sub> || IVORi<sub>48:59</sub> || 0b0000 where IVPR is the Interrupt Vector Prefix Register and IVORi is the Interrupt Vector Offset Register for that interrupt (see Figure 13 on page 666). The contents of the Interrupt Vector Prefix Register and Interrupt Vector Offset Registers are indeterminate upon power-on reset, and must be initialized by system software using the *mtspr* instruction. Interrupts may not clear reservations obtained with Load and Reserve instructions. The operating system should do so at appropriate points, such as at process switch. At the end of an interrupt handling routine, execution of an *rfi, rfdi* [Category: Embedded.Enhanced Debug], *rfmci*, or *rfci* causes the MSR to be restored from the contents of SRR1, DSRR1 [Category: Embedded.Enhanced Debug], MCSRR1, or CSRR1, and instruction execution to resume at the address contained in SRR0, DSRR0 [Category: Embedded.Enhanced Debug], MCSRR0, or CSRR0, respectively. #### Programming Note – In general, at process switch, due to possible process interlocks and possible data availability requirements, the operating system needs to consider executing the following. - stwcx. or stdcx., to clear the reservation if one is outstanding, to ensure that a lwarx or ldarx in the "old" process is not paired with a stwcx. or stdcx. in the "new" process. - msync, to ensure that all storage operations of an interrupted process are complete with respect to other processors before that process begins executing on another processor. - isync, rfi, rfdi [Category: Embedded.Enhanced Debug], rfmci, or rfci to ensure that the instructions in the "new" process execute in the "new" context. #### **Programming Note** For instruction-caused interrupts, in some cases it may be desirable for the operating system to emulate the instruction that caused the interrupt, while in other cases it may be desirable for the operating system not to emulate the instruction. The following list, while not complete, illustrates criteria by which decisions regarding emulation should be made. The list applies to general execution environments; it does not necessarily apply to special environments such as program debugging, processor bring-up, etc. In general, the instruction should be emulated if: - The interrupt is caused by a condition for which the instruction description (including related material such as the introduction to the section describing the instruction) implies that the instruction works correctly. Example: Alignment interrupt caused by *Imw* for which the storage operand is not aligned, or by *dcbz* or *dcbzep* for which the storage operand is in storage that is Write Through Required or Caching Inhibited. - The instruction is an illegal instruction that should appear, to the program executing it, as if it were supported by the implementation. Example: Illegal Instruction type Program interrupt caused by an instruction that has been phased out of the architecture but is still used by some programs that the operating system supports, or by an instruction that is in a category that the implementation does not support but is used by some programs that the operating system supports. In general, the instruction should not be emulated if: - The purpose of the instruction is to cause an interrupt. Example: System Call interrupt caused by sc. - The interrupt is caused by a condition that is stated, in the instruction description, potentially to cause the interrupt. Example: Alignment interrupt caused by *Iwarx* for which the storage operand is not aligned. - The program is attempting to perform a function that it should not be permitted to perform. Example: Data Storage interrupt caused by *Iwz* for which the storage operand is in storage that the program should not be permitted to access. (If the function is one that the program should be permitted to perform, the conditions that caused the interrupt should be corrected and the program re-dispatched such that the instruction will be re-executed. Example: Data Storage interrupt caused by *Iwz* for which the storage operand is in storage that the program should be permitted to access but for which there currently is no TLB entry.) # **5.6 Interrupt Definitions** Table 15 provides a summary of each interrupt type, the various exception types that may cause that interrupt type, the classification of the interrupt, which ESR bits can be set, if any, which MSR bits can mask the interrupt type and which Interrupt Vector Offset Register is used to specify that interrupt type's vector address. | IVOR | Interrupt | Exception | Asynchronous | Synchronous, Precise | Synchronous, Imprecise | Critical | ESR<br>(See Note 5) | ☐ MSR Mask Bit(s) | DBCR0/TCR Mask Bit | □ Category<br>(Section 1.3.5 of Book I) | Notes (see page 673) | <b>Bage</b> 674 | |-----------|----------------|------------------------------------------------------------------------------------|--------------|----------------------|------------------------|----------|----------------------------------------------|-------------------|--------------------|-----------------------------------------|----------------------|-----------------| | IVOR0 | Critical Input | Critical Input | Х | | | Х | | | | | 1 | | | IVOR1 | Machine Check | Machine Check | | | | | | ME | | | 2,4 | 674 | | IVOR2 | Data Storage | Access | | х | | | [ST],[FP,AP,SPV]<br>[VLEMI], [EPID] | | | E | 9 | 675 | | | | Load and Reserve or Store<br>Conditional to 'write-thru<br>required' storage (W=1) | | Х | | | [ST], [VLEMI] | | | E | 9 | | | | | Cache Locking | | х | | | $\{DLK_0,DLK_1\},[ST]$<br>[VLEMI] | | | E | 8 | | | | | Byte Ordering | | х | | | BO, [ST],<br>[FP,AP,SPV],<br>[VLEMI], [EPID] | | | E | | | | IVOR3 | Inst Storage | Access | | х | | | | | | Е | | 676 | | | - | Byte Ordering | | х | | | BO, [VLEMI] | | | E<br>E<br>E, | | | | | | Mismatched Instruction<br>Storage (See Book VLE.)) | | х | | | BO, VLEMI | EE | | E,<br>VLE | 1 | | | | | Misaligned Instruction<br>Storage (See Book VLE.) | | х | | | MIF | EE | | E,<br>VLE | 1 | | | 1 | External Input | External Input | Х | | | | | EE | | E | 1 | 676 | | IVOR5 | Alignment | Alignment | | Х | | | [ST],[FP,AP,SPV]<br>[EPID],[VLEMI] | | | E | | 677 | | IVOR6 | Program | Illegal | | Х | | | PIL, [VLEMI] | | | Е | | 678 | | | | Privileged | | х | | | PPR,[AP],<br>[VLEMI] | | | E | | | | | | Trap | | Х | | | PTR,[VLEMI] | | | E | | | | | | FP Enabled | | Х | Х | | FP, [PIE] | FE0,<br>FE1 | | E | 6,7 | | | | | AP Enabled | | Х | Х | | AP | | | E<br>E | | | | | | Unimplemented Op | | Х | | | PUO, [VLEMI]<br>[FP,AP,SPV] | | | | 7 | | | | FP Unavailable | FP Unavailable | | Х | | | | | | E | | 679 | | | System Call | System Call | | Х | | | [VLEMI] | | | E | | 679 | | | AP Unavailable | AP Unavailable | | Х | | | | | D | E | | 679 | | | Decrementer | | Х | | | | | EE | DIE | E | | 680 | | IVOR11 | | | X | | | | | EE | | E | 4.0 | 680 | | | Watchdog | Data T. D. Mins | Х | | | Х | ICTLIED AD CDV | | WIE | | 10 | 680 | | | Data TLB Error | Data TLB Miss | | X | | | [ST],[FP,AP,SPV]<br>[VLEMI],[EPID] | | | E | | 681 | | 1 V UK 14 | Inst TLB Error | Inst TLB Miss | | Х | | | [MIF] | | | Е | | 681 | | | Interrupt | Exception | Asynchronous | Synchronous, Precise | Synchronous, Imprecise | Critical | ESR<br>(See Note 5) | □ MSR Mask Bit(s) | DBCR0/TCR Mask Bit | Category<br>(Section 1.3 | Notes (see page 673) | Page | |---------|-----------------------------------|----------------------------------------------------|--------------|----------------------|------------------------|----------|---------------------|-------------------|--------------------|--------------------------|----------------------|------| | IVOR15 | Debug | Trap | | Х | | Х | | | | E | 10 | 682 | | | | Inst Addr Compare | | Х | | Х | | DE | IDM | | 10 | | | | | Data Addr Compare | | Х | | Х | | DE | | E | 10 | | | | | Instruction Complete | | Х | | Х | | DE | | E | 3,10 | | | | | Branch Taken | | Х | | Х | | DE | | E | 3,10 | | | | | Return From Interrupt | | Х | | Х | | DE | | E | 10 | | | | | Interrupt Taken | Х | | | Х | | DE | | E | 10 | | | | | Uncond Debug Event | Х | | | Х | | DE<br>DE | | E.ED<br>E.ED | 10 | | | | | Critical Interrupt Taken Critical Interrupt Return | Х | х | | | | DE | | E.ED | | | | IV/OB32 | SPE/Embedded | SPE Unavailable | | ^<br>X | | | SPV, [VLEMI] | | IDIVI | SPE | | 683 | | IVOROZ | Floating-Point/Vector Unavailable | or E oriavaliable | | ^ | | | Or v, [veelvii] | | | 01 L | | 003 | | | Onavanable | Vector Unavailable | | | | | | SPV | | V | | | | IVOR33 | Embedded Floating-<br>Point Data | Embedded Floating-Point Data | | x | | | SPV, [VLEMI] | | | SP.F* | | 684 | | IVOR34 | Embedded Floating-<br>Point Round | Embedded Floating-Point Round | | x | | | SPV, [VLEMI] | | | SP.F* | | 684 | | IVOR35 | Embedded Perfor-<br>mance Monitor | Embedded Performance<br>Monitor | x | | | | | | | E.PM | | | | IVOR36 | Processor Doorbell | Processor Doorbell | х | | | | | EE | | E.PC | | | | IVOR37 | Processor Critical<br>Doorbell | Processor Critical Doorbell | х | | | х | | CE | | E.PC | | | Figure 15. Interrupt and Exception Types #### **Figure 15 Notes** - Although it is not specified, it is common for system implementations to provide, as part of the interrupt controller, independent mask and status bits for the various sources of Critical Input and External Input interrupts. - Machine Check interrupts are a special case and are not classified as asynchronous nor synchronous. See Section 5.4.4 on page 669. - 3. The Instruction Complete and Branch Taken debug events are only defined for MSR<sub>DE</sub>=1 when in Internal Debug Mode (DBCR0<sub>IDM</sub>=1). In other words, when in Internal Debug Mode with MSR<sub>DE</sub>=0, then Instruction Complete and Branch Taken debug events cannot occur, and no DBSR status bits are set and no subsequent imprecise Debug interrupt will occur (see Section 8.4 on page 704). - Machine Check status information is commonly provided as part of the system implementation, but is implementation-dependent. - 5. In general, when an interrupt causes a particular ESR bit or bits to be set (or cleared) as indicated in the table, it also causes all other ESR bits to be cleared. There may be special rules regarding the handling of implementation-specific ESR bits. #### Legend: [xxx] means ESR<sub>xxx</sub> could be set [xxx,yyy] means either ${\rm ESR}_{\rm xxx}$ or ${\rm ESR}_{\rm yyy}$ may be set, but never both (xxx,yyy) means either ${\sf ESR}_{\sf xxx}$ or ${\sf ESR}_{\sf yyy}$ will be set, but never both $\{xxx,yyy\}$ means either $\text{ESR}_{xxx}$ or $\text{ESR}_{yyy}$ will be set, or possibly both xxx means ESR<sub>xxx</sub> is set 6. The precision of the Floating-point Enabled Exception type Program interrupt is controlled by the MSR<sub>FF0,FF1</sub> bits. When MSR<sub>FF0,FF1</sub>=0b01 or 0b10, the interrupt may be imprecise. When such a Program interrupt is taken, if the address saved in SRR0 is not the address of the instruction that caused the exception (i.e. the instruction that caused FPSCR<sub>FEX</sub> to be set to 1), ESR<sub>PIE</sub> is set to 1. When MSR<sub>FE0,FE1</sub>=0b11, the interrupt is precise. When MSR<sub>FE0,FE1</sub>=0b00, the interrupt is masked, and the interrupt will subsequently occur imprecisely if and when Floating-point Enabled Exception type Program interrupts are enabled by setting either or both of MSR<sub>FE0.FE1</sub>, and will also cause ESR<sub>PIE</sub> to be set to 1. See Section 5.6.7. Also, exception status on the exact cause is available in the Floating-Point Status and Control Register (see Section 4.2.2 and Section 4.4 of Book I). The precision of the Auxiliary Processor Enabled Exception type Program interrupt is implementation-dependent. - 7. Auxiliary Processor exception status is commonly provided as part of the implementation. - 8. Cache locking and cache locking exceptions are implementation-dependent. - 9. Software must examine the instruction and the subject TLB entry to determine the exact cause of the interrupt. - 10. If the Embedded.Enhanced Debug category is enabled, this interrupt is not a critical interrupt. DSRR0 and DSRR1 are used instead of CSRR0 and CSRR1. #### **Critical Input Interrupt** 5.6.1 A Critical Input interrupt occurs when no higher priority exception exists (see Section 5.9 on page 689), a Critical Input exception is presented to the interrupt mechanism, and MSR<sub>CE</sub>=1. While the specific definition of a Critical Input exception is implementation-dependent, it would typically be caused by the activation of an asynchronous signal that is part of the system. Also, implementations may provide an alternative means (in addition to MSR<sub>CF</sub>) for masking the Critical Input inter- CSRR0, CSRR1, and MSR are updated as follows: CSRR0 Set to the effective address of the next instruction to be executed. CSRR1 Set to the contents of the MSR at the time of the interrupt. MSR MSR<sub>CM</sub> is set to MSR<sub>ICM</sub>. CM ME, ICM Unchanged. Unchanged if category E.ED is supported; DE otherwise set to 0 All other defined MSR bits set to 0. Instruction execution resumes at address IVPR<sub>0:47</sub> || IVOR0<sub>48:59</sub>||0b0000. #### Programming Note – Software is responsible for taking any action(s) that are required by the implementation in order to clear any Critical Input exception status prior to reenabling MSR<sub>CF</sub> in order to avoid another, redundant Critical Input interrupt. #### 5.6.2 Machine Check Interrupt A Machine Check interrupt occurs when no higher priority exception exists (see Section 5.9 on page 689), a Machine Check exception is presented to the interrupt mechanism, and MSR<sub>ME</sub>=1. The specific cause or causes of Machine Check exceptions are implementation-dependent, as are the details of the actions taken on a Machine Check interrupt. If the Machine Check Extension is implemented, MCSRR0, MCSRR1, and MCSR are set, otherwise CSRR0, CSRR1, and ESR are set. The registers are updated as follows: #### CSRR0/MCSRR0 Set to an instruction address. As closely as possible, set to the effective address of an instruction that was executing or about to be executed when the Machine Check exception occurred. #### CSRR1/MCSRR1 Set to the contents of the MSR at the time of the interrupt. #### **MSR** CM MSR<sub>CM</sub> is set to MSR<sub>ICM</sub>. DF Unchanged if category E.ED is supported; otherwise set to 0. All other defined MSR bits set to 0. #### ESR/MCSR Implementation-dependent. Instruction execution resumes at address IVPR<sub>0.47</sub> || IVOR1<sub>48:59</sub>||0b0000. #### Programming Note - If a Machine Check interrupt is caused by an error in the storage subsystem, the storage subsystem may return incorrect data, that may be placed into registers and/or on-chip caches. #### Programming Note - On implementations on which a Machine Check interrupt can be caused by referring to an invalid real address, executing a *dcbz*, *dcbzep*, or *dcba* instruction can cause a delayed Machine Check interrupt by establishing in the data cache a block that is associated with an invalid real address. See Section 3.3 of Book II. A Machine Check interrupt can eventually occur if and when a subsequent attempt is made to write that block to main storage, for example as the result of executing an instruction that causes a cache miss for which the block is the target for replacement or as the result of executing a *dcbst*, *dcbstep*, *dcbf*, or *dcbfep* instruction ### 5.6.3 Data Storage Interrupt A Data Storage interrupt may occur when no higher priority exception exists (see Section 5.9 on page 689) and a Data Storage exception is presented to the interrupt mechanism. A Data Storage exception is caused when any of the following exceptions arises during execution of an instruction: #### **Read Access Control exception** A Read Access Control exception is caused when one of the following conditions exist. - While in user mode (MSR<sub>PR</sub>=1), a Load or 'loadclass' Cache Management instruction attempts to access a location in storage that is not user mode read enabled (i.e. page access control bit UR=0). - While in supervisor mode (MSR<sub>PR</sub>=0), a Load or 'load-class' Cache Management instruction attempts to access a location in storage that is not supervisor mode read enabled (i.e. page access control bit SR=0). #### Write Access Control exception A Write Access Control exception is caused when one of the following conditions exist. - While in user mode (MSR<sub>PR</sub>=1), a Store or 'storeclass' Cache Management instruction attempts to access a location in storage that is not user mode write enabled (i.e. page access control bit UW=0). - While in supervisor mode (MSR<sub>PR</sub>=0), a *Store* or 'store-class' *Cache Management* instruction attempts to access a location in storage that is not supervisor mode write enabled (i.e. page access control bit SW=0). #### **Byte Ordering exception** A Byte Ordering exception may occur when the implementation cannot perform the data storage access in the byte order specified by the Endian storage attribute of the page being accessed. #### Cache Locking exception A Cache Locking exception may occur when the locked state of one or more cache lines has the potential to be altered. This exception is implementation-dependent. #### Storage Synchronization exception A Storage Synchronization exception will occur when an attempt is made to execute a *Load and Reserve* or *Store Conditional* instruction from or to a location that is Write Through Required or Caching Inhibited (if the interrupt does not occur then the instruction executes correctly: see Section 3.4.2 of Book II). If a **stwcx.** or **stdcx.** would not perform its store in the absence of a Data Storage interrupt, and either (a) the specified effective address refers to storage that is Write Through Required or Caching Inhibited, or (b) a non-conditional *Store* to the specified effective address would cause a Data Storage interrupt, it is implementation-dependent whether a Data Storage interrupt occurs. Instructions *Iswx* or *stswx* with a length of zero, *icbt*, *dcbtep*, *dcbtst*, *dcbtstep*, or *dcba* cannot cause a Data Storage interrupt, regardless of the effective address. #### **Programming Note -** The *icbi*, *icbiep*, and *icbt* instructions are treated as *Loads* from the addressed byte with respect to address translation and protection. These *Instruction Cache Management* instructions use MSR<sub>DS</sub>, not MSR<sub>IS</sub>, to determine translation for their operands. Instruction Storage exceptions and Instruction TLB Miss exceptions are associated with the 'fetching' of instructions not with the 'execution' of instructions. Data Storage exceptions and Data TLB Miss exceptions are associated with the 'execution' of *Instruction Cache Management* instructions When a Data Storage interrupt occurs, the processor suppresses the execution of the instruction causing the Data Storage exception. SRR0, SRR1, MSR, DEAR, and ESR are updated as follows: SRR0 Set to the effective address of the instruction causing the Data Storage interrupt. SRR1 Set to the contents of the MSR at the time of the interrupt. MSR CM MSR<sub>CM</sub> is set to MSR<sub>ICM</sub>. CE, ME, DE, ICM Unchanged. All other defined MSR bits set to 0. **DEAR** Set to the effective address of a byte that is both within the range of the bytes being accessed by the *Storage Access* or *Cache Management* instruction, and within the page whose access caused the Data Storage exception. **ESR** FP Set to 1 if the instruction causing the inter- rupt is a floating-point load or store; other- wise set to 0. ST Set to 1 if the instruction causing the interrupt is a *Store* or 'store-class' *Cache Man-* agement instruction; otherwise set to 0. DLK<sub>0:1</sub> Set to an implementation-dependent value due to a Cache Locking exception causing the interrupt. AP Set to 1 if the instruction causing the interrupt is an Auxiliary Processor load or store; otherwise set to 0. BO Set to 1 if the instruction caused a Byte Ordering exception; otherwise set to 0. SPV Set to 1 if the instruction causing the inter- rupt is a SPE operation or a Vector operation; otherwise set to 0. VLEMI Set to 1 if the instruction causing the inter- rupt resides in VLE storage. EPID Set to 1 if the instruction causing the inter- rupt is an External Process ID instruction; otherwise set to 0. All other defined ESR bits are set to 0. #### Programming Note Read and Write Access Control and Byte Ordering exceptions are not mutually exclusive. Even if ESR<sub>BO</sub> is set, system software must also examine the TLB entry accessed by the data storage access to determine whether or not a Read Access Control or Write Access Control exception may have also occurred. Instruction execution resumes at address IVPR $_{0:47}$ || IVOR $_{248:59}$ ||0b0000. ## 5.6.4 Instruction Storage Interrupt An Instruction Storage interrupt occurs when no higher priority exception exists (see Section 5.9 on page 689) and an Instruction Storage exception is presented to the interrupt mechanism. An Instruction Storage exception is caused when any of the following exceptions arises during execution of an instruction: #### **Execute Access Control exception** An Execute Access Control exception is caused when one of the following conditions exist. While in user mode (MSR<sub>PR</sub>=1), an instruction fetch attempts to access a location in storage that - is not user mode execute enabled (i.e. page access control bit UX=0). - While in supervisor mode (MSR<sub>PR</sub>=0), an instruction fetch attempts to access a location in storage that is not supervisor mode execute enabled (i.e. page access control bit SX=0). #### **Byte Ordering exception** A Byte Ordering exception may occur when the implementation cannot perform the instruction fetch in the byte order specified by the Endian storage attribute of the page being accessed. When an Instruction Storage interrupt occurs, the processor suppresses the execution of the instruction causing the Instruction Storage exception. SRR0, SRR1, MSR, and ESR are updated as follows: **SRR0** Set to the effective address of the instruc- tion causing the Instruction Storage inter- rupt. SRR1 Set to the contents of the MSR at the time of the interrupt. **MSR** CM $MSR_{CM}$ is set to $MSR_{ICM}$ . CE, ME, DE, ICM Unchanged. All other defined MSR bits set to 0. **ESR** BO Set to 1 if the instruction fetch caused a Byte Ordering exception; otherwise set to C VLEMI Set to 1 if the instruction causing the interrupt resides in VLE storage. All other defined ESR bits are set to 0. #### Programming Note Execute Access Control and Byte Ordering exceptions are not mutually exclusive. Even if ESR<sub>BO</sub> is set, system software must also examine the TLB entry accessed by the instruction fetch to determine whether or not an Execute Access Control exception may have also occurred. Instruction execution resumes at address IVPR $_{0:47}$ || IVOR3 $_{48\cdot59}$ ||0b0000. # 5.6.5 External Input Interrupt An External Input interrupt occurs when no higher priority exception exists (see Section 5.9 on page 689), an External Input exception is presented to the interrupt mechanism, and MSR<sub>EE</sub>=1. While the specific definition of an External Input exception is implementation-dependent, it would typically be caused by the activation of an asynchronous signal that is part of the pro- cessing system. Also, implementations may provide an alternative means (in addition to ${\sf MSR}_{\sf EE}$ ) for masking the External Input interrupt. SRR0, SRR1, and MSR are updated as follows: SRR0 Set to the effective address of the next instruction to be executed. SRR1 Set to the contents of the MSR at the time of the interrupt. #### **MSR** CM $MSR_{CM}$ is set to $MSR_{ICM}$ . CE, ME, DE, ICM Unchanged. All other defined MSR bits set to 0. Instruction execution resumes at address IVPR $_{0:47}$ || IVOR4 $_{48:59}$ ||0b0000. #### Programming Note Software is responsible for taking whatever action(s) are required by the implementation in order to clear any External Input exception status prior to re-enabling MSR<sub>EE</sub> in order to avoid another, redundant External Input interrupt. ## 5.6.6 Alignment Interrupt An Alignment interrupt occurs when no higher priority exception exists (see Section 5.9 on page 689) and an Alignment exception is presented to the interrupt mechanism. An Alignment exception may be caused when the implementation cannot perform a data storage access for one of the following reasons: - The operand of a *Load* or *Store* is not aligned. - The instruction is a Move Assist, Load Multiple or Store Multiple. - The operand of *dcbz* or *dcbzep* is in storage that is Write Through Required or Caching Inhibited, or one of these instructions is executed in an implementation that has either no data cache or a Write Through data cache or the line addressed by the instruction cannot be established in the cache because the cache is disabled or locked. - The operand of a Store, except Store Conditional, is in storage that is Write-Through Required. For *Imw* and *stmw* with an operand that is not wordaligned, and for *Load and Reserve* and *Store Conditional* instructions with an operand that is not aligned, an implementation may yield boundedly undefined results instead of causing an Alignment interrupt. A *Store Conditional* to Write Through Required storage may either cause a Data Storage interrupt, cause an Alignment interrupt, or correctly execute the instruction. For all other cases listed above, an implementation may execute the instruction correctly instead of causing an Alignment interrupt. (For *dcbz* or *dcbzep*, 'correct' execution means setting each byte of the block in main storage to 0x00.) #### Programming Note - The architecture does not support the use of an unaligned effective address by *Load and Reserve* and *Store Conditional* instructions. If an Alignment interrupt occurs because one of these instructions specifies an unaligned effective address, the Alignment interrupt handler must not attempt to emulate the instruction, but instead should treat the instruction as a programming error. When an Alignment interrupt occurs, the processor suppresses the execution of the instruction causing the Alignment exception. SRR0, SRR1, MSR, DEAR, and ESR are updated as follows: **SRR0** Set to the effective address of the instruc- tion causing the Alignment interrupt. SRR1 Set to the contents of the MSR at the time of the interrupt. #### MSR CM $MSR_{CM}$ is set to $MSR_{ICM}$ . CE, ME, DE, ICM Unchanged. All other defined MSR bits set to 0. **DEAR** Set to the effective address of a byte that is both within the range of the bytes being accessed by the Storage Access or Cache Management instruction, and within the page whose access caused the Alignment exception. **ESR** FP Set to 1 if the instruction causing the interrupt is a floating-point load or store; otherwise set to 0. ST Set to 1 if the instruction causing the interrupt is a *Store*; otherwise set to 0. AP Set to 1 if the instruction causing the interrupt is an Auxiliary Processor load or store; otherwise set to 0. SPV Set to 1 if the instruction causing the interrupt is a SPE operation or a Vector operation; otherwise set to 0. VLEMI Set to 1 if the instruction causing the interrupt resides in VLE storage. EPID Set to 1 if the instruction causing the interrupt is an External Process ID instruction; otherwise set to 0. All other defined ESR bits are set to 0. Instruction execution resumes at address IVPR $_{0:47}$ || IVOR5 $_{48:59}$ ||0b0000. ## 5.6.7 Program Interrupt A Program interrupt occurs when no higher priority exception exists (see Section 5.9 on page 689), a Program exception is presented to the interrupt mechanism, and, for Floating-point Enabled exception, MSR<sub>FE0,FE1</sub> are non-zero. A Program exception is caused when any of the following exceptions arises during execution of an instruction: #### Floating-point Enabled exception A Floating-point Enabled exception is caused when FPSCR<sub>FEX</sub> is set to 1 by the execution of a floating-point instruction that causes an enabled exception, including the case of a *Move To FPSCR* instruction that causes an exception bit and the corresponding enable bit both to be 1. Note that in this context, the term 'enabled exception' refers to the enabling provided by control bits in the Floating-Point Status and Control Register. See Section 4.2.2 of Book I. #### **Auxiliary Processor Enabled exception** The cause of an Auxiliary Processor Enabled exception is implementation-dependent. #### Illegal Instruction exception An Illegal Instruction exception does occur when execution is attempted of any of the following kinds of instructions. - a reserved-illegal instruction - when MSR<sub>PR</sub>=1 (user mode), an *mtspr* or *mfspr* that specifies an spr value with spr<sub>5</sub>=0 (user-mode accessible) that represents an unimplemented Special Purpose Register An Illegal Instruction exception *may* occur when execution is attempted of any of the following kinds of instructions. If the exception does not occur, the alternative is shown in parentheses. - an instruction that is in invalid form (boundedly undefined results) - an Iswx instruction for which register RA or register RB is in the range of registers to be loaded (boundedly undefined results) - a defined instruction that is not implemented by the implementation (Unimplemented Operation exception) #### **Privileged Instruction exception** A Privileged Instruction exception occurs when $MSR_{PR}$ =1 and execution is attempted of any of the following kinds of instructions. - a privileged instruction - an *mtspr* or *mfspr* instruction that specifies an spr value with spr<sub>5</sub>=1 #### Trap exception A Trap exception occurs when any of the conditions specified in a Trap instruction are met and the exception is not also enabled as a Debug interrupt. If enabled as a Debug interrupt (i.e. $DBCR0_{TRAP}=1$ , $DBCR0_{IDM}=1$ , and $MSR_{DE}=1$ ), then a Debug interrupt will be taken instead of the Program interrupt. #### **Unimplemented Operation exception** An Unimplemented Operation exception *may* occur when execution is attempted of a defined instruction that is not implemented by the implementation. Otherwise an Illegal Instruction exception occurs. An Unimplemented Operation exception *may* also occur when the processor is in 32-bit mode and execution is attempted of an instruction that is part of the 64-Bit category. Otherwise the instruction executes normally. SRR0, SRR1, MSR, and ESR are updated as follows: SRR0 For all Program interrupts except an Enabled exception when in one of the imprecise modes (see Section 2.2.1 on page 611) or when a disabled exception is subsequently enabled, set to the effective address of the instruction that caused the Program interrupt. For an imprecise Enabled exception, set to the effective address of the excepting instruction or to the effective address of some subsequent instruction. If it points to a subsequent instruction, that instruction has not been executed, and ESR<sub>PIE</sub> is set to 1. If a subsequent instruction is an *msync* or *isync*, SRR0 will point at the *msync* or *isync* instruction, or at the following instruction. If $FPSCR_{FEX}=1$ but both $MSR_{FE0}=0$ and $MSR_{FE1}=0$ , an Enabled exception type Program interrupt will occur imprecisely prior to or at the next synchronizing event if these MSR bits are altered by any instruction that can set the MSR so that the expression (MSR<sub>FE0</sub> | MSR<sub>FE1</sub>) & FPSCR<sub>FEX</sub> is 1. When this occurs, SRR0 is loaded with the address of the instruction that would have executed next, not with the address of the instruction that modified the MSR causing the interrupt, and ESR<sub>PIF</sub> is set to 1. SRR1 Set to the contents of the MSR at the time of the interrupt. **MSR** CM $MSR_{CM}$ is set to $MSR_{ICM}$ . CE, ME, DE, ICM Unchanged. All other defined MSR bits set to 0. **ESR** PIL Set to 1 if an Illegal Instruction exception type Program interrupt; otherwise set to 0 PPR Set to 1 if a Privileged Instruction exception type Program interrupt; otherwise set to 0 PTR Set to 1 if a Trap exception type Program interrupt; otherwise set to 0 PUO Set to 1 if an Unimplemented Operation exception type Program interrupt; otherwise set to 0 FP Set to 1 if the instruction causing the interrupt is a floating-point instruction; otherwise set to 0. PIE Set to 1 if a Floating-point Enabled exception type Program interrupt, and the address saved in SRR0 is not the address of the instruction causing the exception (i.e. the instruction that caused FPSCR<sub>FEX</sub> to be set); otherwise set to 0. AP Set to 1 if the instruction causing the interrupt is an Auxiliary Processor instruction; otherwise set to 0. SPV Set to 1 if the instruction causing the interrupt is a SPE operation or a Vector operation; otherwise set to 0. VLEMI Set to 1 if the instruction causing the interrupt resides in VLE storage. All other defined ESR bits are set to 0. Instruction execution resumes at address IVPR $_{0:47}$ || IVOR6 $_{48:59}$ ||0b0000. # 5.6.8 Floating-Point Unavailable Interrupt A Floating-Point Unavailable interrupt occurs when no higher priority exception exists (see Section 5.9 on page 689), an attempt is made to execute a floating-point instruction (i.e. any instruction listed in Section 4.6 of Book I), and $MSR_{FP}=0$ . When a Floating-Point Unavailable interrupt occurs, the processor suppresses the execution of the instruction causing the Floating-Point Unavailable interrupt. SRR0, SRR1, and MSR are updated as follows: **SRR0** Set to the effective address of the instruction that caused the interrupt. SRR1 Set to the contents of the MSR at the time of the interrupt. MSR CM $MSR_{CM}$ is set to $MSR_{ICM}$ . CE. ME. DE, ICM Unchanged. All other defined MSR bits set to 0. Instruction execution resumes at address IVPR $_{0:47}$ || IVOR7 $_{48:59}$ ||0b0000. ## 5.6.9 System Call Interrupt A System Call interrupt occurs when no higher priority exception exists (see Section 5.9 on page 689) and a System Call (sc) instruction is executed. SRR0, SRR1, and MSR are updated as follows: **SRR0** Set to the effective address of the instruction *after* the **sc** instruction. SRR1 Set to the contents of the MSR at the time of the interrupt. MSR CM $MSR_{CM}$ is set to $MSR_{ICM}$ . VLEMI Set to 1 if the instruction causing the inter- rupt resides in VLE storage. CE, ME, DE, ICM Unchanged. All other defined MSR bits set to 0. Instruction execution resumes at address IVPR $_{0:47}$ || IVOR8 $_{48:59}$ ||0b0000. ## 5.6.10 Auxiliary Processor Unavailable Interrupt An Auxiliary Processor Unavailable interrupt occurs when no higher priority exception exists (see Section 5.9 on page 689), an attempt is made to execute an Auxiliary Processor instruction (including Auxiliary Processor loads, stores, and moves), the target Auxiliary Processor is present on the implementation, and the Auxiliary Processor is configured as unavailable. Details of the Auxiliary Processor, its instruction set, and its configuration are implementation-dependent. See User's Manual for the implementation. When an Auxiliary Processor Unavailable interrupt occurs, the processor suppresses the execution of the instruction causing the Auxiliary Processor Unavailable interrupt. Registers SRR0, SRR1, and MSR are updated as follows: SRR0 Set to the effective address of the instruction that caused the interrupt. tion that dauged the interrupt. Set to the contents of the MSR at the time of the interrupt. MSR SRR1 CM $MSR_{CM}$ is set to $MSR_{ICM}$ . CE, ME, DE, ICM Unchanged. All other defined MSR bits set to 0. Instruction execution resumes at address IVPR $_{0:47}$ || IVOR9 $_{48:59}$ ||0b0000. ## 5.6.11 Decrementer Interrupt A Decrementer interrupt occurs when no higher priority exception exists (see Section 5.9 on page 689), a Decrementer exception exists (TSR $_{DIS}$ =1), and the interrupt is enabled (TCR $_{DIE}$ =1 and MSR $_{EE}$ =1). See Section 7.3 on page 697. #### Programming Note — MSR<sub>EE</sub> also enables the External Input and Fixed-Interval Timer interrupts. SRR0, SRR1, MSR, and TSR are updated as follows: SRR0 Set to the effective address of the next instruction to be executed. **SRR1** Set to the contents of the MSR at the time of the interrupt. **MSR** CM $MSR_{CM}$ is set to $MSR_{ICM}$ . CE, ME, DE, ICM Unchanged. All other defined MSR bits set to 0. TSR (See Section 7.5.1 on page 700.) DIS Set to 1. Instruction execution resumes at address IVPR $_{0:47} \parallel$ IVOR10 $_{48:59} \parallel$ 0b0000. #### Programming Note — Software is responsible for clearing the Decrementer exception status prior to re-enabling the MSR<sub>EE</sub> bit in order to avoid another redundant Decrementer interrupt. To clear the Decrementer exception, the interrupt handling routine must clear TSR<sub>DIS</sub>. Clearing is done by writing a word to TSR using *mtspr* with a 1 in any bit position that is to be cleared and 0 in all other bit positions. The writedata to the TSR is not direct data, but a mask. A 1 causes the bit to be cleared, and a 0 has no effect. # 5.6.12 Fixed-Interval Timer Interrupt A Fixed-Interval Timer interrupt occurs when no higher priority exception exists (see Section 5.9 on page 689), a Fixed-Interval Timer exception exists (TSR $_{FIS}$ =1), and the interrupt is enabled (TCR $_{FIE}$ =1 and MSR $_{EE}$ =1). See Section 7.6 on page 700. #### Programming Note MSR<sub>EE</sub> also enables the External Input and Decrementer interrupts. SRR0, SRR1, MSR, and TSR are updated as follows: SRR0 Set to the effective address of the next instruction to be executed. **SRR1** Set to the contents of the MSR at the time of the interrupt. **MSR** CM MSR<sub>CM</sub> is set to MSR<sub>ICM</sub>. CE, ME, DE, ICM Unchanged. All other defined MSR bits set to 0. TSR (See Section 7.5.1 on page 700.) FIS Set to 1 Instruction execution resumes at address IVPR $_{0:47}$ || IVOR11 $_{48:59}$ ||0b0000. #### Programming Note Software is responsible for clearing the Fixed-Interval Timer exception status prior to re-enabling the MSR<sub>EE</sub> bit in order to avoid another redundant Fixed-Interval Timer interrupt. To clear the Fixed-Interval Timer exception, the interrupt handling routine must clear TSR<sub>FIS</sub>. Clearing is done by writing a word to TSR using *mtspr* with a 1 in any bit position that is to be cleared and 0 in all other bit positions. The write-data to the TSR is not direct data, but a mask. A 1 causes the bit to be cleared, and a 0 has no effect. # 5.6.13 Watchdog Timer Interrupt A Watchdog Timer interrupt occurs when no higher priority exception exists (see Section 5.9 on page 689), a Watchdog Timer exception exists (TSR $_{WIS}$ =1), and the interrupt is enabled (i.e. TCR $_{WIE}$ =1 and MSR $_{CE}$ =1). See Section 7.7 on page 701. #### Programming Note — MSR<sub>CE</sub> also enables the Critical Input interrupt. CSRR0, CSRR1, MSR, and TSR are updated as follows: **CSRR0** Set to the effective address of the next instruction to be executed. **CSRR1** Set to the contents of the MSR at the time of the interrupt. MSR CM MSR<sub>CM</sub> is set to MSR<sub>ICM</sub>. ME, ICM, DE Unchanged. All other defined MSR bits set to 0. TSR (See Section 7.5.1 on page 700.) WIS Set to 1. Instruction execution resumes at address IVPR $_{0:47}$ | IVOR12 $_{48\cdot59}$ ||0b0000. #### Programming Note Software is responsible for clearing the Watchdog Timer exception status prior to re-enabling the MSR<sub>CE</sub> bit in order to avoid another redundant Watchdog Timer interrupt. To clear the Watchdog Timer exception, the interrupt handling routine must clear TSR<sub>WIS</sub>. Clearing is done by writing a word to TSR using *mtspr* with a 1 in any bit position that is to be cleared and 0 in all other bit positions. The write-data to the TSR is not direct data, but a mask. A 1 causes the bit to be cleared, and a 0 has no effect. ## 5.6.14 Data TLB Error Interrupt A Data TLB Error interrupt occurs when no higher priority exception exists (see Section 5.9 on page 689) and any of the following Data TLB Error exceptions is presented to the interrupt mechanism. #### **TLB Miss exception** Caused when the virtual address associated with a data storage access does not match any valid entry in the TLB as specified in Section 4.7.2 on page 643. If a **stwcx.** or **stdcx.** would not perform its store in the absence of a Data Storage interrupt, and a non-conditional *Store* to the specified effective address would cause a Data Storage interrupt, it is implementation-dependent whether a Data Storage interrupt occurs. When a Data TLB Error interrupt occurs, the processor suppresses the execution of the instruction causing the Data TLB Error interrupt. SRR0, SRR1, MSR, DEAR and ESR are updated as follows: SRR0 Set to the effective address of the instruc- tion causing the Data TLB Error interrupt **SRR1** Set to the contents of the MSR at the time of the interrupt. **MSR** CM MSR<sub>CM</sub> is set to MSR<sub>ICM</sub>. CE, ME, DE, ICM Unchanged. All other defined MSR bits set to 0. **DEAR** Set to the effective address of a byte that is both within the range of the bytes being accessed by the *Storage Access* or *Cache* Management instruction, and within the page whose access caused the Data TLB Error exception. **ESR** ST Set to 1 if the instruction causing the interrupt is a *Store*, *dcbi*, *dcbz*, or *dcbzep* instruction; otherwise set to 0. FP Set to 1 if the instruction causing the interrupt is a floating-point load or store; otherwise set to 0. AP Set to 1 if the instruction causing the interrupt is an Auxiliary Processor load or store; otherwise set to 0. SPV Set to 1 if the instruction causing the interrupt is a SPE operation or a Vector operation; otherwise set to 0. VLEMI Set to 1 if the instruction causing the interrupt resides in VLE storage. EPID Set to 1 if the instruction causing the interrupt is an External Process ID instruction; otherwise set to 0. All other defined ESR bits are set to 0. Instruction execution resumes at address IVPR $_{0:47}$ | IVOR13 $_{48:59}$ ||0b0000. # 5.6.15 Instruction TLB Error Interrupt An Instruction TLB Error interrupt occurs when no higher priority exception exists (see Section 5.9 on page 689) and any of the following Instruction TLB Error exceptions is presented to the interrupt mechanism #### **TLB Miss exception** Caused when the virtual address associated with an instruction fetch does not match any valid entry in the TLB as specified in Section 4.7.2 on page 643. When an Instruction TLB Error interrupt occurs, the processor suppresses the execution of the instruction causing the Instruction TLB Miss exception. SRR0, SRR1, and MSR are updated as follows: SRR0 Set to the effective address of the instruction causing the Instruction TLB Error interrupt. SRR1 Set to the contents of the MSR at the time of the interrupt. MSR CM $MSR_{CM}$ is set to $MSR_{ICM}$ . CE, ME, DE, ICM Unchanged. All other defined MSR bits set to 0. Instruction execution resumes at address IVPR $_{0:47}$ || IVOR14 $_{48:59}$ ||0b0000. ## 5.6.16 Debug Interrupt A Debug interrupt occurs when no higher priority exception exists (see Section 5.9 on page 689), a Debug exception exists in the DBSR, and Debug interrupts are enabled (DBCR0<sub>IDM</sub>=1 and MSR<sub>DE</sub>=1). A Debug exception occurs when a Debug Event causes a corresponding bit in the DBSR to be set. See Section 8.5. If the Embedded.Enhanced Debug category is not supported or is supported and is not enabled, CSRR0, CSRR1, MSR, and DBSR are updated as follows. If the Embedded.Enhanced Debug category is supported and is enabled, DSRR0 and DSRR1 are updated as specified below and CSRR0 and CSRR1 are not changed. The means by which the Embedded.Enhanced Debug category is enabled is implementation-dependent. **CSRR0** or **DSRR0** [Category: Embedded.Enhanced Debug] For Debug exceptions that occur while Debug interrupts are enabled (DBCR0<sub>IDM</sub>=1 and MSR<sub>DE</sub>=1), CSRR0 is set as follows: - For Instruction Address Compare (IAC1, IAC2, IAC3, IAC4), Data Address Compare (DAC1R, DAC1W, DAC2R, DAC2W), Data Value Compare (DVC1, DVC2), Trap (TRAP), or Branch Taken (BRT) debug exceptions, set to the address of the instruction causing the Debug interrupt. - For Instruction Complete (ICMP) debug exceptions, set to the address of the instruction that would have executed after the one that caused the Debug interrupt. - For Unconditional Debug Event (UDE) debug exceptions, set to the address of the instruction that would have executed next if the Debug interrupt had not occurred. - For Interrupt Taken (IRPT) debug exceptions, set to the interrupt vector value of the interrupt that caused the Interrupt Taken debug event. - For Return From Interrupt (RET) debug exceptions, set to the address of the *rfi* instruction that caused the Debug interrupt. - For Critical Interrupt Taken (CRPT) debug exceptions, DSRR0 is set to the address of the first instruction of the critical interrupt handler. CSRR0 is unaffected. - For Critical Interrupt Return (CRET) debug exceptions, DSRR0 is set to the address of the rfci instruction that caused the Debug interrupt. See Section 8.4.10, "Critical Interrupt Return Debug Event [Category: Embedded.Enhanced Debug]". For Debug exceptions that occur while Debug interrupts disabled are (DBCR0<sub>IDM</sub>=0 or MSR<sub>DE</sub>=0), a Debug interrupt will occur at the next synchronizing event if DBCR0<sub>IDM</sub> and MSR<sub>DE</sub> are modified such that they are both 1 and if the Debug exception Status is still set in the DBSR. When this occurs, CSRR0 or DSRR0 [Category:Embedded.Enhanced Debug] is set to the address of the instruction that would have executed next, not with the address of the instruction that modified the Debug Control Register 0 or MSR and thus caused the interrupt. CSRR1 or DSRR1 [Category: Embedded.Enhanced Debug] Set to the contents of the MSR at the time of the interrupt. #### MSR CM MSR<sub>CM</sub> is set to MSR<sub>ICM</sub>. ME, ICM Unchanged. All other supported MSR bits set to 0. DBSR Set to indicate type of Debug Event (see Section 8.5.2) Instruction execution resumes at address IVPR $_{0:47}$ || IVOR15 $_{48:59}$ ||0b0000. # 5.6.17 SPE/Embedded Floating-Point/Vector Unavailable Interrupt [Categories: SPE.Embedded Float Scalar Double, SPE.Embedded Float Vector, Vector] The SPE/Embedded Floating-Point/Vector Unavailable interrupt occurs when no higher priority exception exists, and an attempt is made to execute an SPE, SPE.Embedded Float Scalar Double, SPE.Embedded Float Vector, or Vector instruction and MSR<sub>SPV</sub> = 0. When an Embedded Floating-Point Unavailable interrupt occurs, the processor suppresses the execution of the instruction causing the exception. SRR0, SRR1, MSR, and ESR are updated as follows: SRR0 Set to the effective address of the instruction causing the Embedded Floating-Point Unavailable interrupt. SRR1 Set to the contents of the MSR at the time of the interrupt. #### MSR CM $MSR_{CM}$ is set to $MSR_{ICM}$ . VLEMI Set to 1 if the instruction causing the inter- rupt resides in VLE storage. CE, ME, DE, ICM Unchanged. All other defined MSR bits set to 0. #### **ESR** SPV Set to 1. VLEMI Set to 1 if the instruction causing the inter- rupt resides in VLE storage. All other defined ESR bits are set to 0. Instruction execution resumes at address IVPR $_{0:47} \parallel$ IVOR32 $_{48:59} \parallel$ 0b0000. #### Programming Note — This interrupt is also used by the Signal Processing Engine in the same manner. It should be used by software to determine if the application is using the upper 32 bits of the GPRs in a 32-bit implementation and thus be required to save and restore them on context switch. # 5.6.18 Embedded Floating-Point Data Interrupt ## [Categories: SPE.Embedded Float Scalar Double, SPE.Embedded Float Scalar Single, SPE.Embedded Float Vector] The Embedded Floating-Point Data interrupt occurs when no higher priority exception exists (see Section 5.9) and an Embedded Floating-Point Data exception is presented to the interrupt mechanism. The Embedded Floating-Point Data exception causing the interrupt is indicated in the SPEFSCR; these exceptions include Embedded Floating-Point Invalid Operation/Input Error (FINV, FINVH), Embedded Floating-Point Divide By Zero (FDBZ, FDBZH), Embedded Floating-Point Overflow (FOV, FOVH), and Embedded Floating-Point Underflow (FUNF, FUNFH) When an Embedded Floating-Point Data interrupt occurs, the processor suppresses the execution of the instruction causing the exception. SRR0, SRR1, MSR, and ESR are updated as follows: SRR0 Set to the effective address of the instruc- tion causing the Embedded Floating-Point Data interrupt. Data interrupt. SRR1 Set to the contents of the MSR at the time of the interrupt. **MSR** CM $MSR_{CM}$ is set to $MSR_{ICM}$ . VLEMI Set to 1 if the instruction causing the inter- rupt resides in VLE storage. CE, ME, DE, ICM Unchanged. All other defined MSR bits set to 0. **ESR** SPV Set to 1. All other defined ESR bits are set to 0. Instruction execution resumes at address IVPR $_{0:47}$ || IVOR33 $_{48:59}$ ||0b0000. # 5.6.19 Embedded Floating-Point Round Interrupt [Categories: SPE.Embedded Float Scalar Double, SPE.Embedded Float Scalar Single, SPE.Embedded Float Vector] The Embedded Floating-Point Round interrupt occurs when no higher priority exception exists (see Section 5.9 on page 689), SPEFSCR<sub>FINXE</sub> is set to 1, and any of the following occurs: - the unrounded result of an Embedded Floating-Point operation is not exact - an overflow occurs and overflow exceptions are disabled (FOVF or FOVFH is set to 1 and FOVFE is set to 0) - an underflow occurs and underflow exceptions are disabled (FUNF is set to 1 and FUNFE is set to 0). The value of SPEFSCR<sub>FINXS</sub> is 1, indicating that one of the above exceptions has occurred, and additional information about the exception is found in SPEFSCR<sub>FGH FG FXH FX</sub>. When an Embedded Floating-Point Round interrupt occurs, the processor completes the execution of the instruction causing the exception and writes the result to the destination register prior to taking the interrupt. SRR0, SRR1, MSR, and ESR are updated as follows: SRR0 Set to the effective address of the instruction following the instruction causing the Embedded Floating-Point Round interrupt. SRR1 Set to the contents of the MSR at the time of the interrupt. **MSR** CM $MSR_{CM}$ is set to $MSR_{ICM}$ . CE, ME, DE, ICM Unchanged. All other defined MSR bits set to 0. **ESR** SPV Set to 1. VLEMI Set to 1 if the instruction causing the interrupt resides in VLE storage. All other defined ESR bits are set to 0. Instruction execution resumes at address IVPR $_{0:47} \parallel$ IVOR34 $_{48:59} \parallel$ 0b0000. #### - Programming Note If an implementation does not support ±Infinity rounding modes and the rounding mode is set to be +Infinity or -Infinity, an Embedded Floating-Point Round interrupt occurs after every *Embedded Floating-Point* instruction for which rounding might occur regardless of the value of FINXE, provided no higher priority exception exists. When an Embedded Floating-Point Round interrupt occurs, the unrounded (truncated) result of an inexact high or low element is placed in the target register. If only a single element is inexact, the other exact element is updated with the correctly rounded result, and the FG and FX bits corresponding to the other exact element will both be 0. The bits FG (FGH) and FX (FXH) are provided so that an interrupt handler can round the result as it desires. FG (FGH) is the value of the bit immediately to the right of the least significant bit of the destination format mantissa from the infinitely precise intermediate calculation before rounding. FX (FXH) is the value of the 'or' of all the bits to the right of the FG (FGH) of the destination format mantissa from the infinitely precise intermediate calculation before rounding. # 5.6.20 Performance Monitor Interrupt [Category: Embedded Performance Monitor] The Performance Monitor interrupt is part of the optional Performance Monitor facility; see Appendix E. ## 5.6.21 Processor Doorbell Interrupt [Category: Embedded Processor Control] A Processor Doorbell Interrupt occurs when no higher priority exception exists, a Processor Doorbell exception is present, and MSR<sub>EE</sub>=1. Processor Doorbell exceptions are generated when DBELL messages (see Chapter 9) are received and accepted by the processor. When a Processor Doorbell Interrupt occurs, SRR0 is set to the address of the next instruction to be executed and SRR1 is set to the contents of the MSR at the time of the interrupt. Instruction execution resumes at address IVPR $_{\!0:47}\parallel$ IVOR36 $_{\!48:59}\parallel$ 0b0000. # 5.6.22 Processor Doorbell Critical Interrupt [Category: Embedded.Processor Control] A Processor Doorbell Critical Interrupt occurs when no higher priority exception exists, a Processor Doorbell Critical exception is present, and MSR<sub>CE</sub>=1. Processor Doorbell Critical exceptions are generated when DBELL\_CRIT messages (see Chapter 9) are received and accepted by the processor. When a Processor Doorbell Critical Interrupt occurs, CSRR0 is set to the address of the next instruction to be executed and CSRR1 is set to the contents of the MSR at the time of the interrupt. Instruction execution resumes at address IVPR $_{0:47}\parallel$ IVOR37 $_{48:59}\parallel$ 0b0000. ## 5.7 Partially Executed Instructions In general, the architecture permits load and store instructions to be partially executed, interrupted, and then to be restarted from the beginning upon return from the interrupt. Unaligned Load and Store instructions, or Load Multiple, Store Multiple, Load String, and Store String instructions may be broken up into multiple, smaller accesses, and these accesses may be performed in any order. In order to guarantee that a particular load or store instruction will complete without being interrupted and restarted, software must mark the storage being referred to as Guarded, and must use an elementary (non-string or non-multiple) load or store that is aligned on an operand-sized boundary. In order to guarantee that *Load* and *Store* instructions can, in general, be restarted and completed correctly without software intervention, the following rules apply when an execution is partially executed and then interrupted: - For an elementary Load, no part of the target register RT or FRT, will have been altered. - For 'with update' forms of Load or Store, the update register, register RA, will not have been altered. On the other hand, the following effects are permissible when certain instructions are partially executed and then restarted: - For any *Store*, some of the bytes at the target storage location may have been altered (if write access to that page in which bytes were altered is permitted by the access control mechanism). In addition, for *Store Conditional* instructions, CR0 has been set to an undefined value, and it is undefined whether the reservation has been cleared. - For any Load, some of the bytes at the addressed storage location may have been accessed (if read access to that page in which bytes were accessed is permitted by the access control mechanism). - For Load Multiple or Load String, some of the registers in the range to be loaded may have been altered. Including the addressing registers (RA, and possibly RB) in the range to be loaded is a programming error, and thus the rules for partial execution do not protect against overwriting of these registers. In no case will access control be violated. As previously stated, the only load or store instructions that are guaranteed to not be interrupted after being partially executed are elementary, aligned, guarded loads and stores. All others may be interrupted after being partially executed. The following list identifies the specific instruction types for which interruption after partial execution may occur, as well as the specific interrupt types that could cause the interruption: Any Load or Store (except elementary, aligned, guarded): Any asynchronous interrupt Machine Check Program (Imprecise Mode Floating-Point Enabled) Program (Imprecise Mode Auxiliary Processor Enabled) Unaligned elementary Load or Store, or any multiple or string: All of the above listed under item 1, plus the following: Data Storage (if the access crosses a protection boundary) Debug (Data Address Compare, Data Value Compare) - mtcrf may also be partially executed due to the occurrence of any of the interrupts listed under item 1 at the time the mtcrf was executing. - All instructions prior to the *mtcrf* have completed execution. (Some storage accesses generated by these preceding instructions may not have completed.) - No subsequent instruction has begun execution. - The *mtcrf* instruction (the address of which was saved in SRR0/CSRR0/MCSRR0/DSRR0 [Category: Embedded.Enhanced Debug] at the occurrence of the interrupt), may appear not to have begun or may have partially executed. ## 5.8 Interrupt Ordering and Masking It is possible for multiple exceptions to exist simultaneously, each of which could cause the generation of an interrupt. Furthermore, for interrupts classes other than the Machine Check interrupt and critical interrupts, the architecture does not provide for reporting more than one interrupt of the same class (unless the Embedded.Enhanced Debug category is supported). Therefore, the architecture defines that interrupts are ordered with respect to each other, and provides a masking mechanism for certain persistent interrupt types. When an interrupt is masked (disabled), and an event causes an exception that would normally generate the interrupt, the exception *persists* as a *status* bit in a register (which register depends upon the exception type). However, no interrupt is generated. Later, if the interrupt is enabled (unmasked), and the exception status has not been cleared by software, the interrupt due to the original exception event will then finally be generated. All asynchronous interrupts can be masked. In addition, certain synchronous interrupts can be masked. An example of such an interrupt is the Floating-Point Enabled exception type Program interrupt. The execution of a floating-point instruction that causes the FPSCR<sub>FEX</sub> bit to be set to 1 is considered an exception event, regardless of the setting of MSR<sub>FE0,FE1</sub>. If MSR<sub>FE0,FE1</sub> are both 0, then the Floating-Point Enabled exception type of Program interrupt is masked, but the exception persists in the FPSCR<sub>FEX</sub> bit. Later, if the MSR<sub>FE0,FE1</sub> bits are enabled, the interrupt will finally be generated. The architecture enables implementations to avoid situations in which an interrupt would cause the state information (saved in Save/Restore Registers) from a previous interrupt to be overwritten and lost. In order to do this, the architecture defines interrupt classes in a hierarchical manner. At each interrupt class, hardware automatically disables any further interrupts associated with the interrupt class by masking the interrupt enable in the MSR when the interrupt is taken. In addition, each interrupt class masks the interrupt enable in the MSR for each lower class in the hierarchy. The hierar- chy of interrupt classes is as follows from highest to lowest: | Interrupt Class | MSR Enables<br>Cleared | Save/Restore<br>Registers | | | | | |--------------------|------------------------|---------------------------|--|--|--|--| | Machine Check | ME,DE, CE, EE | MSRR0/1 | | | | | | Debug <sup>1</sup> | DE,CE,EE | DSRR0/1 | | | | | | Critical | CE,EE | CSRR0/1 | | | | | | Base | EE | SRR0/1 | | | | | The Debug interrupt class is Category: E.ED. Note: MSR<sub>DE</sub> may be cleared when a critical interrupt occurs if Category: E.ED is not supported. #### Figure 16. Interrupt Hierarchy If the Embedded.Enhanced Debug category is not supported (or is supported and is not enabled), then the Debug interrupt becomes a Critical class interrupt and all critical class interrupts will clear DE, CE, and EE in the MSR. Base Class interrupts that occur as a result of precise exceptions are not masked by the EE bit in the MSR and any such exception that occurs prior to software saving the state of SRR0/1 in a base class exception handler will result in a situation that could result in the loss of state information. This first step of the hardware clearing the MSR enable bits lower in the hierarchy shown in Figure 16 prevents any subsequent asynchronous interrupts from overwriting the Save/Restore Registers (SRR0/SRR1, CSRR0/ CSRR1, MCSRR0/MCSRR1, or DSRR0/DSRR1 [Category: Embedded.Enhanced Debug]), prior to software being able to save their contents. Hardware also automatically interrupt, clears. any MSR<sub>WE,PR,FP,FE0,FE1,IS,DS</sub>. The clearing of these bits assists in the avoidance of subsequent interrupts of certain other types. However, guaranteeing that interrupt classes lower in the hierarchy do not occur and thus do not overwrite the Save/Restore Registers (SRR0/SRR1, CSRR0/CSRR1, DSRR0/DSRR1 [Category: Embedded.Enhanced Debug], or MCSRR0/ MCSRR1) also requires the cooperation of system software. Specifically, system software must avoid the execution of instructions that could cause (or enable) a subsequent interrupt, if the contents of the Save/ Restore Registers (SRR0/SRR1, CSRR0/CSRR1, DSRR0/DSRR1 [Category: Embedded.Enhanced Debug]), or MCSRR0/MCSRR1) have not yet been saved. # 5.8.1 Guidelines for System Software The following list identifies the actions that system software must avoid, prior to having saved the Save/Restore Registers' contents: - Re-enabling an interrupt class that is at the same or a lower level in the interrupt hierarchy. This includes the following actions: - Re-enabling of MSR<sub>EE</sub> - Re-enabling of MSR<sub>CE,EE</sub> in critical class interrupt handlers, and if the Embedded.Enhanced Debug category is not supported, re-enabling of MSR<sub>DE</sub>. - Category: Embedded.Enhanced Debug: Reenabling of MSR<sub>CE,EE,DE</sub> in Debug class interrupt handlers - Re-enabling of MSR<sub>EE,CE,DE,ME</sub> in Machine Check interrupt handlers. - Branching (or sequential execution) to addresses not mapped by the TLB, or mapped without UX=1 or SX=1 permission. This prevents Instruction Storage and Instruction TLB Error interrupts. Load, Store or Cache Management instructions to addresses not mapped by the TLB or not having required access permissions. This prevents Data Storage and Data TLB Error interrupts. Execution of System Call (sc) or Trap (tw, twi, td, tdi) instructions This prevents System Call and Trap exception type Program interrupts. Execution of any floating-point instruction This prevents Floating-Point Unavailable interrupts. Note that this interrupt would occur upon the execution of any floating-point instruction, due to the automatic clearing of MSR<sub>FP</sub>. However, even if software were to re-enable MSR<sub>FP</sub>, floating-point instructions must still be avoided in order to prevent Program interrupts due to various possible Program interrupt exceptions (Floating-Point Enabled, Unimplemented Operation). Re-enabling of MSR<sub>PR</sub> This prevents Privileged Instruction exception type Program interrupts. Alternatively, software could re-enable MSR<sub>PR</sub>, but avoid the execution of any privileged instructions. ■ Execution of any Auxiliary Processor instruction This prevents Auxiliary Processor Unavailable interrupts, and Auxiliary Processor Enabled type and Unimplemented Operation type Program interrupts. Execution of any Illegal instructions This prevents Illegal Instruction exception type Program interrupts. Execution of any instruction that could cause an Alignment interrupt This prevents Alignment interrupts. Included in this category are any string or multiple instructions, and any unaligned elementary load or store instructions. See Section 5.6.6 on page 677 for a complete list of instructions that may cause Alignment interrupts. It is not necessary for hardware or software to avoid interrupts higher in the interrupt hierarchy (see Figure 16) from within interrupt handlers (and hence, for example, hardware does not automatically clear MSR<sub>CE,ME,DE</sub> upon a base class interrupt), since interrupts at each level of the hierarchy use different pairs of Save/Restore Registers to save the instruction address and MSR (i.e. SRR0/SRR1 for base class interrupts, and MCSRR0/MCSRR1, DSRR0/DSRR1 [Category: Embedded.Enhanced Debug], or CSRR0/CSRR1 for non-base class interrupts). The converse, however, is not true. That is, hardware and software must cooperate in the avoidance of interrupts lower in the hierarchy from occurring within interrupt handlers, even though the these interrupts use different Save/Restore Register pairs. This is because the interrupt higher in the hierarchy may have occurred from within a interrupt handler for an interrupt lower in the hierarchy prior to the interrupt handler having saved the Save/Restore Registers. Therefore, within an interrupt handler, Save/ Restore Registers for all interrupts lower in the hierarchy may contain data that is necessary to the system software. ### 5.8.2 Interrupt Order The following is a prioritized listing of the various enabled interrupts for which exceptions might exist simultaneously: 1. Synchronous (Non-Debug) Interrupts: Data Storage Instruction Storage Alignment Program Floating-Point Unit Unavailable Auxiliary Processor Unavailable Embedded Floating-Point Unavailable [SP.Category: SP.Embedded Float\_\*] SPE/Embedded Floating-Point/Vector Unavailable Embedded Floating-Point Data [Category: SP.Embedded Float\_\*] Embedded Floating-Point Round [Category: SP.Embedded Float\_\*] System Call Data TLB Error Instruction TLB Error Only one of the above types of synchronous interrupts may have an existing exception generating it at any given time. This is guaranteed by the exception priority mechanism (see Section 5.9 on page 689) and the requirements of the Sequential Execution Model. - 2. Machine Check - 3. Debug - 4. Critical Input - 5. Watchdog Timer - 6. Processor Doorbell Critical - 7. External Input - 8. Fixed-Interval Timer - 9. Decrementer - 10. Processor Doorbell - 11. Embedded Performance Monitor Even though, as indicated above, the base, synchronous exception types listed under item 1 are generated with higher priority than the non-base interrupt classes listed in items 2-5, the fact is that these base class interrupts will immediately be followed by the highest priority existing interrupt in items 2-5, without executing any instructions at the base class interrupt handler. This is because the base interrupt classes do not automatically disable the MSR mask bits for the interrupts listed in 2-5. In all other cases, a particular interrupt class from the above list will automatically disable any subsequent interrupts of the same class, as well as all other interrupt classes that are listed below it in the priority order. ## 5.9 Exception Priorities All synchronous (precise and imprecise) interrupts are reported in program order, as required by the Sequential Execution Model. The one exception to this rule is the case of multiple synchronous imprecise interrupts. Upon a synchronizing event, all previously executed instructions are required to report any synchronous imprecise interrupt-generating exceptions, and the interrupt will then be generated with all of those exception types reported cumulatively, in both the ESR, and any status registers associated with the particular exception type (e.g. the Floating-Point Status and Control Register). For any single instruction attempting to cause multiple exceptions for which the corresponding synchronous interrupt types are enabled, this section defines the priority order by which the instruction will be permitted to cause a *single* enabled exception, thus generating a particular synchronous interrupt. Note that it is this exception priority mechanism, along with the requirement that synchronous interrupts be generated in program order, that guarantees that at any given time, there exists for consideration only one of the synchronous interrupt types listed in item 1 of Section 5.8.2 on page 689. The exception priority mechanism also prevents certain debug exceptions from existing in combination with certain other synchronous interrupt-generating exceptions. Because unaligned *Load* and *Store* instructions, or *Load Multiple*, *Store Multiple*, *Load String*, and *Store Sting* instructions may be broken up into multiple, smaller accesses, and these accesses may be performed in any order. The exception priority mechanism applies to each of the multiple storage accesses in the order they are performed by the implementation. This section does not define the permitted setting of multiple exceptions for which the corresponding interrupt types are disabled. The generation of exceptions for which the corresponding interrupt types are disabled will have no effect on the generation of other exceptions for which the corresponding interrupt types are enabled. Conversely, if a particular exception for which the corresponding interrupt type is enabled is shown in the following sections to be of a higher priority than another exception, it will prevent the setting of that other exception, independent of whether that other exception's corresponding interrupt type is enabled or disabled. Except as specifically noted, only one of the exception types listed for a given instruction type will be permitted to be generated at any given time. The priority of the exception types are listed in the following sections ranging from highest to lowest, within each instruction type. #### Programming Note - Some exception types may even be mutually exclusive of each other and could otherwise be considered the same priority. In these cases, the exceptions are listed in the order suggested by the sequential execution model. ## 5.9.1 Exception Priorities for Defined Instructions # 5.9.1.1 Exception Priorities for Defined Floating-Point Load and Store Instructions The following prioritized list of exceptions may occur as a result of the attempted execution of any defined *Floating-Point Load* and *Store* instruction. - 1. Debug (Instruction Address Compare) - 2. Instruction TLB Error - 3. Instruction Storage Interrupt (all types) - 4. Program (Illegal Instruction) - 5. Floating-Point Unavailable - 6. Program (Unimplemented Operation) - 7. Data TLB Error - 8. Data Storage (all types) - 9. Alianment - Debug (Data Address Compare, Data Value Compare) - 11. Debug (Instruction Complete) If the instruction is causing both a Debug (Instruction Address Compare) and a Debug (Data Address Compare) or Debug (Data Value Compare), and is not causing any of the exceptions listed in items 2-9, it is permissible for both exceptions to be generated and recorded in the DBSR. A single Debug interrupt will result. # 5.9.1.2 Exception Priorities for Other Defined Load and Store Instructions and Defined Cache Management Instructions The following prioritized list of exceptions may occur as a result of the attempted execution of any other defined *Load* or *Store* instruction, or defined *Cache Management* instruction. - 1. Debug (Instruction Address Compare) - 2. Instruction TLB Error - 3. Instruction Storage Interrupt (all types) - 4. Program (Illegal Instruction) - 5. Program (Privileged Instruction) (dcbi only) - 6. Program (Unimplemented Operation) - 7. Data TLB Error - 8. Data Storage (all types) - 9. Alignment - Debug (Data Address Compare, Data Value Compare) - 11. Debug (Instruction Complete) If the instruction is causing both a Debug (Instruction Address Compare) and a Debug (Data Address Compare) or Debug (Data Value Compare), and is not causing any of the exceptions listed in items 2-9, it is permissible for both exceptions to be generated and recorded in the DBSR. A single Debug interrupt will result. # 5.9.1.3 Exception Priorities for Other Defined Floating-Point Instructions The following prioritized list of exceptions may occur as a result of the attempted execution of any defined floating-point instruction other than a load or store. - 1. Debug (Instruction Address Compare) - 2. Instruction TLB Error - 3. Instruction Storage Interrupt (all types) - 4. Program (Illegal Instruction) - 5. Floating-Point Unavailable - 6. Program (Unimplemented Operation) - 7. Program (Floating-point Enabled) - 8. Debug (Instruction Complete) # 5.9.1.4 Exception Priorities for Defined Privileged Instructions The following prioritized list of exceptions may occur as a result of the attempted execution of any defined privileged instruction, except *dcbi*, *rfi*, and *rfci* instructions. - 1. Debug (Instruction Address Compare) - 2. Instruction TLB Error - 3. Instruction Storage Interrupt (all types) - 4. Program (Illegal Instruction) - 5. Program (Privileged Instruction) - 6. Program (Unimplemented Operation) - 7. Debug (Instruction Complete) For *mtmsr*, *mtspr* (DBCR0, DBCR1, DBCR2), *mtspr* (TCR), and *mtspr* (TSR), if they are not causing Debug (Instruction Address Compare) nor Program (Privileged Instruction) exceptions, it is possible that they are simultaneously enabling (via mask bits) multiple existing exceptions (and at the same time possibly causing a Debug (Instruction Complete) exception). When this occurs, the interrupts will be handled in the order defined by Section 5.8.2 on page 689. # 5.9.1.5 Exception Priorities for Defined Trap Instructions The following prioritized list of exceptions may occur as a result of the attempted execution of a defined *Trap* instruction. - 1. Debug (Instruction Address Compare) - 2. Instruction TLB Error - 3. Instruction Storage Interrupt (all types) - 4. Program (Illegal Instruction) - 5. Program (Unimplemented Operation) - 6. Debug (Trap) - 7. Program (Trap) - 8. Debug (Instruction Complete) If the instruction is causing both a Debug (Instruction Address Compare) and a Debug (Trap), and is not causing any of the exceptions listed in items 2-5, it is permissible for both exceptions to be generated and recorded in the DBSR. A single Debug interrupt will result. # 5.9.1.6 Exception Priorities for Defined System Call Instruction The following prioritized list of exceptions may occur as a result of the attempted execution of a defined *System Call* instruction. - 1. Debug (Instruction Address Compare) - 2. Instruction TLB Error - 3. Instruction Storage Interrupt (all types) - 4. Program (Illegal Instruction) - 5. Program (Unimplemented Operation) - 6. System Call - 7. Debug (Instruction Complete) # 5.9.1.7 Exception Priorities for Defined Branch Instructions The following prioritized list of exceptions may occur as a result of the attempted execution of any defined branch instruction. - 1. Debug (Instruction Address Compare) - 2. Instruction TLB Error - 3. Instruction Storage Interrupt (all types) - 4. Program (Illegal Instruction) - 5. Program (Unimplemented Operation) - 6. Debug (Branch Taken) - 7. Debug (Instruction Complete) If the instruction is causing both a Debug (Instruction Address Compare) and a Debug (Branch Taken), and is not causing any of the exceptions listed in items 2-5, it is permissible for both exceptions to be generated and recorded in the DBSR. A single Debug interrupt will result. # 5.9.1.8 Exception Priorities for Defined Return From Interrupt Instructions The following prioritized list of exceptions may occur as a result of the attempted execution of an *rfi*, *rfci*, *rfmci*, *rfdi* [Category:Embedded.Enhanced Debug] instruction. - 1. Debug (Instruction Address Compare) - 2. Instruction TLB Error - 3. Instruction Storage Interrupt (all types) - 4. Program (Illegal Instruction) - 5. Program (Privileged Instruction) - 6. Program (Unimplemented Operation) - 7. Debug (Return From Interrupt) - 8. Debug (Instruction Complete) If the *rfi* or *rfci*, *rfmci*, or *rfdi* [Category: Embedded.Enhanced Debug] instruction is causing both a Debug (Instruction Address Compare) and a Debug (Return From Interrupt), and is not causing any of the exceptions listed in items 2-5, it is permissible for both exceptions to be generated and recorded in the DBSR. A single Debug interrupt will result. # 5.9.1.9 Exception Priorities for Other Defined Instructions The following prioritized list of exceptions may occur as a result of the attempted execution of all other instructions not listed above. - 1. Debug (Instruction Address Compare) - 2. Instruction TLB Error - 3. Instruction Storage Interrupt (all types) - 4. Program (Illegal Instruction) - 5. Program (Unimplemented Operation) - 6. Debug (Instruction Complete) ## 5.9.2 Exception Priorities for Reserved Instructions The following prioritized list of exceptions may occur as a result of the attempted execution of any reserved instruction. - 1. Debug (Instruction Address Compare) - 2. Instruction TLB Error - 3. Instruction Storage Interrupt (all types) - 4. Program (Illegal Instruction) # **Chapter 6. Reset and Initialization** | 6.1 | Background | 693 | 6.4 | Software Initialization Requirements | |-----|-----------------------------|-----|-----|--------------------------------------| | 6.2 | Reset Mechanisms | 693 | 694 | · | | 6.3 | Processor State After Reset | 693 | | | ## 6.1 Background This chapter describes the requirements for processor reset. This includes both the means of causing reset, and the specific initialization that is required to be performed automatically by the processor hardware. This chapter also provides an overview of the operations that should be performed by initialization software, in order to fully initialize the processor. In general, the specific actions taken by a processor upon reset are implementation-dependent. Also, it is the responsibility of system initialization software to initialize the majority of processor and system resources after reset. Implementations are required to provide a minimum processor initialization such that this system software may be fetched and executed, thereby accomplishing the rest of system initialization. ## 6.2 Reset Mechanisms This specification defines two processor mechanisms for internally invoking a reset operation using either the Watchdog Timer (see Section 7.7 on page 701) or the Debug facilities using DBCR0<sub>RST</sub> (see Section 8.5.1.1 on page 711). In addition, implementations will typically provide additional means for invoking a reset operation, via an external mechanism such as a signal pin which when activated will cause the processor to reset. ## 6.3 Processor State After Reset The initial processor state is controlled by the register contents after reset. In general, the contents of most registers are undefined after reset. The processor hardware is only guaranteed to initialize those registers (or specific bits in registers) which must be initialized in order for software to be able to reliably perform the rest of system initialization. The Machine State Register and Processor Version Register and a TLB entry are updated as follows: ### **Machine State Register** | Bit | Setting | Comments | |------|---------|----------------------------------------------------| | CM | 0 | Computation Mode (set to 32-bit mode) | | ICM | 0 | Interrupt Computation Mode (set to 32-bit) | | UCLE | 0 | User Cache Locking Enable | | SPV | 0 | SPE/Embedded Floating-Point/<br>Vector Unavailable | | WE | 0 | Wait State disabled | | CE | 0 | Critical Input interrupts disabled | | DE | 0 | Debug interrupts disabled | | EE | 0 | External Input interrupts disabled | | PR | 0 | Supervisor mode | | FP | 0 | FP unavailable | | ME | 0 | Machine Check interrupts disabled | | FE0 | 0 | FP exception type Program inter-<br>rupts disabled | | FE1 | 0 | FP exception type Program inter-<br>rupts disabled | | IS | 0 | Instruction Address Space 0 | | DS | 0 | Data Address Space 0 | | PMM | 0 | Performance Monitor Mark | Figure 17. Machine State Register Initial Values #### **Processor Version Register** Implementation-Dependent. (This register is read-only, and contains a value which identifies the specific implementation) #### TLB entry A TLB entry (which entry is implementation-dependent) is initialized in an implementation-dependent manner that maps the last 4KB page in the implemented effective storage address space, with the following field settings: | | | T- | |-------|--------------|-------------------------------------------------------------| | Field | Setting | Comments | | EPN | see<br>below | Represents the last 4K page in effective address space | | RPN | see<br>below | Represents the last 4K page in physical address space | | TS | 0 | translation address space 0 | | SIZE | 0b0001 | 4KB page size | | W | ? | implementation-dependent value | | I | ? | implementation-dependent value | | М | ? | implementation-dependent value | | G | ? | implementation-dependent value | | E | ? | implementation-dependent value | | U0 | ? | implementation-dependent value | | U1 | ? | implementation-dependent value | | U2 | ? | implementation-dependent value | | U3 | ? | implementation-dependent value | | TID | ? | implementation-dependent value, but page must be accessible | | UX | ? | implementation-dependent value | | UR | ? | implementation-dependent value | | UW | ? | implementation-dependent value | | SX | 1 | page is execute accessible in supervisor mode | | SR | 1 | page is read accessible in supervisor mode | | SW | 1 | page is write accessible in supervisor mode | | VLE | ? | implementation-dependent value | | ACM | ? | implementation-dependent value | Figure 18. TLB Initial Values The initial settings of EPN and RPN are dependent upon the number of bits implemented in the EPN and RPN fields and the minimum page size supported by the implementation. For example, an implementation that allows 1KB pages and 32 bits of effective address would implement a 22 bit EPN and set the initial value of the boot entry to 2<sup>22</sup>-4 (0x3FFC) while an implementation that supports only 4K pages as the smallest size and 32 bits of effective address would implement a 20 bit EPN and set the initial value of the boot entry to 2<sup>20</sup>-1 (0xFFFF). Instruction execution begins at the last word address of the page mapped by the boot TLB entry. Note that this address is different from the PowerPC Architecture System Reset interrupt vector. An implementation may provide additional methods for initializing the TLB entry used for initial boot by providing an implementation-dependent RPN, or initializing other TLB entries. # 6.4 Software Initialization Requirements When reset occurs, the processor is initialized to a minimum configuration to start executing initialization code. Initialization code is necessary to complete the processor and system configuration. The initialization code described in this section is the minimum recommended for configuring the processor to run application code. Initialization code should configure the following processor resources: - Invalidate the instruction cache and data cache (implementation-dependent). - Initialize system memory as required by the operating system or application code. - Initialize the Interrupt Vector Prefix Register and Interrupt Vector Offset Register. - Initialize other processor registers as needed by the system. - Initialize off-chip system facilities. - Dispatch the operating system or application code. ### **Chapter 7. Timer Facilities** | 7.1 Overview | · · · · · · · · · · · · · · · · · · · | |--------------------------------------|---------------------------------------| | 7.2.1 Writing the Time Base 696 | | | 7.3 Decrementer | 7.5.1 Timer Status Register700 | | 7.3.1 Writing and Reading the Decre- | 7.6 Fixed-Interval Timer 700 | | menter | 7.7 Watchdog Timer 701 | | 7.3.2 Decrementer Events 697 | 7.8 Freezing the Timer Facilities 702 | #### 7.1 Overview The Time Base, Decrementer, Fixed-interval Timer, and Watchdog Timer provide timing functions for the system. The remainder of this section describes these registers and related facilities. ### 7.2 Time Base (TB) The Time Base (TB) is a 64-bit register (see Figure 19) containing a 64-bit unsigned integer that is incremented periodically. Each increment adds 1 to the low-order bit (bit 63). The frequency at which the integer is updated is implementation-dependent. | Field | Description | |-------|----------------------------| | TBU | Upper 32 bits of Time Base | | TRI | Lower 32 hits of Time Base | #### Figure 19. Time Base The Time Base bits 0:59 increment until their value becomes $0xFFF\_FFFF\_FFFF\_FFFF$ ( $2^{59}$ - 1), at the next increment their value becomes $0x000\_0000\_0000\_0000$ . There is no interrupt or other indication when this occurs. Time base bits 60:63 may increment at a variable rate. When the value of bit 59 changes, bits 60:63 are set to zero; if bits 60:63 increment to 0xF before the value of bit 59 changes, they remain at 0xF until the value of bit 59 changes. The period of the Time Base depends on the driving frequency. As an order of magnitude example, suppose that the CPU clock is 1 GHz and that the Time Base is driven by this frequency divided by 32. Then the period of the Time Base would be $$T_{\text{TB}} = \frac{2^{64} \times 32}{1 \text{ GHz}} = 5.90 \times 10^{11} \text{ seconds}$$ which is approximately 18,700 years. The Time Base is implemented such that: - Loading a GPR from the Time Base has no effect on the accuracy of the Time Base. - Copying the contents of a GPR to the Time Base replaces the contents of the Time Base with the contents of the GPR. The Power ISA does not specify a relationship between the frequency at which the Time Base is updated and other frequencies, such as the CPU clock or bus clock in a Power ISA system. The Time Base update frequency is not required to be constant. What *is* required, so that system software can keep time of day and operate interval timers, is one of the following. - The system provides an (implementation-dependent) interrupt to software whenever the update frequency of the Time Base bits 0:59 changes, and a means to determine what the current update frequency is. - The update frequency of the Time Base bits 0:59 is under the control of the system software. Implementations must provide a means for either preventing the Time Base from incrementing or preventing it from being read in user mode (MSR<sub>PR</sub>=1). If the means is under software control, it must be privileged. There must be a method for getting all processors' Time Bases to start incrementing with values that are identical or almost identical in all processors. #### Programming Note If software initializes the Time Base on power-on to some reasonable value and the update frequency of the Time Base is constant, the Time Base can be used as a source of values that increase at a constant rate, such as for time stamps in trace entries. Even if the update frequency is not constant, values read from the Time Base are monotonically increasing (except when the Time Base wraps from 2<sup>64</sup>-1 to 0). If a trace entry is recorded each time the update frequency changes, the sequence of Time Base values can be post-processed to become actual time values. Successive readings of the Time Base may return identical values. See the description of the Time Base in Book II, for ways to compute time of day in POSIX format from the Time Base. ### 7.2.1 Writing the Time Base Writing the Time Base is privileged. Reading the Time Base is not privileged; it is discussed in Book II. It is not possible to write the entire 64-bit Time Base using a single instruction. The *mttbl* and *mttbu* extended mnemonics write the lower and upper halves of the Time Base (TBL and TBU), respectively, preserving the other half. These are extended mnemonics for the *mtspr* instruction; see Appendix B, "Assembler Extended Mnemonics" on page 733. The Time Base can be written by a sequence such as: ``` lwz Rx,upper # load 64-bit value for lwz Ry,lower # TB into Rx and Ry li Rz,0 mttbl Rz # set TBL to 0 mttbu Rx # set TBU mttbl Ry # set TBL ``` Provided that no interrupts occur while the last three instructions are being executed, loading 0 into TBL prevents the possibility of a carry from TBL to TBU while the Time Base is being initialized. #### Programming Note - The instructions for writing the Time Base are mode-independent. Thus code written to set the Time Base will work correctly in either 64-bit or 32-bit mode. ### 7.3 Decrementer The Decrementer (DEC) is a 32-bit decrementing counter that provides a mechanism for causing a Decrementer interrupt after a programmable delay. The contents of the Decrementer are treated as a signed integer. Figure 20. Decrementer Decrementer bits 32:59 count down until their value becomes 0x000\_0000, at the next increment their value becomes 0xFFF\_FFFF. Decrementer bits 60:63 may decrement at a variable rate. When the value of bit 59 changes, bits 60:63 are set to 0xF; if bits 60:63 decrement to 0x0 before the value of bit 59 changes, they remain at 0x0 until the value of bit 59 changes. The Decrementer is driven by the same frequency as the Time Base. The period of the Decrementer will depend on the driving frequency, but if the same values are used as given above for the Time Base (see Section 7.2), and if the Time Base update frequency is constant, the period would be $$T_{\rm DEC} = \frac{2^{32} \times 32}{1 \text{ GHz}} = 137 \text{ seconds.}$$ The Decrementer counts down. The operation of the Decrementer satisfies the following constraints. - The operation of the Time Base and the Decrementer is coherent, i.e., the counters are driven by the same fundamental time base. - Loading a GPR from the Decrementer has no effect on the accuracy of the Time Base. - Copying the contents of a GPR to the Decrementer replaces the contents of the Decrementer with the contents of the GPR. #### **Programming Note** In systems that change the Time Base update frequency for purposes such as power management, the Decrementer input frequency will also change. Software must be aware of this in order to set interval timers. If Decrementer bits 60:63 are used as part of a random number generator, software must account for the fact that these bits are set to 0xF only when bit 59 changes state regardless of whether or not they decremented to 0x0 since they were previously set to 0xF. ## 7.3.1 Writing and Reading the Decrementer The contents of the Decrementer can be read or written using the *mfspr* and *mtspr* instructions, both of which are privileged when they refer to the Decrementer. Using an extended mnemonic (see Appendix B, "Assembler Extended Mnemonics" on page 733), the Decrementer can be written from GPR Rx using: mtdec Rx The Decrementer can be read into GPR Rx using: mfdec Rx Copying the Decrementer to a GPR has no effect on the Decrementer contents or on the interrupt mechanism. #### 7.3.2 Decrementer Events A Decrementer event occurs when a decrement occurs on a Decrementer value of 0x0000\_0001. Upon the occurrence of a Decrementer event, the Decrementer may be reloaded from a 32-bit Decrementer Auto-Reload Register (DECAR). See Section 7.4. Upon the occurrence of a Decrementer event, the Decrementer has the following basic modes of operation. #### Decrement to one and stop on zero If TCR<sub>ARE</sub>=0, TSR<sub>DIS</sub> is set to 1, the value 0x0000\_0000 is then placed into the DEC, and the Decrementer stops decrementing. If enabled by $TCR_{DIE}=1$ and $MSR_{EE}=1$ , a Decrementer interrupt is taken. See Section 5.6.11, "Decrementer Interrupt" on page 680 for details of register behavior caused by the Decrementer interrupt. #### Decrement to one and auto-reload If $TCR_{ARE}$ =1, $TSR_{DIS}$ is set to 1, the contents of the Decrementer Auto-Reload Register is then placed into the DEC, and the Decrementer continues decrementing from the reloaded value. If enabled by $TCR_{DIE}=1$ and $MSR_{EE}=1$ , a Decrementer interrupt is taken. See Section 5.6.11, "Decrementer Interrupt" on page 680 for details of register behavior caused by the Decrementer interrupt. Forcing the Decrementer to 0 using the *mtspr* instruction will not cause a Decrementer exception; however, decrementing which was in progress at the instant of the *mtspr* may cause the exception. To eliminate the Decrementer as a source of exceptions, set TCR<sub>DIE</sub> to 0 (clear the Decrementer Interrupt Enable bit). If it is desired to eliminate all Decrementer activity, the procedure is as follows: - Write 0 to TCR<sub>DIE</sub>. This will prevent Decrementer activity from causing exceptions. - Write 0 to TCR<sub>ARE</sub> to disable the Decrementer auto-reload. - Write 0 to Decrementer. This will halt Decrementer decrementing. While this action will not cause a Decrementer exception to be set in TSR<sub>DIS</sub>, a near simultaneous decrement may have done so. - 4. Write 1 to TSR<sub>DIS</sub>. This action will clear TSR<sub>DIS</sub> to 0 (see Section 7.5.1 on page 700). This will clear any Decrementer exception which may be pending. Because the Decrementer is frozen at zero, no further Decrementer events are possible. If the auto-reload feature is disabled (TCR<sub>ARE</sub>=0), then once the Decrementer decrements to zero, it will stay there until software reloads it using the *mtspr* instruction. On reset, $TCR_{ARE}$ is set to 0. This disables the autoreload feature. ## 7.4 Decrementer Auto-Reload Register The Decrementer Auto-Reload Register is a 32-bit register as shown below. Figure 21. Decrementer Bits of the decrementer auto-reload register are numbered 32 (most-significant bit) to 63 (least-significant bit). The Decrementer Auto-Reload Register is provided to support the auto-reload feature of the Decrementer. See Section 7.3.2 The contents of the Decrementer Auto-Reload Register cannot be read. The contents of bits 32:63 of register RS can be written to the Decrementer Auto-Reload Register using the *mtspr* instruction. ### 7.5 Timer Control Register The Timer Control Register (TCR) is a 32-bit register. Timer Control Register bits are numbered 32 (most-significant bit) to 63 (least-significant bit). The Timer Control Register controls Decrementer (see Section 7.3), Fixed-Interval Timer (see Section 7.6), and Watchdog Timer (see Section 7.7) options. The relationship of the Timer facilities to the TCR and TB is shown in the figure below. Figure 22. Relationships of the Timer Facilities The contents of the Timer Control Register can be read using the *mfspr* instruction. The contents of bits 32:63 of register RS can be written to the Timer Control Register using the *mtspr* instruction. The contents of the TCR are defined below: #### Bit(s) Description 32:33 **Watchdog Timer Period** (WP) (see Section 7.7 on page 701) Specifies one of 4 bit locations of the Time Base used to signal a Watchdog Timer exception on a transition from 0 to 1. The 4 Time Base bits that can be specified to serve as the Watchdog Timer period are implementation-dependent. - 34:35 Watchdog Timer Reset Control (WRC) (see Section 7.7 on page 701) - 00 No Watchdog Timer reset will occur TCR<sub>WRC</sub> resets to 0b00. This field may be set by software, but cannot be cleared by software (except by a software-induced reset) 01-11 Force processor to be reset on second time-out of Watchdog Timer. The exact function of any of these settings is implementation-dependent. The Watchdog Timer Reset Control field is cleared to zero by processor reset. These bits are set only by software. Once a 1 has been written to one of these bits, that bit remains a 1 until a reset occurs. This is to prevent errant code from disabling the Watchdog reset function. - 36 **Watchdog Timer Interrupt Enable** (WIE) (see Section 7.7 on page 701) - 0 Disable Watchdog Timer interrupt - I Enable Watchdog Timer interrupt - 37 **Decrementer Interrupt Enable** (DIE) (see Section 7.3 on page 697) - 0 Disable Decrementer interrupt - 1 Enable Decrementer interrupt - 38:39 *Fixed-Interval Timer Period* (FP) (see Section 7.6 on page 700) Specifies one of 4 bit locations of the Time Base used to signal a Fixed-Interval Timer exception on a transition from 0 to 1. The 4 Time Base bits that can be specified to serve as the Fixed-Interval Timer period are implementation-dependent. 40 **Fixed-Interval Timer Interrupt Enable** (FIE) (see Section 7.6 on page 700 - 0 Disable Fixed-Interval Timer interrupt - 1 Enable Fixed-Interval Timer interrupt #### 41 **Auto-Reload Enable** (ARE) 0 Disable auto-reload of the Decrementer Decrementer exception is presented (i.e. $TSR_{DIS}$ is set to 1) when the Decrementer is decremented from a value of $0x0000\_0001$ . The next value placed in the Decrementer is the value $0x0000\_0000$ . The Decrementer then stops decrementing. If $MSR_{EE}$ =1, $TCR_{DIE}$ =1, and $TSR_{DIS}$ =1, a Decrementer interrupt is taken. Software must reset $TSR_{DIS}$ . 1 Enable auto-reload of the Decrementer Decrementer exception is presented (i.e. $TSR_{DIS}$ is set to 1) when the Decrementer is decremented from a value of $0x0000\_0001$ . The contents of the Decrementer Auto-Reload Register is placed in the Decrementer. The Decrementer resumes decrementing. If $MSR_{EE}=1$ , $TCR_{DIE}=1$ , and $TSR_{DIS}=1$ , a Decrementer interrupt is taken. Software must reset $TSR_{DIS}$ . - 42 Implementation-dependent - 43:63 Reserved ### 7.5.1 Timer Status Register The Timer Status Register (TSR) is a 32-bit register. Timer Status Register bits are numbered 32 (most-significant bit) to 63 (least-significant bit). The Timer Status Register contains status on timer events and the most recent Watchdog Timer-initiated processor reset. The Timer Status Register is set via hardware, and read and cleared via software. The contents of the Timer Status Register can be read using the *mfspr* instruction. Bits in the Timer Status Register can be cleared using the *mtspr* instruction. Clearing is done by writing bits 32:63 of a General Purpose Register to the Timer Status Register with a 1 in any bit position that is to be cleared and 0 in all other bit positions. The write-data to the Timer Status Register is not direct data, but a mask. A 1 causes the bit to be cleared, and a 0 has no effect. The contents of the TSR are defined below: #### Bit(s) Description - 32 **Enable Next Watchdog Timer** (ENW) (see Section 7.7 on page 701) - O Action on next Watchdog Timer time-out is to set TSRENW - Action on next Watchdog Timer time-out is governed by TSRWIS - 33 Watchdog Timer Interrupt Status (WIS) (see Section 7.7 on page 701) - A Watchdog Timer event has not occurred. - 1 A Watchdog Timer event has occurred. When MSR<sub>CE</sub>=1 and TCR<sub>WIE</sub>=1, a Watchdog Timer interrupt is taken. - 34:35 **Watchdog Timer Reset Status** (WRS) (see Section 7.7 on page 701) These two bits are set to one of three values when a reset is caused by the Watchdog Timer. These bits are undefined at power-up. - 00 No Watchdog Timer reset has occurred. - 01 Implementation-dependent reset information. - 10 Implementation-dependent reset information - 11 Implementation-dependent reset information. - 36 **Decrementer Interrupt Status** (DIS) (see Section 7.3.2 on page 697) - 0 A Decrementer event has not occurred. - 1 A Decrementer event has occurred. When MSR<sub>EE</sub>=1 and TCR<sub>DIE</sub>=1, a Decrementer interrupt is taken. - 37 Fixed-Interval Timer Interrupt Status (FIS) (see Section 7.6 on page 700) - A Fixed-Interval Timer event has not occurred. - 1 A Fixed-Interval Timer event has occurred. When MSR<sub>EE</sub>=1 and TCR<sub>FIE</sub>=1, a Fixed-Interval Timer interrupt is taken. 38:63 Reserved ### 7.6 Fixed-Interval Timer The Fixed-Interval Timer (FIT) is a mechanism for providing timer interrupts with a repeatable period, to facilitate system maintenance. It is similar in function to an auto-reload Decrementer, except that there are fewer selections of interrupt period available. The Fixed-Interval Timer exception occurs on 0 to 1 transitions of a selected bit from the Time Base (see Section 7.5). The Fixed-Interval Timer exception is logged by TSR-FIS. A Fixed-Interval Timer interrupt will occur if TCR<sub>FIE</sub> and MSR<sub>EE</sub> are enabled. See Section 5.6.12 on page 680 for details of register behavior caused by the Fixed-Interval Timer interrupt. Note that a Fixed-Interval Timer exception will also occur if the selected Time Base bit transitions from 0 to 1 due to an *mtspr* instruction that writes a 1 to the bit when its previous value was 0. ### 7.7 Watchdog Timer The Watchdog Timer is a facility intended to aid system recovery from faulty software or hardware. Watchdog time-outs occur on 0 to 1 transitions of selected bits from the Time Base (Section 7.5). When a Watchdog Timer time-out occurs while Watchdog Timer Interrupt Status is clear (TSR $_{WIS} = 0$ ) and the next Watchdog Time-out is enabled (TSR $_{ENW} = 1$ ), a Watchdog Timer exception is generated and logged by setting TSR $_{WIS}$ to 1. This is referred to as a Watchdog Timer First Time Out. A Watchdog Timer interrupt will occur if enabled by TCR $_{WIE}$ and MSR $_{CE}$ . See Section 5.6.13 on page 680 for details of register behavior caused by the Watchdog Timer interrupt. The purpose of the Watchdog Timer First time-out is to give an indication that there may be problem and give the system a chance to perform corrective action or capture a failure before a reset occurs from the Watchdog Timer Second time-out as explained further below. Note that a Watchdog Timer exception will also occur if the selected Time Base bit transitions from 0 to 1 due to an *mtspr* instruction that writes a 1 to the bit when its previous value was 0. When a Watchdog Timer time-out occurs while ${\sf TSR}_{\sf WIS} = 1$ and ${\sf TSR}_{\sf ENW} = 1$ , a processor reset occurs if it is enabled by a non-zero value of the Watchdog Reset Control field in the Timer Control Register (TCR-WRC). This is referred to as a Watchdog Timer Second Time Out. The assumption is that ${\sf TSR}_{\sf WIS}$ was not cleared because the processor was unable to execute the Watchdog Timer interrupt handler, leaving reset as the only available means to restart the system. Note that once ${\sf TCR}_{\sf WRC}$ has been set to a non-zero value, it cannot be reset by software; this feature prevents errant software from disabling the Watchdog Timer reset capability. A more complete view of Watchdog Timer behavior is afforded by Figure 23 and Table 24, which describe the Watchdog Timer state machine and Watchdog Timer controls. The numbers in parentheses in the figure refer to the discussion of modes of operation which follow the table. Figure 23. Watchdog State Machine | Enable<br>Next WDT<br>(TSR <sub>ENW</sub> ) | WDT Status<br>(TSR <sub>WIS</sub> ) | Action when timer interval expires | |---------------------------------------------|-------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | 0 | 0 | Set Enable Next Watchdog Timer (TSR <sub>ENW</sub> =1). | | 0 | 1 | Set Enable Next Watchdog Timer (TSR <sub>ENW</sub> =1). | | 1 | 0 | Set Watchdog Timer interrupt status bit (TSR <sub>WIS</sub> =1). If Watchdog Timer interrupt is enabled (TCR <sub>WIE</sub> =1 and MSR <sub>CE</sub> =1), then interrupt. | | 1 | 1 | Cause Watchdog Timer reset action specified by TCR <sub>WRC</sub> . Reset will copy pre-reset TCR <sub>WRC</sub> into TSR <sub>WRS</sub> , then clear TCR <sub>WRC</sub> . | Figure 24. Watchdog Timer Controls The controls described in the above table imply three different modes of operation that a programmer might select for the Watchdog Timer. Each of these modes assumes that $TCR_{WRC}$ has been set to allow processor reset by the Watchdog facility: - Always take the Watchdog Timer interrupt when pending, and never attempt to prevent its occurrence. In this mode, the Watchdog Timer interrupt caused by a first time-out is used to clear TSR<sub>WIS</sub> so a second time-out never occurs. TSR<sub>ENW</sub> is not cleared, thereby allowing the next time-out to cause another interrupt. - 2. Always take the Watchdog Timer interrupt when pending, but avoid when possible. In this mode a recurring code loop of reliable duration (or perhaps a periodic interrupt handler such as the Fixed-Interval Timer interrupt handler) is used to repeatedly clear TSR<sub>ENW</sub> such that a first time-out exception is avoided, and thus no Watchdog Timer interrupt occurs. Once TSR<sub>FNW</sub> has been cleared, software has between one and two full Watchdog periods before a Watchdog exception will be posted in TSRWIS. If this occurs before the software is able to clear TSR<sub>ENW</sub> again, a Watchdog Timer interrupt will occur. In this case, the Watchdog Timer interrupt handler will then clear both TSR<sub>ENW</sub> and TSR<sub>WIS</sub>, in order to (hopefully) avoid the next Watchdog Timer interrupt. - 3. Never take the Watchdog Timer interrupt. In this mode, Watchdog Timer interrupts are disabled (via TCR<sub>WIE</sub>=0), and the system depends upon a recurring code loop of reliable duration (or perhaps a periodic interrupt handler such as the Fixed-Interval Timer interrupt handler) to repeatedly clear TSR<sub>WIS</sub> such that a second time-out is avoided, and thus no reset occurs. TSR<sub>ENW</sub> is not cleared, thereby allowing the next time-out to set TSR<sub>WIS</sub> again. The recurring code loop must have a period which is less than one Watchdog Timer period in order to guarantee that a Watchdog Timer reset will not occur. ### 7.8 Freezing the Timer Facilities The debug mechanism provides a means of temporarily freezing the timers upon a debug event. Specifically, the Time Base and Decrementer can be frozen and prevented from incrementing/decrementing, respectively, whenever a debug event is set in the Debug Status Register. Note that this also freezes the FIT and Watchdog timer. This allows a debugger to simulate the appearance of 'real time', even though the application has been temporarily 'halted' to service the debug event. See the description of bit 63 of the Debug Control Register 0 (Freeze Timers on Debug Event or DBCR0<sub>FT</sub>) in Section 8.5.1.1 on page 711. ### **Chapter 8. Debug Facilities** | 8.2 I<br>8.3 E | Overview | 8.4.10 Critica<br>Event [Categ<br>Debug]<br>8.5 Debug R | |----------------|--------------------------------------|---------------------------------------------------------| | | Debug Events 704 | 8.5.1 Debug | | | Instruction Address Compare Debug | 8.5.1.1 Debu | | | nt | 711 | | | Data Address Compare Debug | 8.5.1.2 Debu | | | nt | 712 | | | Trap Debug Event 708 | 8.5.1.3 Debu | | | Branch Taken Debug Event 708 | 714 | | | Instruction Complete Debug Event . | 8.5.2 Debug | | 709 | monutation complete zerog zrent : | 8.5.3 Instruc | | | Interrupt Taken Debug Event 709 | ters | | | 1 Causes of Interrupt Taken Debug | 8.5.4 Data A | | | nts | 716 | | | 2 Interrupt Taken Debug Event | 8.5.5 Data V | | | ription 709 | 8.6 Debugge | | | Return Debug Event 710 | [Category: E | | | Unconditional Debug Event 710 | 718 | | | Critical Interrupt Taken Debug Event | - | | | egory: Embedded.Enhanced Debug]. | | | 710 | . g . y | | | | | | | 8.4.10 Critical Interrupt Return Debug Event [Category: Embedded.Enhanced | |---------------------------------------------------------------------------| | Debug]711 | | 8.5 Debug Registers 711 | | 8.5.1 Debug Control Registers 711 | | 8.5.1.1 Debug Control Register 0 (DBCR0) | | 711 | | 8.5.1.2 Debug Control Register 1 (DBCR1) | | 712 | | 8.5.1.3 Debug Control Register 2 (DBCR2) | | 714 | | 8.5.2 Debug Status Register715 | | 8.5.3 Instruction Address Compare Regis- | | ters716 | | 8.5.4 Data Address Compare Registers | | 716 | | 8.5.5 Data Value Compare Registers . 717 | | 8.6 Debugger Notify Halt Instruction | | [Category: Embedded.Enhanced Debug] . | | 718 | ### 8.1 Overview Processors provide debug facilities to enable hardware and software debug functions, such as instruction and data breakpoints and program single stepping. The debug facilities consist of a set of Debug Control Registers (DBCR0, DBCR1, and DBCR2) (see Section 8.5.1 on page 711), a set of Address and Data Value Compare Registers (IAC1, IAC2, IAC3, IAC4, DAC1, DAC2, DVC1, and DVC2), (see Section 8.4.3, Section 8.4.4, and Section 8.4.5), a Debug Status Register (DBSR) (see Section 8.5.2) for enabling and recording various kinds of debug events, and a special Debug interrupt type built into the interrupt mechanism (see Section 5.6.16). The debug facilities also provide a mechanism for software-controlled processor reset, and for controlling the operation of the timers in a debug environment. The *mfspr* and *mtspr* instructions (see Section 3.4.1) provide access to the registers of the debug facilities. In addition to the facilities described here, implementations will typically include debug facilities, modes, and access mechanisms which are implementation-specific. For example, implementations will typically provide access to the debug facilities via a dedicated interface such as the IEEE 1149.1 Test Access Port (JTAG). ### 8.2 Internal Debug Mode Debug events include such things as instruction and data breakpoints. These debug events cause status bits to be set in the Debug Status Register. The existence of a set bit in the Debug Status Register is considered a Debug exception. Debug exceptions, if enabled, will cause Debug interrupts. There are two different mechanisms that control whether Debug interrupts are enabled. The first is the MSR<sub>DE</sub> bit, and this bit must be set to 1 to enable Debug interrupts. The second mechanism is an enable bit in the Debug Control Register 0 (DBCR0). This bit is the Internal Debug Mode bit (DBCR0<sub>IDM</sub>), and it must also be set to 1 to enable Debug interrupts. When DBCR0<sub>IDM</sub>=1, the processor is in Internal Debug Mode. In this mode, debug events will (if also enabled by MSR<sub>DE</sub>) cause Debug interrupts. Software at the Debug interrupt vector location will thus be given control upon the occurrence of a debug event, and can access (via the normal instructions) all architected processor resources. In this fashion, debug monitor software can control the processor and gather status, and interact with debugging hardware connected to the processor. When the processor is not in Internal Debug Mode (DBCR0<sub>IDM</sub>=0), debug events may still occur and be recorded in the Debug Status Register. These exceptions may be monitored via software by reading the Debug Status Register (using *mfspr*), or may eventually cause a Debug interrupt if later enabled by setting DBCR0<sub>IDM</sub>=1 (and MSR<sub>DE</sub>=1). Processor behavior when debug events occur while DBCR0<sub>IDM</sub>=0 is implementation-dependent. # 8.3 External Debug Mode [Category: Embedded.Enhanced Debug] The External Debug Mode is a mode in which facilities external to the processor can access processor resources and control execution. These facilities are defined as the *external debug* facilities and are not defined here, however some instructions and registers share internal and external debug roles and are briefly described as necessary. A *dnh* instruction is provided to stop instruction fetching and execution and allow the processor to be managed by an external debug facility. After the *dnh* instruction is executed, instructions are not fetched, interrupts are not taken, and the processor does not execute instructions. ### 8.4 Debug Events Debug events are used to cause Debug exceptions to be recorded in the Debug Status Register (see Section 8.5.2). In order for a debug event to be enabled to set a Debug Status Register bit and thereby cause a Debug exception, the specific event type must be enabled by a corresponding bit or bits in the Debug Control Register DBCR0 (see Section 8.5.1.1), DBCR1 (see Section 8.5.1.2), or DBCR2 (see Section 8.5.1.3), in most cases; the Unconditional Debug Event (UDE) is an exception to this rule. Once a Debug Status Register bit is set, if Debug interrupts are enabled by MSR<sub>DE</sub>, a Debug interrupt will be generated. Certain debug events are not allowed to occur when $MSR_{DE}=0$ . In such situations, no Debug exception occurs and thus no Debug Status Register bit is set. Other debug events may cause Debug exceptions and set Debug Status Register bits regardless of the state of $MSR_{DE}$ . The associated Debug interrupts that result from such Debug exceptions will be delayed until $MSR_{DE}=1$ , provided the exceptions have not been cleared from the Debug Status Register in the meantime. Any time that a Debug Status Register bit is allowed to be set while MSR<sub>DE</sub>=0, a special Debug Status Register bit, Imprecise Debug Event (DBSR<sub>IDE</sub>), will also be set. DBSR<sub>IDE</sub> indicates that the associated Debug exception bit in the Debug Status Register was set while Debug interrupts were disabled via the MMSR<sub>DE</sub> bit. Debug interrupt handler software can use this bit to determine whether the address recorded in CSRR0/DSRR0 [Category: Embedded.Enhanced Debug] should be interpreted as the address associated with the instruction causing the Debug exception, or simply the address of the instruction after the one which set the MSR<sub>DE</sub> bit, thereby enabling the delayed Debug interrupt. Debug interrupts are ordered with respect to other interrupt types (see Section 7.8 on page 179). Debug exceptions are prioritized with respect to other exceptions (see Section 7.9 on page 183). There are eight types of debug events defined: - Instruction Address Compare debug events - 2. Data Address Compare debug events - 3. Trap debug events - 4. Branch Taken debug events - 5. Instruction Complete debug events - 6. Interrupt Taken debug events - 7. Return debug events - 8. Unconditional debug events #### **Programming Note** There are two classes of debug exception types: Type 1: exception before instruction Type 2: exception after instruction Almost all debug exceptions fall into the first type. That is, they all take the interrupt upon encountering an instruction having the exception without updating any architectural state (other than DBSR, CSRR0/DSRR0 [Category: Embedded.Enhanced Debug], CSRR1/DSRR1 [Category: Embedded.Enhanced Debug], MSR) for that instruction. The CSRR0/DSRR0 [Category: Embedded.Enhanced Debug] for this type of exception points to the instruction that encountered the exception. This includes IAC, DAC, branch taken, etc. The only exception which fall into the second type is the instruction complete debug exception. This exception is taken upon completing and updating one instruction and then pointing CSRR0/DSRR0 [Category: Embedded.Enhanced Debug] to the next instruction to execute. To make forward progress for any Type 1 debug exception one does the following: - Software sets up Type 1 exceptions (e.g. branch taken debug exceptions) and then returns to normal program operation - Hardware takes Debug interrupt upon the first branch taken Debug exception, pointing to the branch with CSRR0/DSRR0 [Category: Embedded.Enhanced Debug]. - Software, in the debug handler, sees the branch taken exception type, does whatever logging/anal- - ysis it wants to, then clears all debug event enables in the DBCR except for the instruction complete debug event enable. - Software does an *rfci* or *rfdi* [Category: Embedded.Enhanced Debug]. - Hardware would execute and complete one instruction (the branch taken in this case), and then take a Debug interrupt with CSRR0/DSRR0 [Category: Embedded.Enhanced Debug] pointing to the target of the branch. - Software would see the instruction complete interrupt type. It clears the instruction complete event enable, then enables the branch taken interrupt event again. - Software does an *rfci* or *rfdi* [Category: Embedded.Enhanced Debug]. - 8. Hardware resumes on the target of the taken branch and continues until another taken branch, in which case we end up at step 2 again. This, at first, seems like a double tax (i.e. 2 debug interrupts for every instance of a Type 1 exception), but there doesn't seem like any other clean way to make forward progress on Type 1 debug exceptions. The only other way to avoid the double tax is to have the debug handler routine actually emulate the instruction pointed to for the Type 1 exceptions, determine the next instruction that would have been executed by the interrupted program flow and load the CSRR0/DSRR0 [Category: Embedded.Enhanced Debug] with that address and do an *rfci/rfdi* [Category: Embedded.Enhanced Debug]; this is probably not faster. ## 8.4.1 Instruction Address Compare Debug Event One or more Instruction Address Compare debug events (IAC1, IAC2, IAC3 or IAC4) occur if they are enabled and execution is attempted of an instruction at an address that meets the criteria specified in the DBCR0, DBCR1, IAC1, IAC2, IAC3, and IAC4 Registers. #### Instruction Address Compare User/ Supervisor Mode DBCR1<sub>IAC1US</sub> specifies whether IAC1 debug events can occur in user mode or supervisor mode, or both. DBCR1<sub>IAC2US</sub> specifies whether IAC2 debug events can occur in user mode or supervisor mode, or both. DBCR1<sub>IAC3US</sub> specifies whether IAC3 debug events can occur in user mode or supervisor mode, or both. DBCR1<sub>IAC4US</sub> specifies whether IAC4 debug events can occur in user mode or supervisor mode, or both. #### Effective/Real Address Mode DBCR1<sub>IAC1ER</sub> specifies whether effective addresses, real addresses, effective addresses and MSR<sub>IS</sub>=0, or effective addresses and MSR<sub>IS</sub>=1 are used in determining an address match on IAC1 debug events. DBCR1<sub>IAC2ER</sub> specifies whether effective addresses, real addresses, effective addresses and MSR<sub>IS</sub>=0, or effective addresses and $MSR_{IS}$ =1 are used in determining an address match on IAC2 debug events. DBCR1 $_{\rm IAC3ER}$ specifies whether effective addresses, real addresses, effective addresses and MSR $_{\rm IS}$ =0, or effective addresses and MSR $_{\rm IS}$ =1 are used in determining an address match on IAC3 debug events. DBCR1 $_{\rm IAC4ER}$ specifies whether effective addresses, real addresses, effective addresses and MSR $_{\rm IS}$ =0, or effective addresses and MSR $_{\rm IS}$ =1 are used in determining an address match on IAC4 debug events. #### Instruction Address Compare Mode DBCR1<sub>IAC12M</sub> specifies whether all or some of the bits of the address of the instruction fetch must match the contents of the IAC1 or IAC2, whether the address must be inside a specific range specified by the IAC1 and IAC2 or outside a specific range specified by the IAC1 and IAC2 for an IAC1 or IAC2 debug event to occur DBCR1<sub>IAC34M</sub> specifies whether all or some of the bits of the address of the instruction fetch must match the contents of the IAC3 Register or IAC4 Register, whether the address must be inside a specific range specified by the IAC3 Register and IAC4 Register or outside a specific range specified by the IAC3 Register and IAC4 Register for an IAC3 or IAC4 debug event to occur. There are four instruction address compare modes. There are four instruction address compare modes. - Exact address compare mode If the address of the instruction fetch is equal to the value in the enabled IAC Register, an instruction address match occurs. For 64-bit implementations, the addresses are masked to compare only bits 32:63 when the processor is executing in 32-bit mode. - Address bit match mode For IAC1 and IAC2 debug events, if the address of the instruction fetch access, ANDed with the contents of the IAC2, are equal to the contents of the IAC1, also ANDed with the contents of the IAC2, an instruction For IAC3 and IAC4 debug events, if the address of the instruction fetch, ANDed with the contents of the IAC4, are equal to the contents of the IAC3, also ANDed with the contents of the IAC4, an instruction address match occurs. For 64-bit implementations, the addresses are masked to compare only bits 32:63 when the processor is executing in 32-bit mode. Inclusive address range compare mode For IAC1 and IAC2 debug events, if the 64-bit address of the instruction fetch is greater than or equal to the contents of the IAC1 and less than the contents of the IAC2, an instruction address match occurs. For IAC3 and IAC4 debug events, if the 64-bit address of the instruction fetch is greater than or equal to the contents of the IAC3 and less than the contents of the IAC4, an instruction address match occurs. - For 64-bit implementations, the addresses are masked to compare only bits 32:63 when the processor is executing in 32-bit mode. - Exclusive address range compare mode For IAC1 and IAC2 debug events, if the 64-bit address of the instruction fetch is less than the contents of the IAC1 or greater than or equal to the contents of the IAC2, an instruction address match occurs. For IAC3 and IAC4 debug events, if the 64-bit address of the instruction fetch is less than the contents of the IAC3 or greater than or equal to the contents of the IAC4, an instruction address match occurs. For 64-bit implementations, the addresses are masked to compare only bits 32:63 when the processor is executing in 32-bit mode. See the detailed description of DBCR0 (see Section 8.5.1.1, "Debug Control Register 0 (DBCR0)" on page 711) and DBCR1 (see Section 8.5.1.2, "Debug Control Register 1 (DBCR1)" on page 712) and the modes for detecting IAC1, IAC2, IAC3 and IAC4 debug events. Instruction Address Compare debug events can occur regardless of the setting of MSR<sub>DE</sub> or DBCR0<sub>IDM</sub>. When an Instruction Address Compare debug event occurs, the corresponding $DBSR_{IAC1}$ , $DBSR_{IAC2}$ , $DBSR_{IAC3}$ , or $DBSR_{IAC4}$ bit or bits are set to record the debug exception. If $MSR_{DE}$ =0, $DBSR_{IDE}$ is also set to 1 to record the imprecise debug event. If MSR<sub>DE</sub>=1 (i.e. Debug interrupts are enabled) at the time of the Instruction Address Compare debug exception, a Debug interrupt will occur immediately (provided there exists no higher priority exception which is enabled to cause an interrupt). The execution of the instruction causing the exception will be suppressed, and CSRR0/DSRR0 [Category: Embedded.Enhanced Debug] will be set to the address of the excepting instruction. If MSR<sub>DE</sub>=0 (i.e. Debug interrupts are disabled) at the time of the Instruction Address Compare debug exception, a Debug interrupt will not occur, and the instruction will complete execution (provided the instruction is not causing some other exception which will generate an enabled interrupt). address match occurs. Later, if the debug exception has not been reset by clearing $\mathsf{DBSR}_{\mathsf{IAC1}},\;\;\mathsf{DBSR}_{\mathsf{IAC2}},\;\;\mathsf{DBSR}_{\mathsf{IAC3}},\;\;\mathsf{and}\;\;\mathsf{DBSR}_{\mathsf{IAC4}},\;\;\mathsf{and}\;\;\mathsf{MSR}_{\mathsf{DE}}\;\mathsf{is}\;\mathsf{set}\;\mathsf{to}\;\mathsf{1},\;\mathsf{a}\;\;\mathsf{delayed}\;\;\mathsf{Debug}\;\;\mathsf{interrupt}\;\;\mathsf{will}\;\;\mathsf{occur.}\;\;\mathsf{In}\;\;\mathsf{this}\;\;\mathsf{case},\;\;\mathsf{CSRR0/DSRR0}\;\;\mathsf{[Category:}\;\;\mathsf{Embedded.Enhanced}\;\;\mathsf{Debug}\;\;\mathsf{will}\;\;\;\mathsf{contain}\;\;\;\mathsf{the}\;\;\;\mathsf{address}\;\;\mathsf{of}\;\;\mathsf{the}\;\;\mathsf{instruction}\;\;\mathsf{after}\;\;\mathsf{the}\;\;\mathsf{one}\;\;\;\mathsf{which}\;\;\mathsf{enabled}\;\;\;\mathsf{the}\;\;\mathsf{Debug}\;\;\mathsf{interrupt}\;\;\mathsf{by}\;\;\mathsf{setting}\;\;\mathsf{MSR}_{\mathsf{DE}}\;\;\mathsf{to}\;\;\mathsf{1}.\;\;\mathsf{Software}\;\;\mathsf{in}\;\;\;\mathsf{the}\;\;\;\mathsf{Debug}\;\;\mathsf{interrupt}\;\;\mathsf{handler}\;\;\mathsf{can}\;\;\mathsf{observe}\;\;\mathsf{DBSR}_{\mathsf{IDE}}\;\;\mathsf{to}\;\;\;\mathsf{determine}\;\;\mathsf{how}\;\;\mathsf{to}\;\;\mathsf{interpret}\;\;\mathsf{the}\;\;\mathsf{value}\;\;\mathsf{in}\;\;\mathsf{CSRR0/DSRR0}\;\;\;\mathsf{[Category:}\;\;\;\mathsf{Embedded.Enhanced}\;\;\;\mathsf{Debug}\;\;\;$ ## 8.4.2 Data Address Compare Debug Event One or more Data Address Compare debug events (DAC1R, DAC1W, DAC2R, DAC2W) occur if they are enabled, execution is attempted of a data storage access instruction, and the type, address, and possibly even the data value of the data storage access meet the criteria specified in the Debug Control Register 0, Debug Control Register 2, and the DAC1, DAC2, DVC1, and DVC2 Registers. ## Data Address Compare Read/Write Enable DBCR0<sub>DAC1</sub> specifies whether DAC1R debug events can occur on read-type data storage accesses and whether DAC1W debug events can occur on write-type data storage accesses. DBCR0<sub>DAC2</sub> specifies whether DAC2R debug events can occur on read-type data storage accesses and whether DAC2W debug events can occur on write-type data storage accesses. Indexed-string instructions (*Iswx*, *stswx*) for which the XER field specifies zero bytes as the length of the string are treated as no-ops, and are not allowed to cause Data Address Compare debug events. All *Load* instructions are considered reads with respect to debug events, while all *Store* instructions are considered writes with respect to debug events. In addition, the *Cache Management* instructions, and certain special cases, are handled as follows. - dcbt, dcbtls, dcbtep, dcbtst, dcbtstls, dcbtstep, icbt, icbtls, icbtep, icbi, icblc, dcblc, and icbiep are all considered reads with respect to debug events. Note that dcbt, dcbtep, dcbtst, dcbtstep, icbt, and icbtep are treated as no-operations when they report Data Storage or Data TLB Miss exceptions, instead of being allowed to cause interrupts. However, these instructions are allowed to cause Debug interrupts, even when they would otherwise have been no-op'ed due to a Data Storage or Data TLB Miss exception. - dcbz, dcbzep, dcbi, dcbf, dcbfep, dcba, dcbst, and dcbstep are all considered writes with respect to debug events. Note that *dcbf*, *dcbfep*, *dcbst*, and *dcbstep* are considered reads with respect to Data Storage exceptions, since they do not actually change the data at a given address. However, since the execution of these instructions may result in write activity on the processor's data bus, they are treated as writes with respect to debug events. ## Data Address Compare User/Supervisor Mode DBCR2<sub>DAC1US</sub> specifies whether DAC1R and DAC1W debug events can occur in user mode or supervisor mode, or both. DBCR2<sub>DAC2US</sub> specifies whether DAC2R and DAC2W debug events can occur in user mode or supervisor mode, or both. #### Effective/Real Address Mode DBCR2<sub>DAC1ER</sub> specifies whether effective addresses, real addresses, effective addresses and MSR<sub>DS</sub>=0, or effective addresses and MSR<sub>DS</sub>=1 are used to in determining an address match on DAC1R and DAC1W debug events. DBCR2 $_{DAC2ER}$ specifies whether effective addresses, real addresses, effective addresses and MSR $_{DS}$ =0, or effective addresses and MSR $_{DS}$ =1 are used to in determining an address match on DAC2R and DAC2W debug events. #### **Data Address Compare Mode** DBCR2<sub>DAC12M</sub> specifies whether all or some of the bits of the address of the data storage access must match the contents of the DAC1 or DAC2, whether the address must be inside a specific range specified by the DAC1 and DAC2 or outside a specific range specified by the DAC1 and DAC2 for a DAC1R, DAC1W, DAC2R or DAC2W debug event to occur. There are four data address compare modes. Exact address compare mode If the 64-bit address of the data storage access is equal to the value in the enabled Data Address Compare Register, a data address match occurs. For 64-bit implementations, the addresses are masked to compare only bits 32:63 when the processor is executing in 32-bit mode. Address bit match mode If the address of the data storage access, ANDed with the contents of the DAC2, are equal to the contents of the DAC1, also ANDed with the contents of the DAC2, a data address match occurs. For 64-bit implementations, the addresses are masked to compare only bits 32:63 when the processor is executing in 32-bit mode. Inclusive address range compare mode If the 64-bit address of the data storage access is greater than or equal to the contents of the DAC1 and less than the contents of the DAC2, a data address match occurs. For 64-bit implementations, the addresses are masked to compare only bits 32:63 when the processor is executing in 32-bit mode. Exclusive address range compare mode If the 64-bit address of the data storage access is less than the contents of the DAC1 or greater than or equal to the contents of the DAC2, a data address match occurs. For 64-bit implementations, the addresses are masked to compare only bits 32:63 when the processor is executing in 32-bit mode. #### **Data Value Compare Mode** DBCR2<sub>DVC1M</sub> and DBCR2<sub>DVC1BE</sub> specify whether and how the data value being accessed by the storage access must match the contents of the DVC1 for a DAC1R or DAC1W debug event to occur. ${\sf DBCR2_{\sf DVC2M}}$ and ${\sf DBCR2_{\sf DVC2BE}}$ specify whether and how the data value being accessed by the storage access must match the contents of the DVC2 for a DAC2R or DAC2W debug event to occur. The description of DBCR0 (see Section 8.5.1.1) and DBCR2 (see Section 8.5.1.3) and the modes for detecting Data Address Compare debug events. Data Address Compare debug events can occur regardless of the setting of MSR<sub>DE</sub> or DBCR0<sub>IDM</sub>. When an Data Address Compare debug event occurs, the corresponding $\mathsf{DBSR}_{\mathsf{DAC1R}}, \; \mathsf{DBSR}_{\mathsf{DAC1W}}, \; \mathsf{DBSR}_{\mathsf{DAC2R}}, \; \mathsf{or} \; \mathsf{DBSR}_{\mathsf{DAC2W}} \; \mathsf{bit} \; \mathsf{or} \; \mathsf{bits} \; \mathsf{are} \; \mathsf{set} \; \mathsf{to} \; \mathsf{1} \; \mathsf{to} \; \mathsf{record} \; \mathsf{the} \; \mathsf{debug} \; \mathsf{exception}. \; \mathsf{If} \; \mathsf{MSR}_{\mathsf{DE}} = \mathsf{0}, \; \mathsf{DBSR}_{\mathsf{IDE}} \; \mathsf{is} \; \mathsf{also} \; \mathsf{set} \; \mathsf{to} \; \mathsf{1} \; \mathsf{to} \; \mathsf{record} \; \mathsf{the} \; \mathsf{imprecise} \; \mathsf{debug} \; \mathsf{event}.$ If MSR<sub>DE</sub>=1 (i.e. Debug interrupts are enabled) at the time of the Data Address Compare debug exception, a Debug interrupt will occur immediately (provided there exists no higher priority exception which is enabled to cause an interrupt), the execution of the instruction causing the exception will be suppressed, and CSRR0/DSRR0 [Category: Embedded.Enhanced Debug will be set to the address of the excepting instruction. Depending on the type of instruction and/or the alignment of the data access, the instruction causing the exception may have been partially executed (see Section 5.7). If MSR<sub>DE</sub>=0 (i.e. Debug interrupts are disabled) at the time of the Data Address Compare debug exception, a Debug interrupt will not occur, and the instruction will complete execution (provided the instruction is not causing some other exception which will generate an enabled interrupt). Also, DBSR<sub>IDE</sub> is set to indicate that the debug exception occurred while Debug interrupts were disabled by MSR<sub>DE</sub>=0. Later, if the debug exception has not been reset by clearing $\mathsf{DBSR}_{\mathsf{DAC2R}},\;\mathsf{DBSR}_{\mathsf{DAC2R}},\;\mathsf{DBSR}_{\mathsf{DAC2W}},\;\mathsf{and}\;\mathsf{MSR}_{\mathsf{DE}}$ is set to 1, a delayed Debug interrupt will occur. In this case, CSRR0/DSRR0 [Category: Embedded.Enhanced Debug will contain the address of the instruction after the one which enabled the Debug interrupt by setting $\mathsf{MSR}_{\mathsf{DE}}$ to 1. Software in the Debug interrupt handler can observe $\mathsf{DBSR}_{\mathsf{IDE}}$ to determine how to interpret the value in CSRR0/DSRR0 [Category: Embedded.Enhanced Debug. #### 8.4.3 Trap Debug Event A Trap debug event (TRAP) occurs if DBCR0<sub>TRAP</sub>=1 (i.e. Trap debug events are enabled) and a *Trap* instruction (*tw*, *twi*, *td*, *tdi*) is executed and the conditions specified by the instruction for the trap are met. The event can occur regardless of the setting of $MSR_{DF}$ or $DBCR0_{IDM}$ . When a Trap debug event occurs, DBSR<sub>TR</sub> is set to 1 to record the debug exception. If MSR<sub>DE</sub>=0, DBSR<sub>IDE</sub> is also set to 1 to record the imprecise debug event. If MSR<sub>DE</sub>=1 (i.e. Debug interrupts are enabled) at the time of the Trap debug exception, a Debug interrupt will occur immediately (provided there exists no higher priority exception which is enabled to cause an interrupt), and CSRR0/DSRR0 [Category: Embedded.Enhanced Debug] will be set to the address of the excepting instruction. If MSR<sub>DE</sub>=0 (i.e. Debug interrupts are disabled) at the time of the Trap debug exception, a Debug interrupt will not occur, and a Trap exception type Program interrupt will occur instead if the trap condition is met. Later, if the debug exception has not been reset by clearing $\mathsf{DBSR}_{\mathsf{TR}}$ , and $\mathsf{MSR}_{\mathsf{DE}}$ is set to 1, a delayed Debug interrupt will occur. In this case, $\mathsf{CSRR0/DSRR0}$ [Category: Embedded.Enhanced Debug will contain the address of the instruction after the one which enabled the Debug interrupt by setting $\mathsf{MSR}_{\mathsf{DE}}$ to 1. Software in the debug interrupt handler can observe $\mathsf{DBSR}_{\mathsf{IDE}}$ to determine how to interpret the value in $\mathsf{CSRR0/DSRR0}$ [Category: Embedded.Enhanced Debug]. ### 8.4.4 Branch Taken Debug Event A Branch Taken debug event (BRT) occurs if DBCR0<sub>BRT</sub>=1 (i.e. Branch Taken Debug events are enabled), execution is attempted of a branch instruction whose direction will be taken (that is, either an unconditional branch, or a conditional branch whose branch condition is met), and MSR<sub>DF</sub>=1. Branch Taken debug events are not recognized if MSR<sub>DE</sub>=0 at the time of the execution of the branch instruction and thus DBSR<sub>IDE</sub> can not be set by a Branch Taken debug event. This is because branch instructions occur very frequently. Allowing these common events to be recorded as exceptions in the DBSR while debug interrupts are disabled via MSR<sub>DE</sub> would result in an inordinate number of imprecise Debug interrupts. When a Branch Taken debug event occurs, the DBSR-BRT bit is set to 1 to record the debug exception and a Debug interrupt will occur immediately (provided there exists no higher priority exception which is enabled to cause an interrupt). The execution of the instruction causing the exception will be suppressed, and CSRR0/DSRR0 [Category: Embedded.Enhanced Debug] will be set to the address of the excepting instruction. ## 8.4.5 Instruction Complete Debug Event An Instruction Complete debug event (ICMP) occurs if DBCR0<sub>ICMP</sub>=1 (i.e. Instruction Complete debug events are enabled), execution of any instruction is completed, and MSR<sub>DE</sub>=1. Note that if execution of an instruction is suppressed due to the instruction causing some other exception which is enabled to generate an interrupt, then the attempted execution of that instruction does not cause an Instruction Complete debug event. The *sc* instruction does not fall into the type of an instruction whose execution is suppressed, since the instruction actually completes execution and then generates a System Call interrupt. In this case, the Instruction Complete debug exception will also be set. Instruction Complete debug events are not recognized if $MSR_{DE}=0$ at the time of the execution of the instruction, $DBSR_{IDE}$ can not be set by an ICMP debug event. This is because allowing the common event of Instruction Completion to be recorded as an exception in the DBSR while Debug interrupts are disabled via $MSR_{DE}$ would mean that the Debug interrupt handler software would receive an inordinate number of imprecise Debug interrupts every time Debug interrupts were reenabled via $MSR_{DE}$ . When an Instruction Complete debug event occurs, DBSR<sub>ICMP</sub> is set to 1 to record the debug exception, a Debug interrupt will occur immediately (provided there exists no higher priority exception which is enabled to cause an interrupt), and CSRR0/DSRR0 [Category: Embedded.Enhanced Debug] will be set to the address of the instruction after the one causing the Instruction Complete debug exception. #### 8.4.6 Interrupt Taken Debug Event ## 8.4.6.1 Causes of Interrupt Taken Debug Events Only base class interrupts can cause an Interrupt Taken debug event. If the Embedded.Enhanced Debug category is not supported or is supported and not enabled, all other interrupts automatically clear MSR<sub>DE</sub>, and thus would always prevent the associated Debug interrupt from occurring precisely. If the Embedded.Enhanced Debug category is supported and enabled, then critical class interrupts do not automatically clear MSR<sub>DE</sub>, but they cause Critical Interrupt Taken debug events. Also, if the Embedded.Enhanced Debug category is not supported or is supported and not enabled, Debug interrupts themselves are critical class interrupts, and thus any Debug interrupt (for any other debug event) would always end up setting the additional exception of DBSR<sub>IRPT</sub> upon entry to the Debug interrupt handler. At this point, the Debug interrupt handler would be unable to determine whether or not the Interrupt Taken debug event was related to the original debug event. ## 8.4.6.2 Interrupt Taken Debug Event Description An Interrupt Taken debug event (IRPT) occurs if $DBCR0_{IRPT}$ =1 (i.e. Interrupt Taken debug events are enabled) and a base class interrupt occurs. Interrupt Taken debug events can occur regardless of the setting of $MSR_{DE}$ . When an Interrupt Taken debug event occurs, DBSR-IRPT is set to 1 to record the debug exception. If MSR<sub>DE</sub>=0, DBSR<sub>IDE</sub> is also set to 1 to record the imprecise debug event. If MSR<sub>DE</sub>=1 (i.e. Debug interrupts are enabled) at the time of the Interrupt Taken debug event, a Debug interrupt will occur immediately (provided there exists no higher priority exception which is enabled to cause an interrupt), and Critical Save/Restore Register 0/Debug Save/Restore Register 0 [Category: Embedded.Enhanced Debug] will be set to the address of the interrupt vector which caused the Interrupt Taken debug event. No instructions at the base interrupt handler will have been executed. If MSR<sub>DE</sub>=0 (i.e. Debug interrupts are disabled) at the time of the Interrupt Taken debug event, a Debug interrupt will not occur, and the handler for the interrupt which caused the Interrupt Taken debug event will be allowed to execute. Later, if the debug exception has not been reset by clearing DBSR<sub>IRPT</sub>, and MSR<sub>DE</sub> is set to 1, a delayed Debug interrupt will occur. In this case, CSRR0/DSRR0 [Category: Embedded.Enhanced Debug] will contain the address of the instruction after the one which enabled the Debug interrupt by setting MSR<sub>DE</sub> to 1. Software in the Debug interrupt handler can observe the DBSR<sub>IDE</sub> bit to determine how to interpret the value in CSRR0/DSRR0 [Category: Embedded.Enhanced Debug. #### 8.4.7 Return Debug Event A Return debug event (RET) occurs if DBCR0<sub>RET</sub>=1 and an attempt is made to execute an *rfi*. Return debug events can occur regardless of the setting of MSR<sub>DE</sub>. When a Return debug event occurs, DBSR<sub>RET</sub> is set to 1 to record the debug exception. If MSR<sub>DE</sub>=0, DBSR<sub>IDE</sub> is also set to 1 to record the imprecise debug event. If MSR<sub>DE</sub>=1 at the time of the Return Debug event, a Debug interrupt will occur immediately, and CSRR0/DSRR0 [Category: Embedded.Enhanced Debug will be set to the address of the *rfi*. If MSR<sub>DE</sub>=0 at the time of the Return Debug event, a Debug interrupt will not occur. Later, if the Debug exception has not been reset by clearing DBSR<sub>RET</sub>, and MSR<sub>DE</sub> is set to 1, a delayed imprecise Debug interrupt will occur. In this case, CSRR0/DSRR0 [Category: Embedded.Enhanced Debug will contain the address of the instruction after the one which enabled the Debug interrupt by setting MSR<sub>DE</sub> to 1. An imprecise Debug interrupt can be caused by executing an *rfi* when DBCR0<sub>RET</sub>=1 and MSR<sub>DE</sub>=0, and the execution of that *rfi* happens to cause MSR<sub>DE</sub> to be set to 1. Software in the Debug interrupt handler can observe the DBSR<sub>IDE</sub> bit to determine how to interpret the value in CSRR0/DSRR0 [Category: Embedded.Enhanced Debug]. ### 8.4.8 Unconditional Debug Event An Unconditional debug event (UDE) occurs when the Unconditional Debug Event (UDE) signal is activated by the debug mechanism. The exact definition of the UDE signal and how it is activated is implementation-dependent. The Unconditional debug event is the only debug event which does not have a corresponding enable bit for the event in DBCR0 (hence the name of the event). The Unconditional debug event can occur regardless of the setting of MSR<sub>DE</sub>. When an Unconditional debug event occurs, the $DBSR_{UDE}$ bit is set to 1 to record the Debug exception. If $MSR_{DE}$ =0, $DBSR_{IDE}$ is also set to 1 to record the imprecise debug event. If MSR<sub>DE</sub>=1 (i.e. Debug interrupts are enabled) at the time of the Unconditional Debug exception, a Debug interrupt will occur immediately (provided there exists no higher priority exception which is enabled to cause an interrupt), and CSRR0/DSRR0 [Category: Embed- ded.Enhanced Debug] will be set to the address of the instruction which would have executed next had the interrupt not occurred. If MSR<sub>DE</sub>=0 (i.e. Debug interrupts are disabled) at the time of the Unconditional Debug exception, a Debug interrupt will not occur. Later, if the Unconditional Debug exception has not been reset by clearing $\mathsf{DBSR}_{\mathsf{UDE}},$ and $\mathsf{MSR}_{\mathsf{DE}}$ is set to 1, a delayed Debug interrupt will occur. In this case, $\mathsf{CSRR0/DSRR0}$ [Category: Embedded.Enhanced Debug] will contain the address of the instruction after the one which enabled the Debug interrupt by setting $\mathsf{MSR}_{\mathsf{DE}}$ to 1. Software in the Debug interrupt handler can observe $\mathsf{DBSR}_{\mathsf{IDE}}$ to determine how to interpret the value in $\mathsf{CSRR0/DSRR0}$ [Category: Embedded.Enhanced Debug]. # 8.4.9 Critical Interrupt Taken Debug Event [Category: Embedded.Enhanced Debug] A Critical Interrupt Taken debug event (CIRPT) occurs if $DBCR0_{CIRPT} = 1$ (i.e. Critical Interrupt Taken debug events are enabled) and a critical interrupt occurs. A critical interrupt is any interrupt that saves state in CSRR0 and CSRR1 when the interrupt is taken. Critical Interrupt Taken debug events can occur regardless of the setting of $MSR_{DE}$ . When a Critical Interrupt Taken debug event occurs, DBSR<sub>CIRPT</sub> is set to 1 to record the debug event. If $MSR_{DE}=0$ , DBSR<sub>IDE</sub> is also set to 1 to record the imprecise debug event. If MSR<sub>DE</sub> = 1 (i.e. Debug Interrupts are enabled) at the time of the Critical Interrupt Taken debug event, a Debug Interrupt will occur immediately (provided there is no higher priority exception which is enabled to cause an interrupt), and DSRR0 will be set to the address of the first instruction of the critical interrupt handler. No instructions at the critical interrupt handler will have been executed. If $MSR_{DE} = 0$ (i.e. Debug Interrupts are disabled) at the time of the Critical Interrupt Taken debug event, a Debug Interrupt will not occur, and the handler for the critical interrupt which caused the debug event will be allowed to execute normally. Later, if the debug exception has not been reset by clearing $DBSR_{CIRPT}$ and $MSR_{DE}$ is set to 1, a delayed Debug Interrupt will occur. In this case DSRR0 will contain the address of the instruction after the one that set $MSR_{DE} = 1$ . Software in the Debug Interrupt handler can observe $DBSR_{IDE}$ to determine how to interpret the value in DSRR0. # 8.4.10 Critical Interrupt Return Debug Event [Category: Embedded.Enhanced Debug] A Critical Interrupt Return debug event (CRET) occurs if DBCR0<sub>CRET</sub> = 1 (i.e. Critical Interrupt Return debug events are enabled) and an attempt is made to execute an *rfci* instruction. Critical Interrupt Return debug events can occur regardless of the setting of MSR<sub>DE</sub>. When a Critical Interrupt Return debug event occurs, $DBSR_{CRET}$ is set to 1 to record the debug event. If $MSR_{DE}$ =0, $DBSR_{IDE}$ is also set to 1 to record the imprecise debug event. If $MSR_{DE} = 1$ (i.e. Debug Interrupts are enabled) at the time of the Critical Interrupt Return debug event, a Debug Interrupt will occur immediately (provided there is no higher priority exception which is enabled to cause an interrupt), and DSRR0 will be set to the address of the *rfci* instruction. If $MSR_{DE} = 0$ (i.e. Debug Interrupts are disabled) at the time of the Critical Interrupt Return debug event, a Debug Interrupt will not occur. Later, if the debug exception has not been reset by clearing $DBSR_{CRET}$ and $MSR_{DE}$ is set to 1, a delayed Debug Interrupt will occur. In this case DSRR0 will contain the address of the instruction after the one that set $MSR_{DE} = 1$ . An imprecise Debug Interrupt can be caused by executing an $\emph{rfci}$ when $DBCR0_{CRET} = 1$ and $MSR_{DE} = 0$ , and the execution of the $\emph{rfci}$ happens to cause $MSR_{DE}$ to be set to 1. Software in the Debug Interrupt handler can observe $DBSR_{IDE}$ to determine how to interpret the value in DSRR0. ### 8.5 Debug Registers This section describes debug-related registers that are accessible to software running on the processor. These registers are intended for use by special debug tools and debug software, and not by general application or operating system code. ### 8.5.1 Debug Control Registers Debug Control Register 0 (DBCR0), Debug Control Register 1 (DBCR1), and Debug Control Register 2 (DBCR2) are each 32-bit registers. Bits of DBCR0, DBCR1, and DBCR2 are numbered 32 (most-significant bit) to 63 (least-significant bit). DBCR0, DBCR1, and DBCR2 are used to enable debug events, reset the processor, control timer operation during debug events, and set the debug mode of the processor. ## 8.5.1.1 Debug Control Register 0 (DBCR0) The contents of the DBCR0 can be read into bits 32:63 of register RT using *mfspr RT,DBCR0*, setting bits 0:31 of RT to 0. The contents of bits 32:63 of register RS can be written to the DBCR0 using *mtspr DBCR0,RS*. The bit definitions for DBCR0 are shown below. #### Bit(s) Description ## 32 External Debug Mode (EDM) [Category: Embedded.Enhanced Debug] The EDM bit is a read-only bit that reflects whether the processor is controlled by an external debug facility. When EDM is set, internal debug mode is suppressed and the taking of debug interrupts does not occur. - The processor is not in external debug mode. - 1 The processor is in external debug mode. #### 33 Internal Debug Mode (IDM) - 0 Debug interrupts are disabled. - 1 If MSR<sub>DE</sub>=1, then the occurrence of a debug event or the recording of an earlier debug event in the Debug Status Register when MSR<sub>DE</sub>=0 or DBCR0<sub>IDM</sub>=0 will cause a Debug interrupt. #### 34:35 **Reset** (RST) - 00 No action - 01 Implementation-specific - 10 Implementation-specific - 11 Implementation-specific **Warning:** Writing 0b01, 0b10, or 0b11 to these bits may cause a processor reset to occur. ### 36 Instruction Completion Debug Event (ICMP) - 0 ICMP debug events are disabled - 1 ICMP debug events are enabled **Note:** Instruction Completion will not cause an ICMP debug event if MSR<sub>DE</sub>=0. #### 37 Branch Taken Debug Event Enable (BRT) - 0 BRT debug events are disabled - 1 BRT debug events are enabled **Note:** Taken branches will not cause a BRT debug event if MSR<sub>DE</sub>=0. #### 38 Interrupt Taken Debug Event Enable (IRPT) - 0 IRPT debug events are disabled - 1 IRPT debug events are enabled **Note:** Critical interrupts will not cause an IRPT Debug event even if MSR<sub>DE</sub>=0. If the Embed- ded.Enhanced Debug category is supported, see Section 8.4.9. #### 39 Trap Debug Event Enable (TRAP) - 0 TRAP debug events cannot occur - 1 TRAP debug events can occur #### 40 Instruction Address Compare 1 Debug Event Enable (IAC1) - 0 IAC1 debug events cannot occur - 1 IAC1 debug events can occur #### 41 Instruction Address Compare 2 Debug Event Enable (IAC2) - 0 IAC2 debug events cannot occur - 1 IAC2 debug events can occur #### 42 Instruction Address Compare 3 Debug Event Enable (IAC3) - 0 IAC3 debug events cannot occur - 1 IAC3 debug events can occur #### 43 Instruction Address Compare 4 Debug Event Enable (IAC4) - 0 IAC4 debug events cannot occur - 1 IAC4 debug events can occur ### 44:45 **Data Address Compare 1 Debug Event Enable** (DAC1) - 00 DAC1 debug events cannot occur - 01 DAC1 debug events can occur only if a store-type data storage access - 10 DAC1 debug events can occur only if a load-type data storage access - 11 DAC1 debug events can occur on any data storage access ## 46:47 **Data Address Compare 2 Debug Event Enable** (DAC2) - 00 DAC2 debug events cannot occur - 01 DAC2 debug events can occur only if a store-type data storage access - 10 DAC2 debug events can occur only if a load-type data storage access - 11 DAC2 debug events can occur on any data storage access #### 48 Return Debug Event Enable (RET) - 0 RET debug events cannot occur - 1 RET debug events can occur **Note:** Return From Critical Interrupt will not cause an RET debug event if MSR<sub>DE</sub>=0. If the Embedded.Enhanced Debug category is supported, see Section 8.4.10 #### 49:56 Reserved 57 Critical Interrupt Taken Debug Event (CIRPT) [Category: Embedded.Enhanced #### Debug] A Critical Interrupt Taken Debug Event occurs when DBCR0<sub>CIRPT</sub> = 1 and a critical interrupt (any interrupt that uses the critical class, i.e. uses CSRR0 and CSRR1) occurs. - Critical interrupt taken debug events are disabled. - Critical interrupt taken debug events are enabled. ## 58 Critical Interrupt Return Debug Event (CRET) [Category: Embedded.Enhanced Debug] A Critical Interrupt Return Debug Event occurs when DBCR0<sub>CRET</sub>= 1 and a return from critical interrupt (an *rfci* instruction is executed) occurs. - Critical interrupt return debug events are disabled. - Critical interrupt return debug events are enabled. #### 59:62 Implementation-dependent #### 63 Freeze Timers on Debug Event (FT) - 0 Enable clocking of timers - Disable clocking of timers if any DBSR bit is set (except MRR) ## 8.5.1.2 Debug Control Register 1 (DBCR1) The contents of the DBCR1 can be read into bits 32:63 a register RT using *mfspr* RT,DBCR1, setting bits 0:31 of RT to 0. The contents of bits 32:63 of register RS can be written to the DBCR1 using *mtspr* DBCR1,RS. The bit definitions for DBCR1 are shown below. #### Bit(s) Description #### 32:33 Instruction Address Compare 1 User/ Supervisor Mode(IAC1US) - 00 IAC1 debug events can occur - 01 Reserved - 10 IAC1 debug events can occur only if $\ensuremath{\mathsf{MSR}_{\mathsf{PR}}}\xspace=0$ - 11 IAC1 debug events can occur only if MSR<sub>PR</sub>=1 #### 34:35 Instruction Address Compare 1 Effective/ Real Mode (IAC1ER) - 00 IAC1 debug events are based on effective addresses - 01 IAC1 debug events are based on real addresses - 10 IAC1 debug events are based on effective addresses and can occur only if MSR<sub>IS</sub>=0 - 11 IAC1 debug events are based on effective addresses and can occur only if MSR<sub>IS</sub>=1 #### 36:37 Instruction Address Compare 2 User/ Supervisor Mode (IAC2US) - 00 IAC2 debug events can occur - 01 Reserved - 10 IAC2 debug events can occur only if MSR<sub>PR</sub>=0 - 11 IAC2 debug events can occur only if MSR<sub>PR</sub>=1 #### 38:39 Instruction Address Compare 2 Effective/ Real Mode (IAC2ER) - 00 IAC2 debug events are based on effective addresses - 01 IAC2 debug events are based on real addresses - 10 IAC2 debug events are based on effective addresses and can occur only if MSR<sub>IS</sub>=0 - 11 IAC2 debug events are based on effective addresses and can occur only if MSR<sub>IS</sub>=1 ## 40:41 Instruction Address Compare 1/2 Mode (IAC12M) 00 Exact address compare IAC1 debug events can occur only if the address of the instruction fetch is equal to the value specified in IAC1. IAC2 debug events can occur only if the address of the instruction fetch is equal to the value specified in IAC2. #### 01 Address bit match IAC1 and IAC2 debug events can occur only if the address of the instruction fetch, ANDed with the contents of IAC2 are equal to the contents of IAC1, also ANDed with the contents of IAC2. If IAC1US≠IAC2US or IAC1ER≠IAC2ER, results are boundedly undefined. #### 10 Inclusive address range compare IAC1 and IAC2 debug events can occur only if the address of the instruction fetch is greater than or equal to the value specified in IAC1 and less than the value specified in IAC2. If IAC1US≠IAC2US or IAC1ER≠IAC2ER, results are boundedly undefined. #### 11 Exclusive address range compare IAC1 and IAC2 debug events can occur only if the address of the instruction fetch is less than the value specified in IAC1 or is greater than or equal to the value specified in IAC2. If IAC1US≠AC2US or IAC1ER≠IAC2ER, results are boundedly undefined. #### 42:47 Reserved #### 48:49 Instruction Address Compare 3 User/ Supervisor Mode (IAC3US) - 00 IAC3 debug events can occur - 01 Reserved - 10 IAC3 debug events can occur only if $\ensuremath{\mathsf{MSR}_{\mathsf{PR}}}\xspace=0$ - 11 IAC3 debug events can occur only if MSR<sub>PR</sub>=1 #### 50:51 Instruction Address Compare 3 Effective/ Real Mode (IAC3ER) - 00 IAC3 debug events are based on effective addresses - 01 IAC3 debug events are based on real - 10 IAC3 debug events are based on effective addresses and can occur only if MSR<sub>IS</sub>=0 - 11 IAC3 debug events are based on effective addresses and can occur only if MSR<sub>IS</sub>=1 #### 52:53 Instruction Address Compare 4 User/ Supervisor Mode (IAC4US) - 00 IAC4 debug events can occur - 01 Reserved - 10 IAC4 debug events can occur only if MSR<sub>PR</sub>=0 - 11 IAC4 debug events can occur only if MSR<sub>PR</sub>=1 #### 54:55 Instruction Address Compare 4 Effective/ Real Mode (IAC4ER) - 00 IAC4 debug events are based on effective - 01 IAC4 debug events are based on real addresses - 10 IAC4 debug events are based on effective addresses and can occur only if MSR<sub>IS</sub>=0 - 11 IAC4 debug events are based on effective addresses and can occur only if MSR<sub>IS</sub>=1 ## 56:57 Instruction Address Compare 3/4 Mode (IAC34M) 00 Exact address compare IAC3 debug events can occur only if the address of the instruction fetch is equal to the value specified in IAC3. IAC4 debug events can occur only if the address of the instruction fetch is equal to the value specified in IAC4. #### 01 Address bit match IAC3 and IAC4 debug events can occur only if the address of the data storage access, ANDed with the contents of IAC4 are equal to the contents of IAC3, also ANDed with the contents of IAC4. If IAC3US≠IAC4US or IAC3ER≠IAC4ER, results are boundedly undefined. #### 10 Inclusive address range compare IAC3 and IAC4 debug events can occur only if the address of the instruction fetch is greater than or equal to the value specified in IAC3 and less than the value specified in IAC4. If IAC3US≠IAC4US or IAC3ER≠IAC4ER, results are boundedly undefined. #### 11 Exclusive address range compare IAC3 and IAC4 debug events can occur only if the address of the instruction fetch is less than the value specified in IAC3 or is greater than or equal to the value specified in IAC4. If IAC3US≠IAC4US or IAC3ER≠IAC4ER, results are boundedly undefined. #### 58:63 Reserved ## 8.5.1.3 Debug Control Register 2 (DBCR2) The contents of the DBCR2 can be copied into bits 32:63 register RT using *mfspr* RT,DBCR2, setting bits 0:31 of register RT to 0. The contents of bits 32:63 of a register RS can be written to the DBCR2 using *mtspr* DBCR2,RS. The bit definitions for DBCR2 are shown below. #### Bit(s) Description ## 32:33 Data Address Compare 1 User/Supervisor Mode (DAC1US) - 00 DAC1 debug events can occur - 01 Reserved - 10 DAC1 debug events can occur only if $MSR_{PR}$ =0 - 11 DAC1 debug events can occur only if MSR<sub>PR</sub>=1 ## 34:35 Data Address Compare 1 Effective/Real Mode (DAC1ER) - 00 DAC1 debug events are based on effective addresses - 01 DAC1 debug events are based on real addresses - 10 DAC1 debug events are based on effective addresses and can occur only if MSR<sub>DS</sub>=0 11 DAC1 debug events are based on effective addresses and can occur only if MSR<sub>DS</sub>=1 #### 36:37 Data Address Compare 2 User/Supervisor Mode (DAC2US) - 00 DAC2 debug events can occur - 01 Reserved - 10 DAC2 debug events can occur only if MSRPR=0 - 11 DAC2 debug events can occur only if MSRPR=1 ### 38:39 Data Address Compare 2 Effective/Real Mode (DAC2ER) - 00 DAC2 debug events are based on effective addresses - 01 DAC2 debug events are based on real addresses - 10 DAC2 debug events are based on effective addresses and can occur only if MSR<sub>DS</sub>=0 - 11 DAC2 debug events are based on effective addresses and can occur only if MSR<sub>DS</sub>=1 ## 40:41 **Data Address Compare 1/2 Mode** (DAC12M) #### 00 Exact address compare DAC1 debug events can occur only if the address of the data storage access is equal to the value specified in DAC1. DAC2 debug events can occur only if the address of the data storage access is equal to the value specified in DAC2. #### 01 Address bit match DAC1 and DAC2 debug events can occur only if the address of the data storage access, ANDed with the contents of DAC2 are equal to the contents of DAC1, also ANDed with the contents of DAC2. If DAC1US≠DAC2US or DAC1ER≠DAC2ER, results are boundedly undefined. #### 10 Inclusive address range compare DAC1 and DAC2 debug events can occur only if the address of the data storage access is greater than or equal to the value specified in DAC1 and less than the value specified in DAC2. If DAC1US $\neq$ DAC2US or DAC1ER $\neq$ DAC2ER, results are boundedly undefined. #### 11 Exclusive address range compare DAC1 and DAC2 debug events can occur only if the address of the data storage access is less than the value specified in DAC1 or is greater than or equal to the value specified in DAC2. If DAC1US $\neq$ DAC2US or DAC1ER $\neq$ DAC2ER, results are boundedly undefined. #### 42:43 Reserved #### 44:45 Data Value Compare 1 Mode (DVC1M) - 00 DAC1 debug events can occur - 01 DAC1 debug events can occur only when all bytes specified in DBCR2<sub>DVC1BE</sub> in the data value of the data storage access match their corresponding bytes in DVC1 - 10 DAC1 debug events can occur only when at least one of the bytes specified in DBCR2<sub>DVC1BE</sub> in the data value of the data storage access matches its corresponding byte in DVC1 - 11 DAC1 debug events can occur only when all bytes specified in DBCR2<sub>DVC1BE</sub> within at least one of the halfwords of the data value of the data storage access matches their corresponding bytes in DVC1 #### 46:47 **Data Value Compare 2 Mode** (DVC2M) - 00 DAC2 debug events can occur - 01 DAC2 debug events can occur only when all bytes specified in DBCR2<sub>DVC2BE</sub> in the data value of the data storage access match their corresponding bytes in DVC2 - 10 DAC2 debug events can occur only when at least one of the bytes specified in DBCR2<sub>DVC2BE</sub> in the data value of the data storage access matches its corresponding byte in DVC2 - 11 DAC2 debug events can occur only when all bytes specified in DBCR2<sub>DVC2BE</sub> within at least one of the halfwords of the data value of the data storage access matches their corresponding bytes in DVC2 ## 48:55 **Data Value Compare 1 Byte Enables** (DVC1BE) Specifies which bytes in the aligned data value being read or written by the storage access are compared to the corresponding bytes in DVC1. ## 56:63 **Data Value Compare 2 Byte Enables** (DVC2BE) Specifies which bytes in the aligned data value being read or written by the storage access are compared to the corresponding bytes in DVC2 #### 8.5.2 Debug Status Register The Debug Status Register (DBSR) is a 32-bit register and contains status on debug events and the most recent processor reset. The DBSR is set via hardware, and read and cleared via software. The contents of the DBSR can be read into bits 32:63 of a register RT using the *mfspr* instruction, setting bits 0:31 of RT to zero. Bits in the DBSR can be cleared using the *mtspr* instruction. Clearing is done by writing bits 32:63 of a register to the DBSR with a 1 in any bit position that is to be cleared and 0 in all other bit positions. The write-data to the DBSR is not direct data, but a mask. A 1 causes the bit to be cleared, and a 0 has no effect. The bit definitions for the DBSR are shown below: #### Bit(s) Description #### 32 Imprecise Debug Event (IDE) Set to 1 if MSR<sub>DE</sub>=0 and a debug event causes its respective Debug Status Register bit to be set to 1. #### 33 Unconditional Debug Event (UDE) Set to 1 if an Unconditional debug event occurred. See Section 8.4.8. #### 34:35 Most Recent Reset (MRR) Set to one of three values when a reset occurs. These two bits are undefined at power-up. - 00 No reset occurred since these bits last cleared by software - 01 Implementation-dependent reset information - 10 Implementation-dependent reset information - 11 Implementation-dependent reset information #### 36 Instruction Complete Debug Event (ICMP) Set to 1 if an Instruction Completion debug event occurred and DBCR0<sub>ICMP</sub>=1. See Section 8.4.5. #### 37 Branch Taken Debug Event (BRT) Set to 1 if a Branch Taken debug event occurred and DBCR0<sub>BRT</sub>=1. See Section 8.4.4. #### 38 Interrupt Taken Debug Event (IRPT) Set to 1 if an Interrupt Taken debug event occurred and DBCR0<sub>IRPT</sub>=1. See Section 8.4.6. #### 39 Trap Instruction Debug Event (TRAP) Set to 1 if a Trap Instruction debug event occurred and DBCR0<sub>TRAP</sub>=1. See Section 8.4.3. #### 40 Instruction Address Compare 1 Debug Event (IAC1) Set to 1 if an IAC1 debug event occurred and DBCR0<sub>IAC1</sub>=1. See Section 8.4.1. #### 41 Instruction Address Compare 2 Debug Event (IAC2) Set to 1 if an IAC2 debug event occurred and DBCR0<sub>IAC2</sub>=1. See Section 8.4.1. #### 42 Instruction Address Compare 3 Debug Event (IAC3) Set to 1 if an IAC3 debug event occurred and DBCR0<sub>IAC3</sub>=1. See Section 8.4.1. #### 43 Instruction Address Compare 4 Debug Event (IAC4) Set to 1 if an IAC4 debug event occurred and DBCR0<sub>IAC4</sub>=1. See Section 8.4.1. #### 44 Data Address Compare 1 Read Debug Event (DAC1R) Set to 1 if a read-type DAC1 debug event occurred and DBCR0<sub>DAC1</sub>=0b10 or DBCR0<sub>DAC1</sub>=0b11. See Section 8.4.2. #### 45 Data Address Compare 1 Write Debug Event (DAC1W) Set to 1 if a write-type DAC1 debug event occurred and DBCR0<sub>DAC1</sub>=0b01 or DBCR0<sub>DAC1</sub>=0b11. See Section 8.4.2. #### 46 Data Address Compare 2 Read Debug Event (DAC2R) Set to 1 if a read-type DAC2 debug event occurred and DBCR0<sub>DAC2</sub>=0b10 or DBCR0<sub>DAC2</sub>=0b11. See Section 8.4.2. #### 47 Data Address Compare 2 Write Debug Event (DAC2W) Set to 1 if a write-type DAC2 debug event occurred and DBCR0<sub>DAC2</sub>=0b01 or DBCR0<sub>DAC2</sub>=0b11. See Section 8.4.2. #### 48 Return Debug Event (RET) Set to 1 if a Return debug event occurred and DBCR0<sub>RET</sub>=1. See Section 8.4.2. 49:52 Reserved #### 53:56 Implementation-dependent #### 57 Critical Interrupt Taken Debug Event (CIRPT) [Category: Embedded.Enhanced Debug] A Critical Interrupt Taken Debug Event occurs when DBCR0<sub>CIRPT</sub>=1 and a critical interrupt (any interrupt that uses the critical class, i.e. uses CSRR0 and CSRR1) occurs. - Critical interrupt taken debug events are disabled. - Critical interrupt taken debug events are enabled. ## 58 Critical Interrupt Return Debug Event (CRET) [Category: Embedded.Enhanced Debug] A Critical Interrupt Return Debug Event occurs when DBCR0<sub>CRET</sub>=1 and a return from critical interrupt (an *rfci* instruction is executed) occurs. - Critical interrupt return debug events are disabled. - Critical interrupt return debug events are enabled. 59:63 Implementation-dependent ## 8.5.3 Instruction Address Compare Registers The Instruction Address Compare Register 1, 2, 3, and 4 (IAC1, IAC2, IAC3, and IAC4 respectively) are each 64-bits, with bit 63 being reserved. A debug event may be enabled to occur upon an attempt to execute an instruction from an address specified in either IAC1, IAC2, IAC3, or IAC4, inside or outside a range specified by IAC1 and IAC2 or, inside or outside a range specified by IAC3 and IAC4, or to blocks of addresses specified by the combination of the IAC1 and IAC2, or to blocks of addresses specified by the combination of the IAC3 and IAC4. Since all instruction addresses are required to be word-aligned, the two low-order bits of the Instruction Address Compare Registers are reserved and do not participate in the comparison to the instruction address (see Section 8.4.1 on page 705). The contents of the Instruction Address Compare i Register (where i={1,2,3, or 4}) can be read into register RT using mfspr RT,IACi. The contents of register RS can be written to the Instruction Address Compare i Register using mtspr IACi,RS. ## 8.5.4 Data Address Compare Registers The Data Address Compare Register 1 and 2 (DAC1 and DAC2 respectively) are each 64-bits. A debug event may be enabled to occur upon loads, stores, or cache operations to an address specified in either the DAC1 or DAC2, inside or outside a range specified by the DAC1 and DAC2, or to blocks of addresses specified by the combination of the DAC1 and DAC1 (see Section 8.4.2). The contents of the Data Address Compare i Register (where i={1 or 2}) can be read into register RT using mfspr RT, DACi. The contents of register RS can be written to the Data Address Compare i Register using mtspr DACi, RS. The contents of the DAC1 or DAC2 are compared to the address generated by a data storage access instruction. ## 8.5.5 Data Value Compare Registers The Data Value Compare Register 1 and 2 (DVC1 and DVC2 respectively) are each 64-bits. A DAC1R, DAC1W, DAC2R, or DAC2W debug event may be enabled to occur upon loads or stores of a specific data value specified in either or both of the DVC1 and DVC2. DBCR2<sub>DVC1M</sub> and DBCR2<sub>DVC1BE</sub> control how the contents of the DVC1 is compared with the value and DBCR2<sub>DVC2M</sub> and DBCR2<sub>DVC2BE</sub> control how the contents of the DVC2 is compared with the value (see Section 8.4.2 and Section 8.5.1.3). The contents of the Data Value Compare i Register (where i={1 or 2}) can be read into register RT using mfspr RT,DVCi. The contents of register RS can be written to the Data Value Compare i Register using mtspr DVCi,RS. ## 8.6 Debugger Notify Halt Instruction [Category: Embedded.Enhanced Debug] The *dnh* instruction provides the means for the transfer of information between the processor and an implementation-dependent external debug facility. *dnh* also causes the processor to stop fetching and executing instructions. #### Debugger Notify Halt XFX-form dnh **DUI, DUIS** | 19 | DUI | DUIS | 198 | / | |----|-----|------|-----|----| | 0 | 6 | 11 | 21 | 31 | if enabled by implementation-dependent means then $% \frac{\partial f}{\partial x} = \frac{\partial f}{\partial x} + \frac{\partial$ implementation-dependent register ← DUI halt processor else illegal instruction exception Execution of the *dnh* instruction causes the processor to stop fetching instructions and taking interrupts if execution of the instruction has been enabled. The contents of the DUI field are sent to the external debug facility to identify the reason for the halt. If execution of the *dnh* instruction has not been previously enabled, executing the *dnh* instruction produces an Illegal Instruction exception. The means by which execution of the *dnh* instruction is enabled is implementation-dependent. The current state of the processor debug facility, whether the processor is in IDM or EDM mode has no effect on the execution of the *dnh* instruction. The instruction is context synchronizing. #### Programming Note – The DUIS field in the instruction may be used to pass information to an external debug facility. After the *dnh* instruction has executed, the instruction itself can be read back by the Illegal Instruction Interrupt handler or the external debug facility if the contents of the DUIS field are of interest. If the processor entered the Illegal Instruction Interrupt handler, software can use SRR0 to obtain the address of the *dnh* instruction which caused the handler to be invoked. #### **Special Registers Altered:** None ## Chapter 9. Processor Control [Category: Embedded Processor Control] | 9.1 Overview | 9.2.1.2 Doorbell Critical Message Filtering | |----------------------------------------|---------------------------------------------| | 9.2.1 Processor Message Handling and | 9.3 Processor Control Instructions 721 | | Filtering | | | 9.2.1.1 Doorbell Message Filtering 720 | | #### 9.1 Overview The Processor Control facility provides a mechanism for processors within a coherence domain to send messages to all devices in the coherence domain. The facility provides a mechanism for sending interrupts that are not dependent on the interrupt controller to processors and allows message filtering by the processors that receive the message. The Processor Control facility is also useful for sending messages to a device that provides specialized services such as secure boot operations controlled by a security device. The Processor Control facility defines how processors send messages and what actions processors take on the receipt of a message. The actions taken by devices other than processors are not defined. ### 9.2 Programming Model Processors initiate a message by executing the **msg-snd** instruction and specifying a message type and message payload in a general purpose register. Sending a message causes the message to be sent to all the devices, including the sending processor, in the coherence domain in a reliable manner. Each device receives all messages that are sent. The actions that a device takes are dependent on the message type and payload. There are no restrictions on what messages a processor can send. To provide inter processor interrupt capability two message types are defined, *Processor Doorbell* and *Processor Doorbell Critical*. A *Processor Doorbell [Critical*] message causes an interrupt to occur on processors when the message is received and the processor determines through examination of the payload that the message should be accepted. The examination of the payload for this purpose is termed *filtering*. The acceptance of a *Processor Doorbell* [*Critical*] message causes an exception to be generated on the accepting processor. Processors accept and filter messages defined in Section 9.2.1. Processors may also accept other implementation-dependent defined messages. ## 9.2.1 Processor Message Handling and Filtering Processors filter, accept, and handle message types defined as follows. The message type is specified in the message and is determined by the contents of register $RB_{32:36}$ used as the operand in the **msgsnd** instruction. The message type is interpreted as follows: #### Value Description #### 0 Doorbell Interrupt (DBELL) A Processor Doorbell exception is generated on the processor when the processor has filtered the message based on the payload and has determined that it should accept the message. A Processor Doorbell Interrupt occurs when no higher priority exception exists, a Processor Doorbell exception exists, and MSR<sub>FF</sub>=1. # Doorbell Critical Interrupt (DBELL\_CRIT) A Processor Doorbell Critical exception is generated on the processor when the processor has filtered the message based on the payload and has determined that it should accept the message. A Processor Doorbell Critical Interrupt occurs when no higher priority exception exists, a Processor Doorbell Critical exception exists, and MSR<sub>CF</sub>=1. Message types other than these and their associated actions are implementation-dependent. #### 9.2.1.1 Doorbell Message Filtering A processor receiving a DBELL message type will filter the message and either ignore the message or accept the message and generate a Processor Doorbell exception based on the payload and the state of the processor at the time the message is received. The payload is specified in the message and is determined by the contents of register RB<sub>37:63</sub> used as the operand in the *msgsnd* instruction. The payload bits are defined below. #### Bit Description #### 37 Broadcast (BRDCAST) The message is accepted by all processors regardless of the value of the PIR register and the value of PIRTAG. - 0 If the value of PIR and PIRTAG are equal a Processor Doorbell exception is generated - 1 A Processor Doorbell exception is generated regardless of the value of PIRTAG and PIR. #### 38:41 Reserved #### 50:63 *PIR Tag* (PIRTAG) The contents of this field are compared with bits 50:63 of the PIR register. If a DBELL message is received by a processor and either payload<sub>BRDCAST</sub>=1 or PIR<sub>50:63</sub>=payload<sub>PIRTAG</sub> then a Processor Doorbell exception is generated. The exception condition remains until a Processor Doorbell Interrupt is taken, or a *msgclr* instruction is executed on the receiving processor with a message type of DBELL. A change to any of the filtering criteria (i.e. changing the PIR register) will not clear a pending Processor Doorbell exception. DBELL messages are not cumulative. That is, if a DBELL message is accepted and the interrupt is pended because MSR<sub>EE</sub>=0, further DBELL messages that would be accepted are ignored until the Processor Doorbell exception is cleared by taking the interrupt or cleared by executing a *msgclr* with a message type of DBELL on the receiving processor. The temporal relationship between when a DBELL message is sent and when it is received in a given processor is not defined. ## 9.2.1.2 Doorbell Critical Message Filtering A processor receiving a DBELL\_CRIT message type will filter the message and either ignore the message or accept the message and generate a Processor Doorbell Critical exception based on the payload and the state of the processor at the time the message is received. The payload is specified in the message and is determined by the contents of register RB<sub>37:63</sub> used as the operand in the *msgsnd* instruction. The payload bits are defined below. #### Bit Description #### 37 Broadcast (BRDCAST) The message is accepted by all processors regardless of the value of the PIR register and the value of PIRTAG. - If the value of PIR and PIRTAG are equal a Processor Doorbell Critical exception is generated. - 1 A Processor Doorbell Critical exception is generated regardless of the value of PIRTAG and PIR. #### 38:41 Reserved #### 50:63 **PIR Tag** (PIRTAG) The contents of this field are compared with bits 50:63 of the PIR register. If a DBELL\_CRIT message is received by a processor and either payload<sub>BRDCAST</sub>=1 or PIR<sub>50:63</sub>=payload-PIRTAG then a Processor Doorbell Critical exception is generated. The exception condition remains until a Processor Doorbell Critical Interrupt is taken, or a *msgclr* instruction is executed on the receiving processor with a message type of DBELL\_CRIT. A change to any of the filtering criteria (i.e. changing the PIR register) will not clear a pending Processor Doorbell Critical exception. DBELL\_CRIT messages are not cumulative. That is, if a DBELL\_CRIT message is accepted and the interrupt is pended because MSR<sub>CE</sub>=0, further DBELL\_CRIT messages that would be accepted are ignored until the Processor Doorbell Critical exception is cleared by taking the interrupt or cleared by executing a *msgclr* with a message type of DBELL\_CRIT on the receiving processor. The temporal relationship between when a DBELL\_CRIT message is sent and when it is received in a given processor is not defined. #### 9.3 Processor Control Instructions **msgsnd** and **msgclr** instructions are provided for sending and clearing messages to processors and other devices in the coherence domain. These instructions are privileged. In the instruction descriptions the statement "this instructions is treated as a *Store*" means that the instruction is treated as a *Store* with respect to the storage access ordering mechanism caused by memory barriers in Section 1.7.1 of Book II. | Message Send | | | | X-f | orm | |--------------|-------|----|----|-----|-----| | msgsnd | RB | | | | | | 31 | 6 /// | 11 | RB | 206 | 31 | ``` \begin{array}{l} \texttt{msgtype} \leftarrow \texttt{GPR} \, (\texttt{RB})_{\,32:36} \\ \texttt{payload} \leftarrow \texttt{GPR} \, (\texttt{RB})_{\,37:63} \\ \texttt{send} \, \, \texttt{msg} \, \, \texttt{to} \, \, \texttt{choherence} \, \, \texttt{domain} \, (\texttt{msgtype}, \, \, \texttt{payload}) \end{array} ``` $\emph{msgsnd}$ sends a message to all devices in the coherence domain. The message contains a type and a payload. The message type $(\emph{msgtype})$ is defined by the contents of RB<sub>32:36</sub> and the message payload is defined by the contents of RB<sub>37:63</sub>. Message delivery is reliable and guaranteed. Each device may perform specific actions based on the message type and payload or may ignore messages. Consult the implementation user's manual for specific actions taken based on message type and payload. For processors, actions taken on receipt of a message are defined in Section 9.2.1. For storage access ordering, *msgsnd* is treated as a *Store* with respect to memory barriers. This instruction is privileged. #### **Special Registers Altered:** None | Message Clear | | | | X-fo | orm | |---------------|-----|-----|----|------|-----| | msgclr | RB | | | | | | 31 | /// | /// | RB | 238 | / | ``` msgtype ← GPR(RB)<sub>32:36</sub> clear received message(msgtype) ``` **msgclr** clears a message of *msgtype* previously accepted by the processor executing the **msgclr**. *msgtype* is defined by the contents of RB<sub>32:36</sub>. A message is said to be cleared when a pending exception generated by an accepted message has not yet taken its associated interrupt. If a pending exception exists for *msgtype* that exception is cleared at the completion of the *msgclr* instruction. For processors, the types of messages that can be cleared are defined in Section 9.2.1. This instruction is privileged. #### **Special Registers Altered:** None #### Programming Note - Execution of a **msgclr** instruction that clears a pending exception when the associated interrupt is masked because the interrupt enable (MSR $_{\text{EE}}$ or MSR $_{\text{CE}}$ ) is not set to 1 will always clear the pending exception (and thus the interrupt will not occur) if a subsequent instruction causes MSR $_{\text{EE}}$ or MSR $_{\text{CE}}$ to be set to 1. ## **Chapter 10. Synchronization Requirements for Context Alterations** Changing the contents of certain System Registers, the contents of TLB entries, or the contents of other system resources that control the context in which a program executes can have the side effect of altering the context in which data addresses and instruction addresses are interpreted, and in which instructions are executed and data accesses are performed. For example, changing certain bits in the MSR has the side effect of changing how instruction addresses are calculated. These side effects need not occur in program order, and therefore may require explicit synchronization by software. (Program order is defined in Book II.) An instruction that alters the context in which data addresses or instruction addresses are interpreted, or in which instructions are executed or data accesses are performed, is called a *context-altering instruction*. This chapter covers all the context-altering instructions. The software synchronization required for them is shown in Table 5 (for data access) and Table 4 (for instruction fetch and execution). The notation "CSI" in the tables means any context synchronizing instruction (e.g., sc, isync, rfi, rfci, rfmci, or rfdi [Category: Embedded. Enhanced Debug]). A context synchronizing interrupt (i.e., any interrupt except non-recoverable System Reset or non-recoverable Machine Check) can be used instead of a context synchronizing instruction. If it is, phrases like "the synchronizing instruction", below, should be interpreted as meaning the instruction at which the interrupt occurs. If no software synchronization is required before (after) a context-altering instruction, "the synchronizing instruction before (after) the context-altering instruction" should be interpreted as meaning the context-altering instruction itself. The synchronizing instruction before the context-altering instruction ensures that all instructions up to and including that synchronizing instruction are fetched and executed in the context that existed before the alteration. The synchronizing instruction after the context-altering instruction ensures that all instructions after that synchronizing instruction are fetched and executed in the context established by the alteration. Instructions after the first synchronizing instruction, up to and including the second synchronizing instruction, may be fetched or executed in either context. If a sequence of instructions contains context-altering instructions and contains no instructions that are affected by any of the context alterations, no software synchronization is required within the sequence. #### **Programming Note** - Sometimes advantage can be taken of the fact that certain events, such as interrupts, and certain instructions that occur naturally in the program, such as an *rfi*, *rfci*, *rfmci*, or *rfdi* [Category:Embeddd.Enhanced Debug] that returns from an interrupt handler, provide the required synchronization. No software synchronization is required before or after a context-altering instruction that is also context synchronizing (e.g., *rfi*, etc.) or when altering the MSR in most cases (see the tables). No software synchronization is required before most of the other alterations shown in Table 4, because all instructions preceding the context-altering instruction are fetched and decoded before the context-altering instruction is executed (the processor must determine whether any of these preceding instructions are context synchronizing). Unless otherwise stated, the material in this chapter assumes a uniprocessor environment. | Instruction or<br>Event | Required<br>Before | Required<br>After | Notes | |---------------------------|--------------------|---------------------|-------| | interrupt | none | none | | | rfi | none | none | | | rfci | none | none | | | rfmci | none | none | | | rfdi[Category:E.ED] | none | none | | | sc | none | none | | | mtmsr (CM) | none | none | | | mtmsr (ICM) | none | CSI | | | mtmsr (UCLE) | none | none | | | mtmsr (SPV) | none | none | | | mtmsr (WE) | | | 4 | | mtmsr (CE) | none | none | 5 | | mtmsr (EE) | none | none | 5 | | mtmsr (PR) | none | CSI | | | mtmsr (FP) | none | CSI | | | mtmsr (DE) | none | CSI | | | mtmsr (ME) | none | CSI | 3 | | mtmsr (FE0) | none | CSI | | | mtmsr (FE1) | none | CSI | | | mtmsr (IS) | none | CSI | 2 | | mtspr (DEC) | none | none | 8 | | mtspr (PID) | none | CSI | 2 | | mtspr (IVPR) | none | none | | | mtspr (DBSR) | | | 6 | | mtspr | | | 6 | | (DBCR0,DBCR1) | | | | | mtspr<br>(IAC1,IAC2,IAC3, | | | 6 | | IAC4) | | | | | mtspr (IVORi) | none | none | | | mtspr (TSR) | none | none | 8 | | mtspr (TCR) | none | none | 8 | | tlbivax | none | CSI, or | 1,7 | | | | CSI and <b>sync</b> | ',' | | tlbwe | none | CSI, or | 1,7 | | | | CSI and <b>sync</b> | | | wrtee | none | none | 5 | | wrteei | none | none | 5 | Table 4: Synchronization requirements for instruction fetch and/or execution | Instruction or | Required | Required | Notes | |---------------------|----------|--------------------------------|-------| | Event | Before | After | | | interrupt | none | none | | | rfi | none | none | | | rfci | none | none | | | rfmci | none | none | | | rfdi[Category:E.ED] | none | none | | | sc | none | none | | | mtmsr (CM) | none | CSI | | | mtmsr (ICM) | none | none | | | mtmsr (PR) | none | CSI | | | mtmsr (ME) | none | CSI | 3 | | mtmsr (DS) | none | CSI | | | mtspr (PID) | CSI | CSI | | | mtspr (DBSR) | | | 6 | | mtspr | | | 6 | | (DBCR0,DBCR2) | | | | | mtspr | | | 6 | | (DAC1,DAC2, | | | | | DVC1,DVC2) | | | | | tlbivax | CSI | CSI, or CSI<br>and <b>sync</b> | 1,7 | | tlbwe | CSI | CSI, or CSI<br>and <b>sync</b> | 1,7 | Table 5: Synchronization requirements for data access #### Notes: - There are additional software synchronization requirements for this instruction in multiprocessor environments (e.g., it may be necessary to invalidate one or more TLB entries on all processors in the multiprocessor system and to be able to determine that the invalidations have completed and that all side effects of the invalidations have taken effect); it is also necessary to execute a *tlbsync* instruction. - The alteration must not cause an implicit branch in real address space. Thus the real address of the context-altering instruction and of each subsequent instruction, up to and including the next context synchronizing instruction, must be independent of whether the alteration has taken effect. - A context synchronizing instruction is required after altering MSR<sub>ME</sub> to ensure that the alteration takes effect for subsequent Machine Check interrupts, which may not be recoverable and therefore may not be context synchronizing. - Synchronization requirements for changing the Wait State Enable are implementation-dependent,. - The effect of changing MSR<sub>EE</sub> or MSR<sub>CE</sub> is immediate. If an *mtmsr*, *wrtee*, or *wrteei* instruction sets MSR<sub>EE</sub> to '0', an External Input, DEC or FIT interrupt does not occur after the instruction is executed. If an *mtmsr*, *wrtee*, or *wrteei* instruction changes MSR<sub>EE</sub> from '0' to '1' when an External Input, Decrementer, Fixed-Interval Timer, or higher priority enabled exception exists, the corresponding interrupt occurs immediately after the *mtmsr*, *wrtee*, or *wrteei* is executed, and before the next instruction is executed in the program that set MSR<sub>EF</sub> to '1'. If an *mtmsr* instruction sets MSR<sub>CE</sub> to '0', a Critical Input or Watchdog Timer interrupt does not occur after the instruction is executed. If an $\it{mtmsr}$ instruction changes MSR<sub>CE</sub> from '0' to '1' when a Critical Input, Watchdog Timer or higher priority enabled exception exists, the corresponding interrupt occurs immediately after the $\it{mtmsr}$ is executed, and before the next instruction is executed in the program that set MSR<sub>CE</sub> to '1'. - Synchronization requirements for changing any of the Debug Facility Registers are implementationdependent. - 7. For data accesses, the context synchronizing instruction before the *tlbwe* or *tlbivax* instruction ensures that all storage accesses due to preceding instructions have completed to a point at which they have reported all exceptions they will cause. The context synchronizing instruction after the *tlbwe* or *tlbivax* ensures that subsequent storage accesses (data and instruction) will use the updated value in the TLB entry(s) being affected. It does not ensure that all storage accesses previously translated by the TLB entry(s) being updated have completed with respect to storage; if these completions must be ensured, the *tlbwe* or *tlbivax* must be followed by an *sync* instruction as well as by a context synchronizing instruction. #### **Programming Note** - The following sequence illustrates why it is necessary, for data accesses, to ensure that all storage accesses due to instructions before the *tlbwe* or *tlbivax* have completed to a point at which they have reported all exceptions they will cause. Assume that valid TLB entries exist for the target storage location when the sequence starts. - A program issues a load or store to a page. - The same program executes a *tlbwe* or *tlbivax* that invalidates the corresponding TLB entry. - The *Load* or *Store* instruction finally executes, and gets a TLB Miss exception. - The TLB Miss exception is semantically incorrect. In order to prevent it, a context synchronizing instruction must be executed between steps 1 and 2. - 8. The elapsed time between the Decrementer reaching zero, or the transition of the selected Time Base bit for the Fixed-Interval Timer or the Watchdog Timer, and the signalling of the Decrementer, Fixed-Interval Timer or the Watchdog Timer exception is not defined. ### Appendix A. Implementation-Dependent Instructions This appendix documents architectural resources that are allocated for specific implementation-sensitive functions which have scope-limited utility. Implementa- tions may exercise reasonable flexibility in implementing these functions, but that flexibility should be limited to that allowed in this appendix. ## A.1 Embedded Cache Initialization [Category: Embedded.Cache Initialization] #### Data Cache Invalidate X-form #### Instruction Cache Invalidate X-form dci CT | 31 | / | CT | /// | /// | 454 | / | |----|---|----|-----|-----|-----|----| | 0 | 6 | 7 | 11 | 16 | 21 | 31 | If CT is not supported by the implementation, this instruction designates the primary data cache as the target data cache. If CT is supported by the implementation, let CT designate either the primary data cache or another level of the data cache hierarchy, as specified in Book II Section 3.2, as the target data cache. The contents of the target data cache of the processor executing the *dci* instruction are invalidated. Software must place a **sync** instruction before the **dci** to guarantee all previous data storage accesses complete before the **dci** is performed. Software must place a **sync** instruction after the **dci** to guarantee that the **dci** completes before any subsequent data storage accesses are performed. This instruction is privileged. #### **Special Registers Altered:** None #### **Extended Mnemonics:** Extended mnemonic for Data Cache Invalidate | Extended: | Equivalent to: | |-----------|----------------| | decei | dci 0 | If CT is not supported by the implementation, this instruction designates the primary instruction cache as the target instruction cache. If CT is supported by the implementation, let CT designate either the primary instruction cache or another level of the instruction cache hierarchy, as specified in Book II Section 3.2, as the target instruction cache. The contents of the target instruction cache of the processor executing the *ici* instruction are invalidated. Software must place a **sync** instruction before the **ici** to guarantee all previous instruction storage accesses complete before the **ici** is performed. Software must place an *isync* instruction after the *ici* to invalidate any instructions that may have already been fetched from the previous contents of the instruction cache after the *isync*. This instruction is privileged. #### **Special Registers Altered:** None #### **Extended Mnemonics:** Extended mnemonic for Instruction Cache Invalidate **Extended:** Equivalent to: iccci ici 0 31 ## A.2 Embedded Cache Debug Facility [Category: Embedded.Cache Debug] #### A.2.1 Embedded Cache Debug Registers ## A.2.1.1 Data Cache Debug Tag Register High The Data Cache Debug Tag Register High (DCDB-TRH) is a 32-bit Special Purpose Register. The Data Cache Debug Tag Register High is read using *mfspr* and is set by *dcread*. Figure 25. Data Cache Debug Tag Register High #### **Programming Note** An example implementation of DCDBTRH could have the following content and format. | nave | the following content and format. | |--------|--------------------------------------------------------------------------------------------------------------------------| | Bit(s) | Description | | 32:55 | <b>Tag Real Address</b> (TRA) Bits 0:23 of the lower 32 bits of the 36-bit real address associated with this cache block | | 56 | Valid (V) The valid indicator for the cache block (1 indicates valid) | | 57:59 | Reserved | | 60:63 | <b>Tag Extended Real Address</b> (TERA)<br>Upper 4 bits of the 36-bit real address<br>associated with this cache block | | Imple | mentations may support different content and | format based on their cache implementation. ## A.2.1.2 Data Cache Debug Tag Register Low The Data Cache Debug Tag Register Low (DCDBTRL) is a 32-bit Special Purpose Register. The Data Cache Debug Tag Register Low is read using *mfspr* and is set by *dcread*. | | DCDBTRL | | |----|---------|----| | 32 | | 63 | Figure 26. Data Cache Debug Tag Register Low | | ъ. | | ~. | | | | : | _ | N. | -+- | | |---|----|---|----|---|---|---|---|---|----|-----|--| | • | М | O | uı | a | Ш | Ш | m | u | IN | ote | | An example implementation of DCDBTRL could have the following content and format. | Bit(s) | Description | |--------|----------------------------------------------------------------------------------------------------------------------| | 32:44 | Reserved (TRA) | | 45 | U bit parity (UPAR) | | 46:47 | Tag parity (TPAR) | | 48:51 | Data parity (DPAR) | | 52:55 | Modified (dirty) parity (MPAR) | | 56:59 | <b>Dirty Indicators</b> (D)<br>The "dirty" (modified) indicators for each of the four doublewords in the cache block | | 60 | U0 Storage Attribute (U0) The U0 storage attribute for the page associated with this cache block | | 61 | U1 Storage Attribute (U1) The U1 storage attribute for the page associated with this cache block | | 62 | U2 Storage Attribute (U2) The U2 storage attribute for the page associated with this cache block | | 63 | U3 Storage Attribute (U3) The U3 storage attribute for the page associated with this cache block | Implementations may support different content and format based on their cache implementation. #### A.2.1.3 Instruction Cache Debug Data Register The Instruction Cache Debug Data Register (ICDBDR) is a read-only 32-bit Special Purpose Register. The Instruction Cache Debug Data Register can be read using *mfspr* and is set by *icread*. Figure 27. Instruction Cache Debug Data Register #### A.2.1.4 Instruction Cache Debug Tag Register High The Instruction Cache Debug Tag Register High (ICDBTRH) is a 32-bit Special Purpose Register. The Instruction Cache Debug Tag Register High is read using *mfspr* and is set by *icread*. Figure 28. Instruction Cache Debug Tag Register High #### Programming Note - | | ample implementation of ICDBTRH could e following content and format. | |------|---------------------------------------------------------------------------------------------------------------| | Bit( | Description | | 32:5 | <b>Tag Effective Address</b> (TEA) Bits 0:23 of the 32-bit effective address associated with this cache block | | 56 | Valid (V) The valid indicator for the cache block (1 indicates valid) | | 57:5 | Tag parity (TPAR) | Tag parity (TPAR) 59 Instruction Data parity (DPAR) 60:63 Reserved Implementations may support different content and format based on their cache implementation. #### A.2.1.5 Instruction Cache Debug Tag **Register Low** The Instruction Cache Debug Tag Register Low (ICDB-TRL) is a 32-bit Special Purpose Register. The Instruction Cache Debug Tag Register Low is read using mfspr and is set by icread. Figure 29. Instruction Cache Debug Tag Register #### Programming Note - An example implementation of ICDBTRL could have the following content and format. | Bit(s) | Description | |--------|------------------------------------------------------------------------------------------------------------------| | 32:53 | Reserved | | 54 | <b>Translation Space</b> (TS) The address space portion of the virtual address associated with this cache block. | | 55 | <b>Translation ID Disable</b> (TD) TID Disable field for the memory page associated with this cache block | | 56:63 | Translation ID (TID) TID field portion of the virtual address associated with this cache block | Other implementations may support different content and format based on their cache implementation. #### A.2.2 Embedded Cache Debug Instructions #### Data Cache Read X-form dcread RT,RA,RB | 31 | RT | RA | RB | 486 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | #### [Alternative Encoding] | 31 | RT | RA | RB | 326 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) C \leftarrow log<sub>2</sub>(cache size) B \leftarrow log<sub>2</sub>(cache block size) IDX\leftarrow EA<sub>64-C:63-B</sub> WD \leftarrow EA<sub>64-B:61</sub> RT<sub>0:31</sub>\leftarrow undefined RT<sub>32:63</sub>\leftarrow (data cache data) [IDX] WDx32: WDx32+31 DCDBTRH\leftarrow (data cache tag high) [IDX] DCDBTRL\leftarrow (data cache tag low) [IDX] ``` Let the effective address (EA) be the sum of the contents of register RA, or 0 if RA is equal to 0, and the contents of register RB. ``` Let C = log_2(cache size in bytes). Let B = log_2(cache block size in bytes). ``` EA<sub>64-C:63-B</sub> selects one of the 2<sup>C-B</sup> data cache blocks. $\mathsf{EA}_{\mathsf{64-B:61}}$ selects one of the data words in the selected data cache block. The selected word in the selected data cache block is placed into register RT. The contents of the data cache directory entry associated with the selected data cache block are placed into DCDBTRH and DCDBTRL (see Figure 25 and Figure 26). **dcread** requires software to guarantee execution synchronization before subsequent **mfspr** instructions can read the results of the **dcread** instruction into GPRs. In order to guarantee that the **mfspr** instructions obtain the results of the **dcread** instruction, a sequence such as the following must be used: | msync | <pre># ensure that all previous # cache operations have # completed</pre> | |--------|----------------------------------------------------------------------------| | dcread | <pre>regT,regA,regB# read cache information;</pre> | | isync | <pre># ensure dcread completes # before attempting to # read results</pre> | | mfspr | <pre>regD,dcdbtrh # move high portion of tag # into GPR D</pre> | | mfspr | <pre>regE,dcdbtrl # move low portion of tag # into GPR E</pre> | This instruction is privileged. #### **Special Registers Altered:** DCDBTRH DCDBTRL #### Programming Note - **dcread** can be used by a debug tool to determine the contents of the data cache, without knowing the specific addresses of the blocks which are currently contained within the cache. #### **Programming Note** Execution of *dcread* before the data cache has completed all cache operations associated with previously executed instructions (such as block fills and block flushes) is undefined. #### Instruction Cache Read #### X-form icread RA,RB | 31 | /// | RA | RB | 998 | / | |----|-----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | Let the effective address (EA) be the sum of the contents of register RA, or 0 if RA is equal to 0, and the contents of register RB. ``` Let C = log_2 (cache size in bytes). Let B = log_2 (cache block size in bytes). ``` EA<sub>64-C:63-B</sub> selects one of the 2<sup>C-B</sup> instruction cache blocks. EA<sub>64-B:61</sub> selects one of the data words in the selected instruction cache block. The selected word in the selected instruction cache block is placed into ICDBDR. The contents of the instruction cache directory entry associated with the selected cache block are placed into ICDBTRH and ICDBTRL (see Figure 28 and Figure 29). *icread* requires software to guarantee execution synchronization before subsequent *mfspr* instructions can read the results of the *icread* instruction into GPRs. In order to guarantee that the *mfspr* instructions obtain the results of the *icread* instruction, a sequence such as the following must be used: ``` icread reqA, reqB # read cache information isync # ensure icread completes # before attempting to # read results # move instruction mficdbdr regC # information into GPR C mficdbtrh regD # move high portion of # tag into GPR D mficdbtrl regE # move low portion of tag # into GPR E ``` This instruction is privileged. ### **Special Registers Altered:** ICDBDR ICDBTRH ICDBTRL #### Programming Note - *icread* can be used by a debug tool to determine the contents of the instruction cache, without knowing the specific addresses of the blocks which are currently contained within the cache. ## **Appendix B. Assembler Extended Mnemonics** In order to make assembler language programs simpler to write and easier to understand, a set of extended mnemonics and symbols is provided for certain instructions. This appendix defines extended mnemonics and symbols related to instructions defined in Book III. Assemblers should provide the extended mnemonics and symbols listed here, and may provide others. ### **B.1 Move To/From Special Purpose Register Mnemonics** This section defines extended mnemonics for the *mtspr* and *mfspr* instructions, including the Special Purpose Registers (SPRs) defined in Book I and certain privileged SPRs, and for the *Move From Time Base* instruction defined in Book II. The *mtspr* and *mfspr* instructions specify an SPR as a numeric operand; extended mnemonics are provided that represent the SPR in the mnemonic rather than requiring it to be coded as an operand. Similar extended mnemonics are provided for the *Move From* *Time Base* instruction, which specifies the portion of the Time Base as a numeric operand. Note: *mftb* serves as both a basic and an extended mnemonic. The Assembler will recognize an *mftb* mnemonic with two operands as the basic form, and an *mftb* mnemonic with one operand as the extended form. In the extended form the TBR operand is omitted and assumed to be 268 (the value that corresponds to TR) | Table 6: Extended mnemonics for moving to/from an SPR | | | | | | | |-------------------------------------------------------|-------------|----------------|---------------|----------------|--|--| | Special Durmone Degister | Move | To SPR | Move From SPR | | | | | Special Purpose Register | Extended | Equivalent to | Extended | Equivalent to | | | | Fixed-Point Exception Register | mtxer Rx | mtspr 1,Rx | mfxer Rx | mfspr Rx,1 | | | | Link Register | mtlr Rx | mtspr 8,Rx | mflr Rx | mfspr Rx,8 | | | | Count Register | mtctr Rx | mtspr 9,Rx | mfctr Rx | mfspr Rx,9 | | | | Decrementer | mtdec Rx | mtspr 22,Rx | mfdec Rx | mfspr Rx,22 | | | | Save/Restore Register 0 | mtsrr0 Rx | mtspr 26,Rx | mfsrr0 Rx | mfspr Rx,26 | | | | Save/Restore Register 1 | mtsrr1 Rx | mtspr 27,Rx | mfsrr1 Rx | mfspr Rx,27 | | | | Special Purpose Registers<br>G0 through G3 | mtsprg n,Rx | mtspr 272+n,Rx | mfsprg Rx,n | mfspr Rx,272+n | | | | Time Base [Lower] | mttbl Rx | mtspr 284,Rx | mftb Rx | mfspr Rx,268 | | | | Time Base Upper | mttbu Rx | mtspr 285,Rx | mftbu Rx | mfspr Rx,269 | | | | Processor Version Register | - | - | mfpvr Rx | mfspr Rx,287 | | | # Appendix C. Guidelines for 64-bit Implementations in 32-bit Mode and 32-bit Implementations ### C.1 Hardware Guidelines ### C.1.1 64-bit Specific Instructions The instructions in the Category: 64-Bit are considered restricted only to 64-bit processing. A 32-bit implementation need not implement the group; likewise, the 32-bit applications will not utilize any of these instructions. All other instructions shall either be supported directly by the implementation, or sufficient infrastructure will be provided to enable software emulation of the instructions. A 64-bit implementation that is executing in 32-bit mode may choose to take an Unimplemented Instruction Exception when these 64-bit specific instructions are executed. # C.1.2 Registers on 32-bit Implementations The Power ISA provides 32-bit and 64-bit registers. All 32-bit registers shall be supported as defined in the specification except the MSR. The MSR shall be supported as defined in the specification except that bits 32:33 (CM and ICM) are treated as reserved bits. Only bits 32:63 of the 64-bit registers are required to be implemented in hardware in a 32-bit implementation except for the 64-bit FPRs. Such 64-bit registers include the LR, the CTR, the XER, the 32 GPRs, SRR0 and CSRR0. Likewise, other than floating-point instructions, all instructions which are defined to return a 64-bit result shall return only bits 32:63 of the result on a 32-bit implementation. # C.1.3 Addressing on 32-bit Implementations Only bits 32:63 of the 64-bit instruction and data storage effective addresses need to be calculated and presented to main storage. Given that the only branch and data storage access instructions that are not included in Section C.1.1 are defined to prepend 32 0s to bits 32:63 of the effective address computation, a 32-bit implementation can simply bypass the prepending of the 32 0s when implementing these instructions. For Branch to Link Register and Branch to Count Register instructions, given the LR and CTR are implemented only as 32-bit registers, only concatenating 2 0s to the right of bits 32:61 of these registers is necessary to form the 32-bit branch target address. For next sequential instruction address computation, the behavior is the same as for 64-bit implementations in 32-bit mode. # C.1.4 TLB Fields on 32-bit Implementations 32-bit implementations should support bits 32:53 of the Effective Page Number (EPN) field in the TLB. This size provides support for a 32-bit effective address, which Power ISA ABIs may have come to expect to be available. 32-bit implementations may support greater than 32-bit real addresses by supporting more than bits 32:53 of the Real Page Number (RPN) field in the TLB. ### C.2 32-bit Software Guidelines #### C.2.1 32-bit Instruction Selection Any software that uses any of the instructions listed in Category: 64-Bit shall be considered 64-bit software, and correct execution cannot be guaranteed on 32-bit implementations. Generally speaking, 32-bit software should avoid using any instruction or instructions that depend on any particular setting of bits 0:31 of any 64-bit application-accessible system register, including General Purpose Registers, for producing the correct 32-bit results. Context switching may or may not preserve the upper 32 bits of application-accessible 64-bit system registers and insertion of arbitrary settings of those upper 32 bits at arbitrary times during the execution of the 32-bit application must not affect the final result. # Appendix D. Type FSL Storage Control [Category: Embedded.MMU Type FSL] # D.1 Type FSL Storage Control Overview The Embedded category provides two different memory management and TLB programming models from which an implementation may choose. Both models use the same definition of the general contents of a Translation Lookaside Buffer (TLB) entry, but differ on what methods and resources are used to manipulate the TLB itself. The programming model presented here is called Type FSL and it defines functions and structures that are visible to software. These are divided into the following areas: - The TLB itself. The TLB consists of one or more structures called TLB arrays each of which may have differing characteristics. - The address translation mechanism. - Methods and effects of changing and manipulating TLB arrays. - Configuration information available to the operating system that describes the structure and form of the TLB arrays and translation mechanism. The TLB structure and the methods of performing translations are called the Memory Management Unit (MMU). The programming model for reading and writing TLBs is software managed. Hardware page table formats are not defined and software is free to choose any form in which to hold information about address translation. Address translation is accomplished through a set of TLB arrays, PID registers, and address space identifiers from the MSR, all of which are software managed. TLB entries are used to translate both instruction and data memory references providing a unified memory management model. # D.2 Type FSL Storage Control Registers ### D.2.1 Process ID Registers (PIDn) Process ID Registers are used by system software to specify which TLB entries are used by the processor to accomplish address translation for loads, stores, and instruction fetches. Section 4.7.1.1 defines the PID register. The PID register is synonymous with PID0. In addition to PID0, 2 additional PID registers, PID1 and PID2 are defined. An implementation may choose to provide any number of PIDs up to a maximum of 3. The number of PIDs implemented is indicated by the value of MMUCFG<sub>NPIDS</sub> and the number of bits implemented in each PID register is indicated by the value of MMUCFG<sub>PIDSIZE</sub>. PID values are used to construct virtual addresses for accessing memory. Figure 30. Process ID Register (PID0-PID2) Bit Description 32:49 Reserved 50:63 Process ID Identifies the process #### Programming Note The suggested software convention for PID usage is to use PID0 to denote private mappings for a process and to use other PIDs to handle mappings that may be common to multiple processes. This method allows for processes sharing address space to also share TLB entries if the shared address space is mapped at the same virtual address in each process. # D.2.2 Translation Lookaside Buffer The MMU contains up to four TLB arrays. TLB arrays are on-chip storage areas for holding TLB entries. A TLB entry contains effective to real address mappings for loads, stores, and instruction fetches. A TLB array contains zero or more TLB entries. Each of the TLB entries has specific fields that can be accessed using the corresponding fields in the MMU Assist Registers (see Section D.2.4). Each TLB array that is implemented has a configuration register (TLB*n*CFG) associated with it describing the size and attributes of the TLB entries in that array (see Section D.2.5.2). A TLB entry contains the fields described in Section 4.7.1.2 as well as these additional fields: #### Field Description IPROT Invalidation protection. This entry is protected from all TLB invalidation mechanisms except the explicit writing of a 0 to the V bit. ACM The Alternate Coherency Mode (ACM) attribute allows an implementation to employ more than a single coherency method. This allows for a processor to participate in multiple coherency protocols. If the M attribute (Memory Coherence Required) is not set for a page (M=0), the page has no coherency associated with it and the ACM attribute is ignored. If the M attribute is set to 1 for a page (M=1), the ACM attribute is used to determine the coherence domain (or protocol) used. The values for ACM are implementation-dependent. ### **D.2.3 Address Space Identifiers** The address space identifier is called the AS bit. Thus there are two possible address spaces, 0 and 1. The value of the AS bit (see Section 4.7.2, Figure 8) is determined by the type of translation performed and from the contents of the MSR when an address is translated. If the type of translation performed is an instruction fetch, the value of the AS bit is taken from the contents of MSR $_{\rm IS}$ . If the type of translation performed is a load, store, or other data translation including target addresses of software initiated instruction fetch hints and locks the value of the AS bit is taken from the contents of MSR $_{\rm DS}$ . #### **Programming Note** While system software is free to use address space bits as it sees fit, it should be noted that on interrupt, the $MSR_{IS}$ and $MSR_{DS}$ bits are set to 0. This encourages software to use address space 0 for system software and address space 1 for user software. ### D.2.4 MMU Assist Registers The MMU Assist Registers (MAS) are used to transfer data to and from the TLB arrays. MAS registers can be read and written by software using *mfspr* and *mtspr* instructions. Execution of a *tlbre* instruction causes the TLB entry specified by MAS0<sub>TLBSEL</sub>, MAS0<sub>ESEL</sub>, and MAS2<sub>EPN</sub> to be copied to the MAS registers. Conversely, execution of a *tlbwe* instruction causes the TLB entry specified by MAS0<sub>TLBSEL</sub>, MAS0<sub>ESEL</sub>, and MAS2<sub>EPN</sub> to be written with contents of the MAS registers. MAS registers may also be updated by hardware on the occurrence of an Instruction or Data TLB Error interrupt or as the result of a *tlbsx* instruction. All MAS registers are privileged. All MAS registers with the exception of MAS7 must be implemented. MAS7 is not required to be implemented if the processor supports 32 bits or less of real address. Processors are only required to implement the necessary bits of any multi-bit field in a MAS register such that only the resources supplied by the processor are represented. Any non-implemented bits in a field should have no effect when writing and should always read as zero. For example, a processor that implements only 2 TLB arrays will likely only implement the lower-order bit of the MASO<sub>TLBSEL</sub> field. ### D.2.4.1 MAS0 Register The MAS0 register contains fields for identifying and selecting a TLB entry. Figure 31. MAS0 register Deceriation These bits are interpreted as follows: | BIT | Description | |-------|-------------------------------------------------------| | 32:33 | Reserved | | 34:35 | <b>TLB Select</b> (TLBSEL)<br>Selects TLB for access. | | | 00 TLB0<br>01 TLB1<br>10 TLB2<br>11 TLB3 | #### 36:47 Entry Select (ESEL) Identifies an entry in the selected array to be used for *tlbwe* and *tlbre*. Valid values for ESEL are from 0 to TLBnCFG<sub>ASSOC</sub> - 1. That is, ESEL selects the entry in the TLB array from the set of entries which can be used for translating addresses with the EPN specified by MAS2<sub>EPN</sub>. For fully-associative TLB arrays, ESEL ranges from 0 to TLBnCFG<sub>NENTRY</sub> - 1. ESEL is also updated on TLB error exceptions (misses), and *tlbsx* hit and miss cases. 48:51 Reserved #### 52:63 Next Victim (NV) NV is a hint to software to identify the next victim to be targeted for a TLB miss replacement operation for those TLBs that support the NV field. If the TLB selected by MASO<sub>TLBSEL</sub> does not support the NV field, then this field is undefined. The computation of this field is implementation-dependent. NV is updated on TLB error exceptions (misses), *tlbsx* hit and miss cases as shown in Table 7, and on execution of *tlbre* if the TLB array being accessed supports the NV field. When NV is updated by a supported TLB array, the NV field will always present a value that can be used in the MASO<sub>ESEL</sub> field. ### D.2.4.2 MAS1 Register The MAS1 register contains fields for selecting a TLB entry during translation. Figure 32. MAS1 register These bits are interpreted as follows: #### Bit Definition - 32 **TLB Valid Bit** (V) - 0 This TLB entry is invalid. - 1 This TLB entry is valid. #### 33 *Invalidate Protect* (IPROT) Indicates this TLB entry is protected from invalidate operations due to execution of *tlbivax*, *tlbivax* invalidations from another processor, or invalidate all operations. IPROT is only implemented for TLB entries in TLB arrays where TLB*n*CFG<sub>IPROT</sub> is indicated. - 0 Entry is not protected from invalidation - 1 Entry is protected from invalidation. #### 34:47 *Translation Identity* (TID) During translation, TID is compared with the current process IDs (PIDs) to select a TLB entry. A TID value of 0 defines an entry as global and matches with all process IDs. #### 48:50 Reserved #### 51 **Translation Space (TS)** During translation, TS is compared with AS (the IS or DS fields of the MSR depending on the type of access) to select a TLB entry. #### 52:55 Translation Size (TSIZE) TSIZE defines the page size of the TLB entry. For TLB arrays that contain fixed-size TLB entries, this field is ignored. For variable page size TLB arrays, the page size is 4<sup>TSIZE</sup> Kbytes. TSIZE must be a non-zero value between TLB*n*CFG<sub>MINSIZE</sub> and TLB*n*CFG<sub>MAXSIZE</sub>. Encodings for page size are defined in Section 4.7.1.2. 56:63 Reserved ### D.2.4.3 MAS2 Register The MAS2 register is a 64-bit register. The register contains fields for specifying the effective page address and the storage attributes for a TLB entry. Figure 33. MAS2 register These bits are interpreted as follows: #### Bit Description #### 0:51 **Effective Page Number** (EPN) Depending on page size, only the bits associated with a page boundary are valid. Bits that represent offsets within a page are ignored and should be zero. EPN<sub>0:31</sub> are accessible only in 64-bit implementations as the upper 32 bits of the effective address of the page. #### 52:55 Reserved #### 56:57 Alternate Coherency Mode (ACM) The ACM attribute allows an implementation to employ more than a single coherency method. This allows for a processor to participate in multiple coherency protocols. If the M attribute (Memory Coherence Required) is not set for a page (M=0), the page has no coherency associated with it and the ACM attribute is ignored. If the M attribute is set to 1 for a page (M=1), the ACM attribute is used to determine the coherence domain (or protocol) used. The values for ACM are implementation-dependent. #### **Programming Note** Some previous implementations may have a storage bit in the bit 57 position labeled as X0. #### 58 **VLE Mode** (VLE) [Category: VLE] Identifies pages which contain instructions to be decoded as VLE instructions (see Chapter 1 of Book VLE). Setting the VLE attribute to 1 and setting the E attribute to 1 is considered a programming error and an attempt to fetch instructions from a page so marked produces an Instruction Storage Interrupt Byte Ordering Exception and sets ESR<sub>BO</sub>. - 0 Instructions fetched from the page are decoded and executed as non-VLE instructions. - Instructions fetched from the page are decoded and executed as VLE instructions. #### Programming Note - Some previous implementations may have a storage bit in this position labeled as X1. Software should not use the presence of this bit (the ability to set to 1 and read a 1) to determine if the implementation supports the VLE. #### 59 Write Through (W) - This page is not Write-Through Required storage. - This page is Write-Through Required storage. #### 60 Caching Inhibited (I) - This page is not Caching Inhibited storage. - 1 This page is Caching Inhibited storage #### 61 Memory Coherence Required (M) - O This page is not Memory Coherence Required storage. - 1 This page is Memory Coherence Required storage. #### 62 Guarded (G) - 0 This page is not Guarded storage. - 1 This page is Guarded storage. #### 63 Endianness (E) - The page is accessed in Big-Endian byte order. - The page is accessed in Little-Endian byte order. #### D.2.4.4 MAS3 Register The MAS3 register contains fields for specifying the real page address, user defined attributes, and the permission attributes for a TLB entry. #### Figure 34. MAS3 register These bits are interpreted as follows: #### Bit Description ## 32:51 **Real Page Number (bits 32:51)** (RPNL or RPN<sub>32:51</sub>) Depending on page size, only the bits associated with a page boundary are valid. Bits that represent offsets within a page are ignored and should be zero. RPN<sub>0:31</sub> are accessed through MAS7. 52:53 Reserved #### 54:57 **User Bits** (U0:U3) These bits are associated with a TLB entry and can be used by system software. For example, these bits may be used to hold information useful to a page scanning algorithm or be used to mark more abstract page attributes. 58:63 **Permission Bits** (UX, SX, UW, SW, UR, SR). User and supervisor execute, write, and read permission bits. The effect of the Permission Bits are defined in Section 4.7.1.2. ### D.2.4.5 MAS4 Register The MAS4 register contains fields for specifying default information to be pre-loaded on certain MMU related exceptions. See Section D.4.5 for more information. #### Figure 35. MAS4 register Description **Bit** The MAS4 fields are described below. | | 2000 | | |-------|--------------------------------------------------------------------------------------------------------------------------|----| | 32:33 | Reserved | | | 34:35 | <b>TLBSEL Default Value</b> (TLBSELD) Specifies the default value loaded MAS0 <sub>TLBSEL</sub> on a TLB miss exception. | in | | 36:43 | Reserved | | #### 44:47 **TID Default Selection Value (TIDSELD)** Specifies which of the current PID registers should be used to load the MAS1<sub>TID</sub> field on a TLB miss exception. The PID registers are addressed as follows: 0000 = PID0 (PID) 0001 = PID1 0010 = PID2 A value that references a non-implemented PID register causes a value of 0 to be placed in ${\rm MAS1}_{\rm TID}$ . 48:51 Reserved ## 52:55 **Default TSIZE Value (TSIZED)** Specifies the default value loaded into MAS1<sub>TSIZE</sub> on a TLB miss exception. #### 56:57 **Default ACM Value** (ACMD) Specifies the default value loaded into $\ensuremath{\mathsf{MAS2}_{\mathsf{ACM}}}$ on a TLB miss exception. #### 58 **Default VLE Value** (VLED) Specifies the default value loaded into MAS2<sub>VLE</sub> on a TLB miss exception. #### 59 **Default W Value** (WD) Specifies the default value loaded into ${\sf MAS2}_{\sf W}$ on a TLB miss exception. #### 60 Default I Value (ID) Specifies the default value loaded into MAS21 on a TLB miss exception. #### 61 **Default M Value** (MD) Specifies the default value loaded into MAS2<sub>M</sub> on a TLB miss exception. #### 62 Default G Value (GD) Specifies the default value loaded into MAS2G on a TLB miss exception. #### 63 Default E Value (ED) Specifies the default value loaded into MAS2<sub>F</sub> on a TLB miss exception. #### D.2.4.6 MAS6 Register The MAS6 register contains fields for specifying PID and AS values to be used when searching TLB entries with the tlbsx instruction. | | MAS6 | | |----|------|----| | 32 | | 63 | Figure 36. MAS6 register These bits are interpreted as follows: | Bit | Description | |-------|-------------| | 32:33 | Reserved | Reserved #### 34:47 Search PID0 (SPID0) Specifies the value of PID0 used when searching the TLB during execution of tlbsx. This field is valid for only the number of bits implemented for PID registers. #### 48:62 Reserved #### 63 Address Space Value for Searches (SAS) Specifies the value of AS used when searching the TLB during execution of tlbsx. #### D.2.4.7 MAS7 Register The MAS7 register contains the high order address bits of the RPN for implementations that support more than 32 bits of physical address. Implementations that do not support more than 32 bits of physical addressing are not required to implement MAS7. Figure 37. MAS7 register These bits are interpreted as follows: #### Bit Description #### 32:63 Real Page Number (bits 0:31) (RPNU or $RPN_{0:31}$ ) RPN<sub>32:51</sub> are accessed through MAS3. | Table 7: MAS Reg | ister Update Summary | | | | |--------------------------------------|------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------| | MAC Field | | Value Load | ed on Event | | | MAS Field<br>Updated | Data or Instruction TLB Error Interrupt | <i>tlbsx</i> hit | tlbsx miss | tlbre | | MAS0 <sub>TLBSEL</sub> | MAS4 <sub>TLBSELD</sub> | TLB array that hit | MAS4 <sub>TLBSELD</sub> | _ | | MAS0 <sub>ESEL</sub> | if TLB array [MAS4 <sub>TLBSELD</sub> ] supports next victim then hardware hint, else undefined | Number of entry that hit | <pre>if TLB array [MAS4<sub>TLBSELD</sub>] sup- ports next victim then hardware hint, else undefined</pre> | _ | | MAS0 <sub>NV</sub> | if TLB array [MAS4 <sub>TLBSELD</sub> ] supports next victim then next hardware hint, else undefined | <pre>if TLB array [MAS4<sub>TLBSELD</sub>] sup- ports next victim then hardware hint, else undefined</pre> | <pre>if TLB array [MAS4<sub>TLBSELD</sub>] sup- ports next victim then next hardware hint, else undefined</pre> | if TLB array [MAS4 <sub>TLBSELD</sub> ] supports next victim then hardware hint, else undefined | | MAS1 <sub>V</sub> | 1 | 1 | 0 | TLB <sub>V</sub> | | MAS1 <sub>IPROT</sub> | 0 | TLB <sub>IPROT</sub> | 0 | TLB <sub>IPROT</sub> | | MAS1 <sub>TID</sub> | if PID[MAS4 <sub>TIDSELD</sub> ]<br>implemented then<br>PID[MAS4 <sub>TIDSELD</sub> ]<br>else 0 | TLB <sub>TID</sub> | MAS6 <sub>SPID0</sub> | TLB <sub>TID</sub> | | MAS1 <sub>TS</sub> | MSR <sub>IS</sub> or MSR <sub>DS</sub> | TLB <sub>TS</sub> | MAS6 <sub>SAS</sub> | TLB <sub>TS</sub> | | MAS1 <sub>TSIZE</sub> | MAS4 <sub>TSIZED</sub> | TLB <sub>SIZE</sub> | MAS4 <sub>TSIZED</sub> | TLB <sub>SIZE</sub> | | MAS2 <sub>EPN</sub> | EA <sub>0:51</sub> <sup>1</sup> | TLB <sub>EPN</sub> | undefined | TLB <sub>EPN</sub> | | MAS2 <sub>ACM</sub> | MAS4 <sub>ACMD</sub> | TLB <sub>ACM</sub> | MAS4 <sub>ACMD</sub> | TLB <sub>ACM</sub> | | MAS2 <sub>VLE</sub> | MAS4 <sub>VLED</sub> | TLB <sub>VLE</sub> | MAS4 <sub>VLED</sub> | TLB <sub>VLE</sub> | | MAS2 <sub>W</sub> | MAS4 <sub>WD</sub> | TLB <sub>W</sub> | MAS4 <sub>WD</sub> | $TLB_W$ | | MAS2 <sub>I</sub> | MAS4 <sub>ID</sub> | TLB <sub>I</sub> | MAS4 <sub>ID</sub> | TLB <sub>I</sub> | | MAS2 <sub>M</sub> | MAS4 <sub>MD</sub> | TLB <sub>M</sub> | MAS4 <sub>MD</sub> | TLB <sub>M</sub> | | MAS2 <sub>G</sub> | MAS4 <sub>GD</sub> | TLB <sub>G</sub> | MAS4 <sub>GD</sub> | TLB <sub>G</sub> | | MAS2 <sub>E</sub> | MAS4 <sub>ED</sub> | TLB <sub>E</sub> | MAS4 <sub>ED</sub> | TLB <sub>E</sub> | | MAS3 <sub>RPN</sub> | 0 | TLB <sub>RPN</sub><br>(bits 32:51) | 0 | TLB <sub>RPN</sub> (bits 32:51) | | MAS3 <sub>U0 U1 U2 U3</sub> | 0 | TLB <sub>U0 U1 U2 U3</sub> | 0 | TLB <sub>U0 U1 U2 U3</sub> | | MAS3 <sub>UX SX UW</sub><br>SW UR SR | 0 | TLB <sub>UX SX UW SW UR SR</sub> | 0 | TLB <sub>UX SX UW SW UR SR</sub> | | MAS4 | _ | _ | <del>-</del> | _ | | MAS6 <sub>SPID0</sub> | PID0 | _ | _ | _ | | MAS6 <sub>SAS</sub> | MSR <sub>IS</sub> or MSR <sub>DS</sub> | _ | _ | _ | | MAS7 <sub>RPN</sub> | 0 | TLB <sub>RPN</sub> (bits 0:31) | 0 | TLB <sub>RPN</sub> (bits 0:31) | <sup>1.</sup> If MSR $_{\text{CM}}\!\!=\!\!0$ (32-bit mode) at the time of the exception, EPN $_{0:31}$ are set to 0. # D.2.5 MMU Configuration and Control Registers ## D.2.5.1 MMU Configuration Register (MMUCFG) The read-only MMUCFG register is described as follows. #### Figure 38. MMU Configuration Register These bits are interpreted as follows: | Bit | Description | |-------|-----------------------------------------------------------------------------------------------------------| | 32:39 | Reserved | | 40:46 | <b>Real Address Size</b> (RASIZE)<br>Number of bits in a real address supported by<br>the implementation. | | 47:48 | Reserved | | 49:52 | <b>Number of PID Registers</b> (NPIDS) Indicates the number of PID registers provided by the processor. | | 53:57 | PID Register Size (PIDSIZE) | The value of PIDSIZE is one less than the number of bits implemented for each of the PID registers implemented by the processor. The processor implements only the least significant PIDSIZE+1 bits in the PID registers. The maximum number of PID register bits that may be implemented is 14. 58:59 Reserved #### 60:61 Number of TLBs (NTLBS) The value of NTLBS is one less than the number of software-accessible TLB structures that are implemented by the processor. NTLBS is set to one less than the number of TLB structures so that its value matches the maximum value of MAS0\_TLBSEL. 00 1 TLB 01 2 TLBs 10 3 TLBs 11 4 TLBs 62:63 MMU Architecture Version Number (MAVN) Indicates the version number of the architecture of the MMU implemented by the processor. 00 Version 1.0 01 Reserved 10 Reserved 11 Reserved ## D.2.5.2 TLB Configuration Registers (TLB*n*CFG) The TLBnCFG read-only registers provide information about each specific TLB that is implemented. There is one TLBnCFG register implemented for each TLB array that is implemented. TLB0CFG corresponds to TLB0, TLB1CFG corresponds to TLB1, etc. TLB*n*CFG provides configuration information for the corresponding TLB array. #### Figure 39. TLB Configuration Register These bits are interpreted as follows: #### Bit Description #### 32:39 **Associativity** (ASSOC) Total number of entries in a TLB array which can be used for translating addresses with a given EPN. This number is referred to as the associativity level of the TLB array. A value equal to NENTRY or 0 indicates the array is fully-associative. #### 40:43 *Minimum Page Size* (MINSIZE) Minimum page size of TLB array. Page size encoding is defined in Section 4.7.1.2. #### 44:47 Maximum Page Size (MAXSIZE) Maximum page size of TLB array. Page size encoding is defined in Section 4.7.1.2. #### 48 *Invalidate Protection* (IPROT) Invalidate protect capability of TLB array. - Indicates invalidate protection capability not supported. - Indicates invalidate protection capability supported. ### 49 **Page Size Availability** (AVAIL) Page size availability of TLB array. - Fixed selectable page size from MINSIZE to MAXSIZE (all TLB entries are the same size). - Variable page size from MINSIZE to MAX-SIZE (each TLB entry can be sized separately). 50:51 Reserved 52:63 **Number of Entries** (NENTRY) Number of entries in TLB array. ## D.2.5.3 MMU Control and Status Register (MMUCSR0) The MMUCSR0 register is used for general control of the MMU including invalidation of the TLB arrays and page sizes for programmable fixed size arrays. For TLB arrays that have programmable fixed sizes, the TLB*n\_*PS fields allow software to specify the page size. Figure 40. MMU Control and Status Register 0 These bits are interpreted as follows: ### Bit Description 32:40 Reserved #### 41:56 TLBn Array Page Size A 4-bit field specifies the page size for TLBn array. Page size encoding is defined in Section 4.7.1.2. For each TLB array n, the field is implemented only if TLBnCFG $_{AVAIL}$ =0 and TLBnCFG $_{MINSIZE}$ $\neq$ TLBnCFG $_{MAXSIZE}$ . If the value of TLBn\_PS is not between TLBnCFG $_{MINSIZE}$ and TLBnCFG $_{MINSIZE}$ the page size is set to TLBnCFG $_{MINSIZE}$ . - 41:44 **TLB3 Array Page Size** (TLB3\_PS) Page size of the TLB3 array. - 45:48 **TLB2 Array Page Size** (TLB2\_PS) Page size of the TLB2 array. - 49:52 **TLB1 Array Page Size** (TLB1\_PS) Page size of the TLB1 array. - 53:56 **TLB0 Array Page Size** (TLB0\_PS) Page size of the TLB0 array. #### 57:62 TLBn Invalidate All TLB invalidate all bit for the TLB*n* array. - 0 If this bit reads as a 1, an invalidate all operation for the TLBn array is in progress. Hardware will set this bit to 0 when the invalidate all operation is completed. Writing a 0 to this bit during an invalidate all operation is ignored. - 1 TLBn invalidation operation. Hardware initiates a TLBn invalidate all operation. When this operation is complete, this bit is cleared. Writing a 1 during an invalidate all operation produces an undefined result. If the TLB array supports IPROT, entries that have IPROT set will not be invalidated. - 57 **TLB2 Invalidate AII** (TLB2\_FI) TLB invalidate all bit for the TLB2 array. - 58 **TLB3 Invalidate AII** (TLB3\_FI) TLB invalidate all bit for the TLB3 array. - 59:60 Reserved - 61 TLB0 Invalidate AII (TLB0\_FI) TLB invalidate all bit for the TLB0 array. - 62 **TLB1 Invalidate AII** (TLB1\_FI) TLB invalidate all bit for the TLB1 array. - 63 Reserved #### **Programming Note** Changing the fixed page size of an entire array must be done with great care. If any entries in the array are valid, changing the page size may cause those entries to overlap, creating a serious programming error. It is suggested that the entire TLB array be invalidated and any entries with IPROT have their V bits set to zero before changing page size. # D.3 Page Identification and Address Translation Page Identification occurs as described in Section 4.7.2 except the matching TLB entry may be identified using more than one PID register. Accesses that would result in multiple matching entries are not allowed and are considered a serious programming error by system software and the results of such a translation are undefined. A PID register containing a 0 value (or the same value as another PID register) will form a non unique match and is permissible. Once a match occurs the matching TLB entry is used for access control, storage attributes, and effective to real address translation. ### **D.4 TLB Management** ### **D.4.1 Reading TLB Entries** TLB entries can be read by executing *tlbre* instructions. At the time of *tlbre* execution, the MAS registers are used to index a specific TLB entry and upon completion of the *tlbre* instruction, the MAS registers will contain the contents of the indexed TLB entry. Specifying invalid values for $MASO_{TLBSEL}$ and $MASO_{ESEL}$ produce undefined results. ## **D.4.2 Writing TLB Entries** TLB entries can be written by executing *tlbwe* instructions. At the time of *tlbwe* execution, the MAS registers are used to index a specific TLB entry and contain the contents to be written to the indexed TLB entry. Upon completion of the *tlbwe* instruction, the contents of the MAS registers corresponding to TLB entry fields will be written to the indexed TLB entry. Specifying invalid values for MAS0 $_{\mbox{TLBSEL}}$ ESEL produces undefined results. ### **D.4.3 Invalidating TLB Entries** TLB entries may be invalidated by three different methods. The TLB entry can be invalidated as the result of a *tlbwe* instruction that sets the MAS1<sub>V</sub> bit in the entry to 0. TLB entries may also be invalidated as a result of a *tlbivax* instruction or from an invalidation resulting from a *tlbivax* on another processor. Lastly, TLB entries may be invalidated as a result of an invalidate all operation specified through appropriate settings in the MMUCSR0. In both multiprocessor and uniprocessor systems, invalidations can occur on a wider set of TLB entries than intended. That is, a virtual address presented for invalidation may cause not only the intended TLB targeted for invalidation to be invalidated, but may also invalidate other TLB entries depending on the implementation. This is because parts of the translation mechanism may not be fully specified to the hardware at invalidate time. This is especially true in SMP systems, where the invalidation address must be supplied to all processors in the system, and there may be other limitations imposed by the hardware implementation. This phenomenon is known as generous invalidates. The architecture assures that the intended TLB will be invalidated, but does not guarantee that it will be the only one. A TLB entry invalidated by writing the V bit of the TLB entry to 0 by use of a tIbwe instruction is guaranteed to invalidate only the addressed TLB entry. Invalidates occurring from tlbivax instructions or from tlbivax instructions on another processor may cause generous invalidates. The architecture provides a method to protect against generous invalidations. This is important since there are certain virtual memory regions that must be properly mapped to make forward progress. To prevent this, the architecture specifies an IPROT bit for TLB entries. If the IPROT bit is set to 1 in a given TLB entry, that entry is protected from invalidations resulting from *tlbivax* instructions, or from invalidate all operations. TLB entries with the IPROT field set may only be invalidated by explicitly writing the TLB entry and specifying a 0 for the V (MAS1<sub>V</sub>) field. #### Programming Note The most obvious issue with generous invalidations is the code memory region that serves as the exception handler for MMU faults. If this region does not have a valid mapping, an MMU exception cannot be handled because the first address of the exception handler will result in another MMU exception. #### Programming Note – Not all TLB arrays in a given implementation will implement the IPROT attribute. It is likely that implementations that are suitable for demand page environments will implement it for only a single array, while not implementing it for other TLB arrays. #### **Programming Note** Operating systems need to use great care when using protected (IPROT) TLB entries, particularly in SMP systems. An SMP system that contains TLB entries on other processors will require a cross processor interrupt or some other synchronization mechanism to assure that each processor performs the required invalidation by writing its own TLB entries. #### Programming Note To ensure a TLB entry that is not protected by IPROT is invalidated if software does not know which TLB array the entry is in, software should issue a *tlbivax* instruction targeting each TLB in the implementation with the EA to be invalidated. #### **Programming Note** The preferred method of invalidating entire TLB arrays is invalidation using MMUCSR0. #### Programming Note Invalidations using MMUCSR0 only affect the TLB array on the processor that performs the invalidation. To perform invalidations in a multiprocessor system on all processors in a coherence domain, software should use *tlbivax*. ### D.4.4 Searching TLB Entries Software may search the MMU by using the *tlbsx* instruction. The *tlbsx* instruction uses PID values and an AS value from the MAS registers instead of the PID registers and the MSR. This allows software to search address spaces that differ from the current address space defined by the PID registers. This is useful for TLB fault handling. # D.4.5 TLB Replacement Hardware Assist The architecture provides mechanisms to assist software in creating and updating TLB entries when MMU related exceptions occur. This is called TLB Replacement Hardware Assist. Hardware will update the MAS registers on the occurrence of a Data TLB Error Interrupt or Instruction TLB Error interrupt. When a Data or Instruction TLB Error interrupt (miss) occurs, MAS0, MAS1, and MAS2 are automatically updated using the defaults specified in MAS4 as well as the AS and EPN values corresponding to the access that caused the exception. MAS6 is updated to set MAS6<sub>SPID0</sub> to the value of PID0 and MAS6<sub>SAS</sub> to the value of MSR<sub>DS</sub> or MSR<sub>IS</sub> depending on the type of access that caused the error. In addition, if MAS4<sub>TI BSFLD</sub> identifies a TLB array that supports NV (Next Victim), MASO<sub>ESEL</sub> is loaded with a value that hardware believes represents the best TLB entry to victimize to create a new TLB entry and $MASO_{NV}$ is updated with the TLB entry index of what hardware believes to be the next victim. Thus MAS0<sub>ESEL</sub> identifies the current TLB entry to be replaced, and MASO<sub>NV</sub> points to the next victim. When software writes the TLB entry, the MASO<sub>NV</sub> field is written to the TLB array. The algorithm used by the hardware to determine which TLB entry should be targeted for replacement is implementation-dependent. The automatic update of the MAS registers sets up all the necessary fields for creating a new TLB entry with the exception of RPN, the U0-U3 attribute bits, and the permission bits. With the exception of the upper 32 bits of RPN and the page attributes (should software desire to specify changes from the default attributes), all the remaining fields are located in MAS3, requiring only the single MAS register manipulation by software before writing the TLB entry. For Instruction Storage interrupt (ISI) and Data Storage interrupt (DSI) related exceptions, the MAS registers are not updated. Software must explicitly search the TLB to find the appropriate entry. The update of MAS registers through TLB Replacement Hardware Assist is summarized in Table 7. # D.5 32-bit and 64-bit Specific MMU Behavior MMU behavior is largely unaffected by whether the processor is in 32-bit computation mode (MSR $_{\rm CM}$ =0) or 64-bit computation mode (MSR $_{\rm CM}$ =1). The only differences occur in the EPN field of the TLB entry and the EPN field of MAS2. The differences are summarized here. - Executing a *tlbwe* instruction in 32-bit mode will set bits 0:31 of the TLB EPN field to 0, regardless of the value of bits 0:31 of the EPN field in MAS2. - Updates to MAS registers via TLB Replacement Hardware Assist (see Section D.4.5), update bits 0:51 of the EPN field regardless of the computation mode of the processor at the time of the exception or the interrupt computation mode in which the interrupt is taken. If the instruction caus- - ing the exception was executing in 32-bit mode, then bits 0:31 of the EPN field in MAS2 will be set to 0. - Executing a *tlbre* instruction in 32-bit mode will set bits 0:31 of the MAS2 EPN field to an undefined value. #### **Programming Note** This allows a 32-bit OS to operate seamlessly on a 64-bit implementation and a 64-bit OS to easily support 32-bit applications. ## D.6 Type FSL MMU Instructions The instructions described in this section, replace the instructions described in Section 4.9.4.1, "TLB Management Instructions". ## TLB Invalidate Virtual Address Indexed X-form tlbivax RA,RB ``` \begin{array}{lll} \mbox{if RA} = 0 \mbox{ then } b \leftarrow 0 \\ \mbox{else} & \mbox{b} \leftarrow (\mbox{RA}) \\ \mbox{EA} \leftarrow b + (\mbox{RB}) \\ \mbox{for each processor} \\ \mbox{for TLB array} = \mbox{EA}_{59:60} \\ \mbox{for each TLB entry} \\ \mbox{m} \leftarrow \neg \left( \left( 1 << \left( 2 \times (\mbox{entry}_{SIZE}^{-1}) \right) \right) - 1 \right) \\ \mbox{if } \left( (\mbox{EA}_{0:51} \& \mbox{m}) = (\mbox{entry}_{EPN} \& \mbox{m}) \right) \mid \mbox{EA}_{61} \\ \mbox{then if entry}_{IPROT} = 0 \\ \mbox{then entry}_{V} \leftarrow 0 \end{array} ``` Let the effective address (EA) be the sum(RA|0)+(RB). The EA is interpreted as show below. EA<sub>0:51</sub> EA<sub>0:51</sub> EA<sub>52:58</sub> Reserved EA<sub>59:60</sub> TLB array selector 00 TLB0 01 TLB1 10 TLB2 IU ILBZ 11 TLB3 EA<sub>61</sub> TLB Invalidate All EA<sub>62:63</sub> Reserved If EA $_{61}$ =0, then if the TLB array targeted by EA $_{59:60}$ contains an entry identified by EA $_{0:51}$ , that entry is made invalid unless the TLB entry is protected by the IPROT attribute. A TLB entry is identified if, for m = ¬((1 << (2×(TLB\_entry\_{size}-1))) - 1), EA $_{0:51}$ &m is equal to TLB\_entry\_EPN&m. The AS bit does not participate in the comparison. If $\rm EA_{61}$ =1, then all entries not protected by the IPROT attribute in the TLB array targeted by $\rm EA_{59:60}$ are made invalid. This instruction causes the target TLB entry to be invalidated in all processors. The operation performed by this instruction is ordered by the *mbar* (or *sync*) instruction with respect to a subsequent *tlbsync* instruction executed by the processor executing the *tlbivax* instruction. The operations caused by *tlbivax* and *tlbsync* are ordered by *mbar* as a set of operations which is independent of the other sets that *mbar* orders. The effects of the invalidation are not guaranteed to be visible to the programming model until the completion of a context synchronizing operation. Invalidations may occur for other TLB entries in the designated array, but in no case will any TLB entries with the IPROT attribute set be made invalid. In some implementations, if RA does not equal 0, it may produce an Illegal Instruction exception. This instruction is privileged. #### Special Registers Altered: None #### **Programming Note** - The use of $EA_{61}$ to invalidate TLB arrays may be phased out in future versions of the architecture. The preferred method of invalidating TLB arrays is invalidation using MMUCSR0. #### TLB Search Indexed #### X-form tlbsx RA,RB ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + (RB) pid \leftarrow MAS6_{SPID0} as ← MAS6<sub>SAS</sub> va ← as || pid || EA if Valid matching_entry_exists(va) then entry ← matching entry found array ← TLB array number where TLB entry found index ← index into TLB array of TLB entry found if TLB array supports Next Victim then hint ← hardware hint for Next Victim else hint ← undefined \texttt{rpn} \leftarrow \texttt{entry}_{\texttt{RPN}} \begin{array}{l} \text{MASO}_{\text{TLBSEL}} \leftarrow \text{array} \\ \text{MASO}_{\text{ESEL}} \leftarrow \text{index} \end{array} MASO<sub>NV</sub> ← hint \text{MAS1}_{\text{V}} \leftarrow 1 { m MAS1}_{ m IPROT\ TID\ TS\ TSIZE} \leftarrow { m entry}_{ m IPROT\ TID\ TS\ SIZE} \texttt{MAS2}_{\texttt{EPN VLE W I M G E ACM}} \leftarrow \texttt{entry}_{\texttt{EPN VLE W I M G E ACM}} MAS3_{RPNL} \leftarrow rpn_{32:51} MAS3<sub>U0:U3</sub> UX SX UW SW UR SR ← entry<sub>U0:U3</sub> UX SX UW SW UR SR MAS7<sub>RPNU</sub> ← rpn<sub>0:31</sub> MASO<sub>TLBSEL</sub> ← MAS4<sub>TLBSELD</sub> MASO<sub>ESEL</sub> ← hint MASO<sub>NV</sub> ← hint \text{MAS1}_{\text{V IPROT}} \leftarrow 0 \begin{array}{l} \text{MAS1}_{\text{TID TS}} \leftarrow \text{MAS6}_{\text{SPID0 SAS}} \\ \text{MAS1}_{\text{TSIZE}} \leftarrow \text{MAS4}_{\text{TSIZED}} \end{array} {\rm MAS2_{\rm VLE~W~I~M~G~E~ACM}} \leftarrow {\rm MAS4_{\rm VLED~WD~ID~MD~GD~ED~ACMD}} \texttt{MAS2}_{\texttt{EPN}} \, \leftarrow \, \texttt{undefined} MAS3_{RPNL} \leftarrow 0 \texttt{MAS3}_{\texttt{U0:U3}}~_{\texttt{UX}}~_{\texttt{SX}}~_{\texttt{UW}}~_{\texttt{SW}}~_{\texttt{UR}}~_{\texttt{SR}}~\leftarrow~0 MAS7<sub>RPNII</sub> ← 0 ``` Let the effective address (EA) be the sum(RA|0)+ (RB). If any valid TLB array contains an entry corresponding to the virtual address formed by MAS6<sub>SAS SPID0</sub> and EA, that entry as well as the index and array are read into the MAS registers. If no valid matching translation exists, MAS1 $_{\rm V}$ is set to 0 and the MAS registers are loaded with defaults to facilitate a TLB replacement. If the TLB array supports MAS0 $_{ m NV}$ , an implementation defined value, hint, specifying the index for the next entry to be replaced is loaded into MAS0 $_{ m NV}$ regardless of whether a match occurs; otherwise MAS0 $_{ m NV}$ is set to an undefined value. It is also loaded into MAS0 $_{ m ESEL}$ if no match occurs. In some implementations, if RA does not equal 0, it may produce an Illegal Instruction exception. This instruction is privileged. #### **Special Registers Altered:** MAS0 MAS1 MAS2 MAS3 MAS7 ### TLB Read Entry X-form tlbre ``` entry ← SelectTLB (MASO<sub>TLBSEL</sub>, MASO<sub>ESEL</sub>, MAS2<sub>EPN</sub>) rpn ← entry<sub>RPN</sub> if TLB array supports Next Victim then MASO<sub>NV</sub> ← hint else MASO<sub>NV</sub> ← undefined MAS1<sub>V</sub> IPROT TID TS TSIZE ← entry<sub>V</sub> IPROT TID TS SIZE MAS2<sub>EPN</sub> VLE W I M G E ACM ← entry<sub>EPN</sub> VLE W I M G E ACM MAS3<sub>RPNL</sub> ← rpn<sub>32:51</sub> MAS3<sub>UO:U3</sub> UX SX UW SW UR SR ← entry<sub>UO:U3</sub> UX SX UW SW UR SR MAS7<sub>RPNU</sub> ← rpn<sub>0:31</sub> ``` The contents of the TLB entry specified by $MASO_{TLBSEL}$ , $MASO_{ESEL}$ , and $MAS2_{EPN}$ are read and placed into the MAS registers. If the TLB array supports MAS0 $_{NV}$ , then an implementation defined value, hint, specifying the index for the next entry to be replaced is loaded into MAS0 $_{NV}$ ; otherwise MAS0 $_{NV}$ is set to an undefined value. If the specified entry does not exist, the results are undefined. This instruction is privileged. #### **Special Registers Altered:** MAS0 MAS1 MAS2 MAS3 MAS7 #### TLB Synchronize X-form #### TLB Write Entry X-form tlbsync | 31 | /// | /// | /// | 566 | / | |----|-----|-----|-----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | The *tlbsync* instruction provides an ordering function for the effects of all *tlbivax* instructions executed by the processor executing the *tlbsync* instruction, with respect to the memory barrier created by a subsequent *sync* (*msync*) instruction executed by the same processor. Executing a *tlbsync* instruction ensures that all of the following will occur. - All TLB invalidations caused by tlbivax instructions preceding the tlbsync instruction will have completed on any other processor before any storage accesses associated with data accesses caused by instructions following the sync (msync) instruction are performed with respect to that processor. - All storage accesses by other processors for which the address was translated using the translations being invalidated will have been performed with respect to the processor executing the *sync* (*msync*) instruction, to the extent required by the associated Memory Coherence Required attributes, before the *sync* (*msync*) instruction's memory barrier is created. The operation performed by this instruction is ordered by the *mbar* or *sync* (*msync*) instruction with respect to preceding *tlbivax* instructions executed by the processor executing the *tlbsync* instruction. The operations caused by *tlbivax* and *tlbsync* are ordered by *mbar* as a set of operations, which is independent of the other sets that *mbar* orders. The *tlbsync* instruction may complete before operations caused by *tlbivax* instructions preceding the *tlb-sync* instruction have been performed. This instruction is privileged. #### **Special Registers Altered:** None #### tlbwe | 21 | /// | 111 | 111 | 070 | 1/ | |----|-----|-----|-----|-----|----| | 31 | /// | /// | /// | 970 | / | | 0 | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{l} \text{entry} \leftarrow \text{SelectTLB} \left( \text{MASO}_{\text{TLBSEL}}, \text{ MASO}_{\text{ESEL}}, \text{ MAS2}_{\text{EPN}} \right) \\ \text{rpn} \leftarrow \text{MAS7}_{\text{RPNU}} \mid \mid \text{MAS3}_{\text{RPNL}} \\ \text{hint} \leftarrow \text{MAS0}_{\text{NV}} \\ \text{entryv}_{\text{IPROT}} \text{ TiD TS SIZE} \leftarrow \text{MAS1}_{\text{V}}_{\text{IPROT}} \text{ TiD TS TSIZE} \\ \text{entry}_{\text{EPN}} \text{ VLE W I M G E ACM} \leftarrow \text{MAS2}_{\text{EPN}} \text{ VLE W I M G E ACM} \\ \text{entry}_{\text{U0:U3}} \text{ UX SX UW SW UR SR} \leftarrow \text{MAS3}_{\text{U0:U3}} \text{ UX SX UW SW UR SR} \\ \text{entry}_{\text{RPN}} \leftarrow \text{rpn} \end{array} ``` The contents of the MAS registers are written to the TLB entry specified by $\rm MASO_{TLBSEL}$ , $\rm MASO_{ESEL}$ , and $\rm MAS2_{EPN}$ . MASO<sub>NV</sub> provides a suggestion to hardware of where the next hardware hint for replacement should be given when the next Data or Instruction TLB Error Interrupt, *tlbsx*, or *tlbre* instruction occurs. If the specified entry does not exist, the results are undefined. A context synchronizing instruction is required after a *tlbwe* instruction to ensure any subsequent instructions that will use the updated TLB values execute in the new context. This instruction is privileged. #### **Special Registers Altered:** None # **Appendix E. Example Performance Monitor** [Category: Embedded Performance Monitor] ### **E.1 Overview** This appendix describes an example of a Performance Monitor facility. It defines an architecture suitable for performance monitoring facilities in the Embedded environment. The architecture itself presents only programming model visible features in conjunction with architecturally defined behavioral features. Much of the selection of events is by necessity implementation-dependent and is not described as part of the architecture; however, this document provides guidelines for some features of a performance monitor implementation that should be followed by all implementations. The example Performance Monitor facility provides the ability to monitor and count predefined events such as processor clocks, misses in the instruction cache or data cache, types of instructions decoded, or mispredicted branches. The count of such events can be used to trigger the Performance Monitor exception. While most of the specific events are not architected, the mechanism of controlling data collection is. The example Performance Monitor facility can be used to do the following: - Improve system performance by monitoring software execution and then recoding algorithms for more efficiency. For example, memory hierarchy behavior can be monitored and analyzed to optimize task scheduling or data distribution algorithms. - Characterize processors in environments not easily characterized by benchmarking. - Help system developers bring up and debug their systems. ## **E.2 Programming Model** The example Performance Monitor facility defines a set of Performance Monitor Registers (PMRs) that are used to collect and control performance data collection and an interrupt to allow intervention by software. The PMRs provide various controls and access to collected data. They are categorized as follows: - Counter registers. These registers are used for data collection. The occurrence of selected events are counted here. These registers are named PMC0..15. User and supervisor level access to these registers is through different PMR numbers allowing different access rights. - Global controls. This register control global settings of the Performance Monitor facility and affect all counters. This register is named PMGC0. User and supervisor level access to these registers is through different PMR numbers allowing different access rights. In addition, a bit in the MSR (MSR<sub>PMM</sub>) is defined to enable/disable counting. - Local controls. These registers control settings that apply only to a particular counter. These registers are named PMLCa0..15 and PMLCb0..15. User and supervisor level access to these registers is through different PMR numbers allowing different access rights. Each set of local control registers (PMLCan and PMLCbn) contains controls that apply to the associated same numbered counter register (e.g. PMLCa0 and PMLCb0 contain controls for PMC0 while PMLCa1 and PMLCb1 contain controls for PMC1). #### **Assembler Note** The counter registers, global controls, and local controls have alias names which cause the assembler to use different PMR numbers. The names PMC0...15, PMGC0, PMLCa0...15, and PMLCb0...15 cause the assembler to use the supervisor level PMR number, and the names UPMC0...15, UPMGC0, UPMLCa0...15, and UPMLCb0...15 cause the assembler to use the user-level PMR number. A given implementation may implement fewer counter registers (and their associated control registers) than are architected. Architected counter and counter control registers that are not implemented behave the same as unarchitected Performance Monitor Registers. PMRs are described in Section E.3. Software uses the global and local controls to select which events are counted in the counter registers, when such events should be counted, and what action should be taken when a counter overflows. Software can use the collected information to determine performance attributes of a given segment of code, a process, or the entire software system. PMRs can be read by software using the *mfpmr* instruction and PMRs can be written by using the *mtpmr* instruction. Both instructions are described in Section E.4. Since counters are defined as 32-bit registers, it is possible for the counting of some events to overflow. A Performance Monitor interrupt is provided that can be programmed to occur in the event of a counter overflow. The Performance Monitor interrupt is described in detail in Section E.2.5 and Section E.2.6. ### **E.2.1 Event Counting** Event counting can be configured in several different ways. This section describes configurability and specific unconditional counting modes. # **E.2.2 Processor Context Configurability** Counting can be enabled if conditions in the processor state match a software-specified condition. Because a software task scheduler may switch a processor's execution among multiple processes and because statistics on only a particular process may be of interest, a facility is provided to mark a process. The Performance Monitor mark bit, MSR<sub>PMM</sub>, is used for this purpose. System software may set this bit to 1 when a marked process is running. This enables statistics to be gathered only during the execution of the marked process. The states of MSR<sub>PR</sub> and MSR<sub>PMM</sub> together define a state that the processor (supervisor or user) and the process (marked or unmarked) may be in at any time. If this state matches an individual state specified by the PMLCan<sub>FCU</sub>, PMLCan<sub>FCS</sub>, PMLCan<sub>FCM1</sub> $PMLCan_{FCM0}$ fields in PMLCan (the state for which monitoring is enabled), counting is enabled for PMCn. Each event, on an implementation basis, may count regardless of the value of MSR<sub>PMM</sub>. The counting behavior of each event should be documented in the User's Manual. The processor states and the settings of the PMLCa $n_{\rm FCS}$ , PMLCa $n_{\rm FCU}$ , PMLCa $n_{\rm FCM0}$ and PMLCa $n_{\rm FCM0}$ fields in PMLCa $n_{\rm RCM0}$ necessary to enable monitoring of each processor state are shown in Figure 41. | Processor State | FCS | FCU | FCM1 | FCM0 | |---------------------------|-----|-----|------|------| | Marked | 0 | 0 | 0 | 1 | | Not marked | 0 | 0 | 1 | 0 | | Supervisor | 0 | 1 | 0 | 0 | | User | 1 | 0 | 0 | 0 | | Marked and supervisor | 0 | 1 | 0 | 1 | | Marked and user | 1 | 0 | 0 | 1 | | Not marked and supervisor | 0 | 1 | 1 | 0 | | Not mark and user | 1 | 0 | 1 | 0 | | All | 0 | 0 | 0 | 0 | | None | Х | Х | 1 | 1 | | None | 1 | 1 | Χ | Х | Figure 41. Processor States and PMLCan Bit Settings Two unconditional counting modes may be specified: - Counting is unconditionally enabled regardless of the states of MSR<sub>PMM</sub> and MSR<sub>PR</sub>. This can be accomplished by setting PMLCan<sub>FCS</sub>, PMLCan<sub>FCU</sub>, PMLCan<sub>FCM1</sub>, and PMLCan<sub>FCM0</sub> to 0 for each counter control. - Counting is unconditionally disabled regardless of the states of MSR<sub>PMM</sub> and MSR<sub>PR</sub>. This can be accomplished by setting PMGC0<sub>FAC</sub> to 1 or by setting PMLCan<sub>FC</sub> to 1 for each counter control. Alternatively, this can be accomplished by setting PMLCan<sub>FCM1</sub> to 1 and PMLCan<sub>FCM0</sub> to 1 for each counter control or by setting PMLCan<sub>FCS</sub> to 1 and PMLCan<sub>FCII</sub> to 1 for each counter control. #### Programming Note Events may be counted in a fuzzy manner. That is, events may not be counted precisely due to the nature of an implementation. Users of the Performance Monitor facility should be aware that an event may be counted even if it was precisely filtered, though it should not have been. In general such discrepancies are statistically unimportant and users should not assume that counts are explicitly accurate. #### E.2.3 Event Selection Events to count are determined by placing an implementation defined event value into the PMLCa0..15<sub>EVENT</sub> field. Which events may be programmed into which counter are implementation specific and should be defined in the User's Manual. In general, most events may be programmed into any of the implementation available counters. Programming a counter with an event that is not supported for that counter gives boundedly undefined results. #### Programming Note - Event name and event numbers will differ greatly across implementations and software should not expect that events and event names will be consistent. ### E.2.4 Thresholds Thresholds are values that must be exceeded for an event to be counted. Threshold values are programmed in the PMLCb0..15<sub>THRESHOLD</sub> field. The events which may be thresholded and the units of each event that may be thresholded are implementation-dependent. Programming a threshold value for an event that is not defined to use a threshold gives boundedly undefined results. # E.2.5 Performance Monitor Exception A Performance Monitor exception occurs when counter overflow detection is enabled and a counter overflows. More specifically, for each counter register n, if $PMGC0_{PMIE}=1$ and $PMLCan_{CE}=1$ and $PMCn_{OV}=1$ and $MSR_{EE}=1$ , a Performance Monitor exception is said to exist. The Performance Monitor exception condition will cause a Performance Monitor interrupt if the exception is the highest priority exception. The Performance Monitor exception is level sensitive and the exception condition may cease to exist if any of the required conditions fail to be met. Thus it is possible for a counter to overflow and continue counting events until PMC $n_{\rm OV}$ becomes 0 without taking a Performance Monitor interrupt if MSR<sub>EE</sub> = 0 during the overflow condition. To avoid this, software should program the counters to freeze if an overflow condition is detected (see Section E.3.4). # **E.2.6 Performance Monitor Interrupt** A Performance Monitor interrupt occurs when a Performance Monitor exception exists and no higher priority exception exists. When a Performance Monitor interrupt occurs, SRR0 and SRR1 record the current state of the NIA and the MSR, the MSR is set to handle the interrupt, and instruction execution resumes at $IVPR_{0.47} \parallel IVOR35_{48.59} \parallel 0b0000$ . The Performance Monitor interrupt is precise and asynchronous. #### Programming Note - When taking a Performance Monitor interrupt software should clear the overflow condition by reading the counter register and setting the counter register to a non-overflow value since the normal return from the interrupt will set MSR<sub>EE</sub> back to 1. # E.3 Performance Monitor Registers # E.3.1 Performance Monitor Global Control Register 0 The Performance Monitor Global Control Register 0 (PMGC0) controls all Performance Monitor counters. Figure 42. [User] Performance Monitor Global Control Register 0 These bits are interpreted as follows: #### Bit Description #### 32 Freeze All Counters (FAC) The FAC bit is sticky; that is, once set to 1 it remains set to 1 until it is set to 0 by an *mtpmr* instruction. - The PMCs can be incremented (if enabled by other Performance Monitor control fields). - 1 The PMCs can not be incremented. ## 33 **Performance Monitor Interrupt Enable** (PMIE) - Performance Monitor interrupts are disabled. - 1 Performance Monitor interrupts are enabled and occur when an enabled condition or event occurs. Enabled conditions and events are described in Section E.2.5. #### 34 Freeze Counters on Enabled Condition or Event (FCECE) Enabled conditions and events are described in Section E.2.5. - The PMCs can be incremented (if enabled by other Performance Monitor control fields). - The PMCs can be incremented (if enabled by other Performance Monitor control fields) only until an enabled condition or event occurs. When an enabled condition or event occurs, PMGC0<sub>FAC</sub> is set to 1. It is the user's responsibility to set PMGC0<sub>FAC</sub> to 0. 35:63 Reserved The UPMGC0 register is an alias to the PMGC0 register for user mode read only access. ### E.3.2 Performance Monitor Local Control A Registers The Performance Monitor Local Control A Registers 0 through 15 (PMLCa0..15) function as event selectors and give local control for the corresponding numbered Performance Monitor counters. PMLCa works with the corresponding numbered PMLCb register. Figure 43. [User] Performance Monitor Local **Control A Registers** PMLCa is set to 0 at reset. These bits are interpreted as follows: #### Bit Description #### 32 Freeze Counter (FC) - The PMC can be incremented (if enabled by other Performance Monitor control fields). - The PMC can not be incremented. #### 33 Freeze Counter in Supervisor State (FCS) - The PMC is incremented (if enabled by other Performance Monitor control fields). - The PMC can not be incremented if $MSR_{PR}$ is 0. #### Freeze Counter in User State (FCU) 34 - The PMC can be incremented (if enabled by other Performance Monitor control fields). - The PMC can not be incremented if MSR<sub>PR</sub> is 1. #### 35 Freeze Counter while Mark is Set (FCM1) - The PMC can be incremented (if enabled by other Performance Monitor control fields). - The PMC can not be incremented if MSR<sub>PMM</sub> is 1. #### 36 Freeze Counter while Mark is Cleared (FCM0) - The PMC can be incremented (if enabled by other Performance Monitor control fields). - The PMC can not be incremented if $MSR_{PMM}$ is 0. #### 37 Condition Enable (CE) - Overflow conditions for PMCn cannot occur (PMCn cannot cause interrupts, cannot freeze counters) - Overflow conditions occur when the mostsignificant-bit of PMCn is equal to 1. It is recommended that CE be set to 0 when counter PMCn is selected for chaining; see Section E.5.1. 38:40 Reserved #### 41:47 **Event Selector (EVENT)** Up to 128 events selectable; see Section E.2.3. 48:53 Setting is implementation-dependent. 54:63 Reserved The UPMLCa0..15 registers are aliases to the PMLCa0..15 registers for user mode read only access. ### E.3.3 Performance Monitor Local **Control B Registers** The Performance Monitor Local Control B Registers 0 through 15 (PMLCb0..15) specify a threshold value and a multiple to apply to a threshold event selected for the corresponding Performance Monitor counter. Threshold capability is implementation counter dependent. Not all events or all counters of an implementation are guaranteed to support thresholds. PMLCb works with the corresponding numbered PMLCa register. | | PMLCb015 | | |----|----------|----| | 32 | | 63 | Description Bit #### Figure 44. [User] Performance **Monitor Local Control B Register** PMLCb is set to 0 at reset. These bits are interpreted as follows: | 32:52 | Reserved | | | | | | |-------|----------------------------|----------|-----|------------|----|---| | 53:55 | Threshold Multi | iple (T⊦ | IRE | SHMUL) | | | | | 000 Threshold<br>(THRESHOL | | is | multiplied | by | 1 | | | 001 Threshold (THRESHOL | | is | multiplied | by | 2 | | | 010 Threshold<br>(THRESHOL | | is | multiplied | by | 4 | | | 011 Threshold | field | is | multiplied | by | 8 | $(THRESHOLD \times 8)$ 100 Threshold field is multiplied by 16 (THRESHOLD × 16) - 101 Threshold field is multiplied by 32 $(THRESHOLD \times 32)$ - 110 Threshold field is multiplied by 64 (THRESHOLD × 64) 111 Threshold field is multiplied by 128 (THRESHOLD × 128) #### 56:57 Reserved #### 58:63 Threshold (THRESHOLD) Only events that exceed the value THRESH-OLD multiplied as described by THRESHMUL are counted. Events to which a threshold value applies are implementation-dependent as are the unit (for example duration in cycles) and the granularity with which the threshold value is interpreted. #### Programming Note By varying the threshold value, software can obtain a profile of the event characteristics subject to thresholding. For example, if PMC1 is configured to count cache misses that last longer than the threshold value, software can measure the distribution of cache miss durations for a given program by monitoring the program repeatedly using a different threshold value each time. The UPMLCb0..15 registers are aliases to the PMLCb0..15 registers for user mode read only access. # E.3.4 Performance Monitor Counter Registers The Performance Monitor Counter Registers (PMC0..15) are 32-bit counters that can be programmed to generate interrupt signals when they overflow. Each counter is enabled to count up to 128 events. ## Figure 45. [User] Performance Monitor Counter Registers PMCs are set to 0 at reset. These bits are interpreted as follows: #### Bit Description #### 32 **Overflow** (OV) - 0 Counter has not reached an overflow state. - 1 Counter has reached an overflow state. #### 33:63 Counter Value (CV) Indicates the number of occurrences of the specified event. The minimum value for a counter is 0 (0x0000\_0000) and the maximum value is 4,294,967,295 (0xFFFF\_FFFF). A counter can increment up to the maximum value and then wraps to the minimum value. A counter enters the overflow state when the high-order bit is set to 1, which normally occurs only when the counter increments from a value below 2,147,483,648 (0x8000\_0000) to a value greater than or equal to 2,147,483,648 (0x8000\_0000). Several different actions may occur when an overflow state is reached, depending on the configuration: - If PMLCan<sub>CE</sub> is 0, no special actions occur on overflow: the counter continues incrementing, and no exception is signaled. - If PMLCan<sub>CE</sub> and PMGC0<sub>FCECE</sub> are 1, all counters are frozen when PMCn overflows. - If PMLCan<sub>CE</sub>, PMGC0<sub>PMIE</sub>, and MSR<sub>EE</sub> are 1, an exception is signalled when PMCn reaches overflow. Note that the interrupts are masked by setting MSR<sub>EE</sub> to 0. An overflow condition may be present while MSR<sub>EE</sub> is zero, but the interrupt is not taken until MSR<sub>EE</sub> is set to 1. If an overflow condition occurs while MSR $_{\rm EE}$ is 0 (the exception is masked), the exception is still signalled once MSR $_{\rm EE}$ is set to 1 if the overflow condition is still present and the configuration has not been changed in the meantime to disable the exception; however, if MSR $_{\rm EE}$ remains 0 until after the counter leaves the overflow state (MSB becomes 0), or if MSR $_{\rm EE}$ remains 0 until after PMLCa $_{\rm CE}$ or PMGC0 $_{\rm PMIE}$ are set to 0, the exception does not occur. #### Programming Note - Loading a PMC with an overflowed value can cause an immediate exception. For example, if PMLCan<sub>CE</sub>, PMGC0<sub>PMIE</sub>, and MSR<sub>EE</sub> are all 1, and an *mtpmr* loads an overflowed value into a PMCn that previously held a non-overflowed value, then an interrupt will be generated before any event counting has occurred. The following sequence is generally recommended for setting the counter values and configurations. - 1. Set PMGC0<sub>FAC</sub> to 1 to freeze the counters. - Perform a series of mtpmr operations to initialize counter values and configure the control registers - Release the counters by setting PMGC0<sub>FAC</sub> to 0 with a final *mtpmr*. ### **E.4 Performance Monitor Instructions** ## Move From Performance Monitor Register XFX-form mfpmr RT,PMRN | 31 | RT | pmrn | 334 | / | |----|----|------|-----|----| | 0 | 6 | 11 | 21 | 31 | ``` n \leftarrow pmrn_{5:9} \mid \mid pmrn_{0:4} if length(PMR(n)) = 64 then RT \leftarrow PMR(n) else RT \leftarrow ^{32}0 \mid \mid PMR(n)_{32:63} ``` Let PMRN denote a Performance Monitor Register number and PMR the set of Performance Monitor Registers. The contents of the designated Performance Monitor Register are placed into register RT. The list of defined Performance Monitor Registers and their privilege class is provided in Figure 46. Execution of this instruction specifying a defined and privileged Performance Monitor Register when MSR<sub>PR</sub>=1 will result in a Privileged Instruction exception Execution of this instruction specifying an undefined Performance Monitor Register will either result in an Illegal Instruction exception or will produce an undefined value for register RT. #### **Special Registers Altered:** None ## Move To Performance Monitor Register XFX-form mtpmr PMRN,RS | 31 | RS | pmrn | 462 | / | |----|----|------|-----|----| | 0 | 6 | 11 | 21 | 31 | $$\begin{array}{ll} n \leftarrow pmrn_{5;9} \mid \mid pmrn_{0;4} \\ \text{if length}(PMR(n)) = 64 \text{ then} \\ PMR(n) \leftarrow (RS) \\ \text{else} \\ PMR(n) \leftarrow (RS)_{32:63} \end{array}$$ Let PMRN denote a Performance Monitor Register number and PMR the set of Performance Monitor Registers. The contents of the register RS are placed into the designated Performance Monitor Register. The list of defined Performance Monitor Registers and their privilege class is provided in Figure 46. Execution of this instruction specifying a defined and privileged Performance Monitor Register when MSR<sub>PR</sub>=1 will result in a Privileged Instruction exception Execution of this instruction specifying an undefined Performance Monitor Register will either result in an Illegal Instruction exception or will perform no operation. #### **Special Registers Altered:** None | decimal | PMR <sup>1</sup> | Register Name | Privileged | | Cat | |-----------|-----------------------------------------|----------------------|------------|-------|------| | decimai | pmrn <sub>5:9</sub> pmrn <sub>0:4</sub> | Register Name | mtpmr | mfpmr | Cat | | 0-15 | 00000 0xxxx | PMC015 | - | no | E.PM | | 16-31 | 00000 1xxxx | PMC015 | yes | yes | E.PM | | 128-143 | 00100 0xxxx | PMLCA015 | - | no | E.PM | | 144-159 | 00100 1xxxx | PMLCA015 | yes | yes | E.PM | | 256-271 | 01000 0xxxx | PMLCB015 | - | no | E.PM | | 272-287 | 01000 1xxxx | PMLCB015 | yes | yes | E.PM | | 384 | 01100 00000 | PMGC0 | - | no | E.PM | | 400 | 01100 10000 | PMGC0 | yes | yes | E.PM | | - This re | gister is not defined | for this instruction | 1- | • | | Note that the order of the two 5-bit halves of the PMR number is reversed. Figure 46. Embedded.Peformance Monitor PMRs ## E.5 Performance Monitor Software Usage Notes ### **E.5.1 Chaining Counters** An implementation may contain events that are used to "chain" counters together to provide a larger range of event counts. This is accomplished by programming the desired event into one counter and programming another counter with an event that occurs when the first counter transitions from 1 to 0 in the most significant bit. The counter chaining feature can be used to decrease the processing pollution caused by Performance Monitor interrupts, (things like cache contamination, and pipeline effects), by allowing a higher event count than is possible with a single counter. Chaining two counters together effectively adds 32 bits to a counter register where the first counter's carry-out event acts like a carry-out feeding the second counter. By defining the event of interest to be another PMC's overflow generation, the chained counter increments each time the first counter rolls over to zero. Multiple counters may be chained together. Because the entire chained value cannot be read in a single instruction, an overflow may occur between counter reads, producing an inaccurate value. A sequence like the following is necessary to read the complete chained value when it spans multiple counters and the counters are not frozen. The example shown is for a two-counter case. ``` loop: mfpmr Rx,pmctr1 #load from upper counter mfpmr Ry,pmctr0 #load from lower counter mfpmr Rz,pmctr1 #load from upper counter cmp cr0,0,Rz,Rx #see if 'old' = 'new' bc 4,2,loop #loop if carry occurred between reads ``` The comparison and loop are necessary to ensure that a consistent set of values has been obtained. The above sequence is not necessary if the counters are frozen. ## E.5.2 Thresholding Threshold event measurement enables the counting of duration and usage events. Assume an example event, dLFB load miss cycles, requires a threshold value. A dLFB load miss cycles event is counted only when the number of cycles spent recovering from the miss is greater than the threshold. If the event is counted on two counters and each counter has an individual threshold, one execution of a performance monitor program can sample two different threshold values. Measuring code performance with multiple concurrent thresholds expedites code profiling significantly. ## **Book VLE:** Power ISA Operating Environment Architecture - Variable Length Encoding (VLE) Environment ## **Chapter 1. Variable Length Encoding Introduction** | 1.1 Overview | 1.4.6 R-form (16-bit Monadic Instructions) | |------------------------------------------------|------------------------------------------------| | 1.2 Documentation Conventions 762 | 763 | | 1.2.1 Description of Instruction Operation 762 | 1.4.7 RR-form (16-bit Dyadic Instructions) 763 | | 1.3 Instruction Mnemonics and Operands | 1.4.8 SD4-form (16-bit Load/Store Instruc- | | 762 | tions) | | 1.4 VLE Instruction Formats 762 | 1.4.9 BD15-form 763 | | 1.4.1 BD8-form (16-bit Branch Instruc- | 1.4.10 BD24-form 763 | | tions) | 1.4.11 D8-form 763 | | 1.4.2 C-form (16-bit Control Instructions). | 1.4.12 I16A-form 763 | | 762 | 1.4.13 I16L-form 763 | | 1.4.3 IM5-form (16-bit register + immediate | 1.4.14 M-form 763 | | Instructions) | 1.4.15 SCI8-form | | 1.4.4 OIM5-form (16-bit register + offset | 1.4.16 LI20-form 763 | | immediate Instructions) 762 | 1.4.17 Instruction Fields 763 | | 1.4.5 IM7-form (16-bit Load immediate | | | Instructions) | | This chapter describes computation modes, document conventions, a processor overview, instruction formats, storage addressing, and instruction addressing. ### 1.1 Overview Variable Length Encoding (VLE) is a code density optimized re-encoding of much of the instruction set defined by Books I, II, and III-E using both 16-bit and 32-bit instruction formats. VLE offers more efficient binary representations of applications for the embedded processor spaces where code density plays a major role in affecting overall system cost, and to a somewhat lesser extent, performance. VLE is a supplement to the instruction set defined by Book I-III and code pages using VLE encoding or non-VLE encoding can be intermingled in a system providing focus on both high performance and code density where most needed. VLE provides alternative encodings to instructions defined in Books I-III to enable reduced code footprint. This set of alternative encodings is selected on a page basis. A single storage attribute bit selects between standard instruction encodings and VLE instructions for that page of memory. Instruction encodings in pages marked as VLE are either 16 or 32 bits long, and are aligned on 16-bit boundaries. Because of this, all instruction pages marked as VLE are required to use Big-Endian byte ordering. The programming model uses the same register set with both instruction set encodings, although some registers are not accessible by VLE instructions using the 16-bit formats and not all condition register (CR) fields are used by *Conditional Branch* instructions or instructions that access the condition register executing from a VLE instruction page. In addition, immediate fields and displacements differ in size and use, due to the more restrictive encodings imposed by VLE instruction formats. VLE additional instruction fields are described in Section 1.4.17, "Instruction Fields". Other than the requirement of Big-Endian byte ordering for instruction pages and the additional storage attribute to identify whether the instruction page corresponds to a VLE section of code, VLE complies with the memory model, register model, timer facilities, debug facilities, and interrupt/exception model defined in Book I-III and therefore execute in the same environment as non-VLE instructions. ### 1.2 Documentation Conventions Book VLE adheres to the documentation conventions defined inSection 1.3 of Book I. Note however that this book defines instructions that apply to the User Instruction Set Architecture, the Virtual Environment Architecture, and the Operating Environment Architecture. # 1.2.1 Description of Instruction Operation The RTL (register transfer language) descriptions in Book VLE conform to the conventions described in Section 1.3.4 of Book I. # 1.3 Instruction Mnemonics and Operands The description of each instruction includes the mnemonic and a formatted list of operands. VLE instruction semantics are either identical or similar to those of other instructions in the architecture. Where the semantics, side-effects, and binary encodings are identical, the standard mnemonics and formats are used. Such unchanged instructions are listed and appropriately referenced, but the instruction definitions are not replicated in this book. Where the semantics are similar but the binary encodings differ, the standard mnemonic is typically preceded with an **e**\_ to denote a VLE instruction. To distinguish between similar instructions available in both 16- and 32-bit forms under VLE and standard instructions, VLE instructions encoded with 16 bits have an **se**\_ prefix. The following are examples: ``` stwx RS,RA,RB \hspace{0.1cm}//\hspace{0.1cm} standard Book I instruction e_stw RS,D(RA) \hspace{0.1cm}//\hspace{0.1cm} 32-bit VLE instruction se stw RZ,SD4(RX) \hspace{0.1cm}//\hspace{0.1cm} 16-bit VLE instruction ``` ### 1.4 VLE Instruction Formats All VLE instructions to be executed are either two or four bytes long and are halfword-aligned in storage. Thus, whenever instruction addresses are presented to the processor (as in *Branch* instructions), the low-order bit is treated as 0. Similarly, whenever the processor generates an instruction address, the low-order bit is zero. The format diagrams given below show horizontally all valid combinations of instruction fields. Only those formats that are unique to VLE-defined instructions are included here. Instruction forms that are available in VLE or non-VLE mode are described in Section 1.6 of Book I and are not repeated here. In some cases an instruction field must contain a particular value. If a field that must contain a particular value does not contain that value, the instruction form is invalid and the results are as described for invalid instruction forms in Book I. VLE instructions use split field notation as defined in Section 1.6 of Book I. # 1.4.1 BD8-form (16-bit Branch Instructions) | 0 | 5 | 6 | 7 | - | 8 | | | 15 | |------|------|--------|----|---|---|----|---|----| | OPCD | BO16 | В | 11 | 6 | | BD | 8 | | | OP | CD | X<br>O | LI | < | | BD | 8 | | Figure 1. BD8 instruction format # 1.4.2 C-form (16-bit Control Instructions) | 0 | 15 | |------|----| | OPCD | | | OPCD | LK | Figure 2. C instruction format # 1.4.3 IM5-form (16-bit register + immediate Instructions) | 0 | 6 7 | | 12 | 15 | |------|-----|-----|----|----| | OPCD | X | UI5 | R | ( | Figure 3. IM5 instruction format # 1.4.4 OIM5-form (16-bit register + offset immediate Instructions) | 0 | 6 7 | 7 | 12 | 15 | |------|--------|------|----|----| | OPCD | X | OIM5 | R) | Χ | | OPCD | R<br>C | OIM5 | R) | Χ | Figure 4. OIM5 instruction format # 1.4.5 IM7-form (16-bit Load immediate Instructions) | 0 | 5 | 12 | 15 | |------|-----|----|----| | OPCD | UI7 | RX | | Figure 5. IM7 instruction format # 1.4.6 R-form (16-bit Monadic Instructions) Figure 6. R instruction format # 1.4.7 RR-form (16-bit Dyadic Instructions) | 0 | | 6 7 | 8 | 12 15 | |---|------|------------|-----|-------| | | OPCD | ХО | RY | RX | | | OPCD | X R<br>O C | RY | RX | | | OPCD | ХО | ARY | RX | | Г | OPCD | ХО | RY | ARX | Figure 7. RR instruction format # 1.4.8 SD4-form (16-bit Load/Store Instructions) | 0 | 4 | 8 | 12 | 15 | |------|-----|----|----|----| | OPCD | SD4 | RZ | RX | | Figure 8. SD4 instruction format #### 1.4.9 BD15-form | 0 | | 10 | 12 | 16 | | 31 | |---|-----|------|------|----|------|----| | О | PCD | BO32 | BI32 | | BD15 | LK | Figure 9. BD15 instruction format #### 1.4.10 BD24-form | 0 | 6 7 | | 31 | |------|-----|------|----| | OPCD | 0 | BD24 | LK | Figure 10. BD24 instruction format #### 1.4.11 D8-form | 0 | 6 | 11 | 16 | 24 31 | |------|----|----|----|-------| | OPCD | RT | RA | XO | D8 | | OPCD | RS | RA | XO | D8 | Figure 11. D8 instruction format #### 1.4.12 I16A-form | 0 | 6 | 11 | 16 | 21 31 | |------|----|----|----|-------| | OPCD | si | RA | XO | si | | OPCD | ui | RA | XO | ui | Figure 12. I16A instruction format ### 1.4.13 **I16L-form** | 0 | 1 | 6 | 11 | 16 | 21 | 31 | |---|------|----|----|----|----|----| | | OPCD | RT | ui | XO | ui | | Figure 13. I16L instruction format ### 1.4.14 M-form | 0 | 6 | 11 | 16 | 21 | 26 | 31 | |------|----|----|----|----|----|----| | OPCD | RS | RA | SH | MB | ME | X | | OPCD | RS | RA | SH | MB | ME | X | Figure 14. M instruction format ### 1.4.15 SCI8-form | 0 | 6 9 | 11 | 16 20 | 21 22 | 24 31 | |------|----------|----|-------|-------|-------| | OPCD | RT | RA | XO Ro | FSCL | UI8 | | OPCD | RT | RA | XO | FSCL | UI8 | | OPCD | RS | RA | XO Ro | FSCL | UI8 | | OPCD | RS | RA | XO | FSCL | UI8 | | OPCD | 000 BF32 | RA | XO | FSCL | UI8 | | OPCD | 001 BF32 | RA | XO | FSCL | UI8 | | OPCD | XO | RA | XO | FSCL | UI8 | Figure 15. SC18 instruction format #### 1.4.16 LI20-form | 0 | 6 | 11 | 16 17 | 21 | 31 | |------|----|------|---------|------|----| | OPCD | RT | li20 | XO li20 | li20 | | Figure 16. LI20 instruction format ### 1.4.17 Instruction Fields VLE uses instruction fields defined in Section 1.6.24 of Book I as well as VLE-defined instruction fields defined below. ARX (12:15) Field used to specify an "alternate" General Purpose Register in the range R8:R23 to be used as a destination. #### ARY (8:11) Field used to specify an "alternate" General Purpose Register in the range R8:R23 to be used as a source. #### BD8 (8:15), BD15 (16:30), BD24 (7:30) Immediate field specifying a signed two's complement branch displacement which is concatenated on the right with 0b0 and signextended to 64 bits. BD15. (Used by 32-bit branch conditional class instructions) A 15-bit signed displacement that is sign-extended and shifted left one bit (concatenated with 0b0) and then added to the current instruction address to form the branch target address. BD24. (Used by 32-bit branch class instructions) A 24-bit signed displacement that is sign-extended and shifted left one bit (concatenated with 0b0) and then added to the current instruction address to form the branch target address. BD8. (Used by 16-bit branch and branch conditional class instructions) An 8-bit signed displacement that is sign-extended and shifted left one bit (concatenated with 0b0) and then added to the current instruction address to form the branch target address. #### BI16 (6:7), BI32 (12:15) Field used to specify one of the Condition Register fields to be used as a condition of a Branch Conditional instruction. #### BO16 (5), BO32 (10:11) Field used to specify whether to branch if the condition is true, false, or to decrement the Count Register and branch if the Count Register is not zero in a Branch Conditional instruction. #### BF32 (9:10) Field used to specify one of the Condition Register fields to be used as a target of a compare instruction. #### D8 (24:31) The D8 field is a 8-bit signed displacement which is sign-extended to 64 bits. F (21) Fill value used to fill the remaining 56 bits of a scaled-immediate 8 value. #### LI20 (17:20 || 11:15 || 21:31) A 20-bit signed immediate value which is signextended to 64 bits for the **e\_li** instruction. #### LK (7, 16, 31) LINK bit. 0 Do not set the Link Register. 1 Set the Link Register. The sum of the value 2 or 4 and the address of the *Branch* instruction is placed into the Link Register. #### OIM5 (7:11) Offset Immediate field used to specify a 5-bit unsigned fixed-point value in the range [1:32] encoded as [0:31]. Thus the binary encoding of 0b00000 represents an immediate value of 1, 0b00001 represents an immediate value of 2, and so on. OPCD (0:3, 0:4, 0:5, 0:9, 0:14, 0:15) Primary opcode field. #### Rc (6, 7, 20, 31) RECORD bit. - O Do not alter the Condition Register. - Set Condition Register Field 0. #### RX (12:15) Field used to specify a General Purpose Register in the ranges R0:R7 or R24:R31 to be used as a source or as a destination. R0 is encoded as 0b0000, R1 as 0b0001, etc. R24 is encoded as 0b1000, R25 as 0b1001, etc. #### RY (8:11) Field used to specify a General Purpose Register in the ranges R0:R7 or R24:R31 to be used as a source. R0 is encoded as 0b0000, R1 as 0b0001, etc. R24 is encoded as 0b1000, R25 as 0b1001, etc. #### RZ (8:11) Field used to specify a General Purpose Register in the ranges R0:R7 or R24:R31 to be used as a source or as a destination for load/store data. R0 is encoded as 0b0000, R1 as 0b0001, etc. R24 is encoded as 0b1000, R25 as 0b1001, etc. #### SCL (22:23) Field used to specify a scale amount in *Immediate* instructions using the SCI8-form. Scaling involves left shifting by 0, 8, 16, or 24 bits. #### SD4 (4:7) Used by 16-bit load and store class instructions. The SD4 field is a 4-bit unsigned immediate value zero-extended to 64 bits, shifted left according to the size of the operation, and then added to the base register to form a 64-bit EA. For byte operations, no shift is performed. For half-word operations, the immediate is shifted left one bit (concatenated with 0b0). For word operations, the immediate is shifted left two bits (concatenated with 0b00).SI (6:10 || 21:31, 11:15 || 21:31) A 16-bit signed immediate value signextended to 64 bits and used as one operand of the instruction. #### UI (6:10 || 21:31, 11:15 || 21:31) A 16-bit unsigned immediate value zeroextended to 64 bits or padded with 16 zeros and used as one operand of the instruction. The instruction encoding differs between the I16A and I16L instruction formats as shown in Section 1.4.12 and Section 1.4.13. #### UI5 (7:11) Immediate field used to specify a 5-bit unsigned fixed-point value. #### UI7 (5:11) Immediate field used to specify a 7-bit unsigned fixed-point value. #### UI8 (24:31) Immediate field used to specify an 8-bit unsigned fixed-point value. XO (6, 6:7, 6:10, 6:11, 16, 16:19,16:23) Extended opcode field. #### Assembler Note For scaled immediate instructions using the SCI8-form, the instruction assembly syntax requires a single immediate value, *sci8*, that the assembler will synthesize into the appropriate F, SCL, and UI8 fields. The F, SCL, and UI8 fields must be able to be formed correctly from the given *sci8* value or the assembler will flag the assembly instruction as an error. ### **Chapter 2. VLE Storage Addressing** - 2.1 Data Storage Addressing Modes . 7672.2 Instruction Storage Addressing Modes 768 - 2.2.1 Misaligned, Mismatched, and Byte Ordering Instruction Storage Exceptions.768 - 2.2.2 VLE Exception Syndrome Bits . . 768 A program references memory using the effective address (EA) computed by the processor when it executes a *Storage Access* or *Branch* instruction (or certain other instructions described in Book II and Book III-E), or when it fetches the next sequential instruction. ### 2.1 Data Storage Addressing Modes Table 1 lists data storage addressing modes supported by the VLE category. | Table 1: Data Storage Addressing Modes | | | | | |--------------------------------------------------------------------|----------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--| | Mode | Form | Description | | | | Base+16-bit displacement (32-bit instruction format) | D-form | The 16-bit D field is sign-extended and added to the contents of the GPR designated by RA or to zero if RA = 0 to produce the EA. | | | | Base+8-bit displacement (32-bit instruction format) | D8-form | The 8-bit D8 field is sign-extended and added to the contents of the GPR designated by RA or to zero if RA = 0 to produce the EA. | | | | Base+scaled 4-bit displace-<br>ment<br>(16-bit instruction format) | SD4-form | The 4-bit SD4 field zero-extended, scaled (shifted left) according to the size of the operand, and added to the contents of the GPR designated by RX to produce the EA. (Note that RX = 0 is not a special case.) | | | | Base+Index (32-bit instruction format) | X-form | The GPR contents designated by RB are added to the GPR contents designated by RA or to zero if RA = 0 to produce the EA. | | | # 2.2 Instruction Storage Addressing Modes Table 2 lists instruction storage addressing modes supported by the VLE category. | Table 2: Instruction Storage Addressing Modes | | | | | | |--------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|--| | Mode | Description | | | | | | Taken BD24-form Branch instructions (32-bit instruction format) | The 24-bit BD24 field is concatenated on the right with 0b0, sign-extended, and then added to the address of the branch instruction. | | | | | | Taken B15-form Branch instructions (32-bit instruction format) | The 15-bit BD15 field is concatenated on the right with 0b0, sign-extended, and then added to the address of the branch instruction to form the EA of the next instruction. | | | | | | Take BD8-form Branch instructions (16-bit instruction format) | The 8-bit BD8 field is concatenated on the right with 0b0, sign-extended, and then added to the address of the branch instruction to form the EA of the next instruction. | | | | | | Sequential instruction fetching (or non-taken branch instructions) | The value 4 [2] is added to the address of the current 32-bit [16-bit] instruction to form the EA of the next instruction. If the address of the current instruction is 0xFFFF_FFFF_FFFF_FFFC [0xFFFF_FFFFFFFFFFF] in 64-bit mode or 0xFFFF_FFFC [0xFFFFFFFFFFFFF] in 32-bit mode, the address of the next sequential instruction is undefined. | | | | | | Any Branch instruction with LK = 1 (32-bit instruction format) | The value 4 is added to the address of the current branch instruction and the result is placed into the LR. If the address of the current instruction is 0xFFFF_FFFF_FFFFC in 64-bit mode o r0xFFFF_FFFC in 32-bit mode, the result placed into the LR is undefined. | | | | | | Branch se_bl. se_blrl. se_bctrl instructions (16-bit instruction format) | The value 2 is added to the address of the current branch instruction and the result is placed into the LR. If the address of the current instruction is 0xFFFF_FFFF_FFFFE in 64-bit mode or 0xFFFF_FFFE in 32-bit mode, the result placed into the LR is undefined. | | | | | # 2.2.1 Misaligned, Mismatched, and Byte Ordering Instruction Storage Exceptions A Misaligned Instruction Storage Exception occurs when an implementation which supports VLE attempts to execute an instruction that is not 32-bit aligned and the VLE storage attribute is not set for the page that corresponds to the effective address of the instruction. The attempted execution can be the result of a Branch instruction which has bit 62 of the target address set to 1 or the result of an rfi, se\_rfi, rfci, se\_rfci, rfdi, se\_rfdi, rfmci, or se\_rfmci instruction which has bit 62 set in SRR0, SRR0, CSRR0, CSRR0, DSRR0, DSRR0, DSRR0, MCSRR0, or MCSRR0 respectively. If a Misaligned Instruction Storage Exception is detected and no higher priority exception exists, an Instruction Storage Interrupt will occur setting SRR0 to the misaligned address for which execution was attempted. A Mismatched Instruction Storage Exception occurs when an implementation which supports VLE attempts to execute an instruction that crosses a page boundary for which the first page has the VLE storage attribute set to 1 and the second page has the VLE storage attribute bit set to 0. If a Mismatched Instruction Stor- age Exception is detected and no higher priority exception exists, an Instruction Storage Interrupt will occur setting SRR0 to the misaligned address for which execution was attempted. A Byte Ordering Instruction Storage Exception occurs when an implementation which supports VLE attempts to execute an instruction that has the VLE storage attribute set to 1 and the E (Endian) storage attribute set to 1 for the page that corresponds to the effective address of the instruction. If a Byte Ordering Instruction Storage Exception is detected and no higher priority exception exists, an Instruction Storage Interrupt will occur setting SRR0 to the address for which execution was attempted. ## 2.2.2 VLE Exception Syndrome Bits Two bits in the Exception Syndrome Register (ESR) (see Section 5.2.9 of Book III-E) are provided to facilitate VLE exception handling, VLEMI and MIF. ESR<sub>VLEMI</sub> is set when an exception and subsequent interrupt is caused by the execution or attempted execution of an instruction that resides in memory with the VLE storage attribute set. ESR<sub>MIF</sub> is set when an Instruction Storage Interrupt is caused by a Misaligned Instruction Storage Exception or when an Instruction TLB Error Interrupt was caused by a TLB miss on the second half of a misaligned 32-bit instruction. ESR<sub>BO</sub> is set when an Instruction Storage Interrupt is caused by a Mismatched Instruction Storage Exception or a Byte Ordering Instruction Storage Exception. #### **Programming Note** When an Instruction TLB Error Interrupt occurs as the result of a Instruction TLB miss on the second half of a 32-bit VLE instruction that is aligned to only 16-bits, SRR0 will point to the first half of the instruction and ESR<sub>MIF</sub> will be set to 1. Any other status posted as a result of the TLB miss (such as MAS register updates described in TYPE-FSL Memory Management) will reflect the page corresponding to the second half of the instruction which caused the Instruction TLB miss. ### Chapter 3. VLE Compatibility with Books I–III | | 2.2 MMU Extensions | |------------|--------------------| | Extensions | | This chapter addresses the relationship between VLE and Books I-III. #### 3.1 Overview Category VLE uses the same semantics as Books I–III. Due to the limited instruction encoding formats, VLE instructions typically support reduced immediate fields and displacements, and not all operations defined by Books I–III are encoded in category VLE. The basic philosophy is to capture all useful operations, with most frequent operations given priority. Immediate fields and displacements are provided to cover the majority of ranges encountered in embedded control code. Instructions are encoded in either a 16- or 32-bit format, and these may be freely intermixed. VLE instructions cannot access floating-point registers (FPRs). VLE instructions use GPRs and SPRs with the following limitations: - VLE instructions using the 16-bit formats are limited to addressing GPR0–GPR7, and GPR24–GPR31 in most instructions. Move instructions are provided to transfer register contents between these registers and GPR8–GPR23. - VLE compare and bit test instructions using the 16bit formats implicitly set their results in CR0. VLE instruction encodings are generally different than instructions defined by Books I–III, except that most instructions falling within primary opcode 31 are encoded identically and have identical semantics unless they affect or access a resource not supported by category VLE. ## 3.2 VLE Processor and Storage Control Extensions This section describes additional functionality to support category VLE. #### 3.2.1 Instruction Extensions This section describes extensions to support VLE operations. Because instructions may reside on a half-word boundary, bit 62 is not masked by instructions that read an instruction address from a register, such as the LR, CTR, or a save/restore register 0, that holds an instruction address: The instruction set defined by Books I-III is modified to support halfword instruction addressing, as follows: - For *Return From Interrupt* instructions, such as *rfi*, *rfci*, *rfdi*, and *rfmci* no longer mask bit 62 of the respective save/restore register 0. The destination address is SRR0<sub>0:62</sub> || 0b0, CSRR0<sub>0:62</sub> || 0b0, DSRR0<sub>0:62</sub> || 0b0, MCSRR0<sub>0:62</sub> || 0b0 respectively. - For bclr, bclrl, bcctr, and bcctrl no longer mask bit 62 of the LR or CTR. The destination address is LR<sub>0:62</sub> || 0b0 or CTR<sub>0:62</sub> || 0b0. #### 3.2.2 MMU Extensions VLE operation is indicated by the VLE storage attribute. When the VLE storage attribute for a page is set to 1, instruction fetches from that page are decoded and processed as VLE instructions. See Section 4.8.3 of Book III-E. When instructions are executing from a page that has the VLE storage attribute set to 1, the processor is said to be in VLE mode. #### 3.3 VLE Limitations VLE instruction fetches are valid only when performed in a Big-Endian mode. Attempting to fetch an instruction in a Little-Endian mode from a page with the VLE storage attribute set causes an Instruction Storage Byte-ordering exception. Support for concurrent modification and execution of VLE instructions is implementation-dependent. ### **Chapter 4. Branch Operation Instructions** | 4.1 Branch Processor Registers 773 | 4.1.2 Link Register (LR) 774 | |--------------------------------------------|----------------------------------------| | 4.1.1 Condition Register (CR) 773 | 4.1.3 Count Register (CTR)77 | | 4.1.1.1 Condition Register Setting for | 4.2 Branch Instructions 77 | | Compare Instructions 774 | 4.3 System Linkage Instructions 778 | | 4.1.1.2 Condition Register Setting for the | 4.4 Condition Register Instructions 78 | | Bit Test Instruction | • | This section defines *Branch* instructions that can be executed when a processor is in VLE mode and the registers that support them. #### 4.1 Branch Processor Registers The registers that support branch operations are: - Section 4.1.1, "Condition Register (CR)" - Section 4.1.2, "Link Register (LR)" - Section 4.1.3, "Count Register (CTR)" #### 4.1.1 Condition Register (CR) The Condition Register (CR) is a 32-bit register which reflects the result of certain operations, and provides a mechanism for testing (and branching). The CR is more fully defined in Book I. Category VLE uses the entire CR, but some comparison operations and all *Branch* instructions are limited to using CR0–CR3. The full Book I condition register field and logical operations are provided however. Figure 17. Condition Register The bits in the Condition Register are grouped into eight 4-bit fields, CR Field 0 (CR0) ... CR Field 7 (CR7), which are set by VLE defined instructions in one of the following ways. - Specified fields of the condition register can be set by a move to the CR from a GPR (*mtcrf*, *mtocrf*). - A specified CR field can be set by a move to the CR from another CR field (e\_mcrf) or from XER<sub>32:35</sub> (mcrxr). - CR field 0 can be set as the implicit result of a fixed-point instruction. - A specified CR field can be set as the result of a fixed-point compare instruction. - CR field 0 can be set as the result of a fixed-point bit test instruction. Other instructions from implemented categories may also set bits in the CR in the same manner that they would when not in VLE mode. Instructions are provided to perform logical operations on individual CR bits and to test individual CR bits. For all fixed-point instructions in which the Rc bit is defined and set, and for **e\_add2i.**, **e\_and2i.**, and **e\_and2is.**, the first three bits of CR field 0 (CR<sub>32:34</sub>) are set by signed comparison of the result to zero, and the fourth bit of CR field 0 (CR<sub>35</sub>) is copied from the final state of XER<sub>SO</sub>. "Result" here refers to the entire 64-bit value placed into the target register in 64-bit mode, and to bits 32:63 of the value placed into the target register in 32-bit mode. ``` if (64-bit mode) then M \leftarrow 0 else M \leftarrow 32 if (target_register)<sub>M:63</sub> < 0 then c \leftarrow 0b100 else if (target_register)<sub>M:63</sub> > 0 then c \leftarrow 0b010 else c \leftarrow 0b001 CR0 \leftarrow c \parallel XER<sub>SO</sub> ``` If any portion of the result is undefined, the value placed into the first three bits of CR field 0 is undefined. The bits of CR field 0 are interpreted as shown below. #### CR Bit Description ``` 32 Negative (LT) The result is negative. 33 Positive (GT) The result is positive. 34 Zero (EQ) The result is 0. ``` #### 35 **Summary overflow** (SO) This is a copy of the contents of XER<sub>SO</sub> at the completion of the instruction. ## 4.1.1.1 Condition Register Setting for Compare Instructions For compare instructions, a CR field specified by the BF operand for the **e\_cmph**, **e\_cmphl**, **e\_cmpi**, and **e\_cmpli** instructions, or CR0 for the **se\_cmpl**, **e\_cmp16i**, **e\_cmph16i**, **e\_cmph16i**, **e\_cmph16i**, **e\_cmph16i**, **se\_cmp**, **se\_cmph**, **se\_cmph**, **se\_cmpi**, and **se\_cmpli** instructions, is set to reflect the result of the comparison. The CR field bits are interpreted as shown below. A complete description of how the bits are set is given in the instruction descriptions and Section 5.6, "Fixed-Point Compare and Bit Test Instructions". Condition register bits settings for compare instructions are interpreted as follows. (Note: **e\_cmpi**, and **e\_cmpli** instructions have a BF32 field instead of BF field; for these instructions, BF32 should be substituted for BF in the list below.) #### CR Bit Description #### 4×BF + 32 #### Less Than (LT) For signed fixed-point compare, (RA) or (RX) < sci8, SI, (RB), or (RY). For unsigned fixed-point compare, (RA) or (RX) < u sci8, UI, UI5, (RB), or (RY). #### 4×BF + 33 #### **Greater Than** (GT) For signed fixed-point compare, (RA) or (RX) > sci8, SI, (RB), or (RY). For unsigned fixed-point compare, (RA) or (RX) ><sup>u</sup> sci8, UI, UI5, (RB), or (RY). #### 4×BF + 34 #### Equal (EQ) For fixed-point compare, (RA) or (RX) = sci8, UI, UI5, SI, (RB), or (RY). #### 4×BF + 35 #### Summary Overflow (SO) For fixed-point compare, this is a copy of the contents of XER<sub>SO</sub> at the completion of the instruction. ## 4.1.1.2 Condition Register Setting for the Bit Test Instruction The Bit Test Immediate instruction, **se\_btsti**, also sets CR field 0. See the instruction description and also Section 5.6, "Fixed-Point Compare and Bit Test Instructions". #### 4.1.2 Link Register (LR) VLE instructions use the Link Register (LR) as defined in Book I, although category VLE defines a subset of all variants of Book I conditional branches involving the LR. #### 4.1.3 Count Register (CTR) VLE instructions use the Count Register (CTR) as defined in Book I, although category VLE defines a subset of the variants of Book I conditional branches involving the CTR. #### 4.2 Branch Instructions The sequence of instruction execution can be changed by the branch instructions. Because VLE instructions must be aligned on half-word boundaries, the low-order bit of the generated branch target address is forced to 0 by the processor in performing the branch. The branch instructions compute the EA of the target in one of the following ways, as described in Section 2.2, "Instruction Storage Addressing Modes" - Adding a displacement to the address of the branch instruction. - Using the address contained in the LR (Branch to Link Register [and Link]). - Using the address contained in the CTR (Branch to Count Register [and Link]). Branching can be conditional or unconditional, and the return address can optionally be provided. If the return address is to be provided (LK = 1), the EA of the instruction following the branch instruction is placed into the LR after the branch target address has been computed; this is done regardless of whether the branch is taken. In branch conditional instructions, the Bl32 or Bl16 instruction field specifies the CR bit to be tested. For 32-bit instructions using Bl32, $CR_{32:47}$ (corresponding to bits in CR0:CR3) may be specified. For 16-bit instructions using Bl16, only $CR_{32:35}$ (bits within CR0) may be specified. In branch conditional instructions, the BO32 or BO16 field specifies the conditions under which the branch is taken and how the branch is affected by or affects the CR and CTR. Note that VLE instructions also have different encodings for the BO32 and BO16 fields than in Book I's BO field. If the BO32 field specifies that the CTR is to be decremented, in 64-bit mode $CTR_{0:63}$ are decremented, and in 32-bit mode $CTR_{32:63}$ are decremented. If BO16 or BO32 specifies a condition that must be TRUE or FALSE, that condition is obtained from the contents of $CR_{Bl32+32}$ or $CR_{Bl16+32}$ . (Note that CR bits are numbered 32:63. Bl32 or Bl16 refers to the condition register bit field in the branch instruction encoding. For example, specifying Bl32 = 2 refers to $CR_{34}$ .) For Figure 18 let M = 0 in 64-bit mode and M = 32 in 32-bit mode. Encodings for the BO32 field for VLE are shown in Figure 18. | BO32 | Description | |------|----------------------------------------------------------------------------------------| | 00 | Branch if the condition is false. | | 01 | Branch if the condition is true. | | 10 | Decrement CTR <sub>M:63</sub> , then branch if the decremented CTR <sub>M:63</sub> ≠0 | | 11 | Decrement CTR <sub>M:63</sub> , then branch if the decremented CTR <sub>M:63</sub> =0. | Figure 18. BO32 field encodings Encodings for the BO16 field for VLE are shown in Figure 19. | BO16 | Description | |------|-----------------------------------| | 0 | Branch if the condition is false. | | 1 | Branch if the condition is true. | Figure 19. BO16 field encodings #### Branch [and Link] #### BD24-form #### Branch [and Link] #### BD8-form | e_b | target_addr | (LK=0) | |------|-------------|--------| | e_bl | target_addr | (LK=1) | | 30 | 0 | BD24 | LK | |----|---|------|----| | 0 | 6 | 7 | 31 | NIA $$\leftarrow_{\text{iea}}$$ CIA + EXTS(BD24 $\parallel$ 0b0) if LK then LR $\leftarrow_{\text{iea}}$ CIA + 4 target\_addr specifies the branch target address. The branch target address is the sum of BD24 || 0b0 sign-extended and the address of this instruction, with the high-order 32 bits of the branch target address set to 0 in 32-bit mode. If LK=1 then the effective address of the instruction following the Branch instruction is placed into the Link Register. #### **Special Registers Altered:** LR (if LK=1) #### Branch Conditional [and Link] BD15-form | e_bc | BO32,BI32,target_addr | (LK=0) | |-------|-----------------------|--------| | e_bcl | BO32,BI32,target_addr | (LK=1) | | 30 | 8 | BO32 | BI32 | BD15 | LK | |----|---|------|------|------|----| | 0 | 6 | 10 | 12 | 16 | 31 | ``` if (64\text{-bit mode}) then M \leftarrow 0 else M \leftarrow 32 if B032_0 then CTR_{M:63} \leftarrow CTR_{M:63} - 1 ctr_ok \leftarrow \neg B032_0 \mid ((CTR_{M:63} \neq 0) \oplus B032_1) cond_ok \leftarrow B032_0 \mid (CR_{B132+32} \equiv B032_1) if ctr_ok & cond_ok then NIA \leftarrow_{iea} (CIA + EXTS(BD15 \parallel 0b0)) else NIA \leftarrow_{iea} CIA + 4 if LK then LR \leftarrow_{iea} CIA + 4 ``` The Bl32 field specifies the Condition Register bit to be tested. The BO32 field is used to resolve the branch as described in Figure 18. *target\_addr* specifies the branch target address. The branch target address is the sum of BD15 || 0b0 sign-extended and the address of this instruction, with the high-order 32 bits of the branch target address set to 0 in 32-bit mode. If LK=1 then the effective address of the instruction following the *Branch* instruction is placed into the Link Register. #### **Special Registers Altered:** | CTR | (if BO32 <sub>0</sub> =1) | |-----|---------------------------| | LR | (if LK=1) | | se_b | target_addr | , | LK=0) | |-------|-------------|---|-------| | se_bl | target_addr | | LK=1) | | | | | | | 58 | 0 | LK | BD8 | | |----|---|----|-----|---| | 0 | 6 | 7 | 8 1 | Ę | NIA $$\leftarrow_{\text{iea}}$$ CIA + EXTS(BD8 || 0b0) if LK then LR $\leftarrow_{\text{iea}}$ CIA + 2 target\_addr specifies the branch target address. The branch target address is the sum of BD8 || 0b0 sign-extended and the address of this instruction, with the high-order 32 bits of the branch target address set to 0 in 32-bit mode. If LK=1 then the effective address of the instruction following the Branch instruction is placed into the Link Register. #### **Special Registers Altered:** LR (if LK=1) #### Branch Conditional Short Form BD8-form se\_bc BO16,BI16,target\_addr ``` \begin{array}{ll} {\tt cond\_ok} \; \leftarrow \; ({\tt CR_{BI16+32}} \; \equiv \; {\tt B016}) \\ {\tt if} \; \; {\tt cond\_ok} \; \; {\tt then} \\ & \qquad \qquad {\tt NIA} \; \leftarrow_{\tt iea} \; {\tt CIA} \; + \; {\tt EXTS} \, ({\tt BD8} \; \parallel \, {\tt 0b0}) \\ {\tt else} & \qquad {\tt NIA} \; \leftarrow_{\tt iea} \; {\tt CIA} \; + \; 2 \end{array} ``` The BI16 field specifies the Condition Register bit to be tested. The BO16 field is used to resolve the branch as described in Figure 19. *target\_addr* specifies the branch target address. The branch target address is the sum of BD8 || 0b0 sign-extended and the address of this instruction, with the high-order 32 bits of the branch target address set to 0 in 32-bit mode. #### **Special Registers Altered:** #### Branch to Count Register [and Link] C-form NIA $$\leftarrow_{\text{iea}} \text{CTR}_{\text{0:62}} \parallel \text{0b0}$$ if LK then LR $\leftarrow_{\text{iea}} \text{CIA} + 2$ The branch target address is $CTR_{0:62} \parallel 0b0$ with the high-order 32 bits of the branch target address set to 0 in 32-bit mode. If LK=1 then the effective address of the instruction following the *Branch* instruction is placed into the Link Register. #### **Special Registers Altered:** #### Branch to Link Register [and Link]C-form NIA $$\leftarrow_{\rm iea}$$ LR $_{\rm 0:62}$ $||$ 0b0 if LK then LR $\leftarrow_{\rm iea}$ CIA + 2 The branch target address is $LR_{0:62} \parallel$ 0b0 with the high-order 32 bits of the branch target address set to 0 in 32-bit mode. If LK=1 then the effective address of the instruction following the *Branch* instruction is placed into the Link Register. #### **Special Registers Altered:** ### 4.3 System Linkage Instructions The System Linkage instructions enable the program to call upon the system to perform a service and provide a means by which the system can return from performing a service or from processing an interrupt. System Linkage instructions defined by the VLE category are identical in semantics to System Linkage instructions defined in Book I and Book III-E with the exception of the LEV field, but are encoded differently. se\_sc provides the same functionality as the Book I (and Book III-E) instruction sc without the LEV field. se\_rfi, se\_rfci, se\_rfdi, and se\_rfmci provide the same functionality as the Book III-E instructions rfi, rfci, rfdi, and rfmci respectively. #### System Call C-form se\_sc ``` \begin{array}{l} \text{SRR1} \leftarrow_{\text{iea}} \text{MSR} \\ \text{SRR0} \leftarrow \text{CIA+2} \\ \text{NIA} \leftarrow_{\text{iea}} \text{IVPR}_{0:47} \parallel \text{IVOR8}_{48:59} \parallel \text{0b0000} \\ \text{MSR} \leftarrow \text{new value (see below)} \end{array} ``` The effective address of the instruction following the *System Call* instruction is placed into SRR0. The contents of the MSR are copied into SRR1. Then a System Call interrupt is generated. The interrupt causes the MSR to be set as described in Section 5.6 of Book III-E. The interrupt causes the next instruction to be fetched from effective address IVPR<sub>0:47</sub> || IVOR8<sub>48:59</sub> || 0b0000. This instruction is context synchronizing. #### **Special Registers Altered:** SRR0 SRR MSR Illegal C-form se\_illegal **se\_illegal** is used to request an Illegal Instruction exception. The behavior is the same as if an illegal instruction was executed. This instruction is context synchronizing. #### **Special Registers Altered:** SRR0 SRR1 MSR ESR #### Return From Machine Check Interrupt Cform se\_rfmci $$\begin{aligned} & \text{MSR} \leftarrow \text{MCSRR1} \\ & \text{NIA} \leftarrow_{\text{iea}} & \text{MCSRR0}_{\text{0:62}} \parallel \text{0b0} \end{aligned}$$ The **se\_rfmci** instruction is used to return from a machine check class interrupt, or as a means of establishing a new context and synchronizing on that new context simultaneously. The contents of MCSRR1 are placed into the MSR. If the new MSR value does not enable any pending exceptions, then the next instruction is fetched, under control of the new MSR value, from the address MCSRR0<sub>0:62</sub>||0b0. If the new MSR value enables one or more pending exceptions, the interrupt associated with the highest priority pending exception is generated; in this case the values placed into the save/restore registers by the interrupt processing mechanism (see Chapter 5 of Book III-E) is the address and MSR value of the instruction that would have been executed next had the interrupt not occurred (that is, the address in MCSRR0 at the time of the execution of the se\_rfmci). This instruction is privileged and context synchronizing. #### **Special Registers Altered:** **MSR** #### Return From Critical Interrupt C-form se\_rfci $$\begin{aligned} & \text{MSR} \leftarrow \text{CSRR1} \\ & \text{NIA} \leftarrow_{\text{iea}} \text{CSRR0}_{0:62} \parallel \text{0b0} \end{aligned}$$ The **se\_rfci** instruction is used to return from a critical class interrupt, or as a means of establishing a new context and synchronizing on that new context simultaneously. The contents of CSRR1 are placed into the MSR. If the new MSR value does not enable any pending exceptions, then the next instruction is fetched, under control of the new MSR value, from the address CSRR0<sub>0:62</sub>||0b0. If the new MSR value enables one or more pending exceptions, the interrupt associated with the highest priority pending exception is generated; in this case the values placed into the save/restore registers by the interrupt processing mechanism (see Chapter 5 of Book III-E) is the address and MSR value of the instruction that would have been executed next had the interrupt not occurred (that is, the address in CSRR0 at the time of the execution of the **se\_rfci**). This instruction is privileged and context synchronizing. #### **Special Registers Altered:** **MSR** #### Return From Interrupt #### C-form #### Return From Debug Interrupt C-form se\_rfi $$\begin{aligned} & \text{MSR} & \leftarrow & \text{SRR1} \\ & \text{NIA} & \leftarrow_{\text{iea}} & \text{SRR0}_{\text{0:62}} \parallel \text{0b0} \end{aligned}$$ The **se\_rfi** instruction is used to return from a non-critical class interrupt, or as a means of establishing a new context and synchronizing on that new context simultaneously. The contents of SRR1 are placed into the MSR. If the new MSR value does not enable any pending exceptions, then the next instruction is fetched under control of the new MSR value from the address SRR0<sub>0:62</sub>||0b0. If the new MSR value enables one or more pending exceptions, the interrupt associated with the highest priority pending exception is generated; in this case the values placed into the save/restore registers by the interrupt processing mechanism (see Chapter 5 of Book III-E) is the address and MSR value of the instruction that would have been executed next had the interrupt not occurred (that is, the address in SRR0 at the time of the execution of the **se\_rfi**). This instruction is privileged and context synchronizing. #### **Special Registers Altered:** MSR se\_rfdi $$\begin{array}{l} \text{MSR} \leftarrow \text{DSRR1} \\ \text{NIA} \leftarrow_{\text{iea}} \text{DSRR0}_{32:62} \parallel \text{Ob0} \end{array}$$ The **se\_rfdi** instruction is used to return from a debug class interrupt, or as a means of establishing a new context and synchronizing on that new context simultaneously. The contents of DSRR1 are placed into the MSR. If the new MSR value does not enable any pending exceptions, then the next instruction is fetched, under control of the new MSR value, from the address DSRR0<sub>0:62</sub>||0b0. If the new MSR value enables one or more pending exceptions, the interrupt associated with the highest priority pending exception is generated; in this case the value placed into the save/restore registers by the interrupt processing mechanism (see Chapter 5 of Book III-E) is the address of the instruction that would have been executed next had the interrupt not occurred (that is, the address in DSRR0 at the time of the execution of **se\_rfdi**). This instruction is privileged and context synchronizing. #### **Special Registers Altered:** **MSR** #### **Corequisite Categories:** Embedded.Enhanced Debug ### 4.4 Condition Register Instructions Condition Register instructions are provided to transfer values to and from various portions of the CR. Category VLE does not introduce any additional functionality beyond that defined in Book I for CR operations, but does remap the CR-logical and *mcrf* instruction functionality into primary opcode 31. These instructions operate identically to the Book I instructions, but are encoded differently. #### Condition Register AND XL-form e\_crand BT,BA,BB $$CR_{BT+32} \leftarrow CR_{BA+32} \& CR_{BB+32}$$ The bit in the Condition Register specified by BA+32 is ANDed with the bit in the Condition Register specified by BB+32, and the result is placed into the bit in the Condition Register specified by BT+32. #### **Special Registers Altered:** CR<sub>BT+32</sub> ## Condition Register AND with Complement XL-form e\_crandc BT,BA,BB | Ī | 31 | BT | BA | BB | 129 | / | |---|----|----|----|----|-----|----| | | 0 | 6 | 11 | 16 | 21 | 31 | $$CR_{BT+32} \leftarrow CR_{BA+32} \& \neg CR_{BB+32}$$ The bit in the Condition Register specified by BA+32 is ANDed with the one's complement of the bit in the Condition Register specified by BB+32, and the result is placed into the bit in the Condition Register specified by BT+32. #### **Special Registers Altered:** CR<sub>BT+32</sub> #### Condition Register Equivalent XL-form e\_creqv BT,BA,BB | 31 | BT | BA | BB | 289 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$CR_{BT+32} \leftarrow CR_{BA+32} \equiv CR_{BB+32}$$ The bit in the Condition Register specified by BA+32 is XORed with the bit in the Condition Register specified by BB+32, and the complemented result is placed into the bit in the Condition Register specified by BT+32. #### **Special Registers Altered:** CR<sub>BT+32</sub> #### Condition Register NAND XL-form e\_crnand BT,BA,BB | 31 | BT | BA | BB | 225 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$\mathtt{CR}_{\mathtt{BT+32}} \; \leftarrow \; \neg \; (\mathtt{CR}_{\mathtt{BA+32}} \; \& \; \mathtt{CR}_{\mathtt{BB+32}})$$ The bit in the Condition Register specified by BA+32 is ANDed with the bit in the Condition Register specified by BB+32, and the complemented result is placed into the bit in the Condition Register specified by BT+32. #### **Special Registers Altered:** CR<sub>BT+32</sub> #### **Condition Register NOR** XL-form AL-IOIII e\_crnor BT,BA,BB | 31 | BT | BA | BB | 33 | / | |----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$CR_{BT+32} \leftarrow \neg (CR_{BA+32} \mid CR_{BB+32})$$ The bit in the Condition Register specified by BA+32 is ORed with the bit in the Condition Register specified by BB+32, and the complemented result is placed into the bit in the Condition Register specified by BT+32. #### **Special Registers Altered:** CR<sub>BT+32</sub> ## Condition Register OR with Complement XL-form e\_crorc BT,BA,BB | Ī | 31 | BT | BA | BB | 417 | / | |---|----|----|----|----|-----|----| | ŀ | 0 | 6 | 11 | 16 | 21 | 31 | $$CR_{BT+32} \leftarrow CR_{BA+32} \mid \neg CR_{BB+32}$$ The bit in the Condition Register specified by BA+32 is ORed with the complement of the bit in the Condition Register specified by BB+32, and the result is placed into the bit in the Condition Register specified by BT+32. #### **Special Registers Altered:** CR<sub>BT+32</sub> #### Move CR Field XL-form e\_mcrf BF,BFA | 31 | RF | // | BFA | ///// | 16 | / | |-----|----|-----------|-----|-------|----|----| | J - | 01 | <i>''</i> | | 11111 | 1 | | | 0 | 6 | 9 | 11 | 16 | 21 | 31 | $CR_{4xBF+32:4xBF+35} \leftarrow CR_{4xBFA+32:4xBFA+35}$ The contents of Condition Register field BFA are copied to Condition Register field BF. #### **Special Registers Altered:** CR field BF #### Condition Register OR XL-form e\_cror BT,BA,BB | 31 | BT | BA | BB | 449 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$CR_{BT+32} \leftarrow CR_{BA+32} \mid CR_{BB+32}$$ The bit in the Condition Register specified by BA+32 is ORed with the bit in the Condition Register specified by BB+32, and the result is placed into the bit in the Condition Register specified by BT+32. #### **Special Registers Altered:** $\mathsf{CR}_{\mathsf{BT+32}}$ #### **Condition Register XOR** XL-form e\_crxor BT,BA,BB | 31 | BT | BA | BB | 193 | / | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $CR_{BT+32} \leftarrow CR_{BA+32} \oplus CR_{BB+32}$ The bit in the Condition Register specified by BA+32 is XORed with the bit in the Condition Register specified by BB+32, and the result is placed into the bit in the Condition Register specified by BT+32. #### **Special Registers Altered:** CR<sub>BT+32</sub> ### **Chapter 5. Fixed-Point Instructions** | 5.1 Fixed-Point Load Instructions 783 | 5.7 Fixed-Point Trap Instructions 799 | |---------------------------------------------|--------------------------------------------| | 5.2 Fixed-Point Store Instructions 787 | 5.8 Fixed-Point Select Instruction 799 | | 5.3 Fixed-Point Load and Store with Byte | 5.9 Fixed-Point Logical, Bit, and Move | | Reversal Instructions 790 | Instructions 800 | | 5.4 Fixed-Point Load and Store Multiple | 5.10 Fixed-Point Rotate and Shift Instruc- | | Instructions | tions | | 5.5 Fixed-Point Arithmetic Instructions 791 | 5.11 Move To/From System Register | | 5.6 Fixed-Point Compare and Bit Test | Instructions | | Instructions | | | | | This section lists the fixed-point instructions supported by category VLE. #### 5.1 Fixed-Point Load Instructions The fixed-point *Load* instructions compute the effective address (EA) of the memory to be accessed as described in Section 2.1, "Data Storage Addressing Modes" The byte, halfword, word, or doubleword in storage addressed by EA is loaded into RT or RZ. Category VLE supports both Big- and Little-Endian byte ordering for data accesses. Some fixed-point load instructions have an update form in which RA is updated with the EA. For these forms, if RA≠0 and RA≠RT, the EA is placed into RA and the memory element (byte, halfword, word, or doubleword) addressed by EA is loaded into RT. If RA=0 or RA =RT, the instruction form is invalid. This is the same behavior as specified for load with update instructions in Book I. The fixed-point *Load* instructions from Book I, *Ibzx*, *Ibzux*, *Ihzx*, *Ihzxx*, *Iwzx*, and *Iwzux* are available while executing in VLE mode. The mnemonics, decoding, and semantics for these instructions are identical to those in Book I. See Section 3.3.2 of Book I for the instruction definitions. The fixed-point *Load* instructions from Book I, *Iwax*, *Iwaux*, *Idx*, and *Idux* are available while executing in VLE mode on 64-bit implementations. The mnemonics, decoding, and semantics for these instructions are identical to those in Book I. See Section 3.3.2 of Book Ifor the instruction definitions. #### Load Byte and Zero #### D-form #### Load Byte and Zero Short Form SD4-form e\_lbz RT,D(RA) | Ī | 12 | RT | RA | D | | |---|----|----|----|----|----| | ( | ) | 6 | 11 | 16 | 31 | if RA = 0 then b $$\leftarrow$$ 0 else b $\leftarrow$ (RA) EA $\leftarrow$ b + EXTS(D) RT $\leftarrow$ 560 || MEM(EA, 1) Let the effective address (EA) be the sum (RA|0) + D. The byte in storage addressed by EA is loaded into $RT_{56:63}$ . $RT_{0:55}$ are set to 0. #### **Special Registers Altered:** None | 80 | SD4 | RZ | RX | |----|-----|----|----| RZ,SD4(RX) EA $$\leftarrow$$ (RX) + <sup>60</sup>0 || SD4 RZ $\leftarrow$ <sup>56</sup>0 || MEM(EA, 1) se\_lbz Let the effective address (EA) be the sum RX + SD4. The byte in storage addressed by EA is loaded into $RT_{56:63}$ . $RT_{0:55}$ are set to 0. #### **Special Registers Altered:** None #### Load Byte and Zero with Update D8-form #### e\_lbzu RT,D8(RA) | 06 | RT | RA | 0 | D8 | | |----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 24 | 31 | EA $$\leftarrow$$ (RA) + EXTS(D8) RT $\leftarrow$ <sup>56</sup>0 || MEM(EA, 1) RA $\leftarrow$ EA Let the effective address (EA) be the sum (RA) + D8. The byte in storage addressed by EA is loaded into $RT_{56:63}$ . $RT_{0:55}$ are set to 0. EA is placed into register RA. If RA=0 or RA=RT, the instruction form is invalid. #### **Special Registers Altered:** None #### Load Halfword Algebraic D-form e\_lha RT,D(RA) | 14 | RT | RA | D | |----|----|----|-------| | 0 | 6 | 11 | 16 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTS(D) RT \leftarrow EXTS(MEM(EA, 2)) ``` Let the effective address (EA) be the sum (RA|0) + D. The halfword in storage addressed by EA is loaded into RT $_{48:63}$ . RT $_{0:47}$ are filled with a copy of bit 0 of the loaded halfword. #### **Special Registers Altered:** None #### Load Halfword and Zero D-form #### e\_lhz RT,D(RA) | 22 | RT | RA | D | |----|----|----|-------| | 0 | 6 | 11 | 16 31 | if RA = 0 then b $$\leftarrow$$ 0 else b $\leftarrow$ (RA) EA $\leftarrow$ b + EXTS(D) RT $\leftarrow$ 480 || MEM(EA, 2) Let the effective address (EA) be the sum (RA $|0\rangle$ + D. The halfword in storage addressed by EA is loaded into RT<sub>48:63</sub>. RT<sub>0:47</sub> are set to 0. #### **Special Registers Altered:** None ## Load Halfword and Zero Short Form SD4-form se\_lhz RZ,SD4(RX) $$EA \leftarrow (RX) + (^{59}0 \parallel SD4 \parallel 0)$$ $RZ \leftarrow ^{48}0 \parallel MEM(EA, 2)$ Let the effective address (EA) be the sum (RX) + (SD4 $\parallel$ 0). The halfword in storage addressed by EA is loaded into RZ<sub>48:63</sub>. RZ<sub>0:47</sub> are set to 0. #### **Special Registers Altered:** ## Load Halfword Algebraic with Update D8-form #### e\_lhau RT,D8(RA) | 06 | RT | RA | 03 | D8 | ٦ | |----|----|----|----|------|---| | 0 | 6 | 11 | 16 | 24 3 | 1 | ``` EA \leftarrow (RA) + EXTS(D8) RT \leftarrow EXTS(MEM(EA, 2)) RA \leftarrow EA ``` Let the effective address (EA) be the sum (RA) + D8. The halfword in storage addressed by EA is loaded into RT $_{48:63}$ . RT $_{0:47}$ are filled with a copy of bit 0 of the loaded halfword. EA is placed into RA. If RA=0 or RA=RT, the instruction form is invalid. #### **Special Registers Altered:** None #### Load Word and Zero #### **D-form** #### e\_lwz RT,D(RA) ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTS(D) RT \leftarrow 320 || MEM(EA, 4) ``` Let the effective address (EA) be the sum (RA|0) + D. The word in storage addressed by EA is loaded into $RT_{32:63}$ . $RT_{0:31}$ are set to 0. #### **Special Registers Altered:** None #### Load Halfword and Zero with Update D8-form e\_lhzu RT,D8(RA) | 06 | RT | RΔ | 01 | אם | | |----|----|----|----|----|----| | 00 | 1 | KA | 01 | 00 | | | 0 | 6 | 11 | 16 | 24 | 31 | ``` EA \leftarrow (RA) + EXTS(D8) RT \leftarrow <sup>48</sup>0 || MEM(EA, 2)) RA \leftarrow EA ``` Let the effective address (EA) be the sum (RA) + D8. The halfword in storage addressed by EA is loaded into $RT_{48:63}$ . $RT_{0:47}$ are set to 0. EA is placed into register RA. If RA=0 or RA=RT, the instruction form is invalid. #### **Special Registers Altered:** None #### Load Word and Zero Short FormSD4-form se\_lwz RZ,SD4(RX) | 12 | SD4 | RZ | RX | | |----|-----|----|----|----| | 0 | 4 | 8 | 12 | 15 | EA $$\leftarrow$$ (RX) + ( $^{58}$ 0 || SD4 || $^{2}$ 0) RZ $\leftarrow$ $^{32}$ 0 || MEM(EA, 2) Let the effective address (EA) be the sum (RX) + (SD4 $\parallel$ 00). The word in storage addressed by EA is loaded into RZ<sub>32:63</sub>. RZ<sub>0:31</sub> are set to 0. #### **Special Registers Altered:** ### Load Word and Zero with Update D8-form e\_lwzu RT,D8(RA) ``` EA \leftarrow (RA) + EXTS(D8) RT \leftarrow <sup>32</sup>0 || MEM(EA, 4)) RA \leftarrow EA ``` Let the effective address (EA) be the sum (RA) + D8. The word in storage addressed by EA is loaded into $RT_{32:63}$ . $RT_{0:31}$ are set to 0. EA is placed into register RA. If RA=0 or RA=RT, the instruction form is invalid. #### **Special Registers Altered:** #### 5.2 Fixed-Point Store Instructions The fixed-point *Store* instructions compute the EA of the memory to be accessed as described in Section 2.1, "Data Storage Addressing Modes". The contents of register RS or RZ are stored into the byte, halfword, word, or doubleword in storage addressed by EA. Category VLE supports both Big- and Little-Endian byte ordering for data accesses. Some fixed-point store instructions have an update form, in which register RA is updated with the effective address. For these forms, the following rules (from Book I) apply. If RA≠0, the effective address is placed into register RA. If RS=RA, the contents of register RS are copied to the target memory element and then EA is placed into register RA (RS). The fixed-point *Store* instructions from Book I, *stbx*, *stbux*, *sthx*, *sthux*, *stwx*, and *stwux* are available while executing in VLE mode. The mnemonics, decoding, and semantics for those instructions are identical to those in Book I; see Section 3.3.3 of Book I for the instruction definitions. The fixed-point *Store* instructions from Book I, *stdx* and *stdux* are available while executing in VLE mode on 64-bit implementations. The mnemonics, decoding, and semantics for these instructions are identical to those in Book I; see Section 3.3.3 of Book I for the instruction definitions. #### Store Byte #### **D-form** #### Store Byte Short Form SD4-form e\_stb RS,D(RA) ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTS(D) MEM(EA, 1) \leftarrow (RS)<sub>56:63</sub> ``` Let the effective address (EA) be the sum (RA|0)+ D. (RS) $_{56:63}$ are stored in the byte in storage addressed by EA. #### **Special Registers Altered:** None $$EA \leftarrow (RX) + EXTS(SD4)$$ $MEM(EA, 1) \leftarrow (RZ)_{56:63}$ Let the effective address (EA) be the sum (RX) + SD4. $(RZ)_{56:63}$ are stored in the byte in storage addressed by EA. #### **Special Registers Altered:** #### Store Byte with Update D8-form e\_stbu RS,D8(RA) | 06 | RS | RA | 04 | D8 | | |----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 24 | 31 | $$EA \leftarrow (RA) + EXTS(D8)$$ $MEM(EA, 1) \leftarrow (RS)_{56:63}$ $RA \leftarrow EA$ Let the effective address (EA) be the sum (RA) + D8. $(RS)_{56:63}$ are stored in the byte in storage addressed by EA. EA is placed into register RA. If RA=0, the instruction form is invalid. #### **Special Registers Altered:** None #### Store Halfword **D-form** #### Store Halfword Short Form SD4-form e\_sth RS,D(RA) | 23 | RS | RA | D | | |----|----|----|----|----| | 0 | 6 | 11 | 16 | 31 | if RA = 0 then b $$\leftarrow$$ 0 else b $\leftarrow$ (RA) EA $\leftarrow$ b + EXTS(D) MEM(EA, 2) $\leftarrow$ (RS)<sub>48:63</sub> Let the effective address (EA) be the sum (RA|0) + D. (RS)<sub>48:63</sub> are stored in the halfword in storage addressed by EA. #### **Special Registers Altered:** None ## RZ,SD4(RX) | 11 | SD4 | RZ | RX | |----|-----|----|----| EA $$\leftarrow$$ (RX) + (<sup>59</sup>0 || SD4 || 0) MEM(EA, 2) $\leftarrow$ (RZ)<sub>48:63</sub> Let the effective address (EA) be the sum (RX) + (SD4 $\parallel$ 0). (RZ)<sub>48:63</sub> are stored in the halfword in storage addressed by EA. #### **Special Registers Altered:** None se\_sth #### Store Halfword with Update D8-form e\_sthu RS,D8(RA) | 06 | RS | RA | 05 | D8 | | |----|----|----|----|------|----| | 0 | 6 | 11 | 16 | 24 : | 31 | EA $$\leftarrow$$ (RA) + EXTS(D8) MEM(EA, 2) $\leftarrow$ (RS)<sub>48:63</sub> Let the effective address (EA) be the sum (RA) + D8. (RS)<sub>48:63</sub> are stored in the halfword in storage addressed by EA. EA is placed into register RA. If RA=0, the instruction form is invalid. #### **Special Registers Altered:** #### Store Word #### D-form #### Store Word Short Form #### SD4-form e\_stw RS,D(RA) | Ī | 21 | RS | RA | D | П | |---|----|----|----|----|----| | ١ | 0 | 6 | 11 | 16 | 31 | if RA = 0 then b $$\leftarrow$$ 0 else b $\leftarrow$ (RA) EA $\leftarrow$ b + EXTS(D) MEM(EA, 4) $\leftarrow$ (RS)<sub>32:63</sub> Let the effective address (EA) be the sum (RA|0) + D. (RS) $_{32:63}$ are stored in the word in storage addressed by EA. #### **Special Registers Altered:** None #### Store Word with Update D8-form e\_stwu RS,D8(RA) | 06 | RS | RA | 06 | D8 | | |----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 24 | 31 | EA $$\leftarrow$$ (RA) + EXTS(D8) MEM(EA, 4) $\leftarrow$ (RS)<sub>32:63</sub> RA $\leftarrow$ EA Let the effective address (EA) be the sum (RA) + D8. (RS) $_{32:63}$ are stored in the word in storage addressed by EA. EA is placed into register RA. If RA=0, the instruction form is invalid. #### **Special Registers Altered:** None #### se\_stw RZ,SD4(RX) EA $$\leftarrow$$ (RX) + (<sup>58</sup>0 $\parallel$ SD4 $\parallel$ <sup>2</sup>0) MEM(EA, 4) $\leftarrow$ (RZ)<sub>32:63</sub> Let the effective address (EA) be the sum (RX)+ (SD4 $\parallel$ 00). (RZ)<sub>32:63</sub> are stored in the word in storage addressed by EA. #### **Special Registers Altered:** # 5.3 Fixed-Point Load and Store with Byte Reversal Instructions The fixed-point Load with Byte Reversal and Store with Byte Reversal instructions from Book I, Ihbrx, Iwbrx, sthbrx, and stwbrx are available while executing in VLE mode. The mnemonics, decoding, and semantics for these instructions are identical to those in Book I. See Section 3.3.4 of Book I for the instruction definitions. # 5.4 Fixed-Point Load and Store Multiple Instructions The *Load/Store Multiple* instructions have preferred forms; see Section 1.8.1 of Book I. In the preferred forms storage alignment satisfies the following rule. The combination of the EA and RT (RS) is such that the low-order byte of GPR 31 is loaded (stored) from (into) the last byte of an aligned quadword in storage. #### Load Multiple Word D8-form #### Store Multiple Word D8-form e\_lmw RT,D8(RA) | 06 | RT | RA | 08 | D8 | | |----|----|----|----|----|----| | o | 6 | 11 | 16 | 24 | 31 | | if $RA = 0$ then $b \leftarrow 0$ | |--------------------------------------------------| | else b $\leftarrow$ (RA) | | $EA \leftarrow b + EXTS(D8)$ | | $r \leftarrow RT$ | | do while r ≤31 | | $GPR(r) \leftarrow {}^{32}0 \parallel MEM(EA,4)$ | | $r \leftarrow r + 1$ | | $FA \leftarrow FA + 4$ | Let n = (32-RT). Let the effective address (EA) be the sum (RA|0) + D8. n consecutive words starting at EA are loaded into the low-order 32 bits of GPRs RT through 31. The high-order 32 bits of these GPRs are set to zero. If RA is in the range of registers to be loaded, including the case in which RA = 0, the instruction form is invalid. #### **Special Registers Altered:** None e\_stmw RS,D8(RA) | | 06 | RS | RA | 9 | D8 | | |---|----|----|----|----|----|----| | 0 | | 6 | 11 | 16 | 24 | 31 | ``` if RA = 0 then b \leftarrow 0 else b \leftarrow (RA) EA \leftarrow b + EXTS(D8) r \leftarrow RS do while r \le 31 MEM(EA,4) \leftarrow GPR(r)_{32:63} r \leftarrow r + 1 EA \leftarrow EA + 4 ``` Let n = (32-RS). Let the effective address (EA) be the sum (RA|0) + D8. n consecutive words starting at EA are stored from the low-order 32 bits of GPRs RS through 31. #### **Special Registers Altered:** ## 5.5 Fixed-Point Arithmetic Instructions The fixed-point *Arithmetic* instructions use the contents of the GPRs as source operands, and place results into GPRs, into status bits in the XER and into CR0. The fixed-point *Arithmetic* instructions treat source operands as signed integers unless the instruction is explicitly identified as performing an unsigned operation The **e\_add2i.** instruction and other *Arithmetic* instructions with Rc=1 set the first three bits of CR0 to characterize the result placed into the target register. In 64-bit mode, these bits are set by signed comparison of the result to 0. In 32-bit mode, these bits are set by signed comparison of the low-order 32 bits of the result to **e\_addic**[.] and **e\_subfic**[.] always set CA to reflect the carry out of bit 0 in 64-bit mode and out of bit 32 in 32-bit mode. The fixed-point *Arithmetic* instructions from Book I, add[.], addo[.], addc[.], addc[.], addco[.], adde[.], addeo[.], addme[.], addmeo[.], addmeo[.], addmeo[.], divwo[.], mulhw[.], mulhwo[.], mulhwo[.], mulhwo[.], subfo[.], subfo[.], subfo[.], subfo[.], subfoe[.], subfreo[.], subf The fixed-point *Arithmetic* instructions from Book I, *mulld[.]*, *mulld[.]*, *mulld[.]*, *divd[.]*, *divd[.]*, *divd[.]*, and *divduo[.]* are available while executing in VLE mode on 64-bit implementations. The mnemonics, decoding, and semantics for those instructions are identical to these in Book I; see Section 3.3.8 of Book I for the instruction definitions. #### Add Short Form #### RR-form #### Add Immediate e\_add16i 07 **D-form** SI se\_add RX,RY $$RX \leftarrow (RX) + (RY)$$ The sum (RX) + (RY) is placed into register RX. #### **Special Registers Altered:** None ## RT ← (RA) + EXTS(SI) The sum (RA) + SI is placed into register RT. RA #### **Special Registers Altered:** RT,RA,SI RT None ## Add (2 operand) Immediate and Record I16A-form e\_add2i. RA,si | 28 | si | RA | 17 | si | | |----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | The sum (RA) + si is placed into register RT. #### **Special Registers Altered:** CR0 ## Add (2 operand) Immediate Shifted I16A-form e\_add2is RA,si | 28 | si | RA | 18 | si | |----|----|----|----|------| | 0 | 6 | 11 | 16 | 21 3 | The sum (RA) + (si $\parallel$ 0x0000) is placed into register RA. #### **Special Registers Altered:** None #### Add Scaled Immediate #### SCI8-form | e_addi | RT,RA,sci8 | (Rc=0) | |---------|------------|--------| | e addi. | RT.RA.sci8 | (Rc=1) | | 06 | RT | RA | 8 | Rc | F | SCL | U | 18 | | |----|----|-----|----|----|----|-----|----|----|---| | 0 | c | | 16 | 20 | | | 24 | - | | | U | О | 1.1 | 10 | 20 | 21 | 22 | 24 | 3 | 1 | sci8 $$\leftarrow$$ <sup>56-SCL×8</sup>F || UI8 || SCL×8F RT $\leftarrow$ (RA) + sci8 The sum (RA) + sci8 is placed into register RT. #### **Special Registers Altered:** CR0 (if Rc=1) #### Add Immediate Short Form OIM5-form se\_addi RX,oimm oimm $$\leftarrow$$ (590 || OIM5) + 1 RX $\leftarrow$ (RX) + oimm The sum (RX) + oimm is placed into RX. The value of oimm must be in the range of 1 to 32. #### **Special Registers Altered:** #### Add Scaled Immediate Carrying #### SCI8-form sci8 $$\leftarrow$$ <sup>56-SCLX8</sup>F || UI8 || SCLX8F RT $\leftarrow$ (RA) + sci8 The sum (RA) + sci8 is placed into register RT. #### **Special Registers Altered:** #### Subtract #### RR-form #### Subtract From Short Form RR-form se\_sub RX,RY $$RX \leftarrow (RX) + \neg (RY) + 1$$ The sum $(RX) + \neg (RY) + 1$ is placed into register RX. #### **Special Registers Altered:** None ## RX,RY $$RX \leftarrow \neg (RX) + (RY) + 1$$ The sum $\neg(RX) + (RY) + 1$ is placed into register RX. #### **Special Registers Altered:** None se\_subf #### Subtract From Scaled Immediate Carrying SCI8-form | e_subfic | RT,RA,sci8 | (Rc=0) | |-----------|------------|--------| | e subfic. | RT.RA.sci8 | (Rc=1) | | | 06 | RT | RA | 11 | Rc | F | SCL | | UI8 | | |---|----|----|----|----|----|----|-----|----|-----|---| | 0 | | 6 | 11 | 16 | 20 | 21 | 22 | 24 | | 3 | sci8 $$\leftarrow$$ <sup>56-SCLX8</sup>F || UI8 || SCLX8F RT $\leftarrow$ $\neg$ (RA) + sci8 + 1 The sum $\neg$ (RA) + sci8 + 1 is placed into register RT. #### **Special Registers Altered:** | CR0 | (if Rc=1) | |-----|-----------| | CA | | #### Subtract Immediate OIM5-form | se_subi | RX,oimm | (Rc=0) | |----------|---------|--------| | se_subi. | RX,oimm | (Rc=1) | | 09 | Rc | OIM5 | RX | | |----|----|------|----|----| | 0 | 6 | 7 | 12 | 15 | $$\begin{array}{l} \text{oimm} \leftarrow (^{59}0 \mid \mid \text{OIM5}) + 1 \\ \text{RX} \leftarrow (\text{RX}) + \neg \text{oimm} + 1 \end{array}$$ The sum $(RA) + \neg oimm + 1$ is placed into register RX. The value of oimm must be in the range 1 to 32. #### **Special Registers Altered:** #### Multiply Low Scaled Immediate SCI8-form e\_mulli RT,RA,sci8 | 06 | RT | RA | 20 | F | SCL | | UI8 | | |----|----|----|----|----|-----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 22 | 24 | | 31 | $$\begin{array}{l} \text{sci8} \leftarrow ^{56\text{-SCLX8}}\text{F} \mid\mid \text{UI8} \mid\mid ^{\text{SCLX8}}\text{F} \\ \text{prod}_{0:127} \leftarrow \text{(RA)} \times \text{sci8} \\ \text{RT} \leftarrow \text{prod}_{64:127} \end{array}$$ The 64-bit first operand is (RA). The 64-bit second operand is the sci8 operand. The low-order 64-bits of the 128-bit product of the operands are placed into register RT. Both operands and the product are interpreted as signed integers. #### **Special Registers Altered:** None #### Multiply Low Word Short Form RR-form se\_mullw RX,RY $$RX \leftarrow (RX)_{32:63} \times (RY)_{32:63}$$ The 32-bit operands are the low-order 32-bits of RX and of RY. The 64-bit product of the operands is placed into register RX. Both operands and the product are interpreted as signed integers. #### **Special Registers Altered:** None ## Multiply (2 operand) Low Immediate I16A-form e\_mull2i RA,si $$\begin{array}{l} \operatorname{prod}_{0:127} \leftarrow (\operatorname{RA}) \ \times \ \operatorname{EXTS}(\operatorname{si}) \\ \operatorname{RA} \ \leftarrow \ \operatorname{prod}_{64:127} \end{array}$$ The 64-bit first operand is (RA). The 64-bit second operand is the sign-extended value of the si operand. The low-order 64-bits of the 128-bit product of the operands are placed into register RA. Both operands and the product are interpreted as signed integers. #### **Special Registers Altered:** None #### **Negate Short Form** R-form se\_neg RX $$RX \leftarrow \neg (RX) + 1$$ The sum $\neg(RX) + 1$ is placed into register RX If the processor is in 64-bit mode and register RX contains the most negative 64-bit number (0x8000\_0000\_0000\_0000), the result is the most negative 64-bit number. Similarly, if the processor is in 32-bit mode and register RX contains the most negative 32-bit number (0x8000\_0000), the result is the most negative 32-bit number. #### **Special Registers Altered:** ### 5.6 Fixed-Point Compare and Bit Test Instructions The fixed-point *Compare* instructions compare the contents of register RA or register RX with one of the following: - The value of the scaled immediate field *sci8* formed from the F, UI8, and SCL fields as: sci8 ← <sup>56-SCLX8</sup>F | UI8 | | SCLX8</sup>F - The zero-extended value of the UI field - The zero-extended value of the UI5 field - The sign-extended value of the SI field - The contents of register RB or register RY. The following comparisons are signed: **e\_cmph**, **e\_cmpi**, **e\_cmp16i**, **e\_cmph16i**, **se\_cmp**, **se\_cmph**, and **se\_cmpi**. The following comparisons are unsigned: **e\_cmpll**, **e\_cmpl16i**, **e\_cmpl16i**, **se\_cmpl**, and **se\_cmpl**. The fixed-point *Bit Test* instruction tests the bit specified by the UI5 instruction field and sets the CR0 field as follows. #### **Bit Name Description** - 0 LT Always set to 0 - 1 GT $RX_{ui5} = 1$ - 2 EQ $RX_{ui5} = 0$ - 3 SO Summary overflow from the XER The fixed-point *Compare* instructions from Book I, *cmp* and *cmpI* are available while executing in VLE mode. The mnemonics, decoding, and semantics for these instructions are identical to those in Book I; see Section 3.3.9 of Book I for the instruction definitions. #### Bit Test Immediate #### IM5-form #### Compare Immediate Word #### I16A-form se\_btsti RX,UI5 ``` a \leftarrow UI5 b \leftarrow a+320 || 1 || 31-a0 c \leftarrow (RX) & b if c = 0 then d \leftarrow 0b001 else d \leftarrow 0b010 CR0 \leftarrow d || XER<sub>SO</sub> ``` Bit UI5+32 of register RX is tested for equality to '0' and the result is recorded in CR0. EQ is set if the tested bit is 0, LT is cleared, and GT is set to the inverse value of EQ. #### Special Registers Altered: CR0 e\_cmp16i RA,si | 28 | si | RA | 19 | si | | |----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` b ← EXTS(si) if (RA)_{32:63} < b_{32:63} then c ← 0b100 if (RA)_{32:63} > b_{32:63} then c ← 0b010 if (RA)_{32:63} = b_{32:63} then c ← 0b001 CR0 ← c || XER_{SO} ``` The low-order 32 bits of register RA are compared with si, treating operands as signed integers. The result of the comparison is placed into CR0. #### **Special Registers Altered:** CR<sub>0</sub> ## Compare Scaled Immediate Word SCI8-form e\_cmpi BF32,RA,sci8 | | 06 | 000 | BF32 | RA | 21 | F | SCL | UI8 | | |---|----|-----|------|----|----|----|-----|-----|----| | 0 | | 6 | 9 | 11 | 16 | 21 | 22 | 24 | 31 | The low-order 32 bits of register RA are compared with sci8, treating operands as signed integers. The result of the comparison is placed into CR field BF32. #### **Special Registers Altered:** CR field BF32 ## Compare Immediate Word Short Form IM5-form se\_cmpi RX,UI5 ``` \begin{array}{l} b \leftarrow ^{59}0 \text{ || UI5} \\ \text{if (RX)}_{32:63} < b_{32:63} \text{ then c} \leftarrow \text{ Ob100} \\ \text{if (RX)}_{32:63} > b_{32:63} \text{ then c} \leftarrow \text{ Ob010} \\ \text{if (RX)}_{32:63} = b_{32:63} \text{ then c} \leftarrow \text{ Ob001} \\ \text{CRO} \leftarrow \text{c} \text{ || XER}_{SO} \end{array} ``` The low-order 32 bits of register RX are compared with UI5, treating operands as signed integers. The result of the comparison is placed into CR0. #### **Special Registers Altered:** CR0 #### Compare Word RR-form se\_cmp RX,RY ``` if (RX) _{32:63} < (RY) _{32:63} then c \leftarrow 0b100 if (RX) _{32:63} > (RY) _{32:63} then c \leftarrow 0b010 if (RX) _{32:63} = (RY) _{32:63} then c \leftarrow 0b001 CRO \leftarrow c \parallel XER<sub>SO</sub> ``` The low-order 32 bits of register RX are compared with the low-order 32 bits of register RY, treating operands as signed integers. The result of the comparison is placed into CR0. #### **Special Registers Altered:** CR0 ## Compare Logical Immediate Word I16A-form e\_cmpl16i RA,ui ``` \begin{array}{l} b \leftarrow ^{48}0 \parallel \text{ui} \\ \text{if (RA)}_{32:63} <^{u} \text{b}_{32:63} \text{ then c} \leftarrow \text{0b100} \\ \text{if (RA)}_{32:63} >^{u} \text{b}_{32:63} \text{ then c} \leftarrow \text{0b010} \\ \text{if (RA)}_{32:63} = \text{b}_{32:63} \text{ then c} \leftarrow \text{0b001} \\ \text{CRO} \leftarrow \text{c} \parallel \text{XER}_{SO} \end{array} ``` The low-order 32 bits of register RA are compared with ui, treating operands as unsigned integers. The result of the comparison is placed into CR0. #### Special Registers Altered: CR0 ## Compare Logical Scaled Immediate Word SCI8-form e\_cmpli BF32,RA,sci8 | | 06 | 01 | BF32 | RA | 21 | F | SCL | UI8 | | |---|----|----|------|----|----|----|-----|-----|----| | 0 | | 6 | 9 | 11 | 16 | 21 | 22 | 24 | 31 | The low-order 32 bits of register RA are compared with sci8, treating operands as unsigned integers. The result of the comparison is placed into CR field BF32. #### **Special Registers Altered:** CR field BF32 ## Compare Logical Immediate Word OIM5-form se\_cmpli RX,oimm The low-order 32 bits of register RX are compared with oimm, treating operands as unsigned integers. The result of the comparison is placed into CR0. The value of oimm must be in the range of 1 to 32. #### **Special Registers Altered:** CR0 #### Compare Logical Word RR-form se\_cmpl RX,RY ``` if (RX) _{32:63} < ^{u} (RY) _{32:63} then c \leftarrow 0b100 if (RX) _{32:63} > ^{u} (RY) _{32:63} then c \leftarrow 0b010 if (RX) _{32:63} = (RY) _{32:63} then c \leftarrow 0b001 CR0 \leftarrow c \parallel XER<sub>SO</sub> ``` The low-order 32 bits of register RX are compared with the low-order 32 bits of register RY, treating operands as unsigned integers. The result of the comparison is placed into CR0. #### **Special Registers Altered:** CR0 #### Compare Halfword X-form e\_cmph BF,RA,RB ``` a ← EXTS((RA)<sub>48:63</sub>) b ← EXTS((RB)<sub>48:63</sub>) if a < b then c ← 0b100 if a > b then c ← 0b010 if a = b then c ← 0b001 CR_{4\times BF+32:4\times BF+35} ← c || XER<sub>SO</sub> ``` The low-order 16 bits of register RA are compared with the low-order 16 bits of register RB, treating operands as signed integers. The result of the comparison is placed into CR field BF. #### **Special Registers Altered:** CR field BF #### Compare Halfword Short Form RR-form se\_cmph RX,RY a $\leftarrow$ EXTS((RX)<sub>48:63</sub>) b $\leftarrow$ EXTS((RY)<sub>48:63</sub>) if a < b then c $\leftarrow$ 0b100 if a > b then c $\leftarrow$ 0b010 if a = b then c $\leftarrow$ 0b001 CR0 $\leftarrow$ c $\parallel$ XER<sub>SO</sub> The low-order 16 bits of register RX are compared with the low-order 16 bits of register RY, treating operands as signed integers. The result of the comparison is placed into CR0. #### **Special Registers Altered:** CR0 #### Compare Halfword Logical e\_cmphl BF,RA,RB ``` \begin{array}{l} \textbf{a} \leftarrow \texttt{EXTZ}((\texttt{RA})_{48:63}) \\ \textbf{b} \leftarrow \texttt{EXTZ}((\texttt{RB})_{48:63}) \\ \textbf{if a} <^{\textbf{u}} \textbf{b} \textbf{ then c} \leftarrow \texttt{0b100} \\ \textbf{if a} >^{\textbf{u}} \textbf{b} \textbf{ then c} \leftarrow \texttt{0b010} \\ \textbf{if a} = \textbf{b} \textbf{ then c} \leftarrow \texttt{0b001} \\ \texttt{CR}_{4\times BF+32:4\times BF+35} \leftarrow \textbf{c} \parallel \texttt{XER}_{SO} \end{array} ``` The low-order 16 bits of register RA are compared with the low-order 16 bits of register RB, treating operands as unsigned integers. The result of the comparison is placed into CR field BF. #### **Special Registers Altered:** CR field BF #### Compare Halfword Immediate 116A-form e\_cmph16i RA,si ``` \begin{array}{l} \texttt{a} \leftarrow \texttt{EXTS}\left(\left(\text{RA}\right)_{48:63}\right) \\ \texttt{b} \leftarrow \texttt{EXTS}\left(\text{si}\right) \\ \texttt{if a < b then c} \leftarrow \texttt{0b100} \\ \texttt{if a > b then c} \leftarrow \texttt{0b010} \\ \texttt{if a = b then c} \leftarrow \texttt{0b001} \\ \texttt{CR0} \leftarrow \texttt{c} \parallel \texttt{XER}_{SO} \end{array} ``` The low-order 16 bits of register RA are compared with si, treating operands as signed integers. The result of the comparison is placed into CR0. #### **Special Registers Altered:** CR0 X-form ## Compare Halfword Logical Short Form RR-form se\_cmphl RX,RY ``` \begin{array}{l} a \leftarrow (RX)_{48:63} \\ b \leftarrow (RY)_{48:63} \\ \text{if a} <^u \text{ b then c} \leftarrow \text{0b100} \\ \text{if a} >^u \text{ b then c} \leftarrow \text{0b010} \\ \text{if a} = \text{b then c} \leftarrow \text{0b001} \\ \text{CRO} \leftarrow \text{c} \parallel \text{XER}_{SO} \end{array} ``` The low-order 16 bits of register RX are compared with the low-order 16 bits of register RY, treating operands as unsigned integers. The result of the comparison is placed into CR0. #### **Special Registers Altered:** CR0 #### Compare Halfword Logical Immediate I16A-form e\_cmphl16i RA,ui ``` \begin{array}{l} a \leftarrow ^{48}0 \parallel (RA)_{^{48:63}} \\ b \leftarrow ^{48}0 \parallel ui \\ \text{if } a <^u \text{ b then } c \leftarrow \text{Ob100} \\ \text{if } a >^u \text{ b then } c \leftarrow \text{Ob010} \\ \text{if } a = \text{b then } c \leftarrow \text{Ob001} \\ \text{CRO} \leftarrow c \parallel \text{XER}_{SO} \end{array} ``` The low-order 16 bits of register RA are compared with the ui field, treating operands as signed integers. The result of the comparison is placed into CR0. ## Special Registers Altered: CR0 # **5.7 Fixed-Point Trap Instructions** The fixed-point *Trap* instruction from Book I, *tw* is available while executing in VLE mode. The mnemonics, decoding, and semantics for this instruction is identical to that in Book I; see Section 3.3.10 of Book I for the instruction definition. The fixed-point *Trap* instruction from Book I, *td* is available while executing in VLE mode on 64-bit implementations. The mnemonic, decoding, and semantics for the *td* instruction are identical to those in Book I; see Section 3.3.10 of Book I for the instruction definitions. # 5.8 Fixed-Point Select Instruction The fixed-point *Select* instruction provides a means to select one of two registers and place the result in a destination register under the control of a predicate value supplied by a CR bit. The fixed-point *Select* instruction from Book I, *isel* is available while executing in VLE mode. The mnemonics, decoding, and semantics for this instruction is identical to that in Book I; see Section of Book I for the instruction definition. ### 5.9 Fixed-Point Logical, Bit, and Move Instructions The *Logical* instructions perform bit-parallel operations on 64-bit operands. The *Bit* instructions manipulate a bit, or create a bit mask, in a register. The *Move* instructions move a register or an immediate value into a register. The X-form Logical instructions with Rc=1, the SCI8-form Logical instructions with Rc=1, the RR-form Logical instructions with Rc=1, the **e\_and2i**. instruction, and the **e\_and2is**. instruction set the first three bits of CR field 0 as the arithmetic instructions described in Section 5.5, "Fixed-Point Arithmetic Instructions". (Also see Section 4.1.1.) The Logical instructions do not change the SO, OV, and CA bits in the XER. The fixed-point *Logical* instructions from Book I, *and[.]*, *or[.]*, *xor[.]*, *nand[.]*, *nor[.]*, *eqv[.]*, *andc[.]*, *orc[.]*, *extsb[.]*, *extsh[.]*, *cntlzw[.]*, and *popcntb* are available while executing in VLE mode. The mnemonics, decoding, and semantics for these instructions are identical to those in Book I; see Section 3.3.12 of Book I for the instruction definitions. The fixed-point *Logical* instructions from Book I, **extsw[.]** and **cntlzd[.]** are available while executing in VLE mode on 64-bit implementations. The mnemonics, decoding, and semantics for these instructions are identical to those in Book I; see Section 3.3.12 of Book I for the instruction definitions. #### AND (two operand) Immediate 116L-form e\_and2i. RT,ui RT $$\leftarrow$$ (RT) & (480 || ui) The contents of register RT are ANDed with <sup>48</sup>0 || ui and the result is placed into register RT. #### **Special Registers Altered:** CR0 ## AND Scaled Immediate Carrying SCI8-form | e_andi | RA,RS,sci8 | (Rc=0) | |---------|------------|--------| | e_andi. | RA,RS,sci8 | (Rc=1) | | 06 | RS | RA | 12 | R۲ | F | SCL | | UI8 | | |----|--------|----|----|----|---|-----|----|-----|----| | 0 | 6<br>6 | 11 | | 20 | | | 24 | 010 | 31 | sci8 $$\leftarrow$$ <sup>56-SCL×8</sup>F || UI8 || SCL×8F RA $\leftarrow$ (RS) & sci8 The contents of register RS are ANDed with sci8 and the result is placed into register RA. #### **Special Registers Altered:** CR0 (if Rc=1) ## AND (2 operand) Immediate Shifted I16L-form e\_and2is. RT,ui $$RT \leftarrow (RT) \& (^{32}0 \parallel ui \parallel^{16}0)$$ The contents of register RT are ANDed with $^{32}0 \parallel$ ui $\parallel$ $^{16}0$ and the result is placed into register RT. #### **Special Registers Altered:** CR0 #### AND Immediate Short Form IM5-form se\_andi RX,UI5 $$RX \leftarrow (RX) \& ^{59}0 \parallel UI5$$ The contents of register RX are ANDed with <sup>59</sup>0 || UI5 and the result is placed into register RX. #### **Special Registers Altered:** #### e\_or2i RT,ui $$RT \leftarrow (RT) | (^{48}0 | | ui)$$ The contents of register RT are ORed with <sup>48</sup>0 || ui and the result is placed into register RT. #### **Special Registers Altered:** None #### OR (2 operand) Immediate Shifted I16L-form e\_or2is RT,ui | Γ | 28 | RT | ui | 26 | ui | | |---|----|----|----|----|----|----| | 0 | ) | 6 | 11 | 16 | 21 | 31 | $$RT \leftarrow (RT) \mid (^{32}0 \parallel ui \parallel^{16}0)$$ The contents of register RT are ORed with $^{32}0 \parallel$ ui $\parallel$ $^{16}0$ and the result is placed into register RT. #### **Special Registers Altered:** None #### OR Scaled Immediate SCI8-form | e_ori | RA,RS,sci8 | (Rc=0) | |--------|------------|--------| | e_ori. | RA,RS,sci8 | (Rc=1) | | 06 | RS | RA | 13 | Rc | F | SCL | | UI8 | | |----|----|----|----|----|----|-----|----|-----|---| | 0 | 6 | 11 | 16 | 20 | 21 | 22 | 24 | | 3 | $$sci8 \leftarrow {}^{56-SCLX8}F \mid\mid UI8 \mid\mid {}^{SCLX8}F$$ $RA \leftarrow (RS) \mid sci8$ The contents of register RS are ORed with sci8 and the result is placed into register RA. #### **Special Registers Altered:** CR0 (if Rc=1) #### XOR Scaled Immediate SCI8-form | e_xori | RA,RS,sci8 | (Rc=0) | |---------|------------|--------| | e_xori. | RA,RS,sci8 | (Rc=1) | | 06 | RS | RA | 14 | Rc | F | SCL | | UI8 | | |----|----|----|----|----|----|-----|----|-----|----| | 0 | 6 | 11 | 16 | 20 | 21 | 22 | 24 | | 31 | $$sci8 \leftarrow {}^{56-SCL\times 8}F \mid\mid UI8\mid\mid {}^{SCL\times 8}F$$ $RA \leftarrow (RS) \oplus sci8$ The contents of register RS are XORed with sci8 and the result is placed into register RA. #### **Special Registers Altered:** CR0 (if Rc=1) #### AND Short Form RR-form | se_and | RX,RY | (Rc=0) | |---------|-------|--------| | se_and. | RX,RY | (Rc=1) | | | 17 | 1 | Rc | RY | RX | | |---|----|---|----|----|----|----| | 0 | | 6 | 7 | 8 | 12 | 15 | The contents of register RX are ANDed with the contents of register RY and the result is placed into register RY #### **Special Registers Altered:** CR0 (if Rc=1) ## AND with Complement Short Form RR-form se\_andc RX,RY $$RX \leftarrow (RX) \& \neg (RY)$$ The contents of register RX are ANDed with the complement of the contents of register RY and the result is placed into register RX. #### **Special Registers Altered:** #### **OR Short Form** #### RR-form #### **NOT Short Form** #### R-form se\_or RX,RY | I | 17 | 0 | RY | RX | | |---|----|---|----|----|----| | | 0 | 6 | 8 | 12 | 15 | $$RX \leftarrow (RX) \mid (RY)$$ The contents of register RX are ORed with the contents of register RY and the result is placed into register RX. #### **Special Registers Altered:** None | se_not | RX | |--------|----| |--------|----| $$RX \leftarrow \neg (RX)$$ The contents of RX are complemented and placed into register RX. #### **Special Registers Altered:** None #### Bit Clear Immediate IM5-form #### Bit Generate Immediate IM5-form se\_bclri RX,UI5 | 24 | 0 | UI5 | R | Χ | |----|-----|-----|----|----| | 0 | 6 7 | | 12 | 15 | a $$\leftarrow$$ UI5 RX $\leftarrow$ (RX) & ( $^{a+32}$ 1 || 0 || $^{31-a}$ 1) Bit UI5+32 of register RX is set to 0. #### **Special Registers Altered:** None a $$\leftarrow$$ UI5 RX $\leftarrow$ (a+320 || 1 || 31-a0) Bit UI5+32 of register RX is set to 1. All other bits in register RX are set to 0. #### **Special Registers Altered:** None #### Bit Mask Generate Immediate IM5-form | se_bmaski RX | ,UI5 | |--------------|------| |--------------|------| | 11 | 0 | | UI5 | RX | | |----|---|---|-----|----|----| | 0 | 6 | 7 | | 12 | 15 | If UI5 is not zero, the low-order UI5 bits are set to 1 in register RX and all other bits in register RX are set to 0. If UI5 is 0, all bits in register RX are set to 1. #### **Special Registers Altered:** None #### Bit Set Immediate IM5-form RX,UI5 se\_bseti | | 25 | 0 | UI5 | R | Χ | |---|----|-----|-----|----|----| | 0 | | 6 7 | | 12 | 15 | a $$\leftarrow$$ UI5 RX $\leftarrow$ (RX) | ( $^{a+32}$ 0 || 1 || $^{31-a}$ 0) Bit UI5+32 of register RX is set to 1. #### **Special Registers Altered:** #### Extend Sign Byte Short Form se\_extsb RX | 0 | 13 | RX | | |---|----|----|----| | 0 | 6 | 12 | 15 | $$\begin{array}{l} \mathtt{s} \leftarrow \left(\mathtt{RX}\right)_{56} \\ \mathtt{RX} \leftarrow \left.^{56}\mathtt{s} \right\| \left(\mathtt{RX}\right)_{56:63} \end{array}$$ (RX)<sub>56:63</sub> are placed into RX<sub>56:63</sub>. Bit 56 of register RX is placed into RX<sub>0:55</sub>. #### **Special Registers Altered:** None #### Extend Sign Halfword Short Form R-form se\_extsh RX | ı | 0 | 15 | RX | _ | |---|---|----|----|----| | | 0 | 6 | 12 | 15 | $$\begin{array}{c} \mathtt{s} \leftarrow (\mathtt{RX})_{48} \\ \mathtt{RX} \leftarrow {}^{48}\mathtt{s} \parallel (\mathtt{RX})_{48:63} \end{array}$$ (RX)<sub>48:63</sub> are placed into RX<sub>48:63</sub>. Bit 48 of register RX is placed into RX<sub>0:47</sub>. #### **Special Registers Altered:** None #### Extend Zero Byte R-form R-form R-form se\_extzb RX | 0 | 12 | RX | | |---|----|-------|---| | 0 | 6 | 12 15 | ś | $$RX \leftarrow ^{56}0 \parallel (RX)_{56:63}$$ $(RX)_{56:63}$ are placed into $RX_{56:63}.\ RX_{0:55}$ are set to 0. #### **Special Registers Altered:** None | Extend 2 | Zero Halfword | | |----------|---------------|--| | se_extzh | RX | | $$RX \leftarrow {}^{48}0 \parallel (RX)_{48:63}$$ $(RX)_{48:63}$ are placed into $RX_{48:63}$ . $RX_{0:47}$ are set to 0. #### **Special Registers Altered:** None #### Load Immediate LI20-form #### Load Immediate Short Form IM7-form e\_li RT,LI20 | 28 | RT | li20 <sub>4:8</sub> | 0 | li20 <sub>0:3</sub> | li20 <sub>9:19</sub> | | |----|----|---------------------|----|---------------------|----------------------|----| | 0 | 6 | 11 | 16 | 17 | 21 | 31 | RT $$\leftarrow \text{EXTS}(1i20_{5:8} \parallel 1i20_{0:4} \parallel 1i20_{9:19})$$ The sign-extended LI20 field is placed into RT. #### **Special Registers Altered:** None | se_li | RX,UI7 | | |-------|--------|--| $$RX \leftarrow 570 \parallel UI7$$ The zero-extended UI7 field is placed into RX. #### **Special Registers Altered:** None #### Load Immediate Shifted I16L-form | 28 | RT | ui | 28 | ui | | |----|----|----|----|------|---| | 0 | 6 | 11 | 16 | 21 3 | 1 | RT $$\leftarrow$$ 320 || ui || 160 The zero-extended value of ui shifted left 16 bits is placed into RT. #### **Special Registers Altered:** #### Move from Alternate Register RR-form se\_mfar RX,ARY | | 0 | 3 | ARY | RX | 1 | |---|---|---|-----|-------|---| | 0 | | 6 | 8 | 12 15 | 5 | $r \leftarrow ARY+8$ RX $\leftarrow GPR(r)$ The contents of register ARY+8 are placed into RX. ARY specifies a register in the range R8:R23. #### **Special Registers Altered:** None #### Move to Alternate Register RR-form se\_mtar ARX,RY $r \leftarrow ARX+8$ $GPR(r) \leftarrow (RY)$ The contents of register RY are placed into register ARX+8. ARX specifies a register in the range R8:R23. #### **Special Registers Altered:** None #### Move Register RR-form se\_mr RX,RY $RX \leftarrow (RY)$ The contents of register RY are placed into RX. #### **Special Registers Altered:** ## 5.10 Fixed-Point Rotate and Shift Instructions The fixed-point *Shift* instructions from Book I, *slw[.]*, *srw[.]*, *srw[.]*, *srawi[.]*, and *sraw[.]* are available while executing in VLE mode. The mnemonics, decoding, and semantics for those instructions are identical to those in Book I; see Section 3.3.13.2 of Book I for the instruction definitions. The fixed-point *Shift* instructions from Book I, *sId[.]*, *srd[.]*, *srd[.]*, and *srad[.]* are available while executing in VLE mode on 64-bit implementations. The mnemonics, decoding, and semantics for those instructions are identical to those in Book I; see Section 3.3.13.2 of Book I for the instruction definitions. #### Rotate Left Word #### Rotate Left Word Immediate X-form | e_rlw<br>e_rlw. | , | S,RB<br>S,RB | | (Rc=0)<br>(Rc=1) | |-----------------|---|--------------|------|------------------| | - 24 | | | <br> | | | 31 | RS | RA | RB | 280 | Rc | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$n \leftarrow (RB)_{59:63}$$ $RA \leftarrow ROTL_{32}((RS)_{32:63}, n)$ The contents of register RS are rotated<sub>32</sub> left the number of bits specified by $(RB)_{59:63}$ and the result is placed into register RA. #### **Special Registers Altered:** | e_rlwi RA,RS,SH<br>e_rlwi. RA,RS,SH | | | | (Rc=0)<br>(Rc=1) | | | |-------------------------------------|----|----|----|------------------|-----|----| | | 31 | RS | RA | SH | 312 | Rc | | 0 | | 6 | 11 | 16 | 21 | 31 | $$\begin{array}{l} \text{n} \leftarrow \text{SH} \\ \text{RA} \leftarrow \text{ROTL}_{32}(\text{(RS)}_{32:63}, \text{n}) \end{array}$$ The contents of register RS are rotated<sub>32</sub> left SH bits and the result is placed into register RA. #### **Special Registers Altered:** #### Rotate Left Word Immediate then Mask Insert M-form e\_rlwimi RA,RS,SH,MB,ME | 29 | RS | RA | SH | MB | ME | 0 | |----|----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 26 | 31 | ``` \begin{array}{l} \text{n} \leftarrow \text{SH} \\ \text{r} \leftarrow \text{ROTL}_{32}\left(\left(\text{RS}\right)_{32:63}, \ \text{n}\right) \\ \text{m} \leftarrow \text{MASK}\left(\text{MB+32}, \ \text{ME+32}\right) \\ \text{RA} \leftarrow \text{r\&m} \quad | \quad (\text{RA}) \& \neg \text{m} \end{array} ``` The contents of register RS are rotated<sub>32</sub> left SH bits. A mask is generated having 1-bits from bit MB+32 through bit ME+32 and 0-bits elsewhere. The rotated data is inserted into register RA under control of the generated mask. #### **Special Registers Altered:** None # Rotate Left Word Immediate then AND with Mask M-form e\_rlwinm RA,RS,SH,MB,ME | 29 | RS | RA | SH | MB | ME | 1 | |----|----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 26 | 31 | ``` \begin{array}{l} \texttt{n} \leftarrow \texttt{SH} \\ \texttt{r} \leftarrow \texttt{ROTL}_{32} (\,(\texttt{RS})_{\,32:63}, \,\, \texttt{n}) \\ \texttt{m} \leftarrow \texttt{MASK} (\texttt{MB+32}, \,\, \texttt{ME+32}) \\ \texttt{RA} \leftarrow \texttt{r} \,\, \texttt{\&} \,\, \texttt{m} \end{array} ``` The contents of register RS are rotated $_{32}$ left SH bits. A mask is generated having 1-bits from bit MB+32 through bit ME+32 and 0-bits elsewhere. The rotated data is ANDed with the generated mask and the result is placed into register RA. #### **Special Registers Altered:** #### Shift Left Word Immediate #### X-form # Shift Left Word Immediate Short Form IM5-form | 31 | RS | RA | SH | 56 | Rc | |----|----|----|----|----|----| | 0 | 6 | 11 | 16 | 21 | 31 | $$n \leftarrow SH$$ $r \leftarrow ROTL_{32}((RS)_{32:63}, n)$ $m \leftarrow MASK(32, 63-n)$ $RA \leftarrow r \& m$ The contents of the low-order 32 bits of register RS are shifted left SH bits. Bits shifted out of position 32 are lost. Zeros are supplied to the vacated positions on the right. The 32-bit result is placed into $RA_{32:63}$ . $RA_{0:31}$ are set to 0. #### **Special Registers Altered:** CR0 (if Rc=1) #### Shift Left Word #### RR-form ``` \begin{array}{l} n \leftarrow (RY)_{58:63} \\ r \leftarrow ROTL_{32}((RX)_{32:63}, \ n) \\ \text{if } (RY)_{58} = 0 \text{ then } m \leftarrow MASK(32, 63-n) \\ \text{else} \\ RX \leftarrow r \& m \end{array} ``` The contents of the low-order 32 bits of register RX are shifted left the number of bits specified by $(RY)_{58:63}$ . Bits shifted out of position 32 are lost. Zeros are supplied to the vacated positions on the right. The 32-bit result is placed into $RX_{32:63}$ . $RX_{0:31}$ are set to 0. Shift amounts from 32-63 give a zero result. #### **Special Registers Altered:** None RX,UI5 se\_slwi ``` \begin{array}{l} n \;\leftarrow\; \text{UI5} \\ r \;\leftarrow\; \text{ROTL}_{32}\left(\left(\text{RX}\right)_{32:63},\;\; n\right) \\ \mathfrak{m} \;\leftarrow\; \text{MASK}(32,\; 63\text{-}n) \\ \text{RX} \;\leftarrow\; r \;\&\; \mathfrak{m} \end{array} ``` The contents of the low-order 32 bits of register RX are shifted left UI5 bits. Bits shifted out of position 32 are lost. Zeros are supplied to the vacated positions on the right. The 32-bit result is placed into $RX_{32:63}$ . $RX_{0:31}$ are set to 0. #### **Special Registers Altered:** None #### Shift Right Algebraic Word Immediate IM5-form se\_srawi RX,UI5 ``` n \leftarrow \text{UI5} r \leftarrow \text{ROTL}_{32}((\text{RX})_{32:63}, 64-n) m \leftarrow \text{MASK}(n+32, 63) s \leftarrow (\text{RX})_{32} \text{RX} \leftarrow r\&m \mid (^{64}s)\&\lnot m \text{CA} \leftarrow s \& ((r\&\lnot m)_{32:63} \neq 0) ``` The contents of the low-order 32 bits of register RX are shifted right UI5 bits. Bits shifted out of position 63 are lost, and bit 32 of RX is replicated to fill the vacated positions on the left. Bit 32 of RX is replicated to fill RX $_{0:31}$ and the 32-bit result is placed into RX $_{32:63}$ . CA is set to 1 if the low-order 32 bits of register RX contain a negative value and any 1-bits are shifted out of bit position 63; otherwise CA is set to 0. A shift amount of zero causes RX to receive EXTS((RX) $_{32:63}$ ), and CA to be set to 0. #### **Special Registers Altered:** CA #### Shift Right Algebraic Word RR-form # Shift Right Word Immediate X-form se\_sraw RX,RY | | 16 | 1 | RY | R | X | |---|----|---|----|----|----| | 0 | | 6 | 8 | 12 | 15 | ``` \begin{array}{lll} n & \leftarrow & (RY)_{59:63} \\ r & \leftarrow & ROTL_{32} ((RX)_{32:63}, 64-n) \\ \text{if } & (RY)_{58} = 0 \text{ then } m \leftarrow MASK(n+32, 63) \\ \text{else} & m \leftarrow ^{64}0 \\ \text{s} & \leftarrow & (RX)_{32} \\ RX & \leftarrow & r\&m \mid (^{64}\text{s})\& \neg m \\ CA & \leftarrow & s\& ((r\& \neg m)_{32:63} \neq 0) \end{array} ``` The contents of the low-order 32 bits of register RX are shifted right the number of bits specified by (RY) $_{58:63}$ . Bits shifted out of position 63 are lost, and bit 32 of RX is replicated to fill the vacated positions on the left. Bit 32 of RX is replicated to fill RX $_{0:31}$ and the 32-bit result is placed into RX $_{32:63}$ . CA is set to 1 if the low-order 32 bits of register RX contain a negative value and any 1-bits are shifted out of bit position 63; otherwise CA is set to 0. A shift amount of zero causes RX to receive EXTS((RX) $_{32:63}$ ), and CA to be set to 0. Shift amounts from 32-63 give a result of 64 sign bits, and cause CA to receive the sign bit of (RX) $_{32:63}$ . #### **Special Registers Altered:** CA | e_srwi | RA,RS,SH | (Rc=0) | |---------|----------|--------| | e_srwi. | RA,RS,SH | (Rc=1) | | 31 | RS | RA | SH | 568 | Rc | |----|----|----|----|-----|----| | 0 | 6 | 11 | 16 | 21 | 31 | ``` \begin{array}{l} n \; \leftarrow \; \text{SH} \\ r \; \leftarrow \; \text{ROTL}_{32} \left( \left( \text{RS} \right)_{32:63}, \; 64\text{-}n \right) \\ \mathfrak{m} \; \leftarrow \; \text{MASK} \left( n\text{+}32 \,, \; 63 \right) \\ \text{RA} \; \leftarrow \; r \; \& \; \mathfrak{m} \end{array} ``` The contents of the low-order 32 bits of register RS are shifted right SH bits. Bits shifted out of position 63 are lost. Zeros are supplied to the vacated positions on the left. The 32-bit result is placed into $RA_{32:63}$ . $RA_{0:31}$ are set to 0. #### **Special Registers Altered:** CR0 (if Rc=1) # Shift Right Word Immediate Short Form IM5-form se\_srwi RX,UI5 ``` \begin{array}{l} n \; \leftarrow \; \text{UI5} \\ r \; \leftarrow \; \text{ROTL}_{32} \left( \; (\text{RX})_{\; 32:63}, \;\; 64\text{-}n \right) \\ \mathfrak{m} \; \leftarrow \; \text{MASK} \left( n\text{+}32, \;\; 63 \right) \\ \text{RX} \; \leftarrow \; r \; \& \; \mathfrak{m} \end{array} ``` The contents of the low-order 32 bits of register RX are shifted right UI5 bits. Bits shifted out of position 63 are lost. Zeros are supplied to the vacated positions on the left. The 32-bit result is placed into $RX_{32:63}$ . $RX_{0:31}$ are set to 0. #### **Special Registers Altered:** None ### Shift Right Word se\_srw RR-form 16 | 0 | RY | RX RX,RY The contents of the low-order 32 bits of register RX are shifted right the number of bits specified by (RY)<sub>58:63</sub>. Bits shifted out of position 63 are lost. Zeros are supplied to the vacated positions on the left. The 32-bit result is placed into RX<sub>32:63</sub>. RX<sub>0:31</sub> are set to 0. Shift amounts from 32 to 63 give a zero result. #### **Special Registers Altered:** # 5.11 Move To/From System Register Instructions The VLE category provides 16-bit forms of instructions to move to/from the LR and CTR. The fixed-point *Move To/From System Register* instructions from Book I, *mfspr*, *mtcrf*, *mfcr*, *mtocrf*, *mfocrf*, *mcrxr*, *mtdcrux*, *mfdcrux*, *mfapidi*, and *mtspr* are available while executing in VLE mode. The mnemonics, decoding, and semantics for these instructions are identical to those in Book I; see Section 3.3.14 of Book I for the instruction definitions. The fixed-point *Move To/From System Register* instructions from Book III-E, *mfspr*, *mtspr*, *mfdcr*, *mtdcr*, *mtmsr*, *mfmsr*, *wrtee*, and *wrteei* are available while executing in VLE mode. The mnemonics, decoding, and semantics for these instructions are identical to those in Book III-E; see Section 3.4.1 of Book III-E for the instruction definitions. # Move From Count RegisterR-formMove From Link RegisterR-formse\_mfctrRXse\_mflrRX010RX08RX061215 RX $\leftarrow$ CTR The CTR contents are placed into register RX. Special Registers Altered: Special Registers Altered: Special Registers Altered: | Move To Count Register | R-form | Move To Link Register | R-form | |------------------------|--------|-----------------------|--------| | se_mtctr RX | | se_mtlr RX | | | 0 11 RX<br>0 6 12 15 | | 0 9 RX<br>0 6 12 15 | | | CTR ← (RX) | | $LR \leftarrow (RX)$ | | The contents of register RX are placed into the CTR. ## Special Registers Altered: **CTR** None The contents of register RX are placed into the LR. #### **Special Registers Altered:** LR # **Chapter 6. Storage Control Instructions** | 6.1 Storage Synchronization Instructions . 809 | 6.4 TLB Management Instructions 810 6.5 Instruction Alignment and Byte Order- | |------------------------------------------------|-------------------------------------------------------------------------------| | 6.2 Cache Management Instructions . 810 | ing | | 6.3 Cache Locking Instructions 810 | | # 6.1 Storage Synchronization Instructions The memory synchronization instructions implemented by category VLE are identical in semantics to those defined in Book II and Book III-E. The **se\_isync** instruction is defined by category VLE, but has the same semantics as **isync**. The Load and Reserve and Store Conditional instructions from Book II, Iwarx and stwcx. are available while executing in VLE mode. The mnemonics, decoding, and semantics for those instructions are identical to those in Book II; see Section 3.4.2 of Book II for the instruction definitions. The Load and Reserve and Store Conditional instructions from Book II, Idarx and stdcx. are available while executing in VLE mode on 64-bit implementations. The mnemonics, decoding, and semantics for those instructions are identical to those in Book II; see Section 3.4.2 of Book II for the instruction definitions. The *Memory Barrier* instructions from Book II, **sync** (**msync**) and **mbar** are available while executing in VLE mode. The mnemonics, decoding, and semantics for those instructions are identical to those in Book II; see Section 3.4.3 of Book II for the instruction definitions. The *wait* instruction from Book II is available while executing in VLE mode if the category Wait is implemented. The mnemonics, decoding, and semantics for *wait* are identical to those in Book II; see Section 3.4 of Book II for the instruction definition. #### Instruction Synchronize C-form Executing an **se\_isync** instruction ensures that all instructions preceding the **se\_isync** instruction have completed before the **se\_isync** instruction completes, and that no subsequent instructions are initiated until after the **se\_isync** instruction completes. It also ensures that all instruction cache block invalidations caused by **icbi** instructions preceding the **se\_isync** instruction have been performed with respect to the processor executing the **se\_isync** instruction, and then causes any prefetched instructions to be discarded. Except as described in the preceding sentence, the **se\_isync** instruction may complete before storage accesses associated with instructions preceding the **se\_isync** instruction have been performed. This instruction is context synchronizing. The **se\_isync** instruction has identical semantics to the Book II **isync** instruction, but has a different encoding. #### **Special Registers Altered:** # **6.2 Cache Management Instructions** Cache management instructions implemented by category VLE are identical to those defined in Book II and Book III-E. The Cache Management instructions from Book II, dcba, dcbf, dcbst, dcbt, dcbtst, dcbz, icbi, and icbt are available while executing in VLE mode. The mnemonics, decoding, and semantics for these instructions are identical to those in Book II; see Section 3.3 of Book II for the instruction definitions. The Cache Management instruction from Book III-E, **dcbi** is available while executing in VLE mode. The mnemonics, decoding, and semantics for this instruction are identical to those in Book III-E; see Section 4.9.1 of Book III-E for the instruction definition. # 6.3 Cache Locking Instructions Cache locking instructions implemented by category VLE are identical to those defined in Book III-E. If the Cache Locking instructions are implemented in category VLE, the category Embedded Cache Locking must also be implemented. The *Cache Locking* instructions from Book III-E, *dcbtls*, *dcbtstls*, *dcbtc*, *icbtls*, and *icblc* are available while executing in VLE mode. The mnemonics, decoding, and semantics for these instructions are identical to those in Book III-E; see Section 4.9.2 of Book III-E for the instruction definitions. # 6.4 TLB Management Instructions The TLB management instructions implemented by category VLE are identical to those defined in Book III-E. The *TLB Management* instructions from Book III-E, *tlbre*, *tlbwe*, *tlbivax*, *tlbsync*, and *tlbsx* are available while executing in VLE mode. The mnemonics, decoding, and semantics for these instructions are identical to those in Book III-E. See Section 4.9.4.1 of Book III-E for the instruction definitions. Instructions and resources from category Embedded.MMU Type FSL are available if the appropriate category is implemented. # 6.5 Instruction Alignment and Byte Ordering Only Big-Endian instruction memory is supported when executing from a page of VLE instructions. Attempting to fetch VLE instructions from a page marked as Little-Endian generates an instruction storage interrupt byte-ordering exception. # Chapter 7. Additional Categories Available in VLE | 7.1 | Move Assist | 811 | 7.6 | External PID | 811 | |-----|--------------------------|-----|-----|--------------------------------|-----| | 7.2 | Vector | 811 | 7.7 | Embedded Performance Monitor . | 812 | | 7.3 | Signal Processing Engine | 811 | 7.8 | Processor Control | 812 | | 7.4 | Embedded Floating Point | 811 | | | | | 7.5 | Legacy Move Assist | 811 | | | | Instructions and resources from categories other than Base and Embedded are available in VLE. These include categories for which all the instructions in the category use primary opcode 4 or primary opcode 31. #### 7.1 Move Assist Move Assist instructions implemented by category VLE are identical to those defined in Book I. If the Move Assist instructions are implemented in category VLE, category Move Assist must also be implemented. The mnemonics, decoding, and semantics for those instructions are identical to those in Book I; see Section 3.3.6 of Book I for the instruction definitions. #### 7.2 Vector Vector instructions implemented by category VLE are identical to those defined in Book I. If the Vector instructions are implemented in category VLE, category Vector must also be implemented. The mnemonics, decoding, and semantics for those instructions are identical to those in Book I; see Chapter 6 of Book I for the instruction definitions. # 7.3 Signal Processing Engine Signal Processing Engine instructions implemented by category VLE are identical to those defined in Book I. If the Signal Processing Engine instructions are implemented in category VLE, category Signal Processing Engine must also be implemented. The mnemonics, decoding, and semantics for those instructions are identical to those in Book I; see Chapter 7 of Book Ifor the instruction definitions. ## 7.4 Embedded Floating Point Embedded Floating Point instructions implemented by category VLE are identical to those defined in Book I. If the Embedded Floating Point instructions are implemented in category VLE, the appropriate category SPE.Embedded Float Scalar Double, SPE.Embedded Float Scalar Single, or SPE.Embedded Float Vector must also be implemented. The mnemonics, decoding, and semantics for those instructions are identical to those in Book I; see Chapter 8 of Book I for the instruction definitions. # 7.5 Legacy Move Assist Legacy Move Assist instructions implemented by category VLE are identical to those defined in Book I. If the Legacy Move Assist instructions are implemented in category VLE, category Legacy Move Assist must also be implemented. The mnemonics, decoding, and semantics for those instructions are identical to those in Book I; see Chapter 9 of Book I for the instruction definitions. ## 7.6 External PID External Process ID instructions implemented by category VLE are identical to those defined in Book III-E. If the External Process ID instructions are implemented in category VLE, category Embedded.External PID must also be implemented. The mnemonics, decoding, and semantics for those instructions are identical to those in Book III-E; see Chapter 3.3.4 of Book III-E for the instruction definitions. # 7.7 Embedded Performance Monitor Embedded Performance Monitor instructions implemented by category VLE are identical to those defined in Book III-E. If the Embedded Performance Monitor instructions are implemented in category VLE, category Embedded.Performance Monitor must also be implemented. The mnemonics, decoding, and semantics for those instructions are identical to those in Book III-E; see Appendix E of Book III-E for the instruction definitions. ## 7.8 Processor Control Processor Control instructions implemented by category VLE are identical to those defined in Book III-E. If the Processor Control instructions are implemented in category VLE, category Embedded Processor Control must also be implemented. The mnemonics, decoding, and semantics for those instructions are identical to those in Book III-E; see Chapter 9 of Book III-E for the instruction definitions. # Appendix A. VLE Instruction Set Sorted by Mnemonic This appendix lists all the instructions available in VLE mode in the Power ISA, in order by mnemonic. Opcodes that are not defined below are treated as illegal by category VLE. | | Opcode | | | | | | |------|-------------------------------------------|------------|-----|------------------|-------------|---------------------------------------------------------| | Form | Opcode<br>(hexadeci-<br>mal) <sup>2</sup> | ode<br>ep. | !≥! | Cat <sup>1</sup> | Mnemonic | Instruction | | F. | mal) <sup>2</sup> | ğ | Ъ | | | | | XO | 7C000214 | | | В | add[o][.] | Add | | XO | 7C000014 | | | В | addc[o][.] | Add Carrying | | XO | 7C000114 | SR | | В | adde[o][.] | Add Extended | | XO | 7C0001D4 | SR | | В | addme[o][.] | Add to Minus One Extended | | XO | 7C000194 | SR | | В | addze[o][.] | Add to Zero Extended | | Χ | 7C000038 | SR | | В | and[.] | AND | | X | 7C000078 | SR | | В | andc[.] | AND with Complement | | EVX | 1000020F | | | SP | brinc | Bit Reverse Increment | | X | 7C000000 | | | В | cmp | Compare | | X | 7C000040 | | | В | cmpl | Compare Logical | | Χ | 7C000074 | SR | | 64 | cntlzd[.] | Count Leading Zeros Doubleword | | X | 7C000034 | SR | | В | cntlzw[.] | Count Leading Zeros Word | | Χ | 7C0005EC | | | Ε | dcba | Data Cache Block Allocate | | Χ | 7C0000AC | | | В | dcbf | Data Cache Block Flush | | X | 7C0000FE | | Ρ | E.PD | dcbfep | Data Cache Block Flush by External Process ID | | Χ | 7C0003AC | | Ρ | Е | dcbi | Data Cache Block Invalidate | | X | 7C00030C | | Μ | ECL | dcblc | Data Cache Block Lock Clear | | Χ | 7C00006C | | | В | dcbst | Data Cache Block Store | | X | 7C00022C | | | В | dcbt | Data Cache Block Touch | | Χ | 7C00027E | | Ρ | E.PD | dcbtep | Data Cache Block Touch by External Process ID | | X | 7C00014C | | Μ | ECL | dcbtls | Data Cache Block Touch and Lock Set | | X | 7C0001EC | | | В | dcbtst | Data Cache Block Touch for Store | | X | 7C0001FE | | Р | E.PD | dcbtstep | Data Cache Block Touch for Store by External Process ID | | X | 7C00010C | | M | ECL | dcbtstls | Data Cache Block Touch for Store and Lock Set | | X | 7C0007EC | | | В | dcbz | Data Cache Block set to Zero | | X | 7C0007FE | | Р | E.PD | dcbzep | Data Cache Block set to Zero by External Process ID | | X | 7C00038C | | Р | E.CI | dci | Data Cache Invalidate | | X | 7C00028C | | Р | E.CD | dcread | Data Cache Read | | X | 7C0003CC | | Р | E.CD | dcread | Data Cache Read | | XO | 7C0003D2 | SR | | 64 | divd[o][.] | Divide Doubleword | | XO | 7C000392 | SR | | 64 | divdu[o][.] | Divide Doubleword Unsigned | | XO | 7C0003D6 | SR | | В | divw[o][.] | Divide Word | | XO | 7C000396 | SR | | В_ | divwu[o][.] | Divide Word Unsigned | | D | 1C000000 | | | VLE | e_add16i | Add Immediate | | 116A | 70008800 | SR | | VLE | e_add2i. | Add (2 operand) Immediate and Record | | 116A | 70009000 | | | VLE | e_add2is | Add (2 operand) Immediate Shifted | | SCI8 | 18008000 | SR | | VLE | e_addi[.] | Add Scaled Immediate | | SCI8 | 18009000 | SR | | VLE | e_addic[.] | Add Scaled Immediate Carrying | | I16L | 7000C800 | SR | | VLE | e_and2i. | AND (2 operand) Immediate | | I16L | 7000E800 | SR | | VLE | e_and2is. | AND (2 operand) Immediate Shifted | | SCI8 | 1800C000 | SR | | VLE | e_andi[.] | AND Scaled Immediate | | BD24 | 78000000 | | | VLE | e_b[l] | Branch [and Link] | | BD15 | 7A000000 | CT | | VLE | e_bc[l] | Branch Conditional [and Link] | | IA16 | 70009800 | | | VLE | e_cmp16i | Compare Immediate Word | | IA16 | 7000B000 | | | VLE | e_cmph16i | Compare Halfword Immediate | | Form | Opcode<br>(hexadeci-<br>mal) <sup>2</sup> | Mode<br>Dep. <sup>1</sup> | Cat <sup>1</sup> | Mnemonic | Instruction | |----------|-------------------------------------------|---------------------------|------------------|-----------------------|-------------------------------------------------------------------------| | X | 7C00001C | | VLE | e_cmph | Compare Halfword | | IA16 | 7000B800 | | VLE | e_cmphl16i | Compare Halfword Logical Immediate | | X | 7C00005C | | VLE | e_cmphl | Compare Halfword Logical | | SCI8 | 1800A800 | | VLE | e_cmpi | Compare Scaled Immediate Word | | I16A | 7000A800 | | VLE | e_cmpl16i | Compare Logical Immediate Word | | SCI8 | 1880A800 | | VLE | e_cmpli | Compare Logical Scaled Immediate Word | | XL | 7C000202 | | VLE | e_crand | Condition Register AND | | XL | 7C000102 | | VLE | e_crandc | Condition Register AND with Complement | | XL | 7C000242 | | VLE<br>VLE | e_creqv | Condition Register Equivalent | | XL<br>XL | 7C0001C2 | | VLE | e_crnand | Condition Register NAND | | XL | 7C000042<br>7C000382 | | VLE | e_crnor<br>e_cror | Condition Register NOR Condition Register OR | | XL | 7C000382<br>7C000342 | | VLE | e_crorc | Condition Register OR with Complement | | XL | 7C000342<br>7C000182 | | VLE | e_crxor | Condition Register XOR | | D | 30000000 | | VLE | e lbz | Load Byte and Zero | | D8 | 18000000 | | VLE | e_lbzu | Load Byte and Zero with Update | | D | 38000000 | | VLE | e_lha | Load Halfword Algebraic | | D8 | 18000300 | | VLE | e_lhau | Load Halfword Algebraic with Update | | D | 58000000 | | VLE | e_lhz | Load Halfword and Zero | | D8 | 18000100 | | VLE | e_lhzu | Load Halfword and Zero with Update | | LI20 | 70000000 | | VLE | e_li | Load Immediate | | I16L | 7000E000 | | VLE | e_lis | Load Immediate Shifted | | D8 | 18000800 | | VLE | e_lmw | Load Multiple Word | | D | 50000000 | | VLE | e_lwz | Load Word and Zero | | D8 | 18000200 | | VLE | e_lwzu | Load Word and Zero with Update | | XL | 7C000020 | | VLE | e_mcrf | Move CR Field | | I16A | 7000A000 | | VLE | e_mull2i | Multiply (2 operand) Low Immediate | | SCI8 | 1800A000 | | VLE | e_mulli | Multiply Low Scaled Immediate | | I16L | 7000C000 | | VLE | e_or2i | OR (2operand) Immediate | | I16L | 7000D000 | 00 | VLE | e_or2is | OR (2 operand) Immediate Shifted | | SCI8 | 1800D000 | SR | VLE<br>VLE | e_ori[.] | OR Scaled Immediate | | X | 7C000230 | SR<br>SR | VLE | e_rlw[.] | Rotate Left Word Rotate Left Word Immediate | | M | 7C000270<br>74000000 | SK | VLE | e_rlwi[.]<br>e_rlwimi | Rotate Left Word Immediate Rotate Left Word Immediate then Mask Insert | | M | 74000000 | | VLE | e_rlwinm | Rotate Left Word Immediate then Mask insert | | X | 7000001 | SR | VLE | e_slwi[.] | Shift Left Word Immediate | | X | 7C000470 | SR | VLE | e_srwi[.] | Shift Right Word Immediate | | D | 34000000 | 0.1 | VLE | e_stb | Store Byte | | D8 | 18000400 | | VLE | e_stbu | Store Byte with Update | | D | 5C000000 | | VLE | e_sth | Store Halfword | | D8 | 18000500 | | VLE | e_sthu | Store Halfword with Update | | D8 | 18000900 | | VLE | e_stmw | Store Multiple Word | | D | 54000000 | | VLE | e_stw | Store Word | | D8 | 18000600 | | VLE | e_stwu | Store word with Update | | SCI8 | 1800B000 | SR | VLE | e_subfic[.] | Subtract From Scaled Immediate Carrying | | SCI8 | 1800E000 | SR | VLE | e_xori[.] | XOR Scaled Immediate | | EVX | 100002E4 | | SP.FD | efdabs | Floating-Point Double-Precision Absolute Value | | EVX | 100002E0 | | SP.FD | efdadd | Floating-Point Double-Precision Add | | EVX | 100002EF | | SP.FD | efdcfs | Floating-Point Double-Precision Convert from Single-Precision | | EVX | 100002F3 | | SP.FD | efdcfsf | Convert Floating-Point Double-Precision from Signed Fraction | | EVX | 100002F1 | | SP.FD | efdcfsi | Convert Floating-Point Double-Precision from Signed Integer | | EVX | 100002E3 | | SP.FD | efdcfsid | Convert Floating-Point Double-Precision from Signed Integer Doubleword | | EVX | 100002F2 | | SP.FD | efdcfuf | Convert Floating-Point Double-Precision from Unsigned Fraction | | EVX | 100002F0 | | SP.FD | efdcfui | Convert Floating-Point Double-Precision from Unsigned Integer | | _ | Opcode | 4.7 | | | | |---------|-------------------------------------------|----------|------------------|--------------------|----------------------------------------------------------------------------------------------------------------------| | Form | Opcode<br>(hexadeci-<br>mal) <sup>2</sup> | e e | Cat <sup>1</sup> | Mnemonic | Instruction | | | mal) <sup>2</sup> | عَ مَ عَ | | | | | EVX | 100002E2 | | SP.FD | efdcfuid | Convert Floating-Point Double-Precision from Unsigned | | E) () ( | | | 00.50 | | Integer Doubleword | | EVX | 100002EE | | SP.FD | efdcmpeq | Floating-Point Double-Precision Compare Equal | | EVX | 100002EC | | SP.FD | efdcmpgt | Floating-Point Double-Precision Compare Greater Than | | EVX | 100002ED | | SP.FD<br>SP.FD | efdcmplt | Floating-Point Double-Precision Compare Less Than | | EVX | 100002F7<br>100002F5 | | SP.FD | efdctsf<br>efdctsi | Convert Floating-Point Double-Precision to Signed Fraction Convert Floating-Point Double-Precision to Signed Integer | | EVX | 100002F5 | | SP.FD | efdctsidz | Convert Floating-Point Double-Precision to Signed Integer | | LVX | 10000266 | | 31.10 | Glucisiuz | Doubleword with Round Towards Zero | | EVX | 100002FA | | SP.FD | efdctsiz | Convert Floating-Point Double-Precision to Signed Integer | | | 100002111 | | 0 5 | 0.00012 | with Round Towards Zero | | EVX | 100002F6 | | SP.FD | efdctuf | Convert Floating-Point Double-Precision to Unsigned Frac- | | | 10000210 | | 0 5 | ordota. | tion | | EVX | 100002F4 | | SP.FD | efdctui | Convert Floating-Point Double-Precision to Unsigned Inte- | | | | | | | ger | | EVX | 100002EA | | SP.FD | efdctuidz | Convert Floating-Point Double-Precision to Unsigned Inte- | | | | | | | ger Doubleword with Round Towards Zero | | EVX | 100002F8 | | SP.FD | efdctuiz | Convert Floating-Point Double-Precision to Unsigned Inte- | | | | | | | ger with Round Towards Zero | | EVX | 100002E9 | | SP.FD | efddiv | Floating-Point Double-Precision Divide | | EVX | 100002E8 | | SP.FD | efdmul | Floating-Point Double-Precision Multiply | | EVX | 100002E5 | | SP.FD | efdnabs | Floating-Point Double-Precision Negative Absolute Value | | EVX | 100002E6 | | SP.FD | efdneg | Floating-Point Double-Precision Negate | | EVX | 100002E1 | | SP.FD | efdsub | Floating-Point Double-Precision Subtract | | EVX | 100002FE | | SP.FD | efdtsteq | Floating-Point Double-Precision Test Equal | | EVX | 100002FC | | SP.FD | efdtstgt | Floating-Point Double-Precision Test Greater Than | | EVX | 100002FD | | SP.FD | efdtstlt | Floating-Point Double-Precision Test Less Than | | EVX | 100002E4 | | SP.FS | efsabs | Floating-Point Single-Precision Absolute Value | | EVX | 100002E0 | | SP.FS | efsadd | Floating-Point Single-Precision Add | | EVX | 100002CF | | SP.FD | efscfd | Floating-Point Single-Precision Convert from Double-Precision | | EVX | 100002F3 | | SP.FS | efscfsf | Convert Floating-Point Single-Precision from Signed Frac- | | EVA | 100002F3 | | 3F.F3 | eiscisi | tion | | EVX | 100002F1 | | SP.FS | efscfsi | Convert Floating-Point Single-Precision from Signed Integer | | EVX | 100002F1 | | SP.FS | efscfsid | Convert Floating-Point Single-Precision from Signed Integer | | | 100002E3 | | 010 | Ciscisia | Doubleword | | EVX | 100002F2 | | SP.FS | efscfuf | Convert Floating-Point Single-Precision from Unsigned | | | 10000212 | | 0 0 | Olooidi | Fraction | | EVX | 100002F0 | | SP.FS | efscfui | Convert Floating-Point Single-Precision from Unsigned Inte- | | | 10000110 | | 0 0 | 0.00.0. | ger | | EVX | 100002E2 | | SP.FS | efscfuid | Convert Floating-Point Single-Precision from Unsigned Inte- | | | | | | | ger Doubleword | | EVX | 100002EE | | SP.FS | efscmpeq | Floating-Point Single-Precision Compare Equal | | EVX | 100002EC | | SP.FS | efscmpgt | Floating-Point Single-Precision Compare Greater Than | | EVX | 100002ED | | SP.FS | efscmplt | Floating-Point Single-Precision Compare Less Than | | EVX | 100002F7 | | SP.FS | efsctsf | Convert Floating-Point Single-Precision to Signed Fraction | | EVX | 100002F5 | | SP.FS | efsctsi | Convert Floating-Point Single-Precision to Signed Integer | | EVX | 100002EB | | SP.FS | efsctsidz | Convert Floating-Point Single-Precision to Signed Integer | | | | | | | Doubleword with Round Towards Zero | | EVX | 100002FA | | SP.FS | efsctsiz | Convert Floating-Point Single-Precision to Signed Integer | | | | | | | with Round Towards Zero | | EVX | 100002F6 | | SP.FS | efsctuf | Convert Floating-Point Single-Precision to Unsigned Frac- | | | | | 05.50 | | tion | | EVX | 100002F4 | | SP.FS | efsctui | Convert Floating-Point Single-Precision to Unsigned Integer | | EVX | 100002EA | | SP.FS | efsctuidz | Convert Floating-Point Single-Precision to Unsigned Integer | | | | | 00.50 | - <b>f</b> | Doubleword with Round Towards Zero | | EVX | 100002F8 | | SP.FS | efsctuiz | Convert Floating-Point Single-Precision to Unsigned Integer | | EV./V | 100000 | | 00.50 | مان ، | with Round Towards Zero | | EVX | 100002E9 | | SP.FS | efsdiv | Floating-Point Single-Precision Divide | | EVX | 100002E8 | | SP.FS | efsmul | Floating-Point Single-Precision Multiply | | | Opcode | <b>⊕</b> ∠ | | | | |-------|---------------------------------|------------|--------------------|------------------------|-----------------------------------------------------------------------------------------------------------------| | Form | (hexadeci-<br>mal) <sup>2</sup> | ode<br>ep. | _ Cat <sup>1</sup> | Mnemonic | Instruction | | F. | ` mal) <sup>2</sup> | ĎĎ | ፈ | | | | EVX | 100002E5 | | SP.FS | | Floating-Point Single-Precision Negative Absolute Value | | EVX | 100002E6 | | SP.FS | efsneg | Floating-Point Single-Precision Negate | | EVX | 100002E1 | | SP.FS | efssub | Floating-Point Single-Precision Subtract | | EVX | 100002FE | | SP.FS | efststeq | Floating-Point Single-Precision Test Equal | | EVX | 100002FC | | SP.FS | efststgt | Floating-Point Single-Precision Test Greater Than | | EVX | 100002FD | | SP.FS | efststlt | Floating-Point Single-Precision Test Less Than | | X | 7C000238 | SR | В | eqv[.] | Equivalent | | EVX | 10000208 | | SP | evabs | Vector Absolute Value | | EVX | 10000202 | | SP<br>SP | evaddiw<br>evaddsmiaaw | Vector Add Immediate Word | | EVX | 100004C9<br>100004C1 | | SP | evaddssiaaw | Vector Add Signed, Modulo, Integer to Accumulator Word Vector Add Signed, Saturate, Integer to Accumulator Word | | EVX | 100004C1 | | SP | evaddumiaaw | Vector Add Unsigned, Modulo, Integer to Accumulator Word | | EVX | 100004C8 | | SP | evaddusiaaw | Vector Add Unsigned, Modalo, Integer to Accumulator Vector Add Unsigned, Saturate, Integer to Accumulator | | - ** | 10000100 | | 0. | Ovadadoladw | Word | | EVX | 10000200 | | SP | evaddw | Vector Add Word | | EVX | 10000211 | | SP | evand | Vector AND | | EVX | 10000212 | | SP | evandc | Vector AND with Complement | | EVX | 10000234 | | SP | evcmpeq | Vector Compare Equal | | EVX | 10000231 | | SP | evcmpgts | Vector Compare Greater Than Signed | | EVX | 10000230 | | SP | evcmpgtu | Vector Compare Greater Than Unsigned | | EVX | 10000233 | | SP | evcmplts | Vector Compare Less Than Signed | | EVX | 10000232 | | SP | evcmpltu | Vector Compare Less Than Unsigned | | EVX | 1000020E | | SP | evcntlsw | Vector Count Leading Sign Bits Word | | EVX | 1000020D | | SP | evcntlzw | Vector Count Leading Zeros Bits Word | | EVX | 100004C6 | | SP | evdivws | Vector Divide Word Signed | | EVX | 100004C7 | | SP | evdivwu | Vector Divide Word Unsigned | | EVX | 10000219 | | SP<br>SP | eveqv | Vector Equivalent | | EVX | 1000020A | | SP | evextsb<br>evextsh | Vector Extend Sign Byte Vector Extend Sign Halfword | | EVX | 1000020B<br>10000284 | | SP.FV | evfsabs | Vector Floating-Point Single-Precision Absolute Value | | EVX | 10000284 | | SP.FV | evfsadd | Vector Floating-Point Single-Precision Add | | EVX | 10000293 | | SP.FV | evfscfsf | Vector Convert Floating-Point Single-Precision from Signed | | | | | | | Fraction | | EVX | 10000291 | | SP.FV | evfscfsi | Vector Convert Floating-Point Single-Precision from Signed | | | | | | | Integer | | EVX | 10000292 | | SP.FV | evfscfuf | Vector Convert Floating-Point Single-Precision from | | | | | | | Unsigned Fraction | | EVX | 10000290 | | SP.FV | evfscfui | Vector Convert Floating-Point Single-Precision from | | | | | | | Unsigned Integer | | EVX | 1000028E | | SP.FV | evfscmpeq | Vector Floating-Point Single-Precision Compare Equal | | EVX | 1000028C | | SP.FV | evfscmpgt | Vector Floating-Point Single-Precision Compare Greater | | | | | | | Than | | EVX | 1000028D | | SP.FV | evfscmplt | Vector Floating-Point Single-Precision Compare Less Than | | EVX | 10000297 | | SP.FV | evfsctsf | Vector Convert Floating-Point Single-Precision to Signed | | | | | | | Fraction | | EVX | 10000295 | | SP.FV | evfsctsi | Vector Convert Floating-Point Single-Precision to Signed | | _, ,, | | | | | Integer | | EVX | 1000029A | | SP.FV | evfsctsiz | Vector Convert Floating-Point Single-Precision to Signed | | _, ,, | | | | | Integer with Round Towards Zero | | EVX | 10000296 | | SP.FV | evfsctuf | Vector Convert Floating-Point Single-Precision to Unsigned | | | | | 05 5 | | Fraction | | EVX | 10000294 | | SP.FV | evfsctui | Vector Convert Floating-Point Single-Precision to Unsigned | | EV/V | 1000000 | | CD EV | ovefo otroit | Integer | | EVX | 10000298 | | SP.FV | evfsctuiz | Vector Convert Floating-Point Single-Precision to Unsigned | | EVV | 1000000 | | CD EV | ov fodi: | Integer with Round Towards Zero | | EVX | 10000289 | | SP.FV | evfsdiv | Vector Floating-Point Single-Precision Divide | | EVX | 10000288 | | SP.FV<br>SP.FV | evfsmul<br>evfsnabs | Vector Floating-Point Single-Precision Multiply Vector Floating-Point Single-Precision Negative Absolute | | - ^ ^ | 10000285 | | 3F.FV | evioliano | Value | | EVX | 10000286 | | SP.FV | evfsneg | Vector Floating-Point Single-Precision Negate | | LVA | T0000200 | | JF.FV | eviolicy | vector i loating-i oint oingle-i lecision negate | | | Opcode | 4 | | | | | |------------|-------------------------------------------|------------|----------|------------------|-----------------------------------------|-------------------------------------------------------------------| | Form | Opcode<br>(hexadeci-<br>mal) <sup>2</sup> | ode<br>ep. | <u>.</u> | Cat <sup>1</sup> | Mnemonic | Instruction | | | ` mal) <sup>2</sup> | ŽΔ | ₫. | | | | | EVX | 10000281 | | | SP.FV | evfssub | Vector Floating-Point Single-Precision Subtract | | EVX | 1000029E | | | SP.FV | evfststeq | Vector Floating-Point Single-Precision Test Equal | | EVX | 1000029C | | | SP.FV | evfststgt | Vector Floating-Point Single-Precision Test Greater Than | | EVX | 1000029D | | | SP.FV | evfststlt | Vector Floating-Point Single-Precision Test Less Than | | EVX | 10000301 | | | SP | evldd | Vector Load Doubleword into Doubleword | | EVX | 7C00011D | | Р | E.PD | evlddepx | Vector Load Doubleword into Doubleword by External Pro- | | | | | | | | cess ID Indexed | | EVX | 10000300 | | | SP | evlddx | Vector Load Doubleword into Doubleword Indexed | | EVX | 10000305 | | | SP | evldh | Vector Load Doubleword into 4 Halfwords | | EVX | 10000304 | | | SP | evldhx | Vector Load Doubleword into 4 Halfwords Indexed | | EVX | 10000303 | | | SP | evldw | Vector Load Doubleword into 2 Words | | EVX | 10000302 | | | SP | evldwx | Vector Load Doubleword into 2 Words Indexed | | EVX | 10000309 | | | SP | evlhhesplat | Vector Load Halfword into Halfwords Even and Splat | | EVX | 10000308 | | | SP | evlhhesplatx | Vector Load Halfword into Halfwords Even and Splat | | | | | | | | Indexed | | EVX | 1000030F | | | SP | evlhhossplat | Vector Load Halfword into Halfwords Odd and Splat | | EVX | 1000030E | | | SP | evlhhossplatx | Vector Load Halfword into Halfwords Odd Signed and Splat | | | | | | | • | Indexed | | EVX | 1000030D | | | SP | evlhhousplat | Vector Load Halfword into Halfwords Odd Unsigned and | | | | | | | | Splat | | EVX | 1000030C | | | SP | evlhhousplatx | Vector Load Halfword into Halfwords Odd Unsigned and | | | 10000300 | | | 0. | oviiiiodopiatx | Splat Indexed | | EVX | 10000311 | | | SP | evlwhe | Vector Load Word into Two Halfwords Even | | EVX | 10000311 | | | SP | evlwhex | Vector Load Word into Two Halfwords Even Indexed | | EVX | 10000310 | | | SP | evlwhos | Vector Load Word into Two Halfwords Odd Signed (with | | LVX | 10000317 | | | Oi | CVIWIIOS | sign extension) | | EVX | 10000316 | | | SP | evlwhosx | Vector Load Word into Two Halfwords Odd Signed Indexed | | LVX | 10000310 | | | 01 | CVIWIIOSX | (with sign extension) | | EVX | 10000315 | | | SP | evlwhou | Vector Load Word into Two Halfwords Odd Unsigned (zero- | | LVA | 10000315 | | | SF | eviwilou | extended) | | EVX | 10000314 | | | SP | evlwhoux | Vector Load Word into Two Halfwords Odd Unsigned | | EVA | 10000314 | | | SF | eviwiloux | | | EV/V | 10000015 | | | CD | alla a a l a t | Indexed (zero-extended) | | EVX | 1000031D | | | SP<br>SP | evlwhsplat | Vector Load Word into Two Halfwords and Splat | | | 1000031C | | | | evlwhsplatx | Vector Load Word into Two Halfwords and Splat Indexed | | EVX<br>EVX | 10000319 | | | SP<br>SP | evlwwsplat | Vector Load Word into Word and Splat | | EVX | 10000318 | | | SP | evlwwsplatx<br>evmergehi | Vector Load Word into Word and Splat Indexed<br>Vector Merge High | | EVX | 1000022C | | | SP | | Vector Merge High! our | | EVX | 1000022E | | | SP | evmergehilo | Vector Merge High/Low | | EVX | 1000022D | | | SP | evmergelo | Vector Merge Low | | | 1000022F | | | | evmergelohi | Vector Merge Low/High | | EVX | 1000052B | | | SP | evmhegsmfaa | Vector Multiply Halfwords, Even, Guarded, Signed, Modulo, | | E) /)/ | | | | OD. | | Fractional and Accumulate | | EVX | 100005AB | | | SP | evmhegsmfan | Vector Multiply Halfwords, Even, Guarded, Signed, Modulo, | | | | | | 0.0 | | Fractional and Accumulate Negative | | EVX | 10000529 | | | SP | evmhegsmiaa | Vector Multiply Halfwords, Even, Guarded, Signed, Modulo, | | | | | | | | Integer and Accumulate | | EVX | 100005A9 | | | SP | evmhegsmian | Vector Multiply Halfwords, Even, Guarded, Signed, Modulo, | | | | | | | | Integer and Accumulate Negative | | EVX | 10000528 | | | SP | evmhegumiaa | Vector Multiply Halfwords, Even, Guarded, Unsigned, Mod- | | | | | | | | ulo, Integer and Accumulate | | EVX | 100005A8 | | | SP | evmhegumian | Vector Multiply Halfwords, Even, Guarded, Unsigned, Mod- | | | | | | | | ulo, Integer and Accumulate Negative | | EVX | 1000040B | | | SP | evmhesmf | Vector Multiply Halfwords, Even, Signed, Modulo, Fractional | | EVX | 1000042B | | | SP | evmhesmfa | Vector Multiply Halfwords, Even, Signed, Modulo, Fractional | | | | | | | | to Accumulate | | EVX | 1000050B | | | SP | evmhesmfaaw | Vector Multiply Halfwords, Even, Signed, Modulo, Fractional | | | | | | | | and Accumulate into Words | | EVX | 1000058B | | | SP | evmhesmfanw | Vector Multiply Halfwords, Even, Signed, Modulo, Fractional | | | | | | | | and Accumulate Negative into Words | | EVX | 10000409 | | | SP | evmhesmi | Vector Multiply Halfwords, Even, Signed, Modulo, Integer | | _ ' ' ' | ±0000±00 | | <u> </u> | ٠, | - · · · · · · · · · · · · · · · · · · · | 1 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - | | Form | Opcode<br>(hexadeci-<br>mal) <sup>2</sup> | Mode<br>Dep. <sup>1</sup> | ≧ Cat <sup>1</sup> | Mnemonic | Instruction | |------------|-------------------------------------------|---------------------------|--------------------|-----------------------|--------------------------------------------------------------------------------------------------------------------------------| | EVX | 10000429 | | SP | evmhesmia | Vector Multiply Halfwords, Even, Signed, Modulo, Integer to | | EVX | 10000509 | | SP | evmhesmiaaw | Accumulator Vector Multiply Halfwords, Even, Signed, Modulo, Integer and Accumulate into Words | | EVX | 10000589 | | SP | evmhesmianw | Vector Multiply Halfwords, Even, Signed, Modulo, Integer | | EVX | 10000403 | | SP | evmhessf | and Accumulate Negative into Words Vector Multiply Halfwords, Even, Signed, Saturate, Fractional | | EVX | 10000423 | | SP | evmhessfa | Vector Multiply Halfwords, Even, Signed, Saturate, Fractional to Accumulator | | EVX | 10000503 | | SP | evmhessfaaw | Vector Multiply Halfwords, Even, Signed, Saturate, Fractional and Accumulate into Words | | EVX | 10000583 | | SP | evmhessfanw | Vector Multiply Halfwords, Even, Signed, Saturate, Frac- | | EVX | 10000501 | | SP | evmhessiaaw | tional and Accumulate Negative into Words Vector Multiply Halfwords, Even, Signed, Saturate, Integer and Accumulate into Words | | EVX | 10000581 | | SP | evmhessianw | Vector Multiply Halfwords, Even, Signed, Saturate, Integer and Accumulate Negative into Words | | EVX<br>EVX | 10000408<br>10000428 | | SP<br>SP | evmheumi<br>evmheumia | Vector Multiply Halfwords, Even, Unsigned, Modulo, Integer Vector Multiply Halfwords, Even, Unsigned, Modulo, Integer | | EVX | 10000508 | | SP | evmheumiaaw | to Accumulator Vector Multiply Halfwords, Even, Unsigned, Modulo, Integer and Accumulate into Words | | EVX | 10000588 | | SP | evmheumianw | Vector Multiply Halfwords, Even, Unsigned, Modulo, Integer and Accumulate Negative into Words | | EVX | 10000500 | | SP | evmheusiaaw | Vector Multiply Halfwords, Even, Unsigned, Saturate Integer and Accumulate into Words | | EVX | 10000580 | | SP | evmheusianw | Vector Multiply Halfwords, Even, Unsigned, Saturate Integer and Accumulate Negative into Words | | EVX | 1000052F | | SP | evmhogsmfaa | Vector Multiply Halfwords, Odd, Guarded, Signed, Modulo, Fractional and Accumulate | | EVX | 100005AF | | SP | evmhogsmfan | Vector Multiply Halfwords, Odd, Guarded, Signed, Modulo, Fractional and Accumulate Negative | | EVX | 1000052D | | SP | evmhogsmiaa | Vector Multiply Halfwords, Odd, Guarded, Signed, Modulo, Integer and Accumulate | | EVX | 100005AD | | SP | evmhogsmian | Vector Multiply Halfwords, Odd, Guarded, Signed, Modulo, Integer and Accumulate Negative | | EVX | 1000052C | | SP | evmhogumiaa | Vector Multiply Halfwords, Odd, Guarded, Unsigned, Modulo, Integer and Accumulate | | EVX | 100005AC | | SP | evmhogumian | Vector Multiply Halfwords, Odd, Guarded, Unsigned, Modulo, Integer and Accumulate Negative | | EVX<br>EVX | 1000040F<br>1000042F | | SP<br>SP | evmhosmf<br>evmhosmfa | Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional | | EVX | 1000050F | | SP | evmhosmfaaw | to Accumulator Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional and Accumulate into Words | | EVX | 1000058F | | SP | evmhosmfanw | Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional and Accumulate Negative into Words | | EVX | 1000040D | | SP | evmhosmi | Vector Multiply Halfwords, Odd, Signed, Modulo, Integer | | EVX | 1000042D | | SP | evmhosmia | Vector Multiply Halfwords, Odd, Signed, Modulo, Integer to Accumulator | | EVX | 1000050D | | SP | evmhosmiaaw | Vector Multiply Halfwords, Odd, Signed, Modulo, Integer and Accumulate into Words | | EVX | 1000058D | | SP | evmhosmianw | Vector Multiply Halfwords, Odd, Signed, Modulo, Integer and Accumulate Negative into Words | | EVX<br>EVX | 10000407<br>10000427 | | SP<br>SP | evmhossf<br>evmhossfa | Vector Multiply Halfwords, Odd, Signed, Fractional Vector Multiply Halfwords, Odd, Signed, Fractional to Accumulator | | | Opcode | ω <del>-</del> | | | | |------------|-------------------------------------------|----------------|------------------|-----------------------|---------------------------------------------------------------------------------------------------------------------| | Form | Opcode<br>(hexadeci-<br>mal) <sup>2</sup> | Mod<br>Dep. | Cat <sup>1</sup> | Mnemonic | Instruction | | EVX | 10000507 | | SP | evmhossfaaw | Vector Multiply Halfwords, Odd, Signed, Saturate, Fractional and Accumulate into Words | | EVX | 10000587 | | SP | evmhossfanw | Vector Multiply Halfwords, Odd, Signed, Saturate, Fractional and Accumulate Negative into Words | | EVX | 10000505 | | SP | evmhossiaaw | Vector Multiply Halfwords, Odd, Signed, Saturate, Integer and Accumulate into Words | | EVX | 10000585 | | SP | evmhossianw | Vector Multiply Halfwords, Odd, Signed, Saturate, Integer and Accumulate Negative into Words | | EVX<br>EVX | 1000040C<br>1000042C | | SP<br>SP | evmhoumi<br>evmhoumia | Vector Multiply Halfwords, Odd, Unsigned, Modulo, Integer Vector Multiply Halfwords, Odd, Unsigned, Modulo, Integer | | | | | | | to Accumulator | | EVX | 1000050C | | SP | evmhoumiaaw<br> | Vector Multiply Halfwords, Odd, Unsigned, Modulo, Integer and Accumulate into Words | | EVX | 1000058C | | SP | evmhoumianw | Vector Multiply Halfwords, Odd, Unsigned, Modulo, Integer and Accumulate Negative into Words | | EVX | 10000504 | | SP | evmhousiaaw | Vector Multiply Halfwords, Odd, Unsigned, Saturate, Integer and Accumulate into Words | | EVX | 10000584 | | SP | evmhousianw | Vector Multiply Halfwords, Odd, Unsigned, Saturate, Integer and Accumulate Negative into Words | | EVX | 100004C4 | | SP | evmra | Initialize Accumulator | | EVX | 1000044F | | SP | evmwhsmf | Vector Multiply Word High Signed, Modulo, Fractional | | EVX | 1000046F | | SP | evmwhsmfa | Vector Multiply Word High Signed, Modulo, Fractional to Accumulator | | EVX | 1000054F | | SP | evmwhsmfaaw | Vector Multiply Word High Signed, Modulo, Fractional and Accumulate into Words | | EVX | 100005CF | | SP | evmwhsmfanw | Vector Multiply Word High Signed, Modulo, Fractional and Accumulate Negative into Words | | EVX<br>EVX | 1000044D<br>1000046D | | SP<br>SP | evmwhsmi<br>evmwhsmia | Vector Multiply Word High Signed, Modulo, Integer Vector Multiply Word High Signed, Modulo, Integer to Accu- | | EVX | 1000054D | | SP | evmwhsmiaaw | mulator Vector Multiply Word High Signed, Modulo, Integer and | | EVX | 100005CD | | SP | evmwhsmianw | Accumulate into Words Vector Multiply Word High Signed, Modulo, Integer and | | EVX | 10000447 | | SP | evmwhssf | Accumulate Negative into Words Vector Multiply Word High Signed, Fractional | | EVX | 10000467 | | SP | evmwhssfa | Vector Multiply Word High Signed, Fractional to Accumulator | | EVX | 10000547 | | SP | evmwhssfaaw | Vector Multiply Word High Signed, Fractional and Accumulate into Words | | EVX | 100005C7 | | SP | evmwhssfanw | Vector Multiply Word High Signed, Fractional and Accumulate Negative into Words | | EVX | 100005C5 | | SP | evmwhssianw | Vector Multiply Word High Signed, Integer and Accumulate Negative into Words | | EVX<br>EVX | 1000044C<br>1000046C | | SP<br>SP | evmwhumi<br>evmwhumia | Vector Multiply Word High Unsigned, Modulo, Integer Vector Multiply Word High Unsigned, Modulo, Integer to | | EVX | 1000010C | | SP | evmwhumiaaw | Accumulator Vector Multiply Word High Unsigned, Modulo, Integer and | | EVX | 100005CC | | SP | evmwhumianw | Accumulate into Words Vector Multiply Word High Unsigned, Modulo, Integer and | | EVX | 10000544 | | SP | evmwhusiaaw | Accumulate Negative into Words Vector Multiply Word High Unsigned, Integer and Accumu- | | EVX | 100005C4 | | SP | evmwhusianw | late into Words Vector Multiply Word High Unsigned, Integer and Accumu- | | EVX | 10000549 | | SP | evmwlsmiaaw | late Negative into Words Vector Multiply Word Low Signed, Modulo, Integer and | | EVX | 100005C9 | | SP | evmwlsmianw | Accumulate into Words Vector Multiply Word Low Signed, Modulo, Integer and Accumulate Negative into Words | | | Opcode | | 1 | | | |------------|-------------------------------------------|------|----------------------------------------------|---------------------|--------------------------------------------------------------------------------------------------------------------| | Form | Opcode<br>(hexadeci-<br>mal) <sup>2</sup> | Dep. | Cat <sup>1</sup> | Mnemonic | Instruction | | EVX | 10000541 | | SP | evmwlssiaaw | Vector Multiply Word Low Signed, Saturate, Integer and | | EVX | 100005C1 | | SP | evmwlssianw | Accumulate into Words Vector Multiply Word Low Signed, Saturate, Integer and Accumulate Negative into Words | | EVX | 10000448 | | SP | evmwlumi | Vector Multiply Word Low Unsigned, Modulo, Integer | | EVX | 10000468 | | SP | evmwlumia | Vector Multiply Word Low Unsigned, Modulo, Integer to Accumulator | | EVX | 10000548 | | SP | evmwlumiaaw | Vector Multiply Word Low Unsigned, Modulo, Integer and Accumulate into Words | | EVX | 100005C8 | | SP | evmwlumianw | Vector Multiply Word Low Unsigned, Modulo, Integer and Accumulate Negative into Words | | EVX | 10000540 | | SP | evmwlusiaaw | Vector Multiply Word Low Unsigned Saturate, Integer and Accumulate into Words | | EVX | 100005C0 | | SP | evmwlusianw | Vector Multiply Word Low Unsigned Saturate, Integer and Accumulate Negative into Words | | EVX | 1000045B<br>1000047B | | SP<br>SP | evmwsmf<br>evmwsmfa | Vector Multiply Word Signed, Modulo, Fractional Vector Multiply Word Signed, Modulo, Fractional to Accumulator | | EVX | 1000055B | | SP | evmwsmfaa | Vector Multiply Word Signed, Modulo, Fractional and Accumulate | | EVX | 100005DB | | SP | evmwsmfan | Vector Multiply Word Signed, Modulo, Fractional and Accumulate Negative | | EVX<br>EVX | 10000459 | | SP<br>SP | evmwsmi<br>evmwsmia | Vector Multiply Word Signed, Modulo, Integer<br>Vector Multiply Word Signed, Modulo, Integer to Accumula-<br>tor | | EVX | 10000559 | | SP | evmwsmiaa | Vector Multiply Word Signed, Modulo, Integer and Accumulate | | EVX | 100005D9 | | SP | evmwsmian | Vector Multiply Word Signed, Modulo, Integer and Accumulate Negative | | EVX<br>EVX | 10000453<br>10000473 | | SP<br>SP | evmwssf<br>evmwssfa | Vector Multiply Word Signed, Saturate, Fractional Vector Multiply Word Signed, Saturate, Fractional to Accumulator | | EVX | 10000553 | | SP | evmwssfaa | Vector Multiply Word Signed, Saturate, Fractional and Accumulate | | EVX | 100005D3 | | SP | evmwssfan | Vector Multiply Word Signed, Saturate, Fractional and Accumulate Negative | | EVX<br>EVX | 10000458<br>10000478 | | SP<br>SP | evmwumi<br>evmwumia | Vector Multiply Word Unsigned, Modulo, Integer<br>Vector Multiply Word Unsigned, Modulo, Integer to Accumu- | | EVX | 10000558 | | SP | evmwumiaa | lator Vector Multiply Word Unsigned, Modulo, Integer and Accumulate | | EVX | 100005D8 | | SP | evmwumian | Vector Multiply Word Unsigned, Modulo, Integer and Accumulate Negative | | EVX | 1000021E | | SP | evnand | Vector NAND | | EVX | 10000209 | | SP | evneg | Vector Negate | | EVX | 10000218 | | SP | evnor | Vector NOR | | EVX | 10000217 | | SP | evor | Vector OR | | EVX<br>EVX | 1000021B | | SP<br>SP | evorc | Vector OR with Complement | | EVX | 10000228<br>1000022A | | SP<br>SP | evrlwi<br>evrlwi | Vector Rotate Left Word<br>Vector Rotate Left Word Immediate | | EVX | 1000022A<br>1000020C | | SP | evrndw | Vector Round Word | | EVSE | 10000200 | | SP | evsel | Vector Select | | EVX | 10000224 | | SP | evslw | Vector Shift Left Word | | EVX | 10000226 | | SP | evslwi | Vector Shift Left Word Immediate | | EVX | 1000022B | | SP | evsplatfi | Vector Splat Fractional Immediate | | EVX | 10000229 | | SP | evsplati | Vector Splat Immediate | | EVX | 10000223 | | SP | evsrwis | Vector Shift Right Word Immediate Signed | | EVX<br>EVX | 10000222 | | SP<br>SP | evsrwiu<br>evsrws | Vector Shift Right Word Immediate Unsigned<br>Vector Shift Right Word Signed | | | | | <u>. </u> | 1 | | | Form | Opcode<br>(hexadeci-<br>mal) <sup>2</sup> | Mode<br>Dep.¹ | Priv | Cat <sup>1</sup> | Mnemonic | Instruction | |------------|-------------------------------------------|---------------|------|------------------|---------------------|---------------------------------------------------------------------------------------------------| | EVX | 10000220 | | | SP | evsrwu | Vector Shift Right Word Unsigned | | EVX | 10000321 | | | SP | evstdd | Vector Store Doubleword of Doubleword | | EVX | 7C00019D | | Р | E.PD | evstddepx | Vector Store Doubleword into Doubleword by External Process ID Indexed | | EVX | 10000320 | | | SP | evstddx | Vector Store Doubleword of Doubleword Indexed | | EVX | 10000325 | | | SP | evstdh | Vector Store Doubleword of Four Halfwords | | EVX | 10000324 | | | SP | evstdhx | Vector Store Doubleword of Four Halfwords Indexed | | EVX | 10000323 | | | SP | evstdw | Vector Store Doubleword of Two Words | | EVX<br>EVX | 10000322 | | | SP<br>SP | evstdwx | Vector Store Doubleword of Two Words Indexed | | EVX | 10000331 | | | SP | evstwhe<br>evstwhex | Vector Store Word of Two Halfwords from Even Vector Store Word of Two Halfwords from Even Indexed | | EVX | 10000330 | | | SP | evstwho | Vector Store Word of Two Halfwords from Odd | | EVX | 10000333 | | | SP | evstwhox | Vector Store Word of Two Halfwords from Odd Indexed | | EVX | 10000334 | | | SP | evstwwe | Vector Store Word of Word from Even | | EVX | 10000338 | | | SP | evstwwex | Vector Store Word of Word from Even Indexed | | EVX | 1000033D | | | SP | evstwwo | Vector Store Word of Word from Odd | | EVX | 1000033C | | | SP | evstwwox | Vector Store Word of Word from Odd Indexed | | EVX | 100004CB | | | SP | evsubfsmiaaw | Vector Subtract Signed, Modulo, Integer to Accumulator Word | | EVX | 100004C3 | | | SP | evsubfssiaaw | Vector Subtract Signed, Saturate, Integer to Accumulator Word | | EVX | 100004CA | | | SP | evsubfumiaaw | Vector Subtract Unsigned, Modulo, Integer to Accumulator Word | | EVX | 100004C2 | | | SP | evsubfusiaaw | Vector Subtract Unsigned, Saturate, Integer to Accumulator Word | | EVX | 10000204 | | | SP | evsubfw | Vector Subtract from Word | | EVX | 10000206 | | | SP | evsubifw | Vector Subtract Immediate from Word | | EVX | 10000216 | | | SP | evxor | Vector XOR | | X | 7C000774 | SR | | В | extsb[.] | Extend Shign Byte | | X | 7C000734 | SR | | В | extsh[.] | Extend Sign Halfword | | X | 7C0007B4 | SR | | 64 | extsw[.] | Extend Sign Word | | X | 7C0007AC<br>7C0007BE | | Р | B<br>E.PD | icbi<br>icbiep | Instruction Cache Block Invalidate Instruction Cache Block Invalidate by External Process ID | | x | 7C0007BE | | М | ECL | icblc | Instruction Cache Block Lock Clear | | X | 7C0001CC | | IVI | E | icbt | Instruction Cache Block Touch | | X | 7C0003CC | | М | ECL | icbtls | Instruction Cache Block Touch and Lock Set | | Χ | 7C00078C | | Р | E.CI | ici | Instruction Cache Invalidate | | X | 7C0007CC | | Р | E.CD | icread | Instruction Cache Read | | Α | 7C00001E | | | B.in | isel | Integer Select | | X | 7C0000BE | | Р | E.PD | lbepx | Load Byte by External Process ID Indexed | | X | 7C0000EE | | | В | lbzux | Load Byte and Zero with Update Indexed | | X | 7C0000AE | | | В | lbzx | Load Byte and Zero Indexed | | X | 7C0000A8 | | Р | 64<br>E.D.D. | Idarx | Load Doubleword and Reserve Indexed | | X | 7C00003A<br>7C00006A | | | E.PD<br>64 | ldepx<br>ldux | Load Doubleword by External Process ID Indexed Load Doubleword with Update Indexed | | x | 7C00006A<br>7C00002A | | | 64 | ldx | Load Doubleword Indexed | | x | 7C00002A<br>7C0004BE | | Р | E.PD | Ifdepx | Load Floating-Point Double by External Process ID Indexed | | X | 7C0004BE<br>7C0002EE | | | В | Ihaux | Load Halfword Algebraic with Update Indexed | | X | 7C0002AE | | | В | lhax | Load Halfword Algebraic Indexed | | X | 7C00062C | | | В | lhbrx | Load Halfword Byte-Reversed Indexed | | X | 7C00023E | | Р | E.PD | lhepx | Load Halfword by External Process ID Indexed | | X | 7C00026E | | | В | lhzux | Load Halfword and Zero with Update Indexed | | X | 7C00022E | | | В | lhzx | Load Halfword and Zero Indexed | | X | 7C0004AA | | | MA | Iswi | Load String Word Immediate | | X | 7C00042A | | | MA | Iswx | Load String Word Indexed | | X | 7C00000E | | | V | lvebx | Load Vector Element Byte Indexed | | X | 7C00004E | | Р | V<br>E.PD | lvehx | Load Vector Element Halfword Indexed | | X | 7C00024E | | P | E.PD | lvepx<br>lvepxl | Load Vector by External Process ID Indexed Load Vector by External Process ID Indexed LRU | | x | 7C00020E<br>7C00008E | | - | E.P.D<br>V | lvepxi | Load Vector Element Word Indexed | | X | 7C00000E | | | V | Ivsl | Load Vector for Shift Left Indexed | | | , 5500000 | | | ٧ | 1 | | | Texas | | Opcode | | | | | | |--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------|-------------------|------------|------|------------------|-----------------|-------------------------------------------------| | V | L L | (hexadeci- | ode<br>ep. | ١٧١. | Cat <sup>1</sup> | Mnemonic | Instruction | | X | Ä | mal) <sup>2</sup> | ĎĎ | Ы | | | | | X 7C00028 7C00029 | X | | | | | | | | X 700002A | | | | | | | | | X | X | | | | | | | | X | X | | | | | | Load Word Algebraic with Update Indexed | | | X | | | | - | | | | X | X | | | Г | | - | | | X | \ \ \ \ | | | ۲ | | • | | | X | \<br>\<br>\ | | | | | | | | | Ŷ | | SR | | | | | | X | ^ | 10000138 | OIX | | LIIVI | macchw[o][.] | | | | X | 10000108 | SR | | LIM | macchws[o][ ] | | | X | ^ | 100001100 | 0.1 | | | maconwo[o][.] | | | | X | 10000198 | SR | | LIM | macchwsu[o][1 | | | X | | 10000130 | 0.1 | | | macon mod[o][.] | | | Name | X | 10000118 | SR | | LIM | macchwu[o][.] | | | X | | 10000110 | | | | | | | X | X | 10000058 | SR | | LIM | machhw[o][.] | | | X | | | | | | machhws[o][.] | | | X | | | | | | | | | Number N | Х | 10000098 | SR | | LIM | machhwsu[o][.] | | | X | | | | | | | | | | Χ | 10000018 | SR | | LIM | machhwu[o][.] | | | X | | | | | | | | | X | Χ | 10000358 | SR | | LIM | maclhw[o][.] | | | X | | | | | | | | | X | Χ | 10000398 | SR | | LIM | | | | New York | | | | | | | Unsigned | | XFX 7C0006AC<br>X R E mbar mcrx Memory Barrier XFX 7C000026<br>XFX P E mbar mcrx Move To Condition Register From XER XFX 7C000286<br>XFX P E mfdcr Move From Device Control Register User-mode Indexed XFX 7C000206<br>XFX P E mfdcrx Move From Device Control Register User-mode Indexed XFX 7C000026<br>XFX P E mfdcrx Move From Device Control Register User-mode Indexed XFX 7C100026<br>XFX D B mfmsr Move From Device Control Register Indexed XFX 7C00026<br>XFX D B mfspr Move From One Condition Register Field XFX 7C00026<br>XFX D B mfspr Move From Performance Monitor Register XFX 7C000126<br>XFX P E PEPC msgclr Message Clear XFX 7C000126<br>XFX P E Midcrux Move To Device Control Register XFX 7C000126<br>XFX P E midcrux Move To Condition Registe | Χ | 10000318 | SR | | LIM | maclhwu[o][.] | Multiply Accumulate Low Halfword to Word Modulo | | X 70000400 / XFX B mcxr mfcr mfcr Move To Condition Register From XER XFX / 70000286 / XFX / 70000246 / XFX / 70000246 / XFX / 7000026 70000290 / XFX / 70000290 / XFX / 70000290 / XFX / 70000100 / XFX / 70000100 / XFX / 70000100 / XFX / 70000100 / XFX / 70000120 / XFX / 70000120 / XFX / 70000346 | | | | | | | Unsigned | | XFX 7c000026 / XFX P P E E mfdcr mfdcr mfdcr Move From Device Control Register User-mode Indexed XFX 7c000246 / XFX P E E mfdcr mfdcr mfdcr Move From Device Control Register User-mode Indexed XFX 7c000206 / XFX P E E mfdcrx Move From Device Control Register Indexed XFX 7c000026 / XFX P B B mfsor Move From Device Control Register Indexed XFX 7c100026 / XFX P B B mfsor Move From Device Control Register Indexed XFX 7c00029C / XFX O B B mfspr Move From Device Control Register Field XFX 7c000126 / XFX P E PM Move From Device Control Register Field XFX 7c000120 / XFX P E PC Move From Device Control Register Move From Device Control Register Field XFX 7c000120 / XFX P E PC Move From Performance Monitor Register Move From Performance Monitor Register XFX 7c000346 / XFX P E MTdcr Move To Device Control Register Fields Move To Device Control Register User-mode Indexed XFX 7c000124 / XFX P E MTdcr Move To Device Control Register Fields Move To Device Control Register Move To Device Control Register XFX 7c000124 / XFX P E MTdcr Move To Move To Move To Move To Device Control Register Move To Device Control Register Move To Device | | 7C0006AC | | | | mbar | | | XFX 70000286 P P E mfdcr Move From Device Control Register XFX 70000246 P P E mfdcrux Move From Device Control Register User-mode Indexed XFX 7000026 P B mfmsr Move From Device Control Register Indexed XFX 7000026 P B mfmsr Move From Device Control Register Indexed XFX 7000026 D B mfmsr Move From Device Control Register Indexed XFX 7000026 D B mfmsr Move From Device Control Register Indexed VX 10000604 D B mfspr Move From Performance Monitor Register VX 10000604 D P E.PC mfspr Move From Device Control Register XFX 70000102 P E.PC mfspr Move To Device Control Register Fields XFX 7000036 P E mtdcr Move To Device Control Register User-mode Indexed XFX 70000124 P E E mtdcr | | 7C000400 | | | | - | | | XFX 7C000246 XFX P E E Mfdcrux Move From Device Control Register User-mode Indexed Move From Device Control Register User-mode Indexed Move From Device Control Register Indexed Move From Device Control Register Indexed Move From Device Control Register Indexed Move From Performance Monitor Register Move From Special Purpose Register Move From Special Purpose Register Move From Special Purpose Register Move From Vector Status and Control Register Move From Vector Status and Control Register Move To Device Control Register Indexed Move To Device Control Register Indexed Move To Device Control Register Fields X 700019C XFX 700019C XFX 7000346 X 70000346 X 70000346 X 7000032 XFX 70000346 X 7000032 XFX 70000346 X 7000032 XFX 7000346 X 7000032 XFX 7000346 X 7000032 XFX 7000346 X 70000346 700 | | | | _ | В | | | | XFX 7C0000206 P B mfdcrx Move From Device Control Register Indexed XFX 7C000026 O B mfmsr Move From Device Control Register Indexed XFX 7C00002A6 O E.PM Move From One Condition Register Field XFX 7C0001A6 O B mfspr Move From Device Control Register VX 10000604 V Move From Special Purpose Register X 7C0001DC P E.PC msgclr Message Clear X 7C000120 P E.PC msgclr Message Send XFX 7C000346 P E mtdcr Move To Device Control Register Fields X 7C000346 P E mtdcrx Move To Device Control Register User-mode Indexed XFX 7C000346 P E mtdcrx Move To Device Control Register Field XFX 7C000346 P E mtdcrx Move To Device Control Register Indexed XFX 7C000350 D E M Move | | | | | E | | | | X 7C0000A6 XFX P CO0029C B Mfmsr Mfmsr Move From Machine State Register Move From Machine State Register XFX 7C00029C O B EPM Mfmr Mfmr Move From Performance Monitor Register Move From Performance Monitor Register VX 7C0001DC X 7C0001DC X 7C00019C P EPC M Mfvscr Move From Performance Monitor Register Move From Performance Monitor Register XFX 7C0001DC X 7C00019C P EPC M Mfvscr Move From Vector Status and Control Register XFX 7C000120 M A 7C000386 P EPC M Mfvscr Move To Device Control Register Fields XFX 7C000346 M A 7C000346 M A 7C00039C P E Mtdcrx Move To Device Control Register User-mode Indexed XFX 7C00039C MARY TOWN TOWN TOWN TOWN TOWN TOWN TOWN TOWN | | | | | ᆫ | | | | XFX 7C100026 B mfoorf Move From One Condition Register Field XFX 7C0002A6 V mfspr Move From Performance Monitor Register VX 10000604 V mfspr Move From Special Purpose Register VX 7C0001DC P EPC msgclr Message Clear X 7C00019C P EPC msgsnd Message Send XFX 7C000386 P E mtdcr Move To Device Control Register Fields X 7C000346 P E mtdcrx Move To Device Control Register User-mode Indexed X 7C000306 P E mtdcrx Move To Device Control Register Indexed X 7C000306 P E mtdcrx Move To Device Control Register Indexed X 7C000306 P E mtmsr Move To One Condition Register Field XFX 7C00039C D B mtspr Move To One Condition Register Field XFX 7C000366 V B mtspr Move To One Condition Register Fields X | | | | | | | | | XFX 7c00029C Compose of the control | | | | ۲ | | _ | | | XFX 7C0002A6 V B mfspr mfvscr msgslr mfvscr Move From Special Purpose Register X 7C0001DC P EPC msgslr mfvscr Message Clear XFX 7C000120 AFT P EPC EPC msgsnd mtcrf Message Clear XFX 7C000386 AFT P E mtdcr mtdcr Move To Condition Register Fields X 7C000346 AFT P E mtdcrx mtdcrx Move To Device Control Register User-mode Indexed X 7C000346 AFT P E mtdcrx mtdcrx Move To Device Control Register Indexed X 7C000326 AFT P E mtmsr mtdcrx Move To Device Control Register Indexed XFX 7C000396 AFT P E mtmsr mtcrf Move To One Condition Register Field XFX 7C000396 AFT SR B mtmpr mtmpr Move To Performance Monitor Register X 10000150 AFT SR LIM mulchw[o][.] Multiply Cross Halfword to Word Signed X 10000150 AFT SR B H Multiply High Doubleword X | | | | | | | | | VX10000604<br>XPROCEDIAL<br>TOOO01DC<br>XVMfvscr<br>Msgsnd<br>Mcssage Clear<br>Message SendMove from Vector Status and Control Register<br>Message Clear<br>Message SendXFX70000120<br>XFXBmtcrf<br>BMove To Device Control Register FieldsXFX70000346<br>XPEmtdcrux<br>Move To Device Control Register User-mode IndexedX70000346<br>XPEmtdcrux<br>Move To Device Control Register User-mode IndexedX70000124<br>XFXPEmtdcrx<br>Move To Device Control Register IndexedXFX70000390<br>XFXOE.PM<br>Bmtpmr<br>Move To One Condition Register FieldXFX70003A6<br>VXOE.PM<br>Bmtpmr<br>Move To Performance Monitor Register<br>Move To Special Purpose Register<br>Move To Special Purpose Register<br>Move To Special Purpose Register<br>Move to Vector Status and Control Register<br>Move To Special Purpose Register<br>Move to Vector Status and Control RegisterX10000150<br>XOSR<br>TO000092<br>XOLIM<br>TO000010<br>SR<br>XLIM<br>Multiply High Doubleword<br>Multiply High Doubleword Unsigned<br>Multiply High Doubleword to Word Signed<br>Multiply High Halfword to Word Unsigned<br>Multiply High Halfword to Word UnsignedXO70000016<br>XOSR<br>TO000016B<br>R<br> | | | | | | | | | X 7C0001DC<br>7C00019C P<br>EPC<br>No. EPC<br>EPC<br>P<br>EPC msgclr<br>msgsnd<br>mtcrf Message Clear<br>Move To Condition Register Fields XFX 7C000386<br>7C000346 P<br>E<br>E<br>E<br>Mtdcrux Move To Device Control Register User-mode Indexed X 7C000306<br>7C000124 P<br>E<br>E<br>E<br>Mtdcrux Move To Device Control Register User-mode Indexed XFX 7C100120<br>7C00039C P<br>E<br>E<br>Mtdcrux Move To Device Control Register Indexed XFX 7C00039C<br>7C00039C D<br>E<br>B<br>Mtdcr Move To Machine State Register XFX 7C00039C<br>7C0000346 D<br>EPM<br>Mtmpmr Move To Performance Monitor Register Field XFX 7C0003A6<br>7C00003E D<br>EPM<br>Mtliply Cross Halfword to Word Signed X 10000150<br>7C000002 SR<br>A<br>TO0000050<br>SR<br>X SR<br>A<br>TO0000050<br>SR<br>A<br>TO000006 SR<br>A<br>TO0000060<br>SR<br>A<br>TO0000060<br>SR<br>A<br>TO0000060<br>SR<br>A<br>TO0000060<br>SR<br>A<br>TO0000060<br>SR<br>A<br>TO0000060<br>SR<br>A<br>TO0000060<br>SR<br>A<br>TO0000060<br>SR<br>A<br>TO0000060<br>SR<br>A<br>TO0000060<br>SR<br>A<br>TO0000060<br>SR<br>A<br>TO0000060<br>SR<br>A<br>TO0000060<br>SR<br>A<br>TO0000060<br>SR<br>A<br>TO0000060<br>SR<br>A<br>TO0000060<br>SR<br>A<br>TO0000060<br>SR<br>A<br>TO0000060<br>SR<br>A<br>TO000060<br>SR<br>B<br>B<br>Multiply High Word Unsigned Multiply High Word Unsigned MUltiply High Word Unsigned Multiply High Word Unsigned Multiply High Word Unsigned | | | | | | _ | l | | X7C00019C<br>7C000120<br>XFXPE.P.C<br>B<br>Move To Condition Register Fields<br>Move To Device Control Register User-mode Indexed<br>Move To Device Control Register User-mode Indexed<br>Move To Device Control Register User-mode Indexed<br>Move To Device Control Register User-mode Indexed<br>Move To Device Control Register Indexed<br>Move To Device Control Register Indexed<br>Move To Device Control Register Indexed<br>Move To Device Control Register IndexedX 7C000124<br>XFX<br>7C00039C<br>XFX<br>7C00039C<br>XFX<br>X 10000644<br>X 10000150<br>X 10000110<br>X 10000110<br>X 10000010<br>X 1000010<br>X 1000010 <br< td=""><td></td><td></td><td></td><td>Р</td><td></td><td></td><td></td></br<> | | | | Р | | | | | XFX 7C000120 XFX 7C000386 X 7C000346 X 7C000346 X 7C000306 X 7C000124 X 7C000124 X 7C000390 XFX 7C000390 XFX 7C000390 XFX 7C000390 XFX 7C000346 X 10000644 X 10000644 X 10000644 X 10000644 X 10000110 X 10000644 X 10000110 X 10000644 X 10000110 X 10000644 X 10000110 X 10000010 10000000000 | | | | | | | | | XFX | | | | | | | | | X 7C000346 X 7C000306 X 7C000124 P E mtdcrx Move To Device Control Register User-mode Indexed Move To Device Control Register User-mode Indexed Move To Device Control Register Indexe | | | | Р | Е | | | | X 7C000306 P E mtdcrx Move To Device Control Register Indexed XFX 7C000124 P E mtmsr Move To Machine State Register XFX 7C00039C O E.PM mtpmr Move To One Condition Register Field XFX 7C0003A6 O E.PM mtpmr Move To Performance Monitor Register VX 10000644 V mtvscr Move To Special Purpose Register X 10000150 SR LIM mulchw[o][.] Multiply Cross Halfword to Word Signed X 10000110 SR LIM mulhd[.] Multiply High Doubleword XO 7C000092 SR 64 mulhd[.] Multiply High Doubleword Unsigned X 10000050 SR LIM mulhhw[o][.] Multiply High Halfword to Word Signed X 10000010 SR B B Multiply High Word XO 7C000096 SR B B Multiply High Word XO 7C0000016 SR B | | 7C000346 | | | E | mtdcrux | | | XFX 7C100120 B Mtocrf Move To One Condition Register Field XFX 7C000336 O B mtpmr Move To Performance Monitor Register XFX 7C0003A6 O B mtspr Move To Special Purpose Register VX 10000150 SR LIM mulchw[o][.] Multiply Cross Halfword to Word Signed X 10000110 SR LIM mulchw[o][.] Multiply High Doubleword XO 7C000092 SR 64 mulhd[.] Multiply High Doubleword XO 7C000012 SR LIM mulhhw[o][.] Multiply High Halfword to Word Signed X 10000050 SR LIM mulhhw[o][.] Multiply High Halfword to Word Unsigned XO 7C000096 SR B B mulhw[.] Multiply High Word XO 7C000016 SR B B mulhw[.] Multiply High Word Unsigned | Χ | 7C000306 | | Р | Е | mtdcrx | | | XFX 7C00039C Compose the control of | Χ | 7C000124 | | Р | E | mtmsr | Move To Machine State Register | | XFX 7C0003A6 VX 10000644 LIM V mtspr mtvscr mtvscr Move To Special Purpose Register Move to Vector Status and Control Multiply High Doubleword Unsigned Multiply High Doubleword Signed Multiply High Doubleword Unsigned Multiply High Halfword to Word Unsigned Multiply High Word Multiply High Word Unsigned Mult | | | | | | | | | VX 10000644 | | | | | | | | | X 10000150 SR 10000110 SR XO 70000092 SR XO XO 70000012 SR XO 10000010 SR XO 70000010 SR XO 70000010 SR XO 70000016 SR XO 70000016 SR XO 70000016 SR XO 70000016 SR XO 70000016 SR B B mulhwu[.] LIM mulchwu[o][.] mulc | | | | 0 | | | | | X 10000110 SR 70000092 LIM 64 mulchwu[0][.] mulhd[.] Multiply Cross Halfword to Word Unsigned Multiply High Doubleword XO 70000012 SR 10000050 SR LIM mulhhw[0][.] mulhhw[0][.] Multiply High Doubleword Unsigned Multiply High Halfword to Word Signed Multiply High Halfword to Word Signed Multiply High Halfword to Word Unsigned Multiply High Halfword to Word Unsigned Multiply High Word Multiply High Word Multiply High Word Multiply High Word Multiply High Word Unsigned XO 70000016 SR B B mulhwu[.] B mulhwu[.] Multiply High Word Unsigned Multiply High Word Unsigned | | | 0.5 | | | | | | XO 7c000092 SR 7c000012 64 64 Multiply High Doubleword Multiply High Doubleword Unsigned XO 7c000012 SR 10000050 SR LIM mulhhw[o][.] mulhhw[o][.] Multiply High Doubleword Unsigned X 10000010 SR SR V C000096 LIM mulhhw[o][.] mulhw[o][.] Multiply High Halfword to Word Unsigned XO 7c000016 SR B B mulhw[.] Multiply High Word XO 7c000016 SR B B mulhw[.] Multiply High Word Unsigned | X | | | | | | | | XO 7c000012 SR 10000050 SR 10000010 SR XO SR LIM mulhhw[o][.] mulhhw[o][.] mulhhw[o][.] mulhw[o][.] mul | | | | | | | | | X | | | | | | | | | X 10000010 SR LIM mulhhwu[0][.] Multiply High Halfword to Word Unsigned Multiply High Word Multiply High Word Multiply High Word Multiply High Word Multiply High Word Unsigned | | | | | | | | | XO 7C000096 SR B mulhw[.] Multiply High Word XO 7C000016 SR B mulhwu[.] Multiply High Word Unsigned | | | | | | | | | XO 7C000016 SR B mulhwu[.] Multiply High Word Unsigned | | | | | | | | | | | | | | | | | | | XO | | SR | | | mulld[o][.] | Multiply Low Doubleword | | | Oncode | | | | 1 | Т | |------------|-------------------|---------------------------|-----|------------------|-------------------|-----------------------------------------------------------------| | Form | Opcode (hexadeci- | Mode<br>Dep. <sup>1</sup> | _> | Cat <sup>1</sup> | Mnemonic | Instruction | | <u>R</u> | mal) <sup>2</sup> | Mo<br>De | Pri | Cat | Willemonic | ilistruction | | хо | 7C0001D6 | SR | | В | mullw[o][.] | Multiply Low Word | | X | 7C0001B0 | SR | | В | nand[.] | NAND | | X | 7C0000D0 | SR | | В | neg[o][.] | Negate | | X | 1000015C | SR | | LIM | nmacchw[o][.] | Negative Multiply Accumulate Cross Halfword to Word Mod- | | | | | | | [-][1 | ulo Signed | | Χ | 100001DC | SR | | LIM | nmacchws[o][.] | Negative Multiply Accumulate Cross Halfword to Word Sat- | | | | | | | | urate Signed | | Χ | 1000005C | SR | | LIM | nmachhw[o][.] | Negative Multiply Accumulate High Halfword to Word Mod- | | | | | | | | ulo Signed | | Χ | 100000DC | SR | | LIM | nmachhws[o][.] | Negative Multiply Accumulate High Halfword to Word Satu- | | | | | | | | rate Signed | | X | 1000035C | SR | | LIM | nmaclhw[o][.] | Negative Multiply Accumulate Low Halfword to Word Mod- | | | | | | | | ulo Signed | | X | 100003DC | SR | | LIM | nmaclhws[o][.] | Negative Multiply Accumulate Low Halfword to Word Satu- | | | | | | | | rate Signed | | X | 7C0000F8 | SR | | В | nor[.] | NOR | | X | 7C000378 | SR | | В | or[.] | OR | | X | 7C000338 | SR | | В | orc[.] | OR with Complement | | RR | 7C0000F4 | | | B<br>VLE | popcntb<br>se_add | Population Count Bytes Add Short Form | | OIM5 | 2000 | | | VLE | se_addi | Add Immediate Short Form | | RR | 4600 | SR | | VLE | se_addi | AND Short Form | | RR | 4500 | Oix | | VLE | se_andc | AND with Complement Short Form | | IM5 | 2E00 | | | VLE | se_andi | AND Immediate Short Form | | BD8 | E800 | | | VLE | se_b[l] | Branch [and Link] | | BD8 | E000 | | | VLE | se_bc | Branch Conditional Short Form | | IM5 | 6000 | | | VLE | se_bclri | Bit Clear Immediate | | С | 0006 | | | VLE | se_bctr | Branch To Count Register [and Link] | | IM5 | 6200 | | | VLE | se_bgeni | Bit Generate Immediate | | С | 0004 | | | VLE | se_blr | Branch To Link Register [and Link] | | IM5 | 2C00 | | | VLE | se_bmaski | Bit Mask Generate Immediate | | IM5 | 6400 | | | VLE | se_bseti | Bit Set Immediate | | IM5<br>RR | 6600 | | | VLE<br>VLE | se_btsti | Bit Test Immediate | | RR | 0C00<br>0E00 | | | VLE | se_cmp<br>se_cmph | Compare Word Compare Halfword Short Form | | RR | 0F00 | | | VLE | se_cmphl | Compare Halfword Logical Short Form | | IM5 | 2A00 | | | VLE | se_cmpi | Compare Immediate Word Short Form | | RR | 0D00 | | | VLE | se_cmpl | Compare Logical Word | | OIM5 | 2200 | | | VLE | se_cmpli | Compare Logical Immendiate Word | | R | 00D0 | | | VLE | se_extsb | Extend Sign Byte Short Form | | R | 00F0 | | | VLE | se_extsh | Extend Sign Halfword Short Form | | R | 00C0 | | | VLE | se_extzb | Extend Zero Byte | | R | 00E0 | | | VLE | se_extzh | Extend Zero Halfword | | С | 0000 | | | VLE | se_illegal | Illegal | | C<br>SD4 | 0001 | | | VLE | se_isync | Instruction Synchronize | | | 8000 | | | VLE | se_lbz | Load Byte and Zero Short Form Load Halfword and Zero Short Form | | SD4<br>IM7 | A000 | | | VLE<br>VLE | se_lhz | Load Immediate Short Form | | SD4 | 4800<br>C000 | | | VLE | se_li<br>se_lwz | Load Word and Zero Short Form | | RR | 0300 | | | VLE | se_iwz<br>se_mfar | Move from Alternate Register | | R | 00A0 | | | VLE | se_mfctr | Move From Count Register | | R | 0080 | | | VLE | se_mflr | Move From Link Register | | RR | 0100 | | | VLE | se_mr | Move Register | | RR | 0200 | | | VLE | se_mtar | Move To Alternate Register | | R | 00B0 | | | VLE | se_mtctr | Move To Count Register | | R | 0090 | | | VLE | se_mtlr | Move To Link Register | | RR | 0500 | | | VLE | se_mullw | Multiply Low Word Short Form | | R | 0030 | | | VLE | se_neg | Negate Short Form | | R<br>RR | 0020 | | | VLE<br>VLE | se_not | NOT Short Form OR SHort Form | | C | 4400<br>0009 | | Р | VLE | se_or | Return From Critical Interrupt | | | 0009 | | ľ | ٧LC | se_rfci | I verani i Toni Oniicai interrupt | | Form | Opcode<br>(hexadeci-<br>mal) <sup>2</sup> | Mode<br>Dep.¹ | Priv | Cat <sup>1</sup> | Mnemonic | Instruction | |-----------|-------------------------------------------|---------------|------|------------------|----------------------------|------------------------------------------------------------------------------------| | С | 000A | | Р | VLE | se_rfdi | Return From Debug Interrupt | | C | 0008 | | Р | VLE | se_rfi | Return from Interrupt | | С | 000B | | Р | VLE | se_rfmci | Return From Machine Check Interrupt | | С | 0002 | | | VLE | se_sc | System Call | | RR | 4200 | | | VLE | se_slw | Shift Left Word | | IM5 | 6C00 | | | VLE | se_slwi | Shift Left Word Immediate Short Form | | RR | 4100 | SR | | VLE | se_sraw | Shift Right Algebraic Word | | IM5 | 6A00 | SR | | VLE | se_srawi | Shift Right Algebraic Immediate | | RR | 4000 | | | VLE | se_srw <sub>.</sub> | Shift Right Word | | IM5 | 6800 | | | VLE | se_srwi | Shift Right Word Immediate Short Form | | SD4 | 9000 | | | VLE | se_stb | Store Byte Short Form | | SD4 | B000 | | | VLE<br>VLE | se_sth | Store Halfword Short Form | | SD4<br>RR | D000 | | | VLE | se_stw<br>se_sub | Store Word Short Form Subtract | | RR | 0700 | | | VLE | se_subf | Subtract From Short Form | | OIM5 | 2400 | SR | | VLE | se_subi[.] | Subtract Immediate | | X | 7C000036 | SR | | 64 | sld[.] | Shift Left Doubleword | | X | 7C000030 | SR | | В | slw[.] | Shift Left Word | | X | 7C000634 | SR | | 64 | srad[.] | Shift Right Algebraic Doubleword | | X | 7C000674 | SR | | 64 | sradi[.] | Shift Right Algebraic Doubleword Immediate | | X | 7C000630 | SR | | В | sraw[.] | Shift Right Algebraic Word | | Χ | 7C000670 | SR | | В | srawi[.] | Shift Right Algebraic Word Immediate | | X | 7C000436 | SR | | 64 | srd[.] | Shift Right Doubleword | | Χ | 7C000430 | SR | | В | srw[.] | Shift Right Word | | X | 7C0001BE | | Р | E.PD | stbepx | Store Byte by External Process ID Indexed | | Χ | 7C0001EE | | | В | stbux | Store Byte with Update Indexed | | Χ | 7C0001AE | | | В | stbx | Store Bye Indexed | | X | 7C0001AD | | | 64 | stdcx. | Store Doubleword Conditional Indexed | | X | 7C00013A | | Р | E.PD | stdepx | Store Doubleword by External Process ID Indexed | | X | 7C00016A | | | 64 | stdux | Store Doubleword with Update Indexed | | X | 7C00012A | | _ | 64 | stdx | Store Doubleword Indexed | | X | 7C0005BE | | Р | E.PD | stfdepx | Store Floating-Point Double by External Process ID Indexed | | X | 7C00072C | | Р | B<br>E.PD | sthony | Store Halfword Byte-Reversed Indexed Store Halfword by External Process ID Indexed | | X | 7C00033E<br>7C00036E | | ' | В | sthepx<br>sthux | Store Halfword with Update Indexed | | X | 7C00036E | | | В | sthx | Store Halfword Indexed | | X | 7C0005AA | | | MA | stswi | Store String Word Immediate | | X | 7C00052A | | | MA | stswx | Store String Word Indexed | | VX | 7C00010E | | | V | stvebx | Store Vector Element Byte Indexed | | VX | 7C00014E | | | V | stvehx | Store Vector Element Halfword Indexed | | X | 7C00064E | | Р | E.PD | stvepx | Store Vector by External Process ID Indexed | | Χ | 7C00060E | | Р | E.PD | stvepxl | Store Vector by External Process ID Indexed LRU | | VX | 7C00018E | | | V | stvewx | Store Vector Element Word Indexed | | VX | 7C0001CE | | | V | stvx[l] | Store Vector Indexed [Last] | | X | 7C00052C | | | В | stwbrx | Store Word Byte-Reversed Indexed | | X | 7C00012D | | _ | В | stwcx. | Store Word Conditional Indexed | | X | 7C00013E | | Р | E.PD | stwepx | Store Word by External Process ID Indexed | | X | 7C00016E | | | В | stwux | Store Word with Update Indexed | | X | 7C00012E | en. | | В | stwx | Store Word Indexed | | XO<br>XO | 7C000050 | SR | | B<br>B | subf[o][.] | Subtract From Carrying | | XO | 7C000010<br>7C000110 | SR<br>SR | | В | subfc[o][.]<br>subfe[o][.] | Subtract From Carrying Subtract From Extended | | XO | 7C000110<br>7C0001D0 | SR | | В | subfme[o][.] | Subtract From Minus One Extended | | XO | 7C0001D0<br>7C000190 | SR | | В | subfine[0][.] | Subtract From Zero Extended | | X | 7C000190 | UI. | | В | sync | Synchronize | | X | 7C0004AC | | | 64 | td | Trap Doubleword | | X | 7C000624 | | Р | E | tlbivax | TLB Invalidate Virtual Address Indexed | | X | 7C000764 | | P | Ē | tlbre | TLB Read Entry | | X | 7C000724 | | P | Ē | tlbsx | TLB Search Indexed | | Χ | 7C00046C | | Р | Ε | tlbsync | TLB Synchronize | | X | 7C0007A4 | | Ρ | Е | tlbwe | TLB Write Entry | | | LINCAGA | 1 | | | | | |----------|-------------------------------------------|-----------|-----|------------------|-------------------------|-----------------------------------------------------------------------------------------| | ΙĘ | Opcode<br>(hexadeci-<br>mal) <sup>2</sup> | de<br>p.1 | _ | Cat <sup>1</sup> | Mnemonic | Instruction | | Form | mal) <sup>2</sup> | Mo<br>De | Pri | Cat | Willelliollic | ilisti uction | | X | 7C000008 | | | В | tw | Trap Word | | VX | 10000180 | | | ٧ | vaddcuw | Vector Add Carryout Unsigned Word | | VX | 10000100 | | | V | vaddfp | Vector Add Floating-Point | | VX | 1000000A | | | V | vaddsbs | Vector Add Signed Byte Saturate | | VX | 10000340 | | | V | vaddshs | Vector Add Signed Halfword Saturate | | VX | 10000380 | | | V | vaddsws | Vector Add Signed Word Saturate | | VX | 10000000 | | | V | vaddubm | Vector Add Unsigned Byte Modulo | | VX | 10000200 | | | V | vaddubs | Vector Add Unsigned Byte Saturate | | VX | 10000040 | | | V | vadduhm | Vector Add Unsigned Halfword Modulo | | VX | 10000240 | | | V | vadduhs | Vector Add Unsigned Halfword Saturate | | VX | 10000080 | | | V | vadduwm | Vector Add Unsigned Word Modulo | | VX | 10000280 | | | V | vadduws | Vector Add Unsigned Word Saturate | | VX | 10000404 | | | V | vand | Vector AND | | VX | 10000444 | | | V | vandc | Vector AND with Complement | | VX | 10000502 | | | V | vavgsb | Vector Average Signed Byte | | VX | 10000542 | | | V | vavgsh | Vector Average Signed Halfword | | VX | 10000582 | | | V | vavgsw | Vector Average Signed Word | | VX | 10000402 | | | V | vavgub | Vector Average Unsigned Byte | | VX | 10000442 | | | V | vavguh | Vector Average Unsigned Halfword | | VX | 10000482 | | | V | vavguw | Vector Average Unsigned Word | | VX | 100003CA | | | V | vcfpsxws | Vector Convert from Single-Precision to Signed Fixed-Point | | | | | | | | Word Saturate | | VX | 1000038A | | | V | vcfpuxws | Vector Convert from Single-Precision to Unsigned Fixed- | | | | | | | | Point Word Saturate | | VX | 100003C6 | | | V | vcmpbfp[.] | Vector Compare Bounds Single-Precision | | VC | 100000C6 | | | V | vcmpeqfp[.] | Vector Compare Equal To Single-Precision | | VC | 10000006 | | | V | vcmpequb[.] | Vector Compare Equal To Unsigned Byte | | VC | 10000046 | | | V | vcmpequh[.] | Vector Compare Equal To Unsigned Halfword | | VC | 10000086 | | | V | vcmpequw[.] | Vector Compare Equal To Unsigned Word | | VC | 100001C6 | | | V | vcmpgefp[.] | Vector Compare Greater Than or Equal To Single-Precision | | VC | 100002C6 | | | V | vcmpgtfp[.] | Vector Compare Greater Than Single-Precision | | VC | 10000306 | | | V | vcmpgtsb[.] | Vector Compare Greater Than Signed Byte | | VC<br>VC | 10000346 | | | V<br>V | vcmpgtsh[.] | Vector Compare Greater Than Signed Halfword | | VC | 10000386 | | | V<br>V | vcmpgtsw[.] | Vector Compare Greater Than Signed Word | | VC | 10000206 | | | V | vcmpgtub[.] vcmpgtuh[.] | Vector Compare Greater Than Unsigned Byte Vector Compare Greater Than Unsigned Halfword | | VC | 10000246 | | | V | vcmpgtun[.] | Vector Compare Greater Than Unsigned Hallword | | VX | 10000288 | | | V | vcmpgtaw[.] | Vector Convert from Signed Fixed-Point Word to Single- | | \ \ \ | 1000034A | | | V | VCSXWIP | Precision | | VX | 1000030A | | | V | vcuxwfp | Vector Convert from Unsigned Fixed-Point Word to Single- | | \ \ \ | 1000030A | | | V | VCUXWIP | Precision | | VX | 1000018A | | | V | vexptefp | Vector 2 Raised to the Exponent Estimate Floating-Point | | VX | 1000018A | | | V | vlogefp | Vector Log Base 2 Estimate Floating-Point | | VA | 100001CA | | | V | vmaddfp | Vector Multiply-Add Single-Precision | | VX | 1000002E | | | V | vmaxfp | Vector Maximum Single-Precision | | VX | 1000040A | | | V | vmaxsb | Vector Maximum Signed Byte | | VX | 10000102 | | | V | vmaxsh | Vector Maximum Signed Byte Vector Maximum Signed Halfword | | VX | 10000112 | | | V | vmaxsw | Vector Maximum Signed Word | | VX | 10000002 | | | V | vmaxub | Vector Maximum Unsigned Byte | | VX | 10000042 | | | V | vmaxuh | Vector Maximum Unsigned Halfword | | VX | 10000082 | | | V | vmaxuw | Vector Maximum Unsigned Word | | VA | 10000020 | | | V | vmhaddshs | Vector Multiply-High-Add Signed Halfword Saturate | | VA | 10000021 | | | V | vmhraddshs | Vector Multiply-High-Round-Add Signed Halfword Saturate | | VX | 1000044A | | | V | vminfp | Vector Minimum Single-Precision | | VX | 10000302 | | | V | vminsb | Vector Minimum Signed Byte | | VX | 10000342 | | | V | vminsh | Vector Minimum Signed Halfword | | VX | 10000382 | | | V | vminsw | Vector Minimum Signed Word | | VX | 10000202 | | | V | vminub | Vector Minimum Unsigned Byte | | VX | 10000242 | | | V | vminuh | Vector Minimum Unsigned Halfword | | VX | 10000282 | | | V | vminuw | Vector Minimum Unsigned Word | | VA | 10000022 | | | V | vmladduhm | Vector Multiply-Low-Add Unsigned Halfword Modulo | | | Opcode | 4- | | | | |----------|----------------------|-----------------------------------|------------------|----------------------|---------------------------------------------------------------------------| | Form | (hexadeci- | Mode<br>Dep. <sup>1</sup><br>Priv | Cat <sup>1</sup> | Mnemonic | Instruction | | 요 | mal) <sup>2</sup> | P De | Jul | I IIII CIII CIII C | mon donon | | VX | 1000000C | | V | vmrghb | Vector Merge High Byte | | VX | 1000004C | | V | vmrghh | Vector Merge High Halfword | | VX | 1000008C | | V | vmrghw | Vector Merge High Word | | VX | 1000010C | | V | vmrglb | Vector Merge Low Byte | | VX | 1000014C | | V | vmrglh | Vector Merge Low Halfword | | VX | 1000018C | | V | vmrglw | Vector Merge Low Word | | VA | 10000025 | | V | vmsummbm | Vector Multiply-Sum Mixed Byte Modulo | | VA | 10000028 | | V | vmsumshm | Vector Multiply-Sum Signed Halfword Modulo | | VA | 10000029 | | V | vmsumshs | Vector Multiply-Sum Signed Halfword Saturate | | VA | 10000024 | | V | vmsumubm | Vector Multiply-Sum Unsigned Byte Modulo | | VA | 10000026 | | V | vmsumuhm | Vector Multiply-Sum Unsigned Halfword Modulo | | VA | 10000027 | | V | vmsumuhs | Vector Multiply-Sum Unsigned Halfword Saturate | | VX | 10000308 | | V | vmulesb | Vector Multiply Even Signed Byte | | VX | 10000348 | | V | vmulesh | Vector Multiply Even Signed Halfword | | VX | 10000208 | | V | vmuleub | Vector Multiply Even Unsigned Byte | | VX | 10000248 | | V | vmuleuh | Vector Multiply Even Unsigned Halfword | | VX | 10000108 | | V | vmulosb | Vector Multiply Odd Signed Byte | | VX | 10000148 | | V | vmulosh | Vector Multiply Odd Signed Halfword | | VX | 10000008 | | V | vmuloub | Vector Multiply Odd Unsigned Byte | | VX | 10000048 | | V | vmulouh | Vector Multiply Odd Unsigned Halfword | | VA | 1000002F | | V | vnmsubfp | Vector Negative Multiply-Subtract Single-Precision | | VX | 10000504 | | V | vnor | Vector NOR<br>Vector OR | | VX<br>VA | 10000484 | | V | vor | Vector Permute | | VX | 1000002B<br>1000030E | | V | vperm<br>vpkpx | Vector Pack Pixel | | VX | 1000030E | | V | vpkshss | Vector Pack Signed Halfword Signed Saturate | | VX | 1000018E | | v | vpkshus | Vector Pack Signed Halfword Unsigned Saturate | | VX | 1000010E | | V | vpkswss | Vector Pack Signed Word Signed Saturate | | VX | 1000016E | | v | vpkswus | Vector Pack Signed Word Unsigned Saturate | | VX | 10000011E | | v | vpkuhum | Vector Pack Unsigned Halfword Unsigned Modulo | | VX | 1000008E | | V | vpkuhus | Vector Pack Unsigned Halfword Unsigned Saturate | | VX | 1000004E | | V | vpkuwum | Vector Pack Unsigned Word Unsigned Modulo | | VX | 100000CE | | V | vpkuwus | Vector Pack Unsigned Word Unsigned Saturate | | VX | 1000010A | | V | vrefp | Vector Reciprocal Estimate Single-Precision | | VX | 100002CA | | V | vrfim | Vector Round to Single-Precision Integer toward -Infinity | | VX | 1000020A | | V | vrfin | Vector Round to Single-Precision Integer Nearest | | VX | 1000028A | | V | vrfip | Vector Round to Single-Precision Integer toward +Infinity | | VX | 1000024A | | V | vrfiz | Vector Round to Single-Precision Integer toward Zero | | VX | 10000004 | | V | vrlb | Vector Rotate Left Byte | | VX | 10000044 | | V | vrlh | Vector Rotate Left Halfword | | VX | 10000084 | | V | vrlw | Vector Rotate Left Word | | VX | 1000014A | | V | vrsqrtefp | Vector Reciprocal Square Root Estimate Single-Precision | | VA | 1000002A | | V | vsel | Vector Select | | VX | 100001C4 | | V | vsl | Vector Shift Left | | VX | 10000104 | | V | vslb | Vector Shift Left Byte | | VA | 1000002C | | V | vsldoi | Vector Shift Left Double by Octet Immediate | | VX | 10000144 | | V | vslh | Vector Shift Left Halfword | | VX | 1000040C | | V | vslo | Vector Shift Left by Octet | | VX | 10000184 | | V | vslw | Vector Shift Left Word | | VX | 1000020C | | V | vspltb | Vector Splat Halfward | | VX | 1000024C | | V | vsplth | Vector Splat Immediate Signed Bute | | VX<br>VX | 1000030C | | V | vspltisb<br>vspltish | Vector Splat Immediate Signed Byte Vector Splat Immediate Signed Halfword | | VX | 1000034C<br>1000038C | | V | vspitism | Vector Splat Immediate Signed Hallword Vector Splat Immediate Signed Word | | VX | 1000038C | | V | vspitisw | Vector Splat Word | | VX | 1000028C | | V | vspitw | Vector Shift Right | | VX | 10000204 | | V | vsrab | Vector Shift Right Algebraic Word | | VX | 10000304 | | V | vsrah | Vector Shift Right Algebraic Word | | VX | 10000344 | | V | vsraw | Vector Shift Right Algebraic Word | | VX | 10000384 | | V | vsrb | Vector Shift Right Byte | | VX | 10000204 | | ľ | vsrh | Vector Shift Right Halfword | | ٧X | 10000244 | | V | vsrn | vector Snift Right Haifword | | Form | Opcode<br>(hexadeci-<br>mal) <sup>2</sup> | lode<br>ep.1 | 'riv' | Cat <sup>1</sup> | Mnemonic | Instruction | |------|-------------------------------------------|--------------|-------|------------------|----------|----------------------------------------------------| | | , | 2 | 4 | | | | | VX | 1000044C | | | V | vsro | Vector Shift Right by Octet | | VX | 10000284 | | | V | vsrw | Vector Shift Right Word | | VX | 10000580 | | | V | vsubcuw | Vector Subtract and Write Carry-Out Unsigned Word | | VX | 1000004A | | | V | vsubfp | Vector Subtract Single-Precision | | VX | 10000700 | | | V | vsubsbs | Vector Subtract Signed Byte Saturate | | VX | 10000740 | | | V | vsubshs | Vector Subtract Signed Halfword Saturate | | VX | 10000780 | | | V | vsubsws | Vector Subtract Signed Word Saturate | | VX | 10000400 | | | V | vsububm | Vector Subtract Unsigned Byte Modulo | | VX | 10000600 | | | V | vsububs | Vector Subtract Unsigned Byte Saturate | | VX | 10000440 | | | V | vsubuhm | Vector Subtract Unsigned Byte Modulo | | VX | 10000640 | | | V | vsubuhs | Vector Subtract Unsigned Halfword Saturate | | VX | 10000480 | | | V | vsubuwm | Vector Subtract Unsigned Word Modulo | | VX | 10000680 | | | V | vsubuws | Vector Subtract Unsigned Word Saturate | | VX | 10000688 | | | V | vsum2sws | Vector Sum across Half Signed Word Saturate | | VX | 10000708 | | | V | vsum4sbs | Vector Sum across Quarter Signed Byte Saturate | | VX | 10000648 | | | V | vsum4shs | Vector Sum across Quarter Signed Halfword Saturate | | VX | 10000608 | | | V | vsum4ubs | Vector Sum across Quarter Unsigned Byte Saturate | | VX | 10000788 | | | V | vsumsws | Vector Sum across Signed Word Saturate | | VX | 1000034E | | | V | vupkhpx | Vector Unpack High Pixel | | VX | 1000020E | | | V | vupkhsb | Vector Unpack High Signed Byte | | VX | 1000024E | | | V | vupkhsh | Vector Unpack High Signed Halfword | | VX | 100003CE | | | V | vupklpx | Vector Unpack Low Pixel | | VX | 1000028E | | | V | vupklsb | Vector Unpack Low Signed Byte | | VX | 100002CE | | | V | vupklsh | Vector Unpack Low Signed Halfword | | VX | 100004C4 | | | V | vxor | Vector XOR | | X | 7C00007C | | | WT | wait | Wait | | Χ | 7C000106 | | Р | Е | wrtee | Write MSR External Enable | | X | 7C000146 | | Ρ | Е | wrteei | Write MSR External Enable Immediate | | D | 7C000278 | SR | | В | xor[.] | XOR | See the key to the mode dependency and privilege columns on page 905 and the key to the category column in Section 1.3.5 of Book I. <sup>&</sup>lt;sup>2</sup> For 16-bit instructions, the "Opcode" column represents the 16-bit hexadecimal instruction encoding with the opcode and extended opcode in the corresponding fields in the instruction, and with 0's in bit positions which are not opcode bits; dashes are used following the opcode to indicate the form is a 16-bit instruction. For 32-bit instructions, the "Opcode" column represents the 32-bit hexadecimal instruction encoding with the opcode and extended opcode in the corresponding fields in the instruction, and with 0's in bit positions which are not opcode bits. # Appendix B. VLE Instruction Set Sorted by Opcode This appendix lists all the instructions available in VLE mode in the Power ISA , in order by opcode. Opcodes that are not defined below are treated as illegal by category VLE. | | Opcode | o | | | | |--------|-------------------------------------------|------------|--------------------|---------------|---------------------------------------------------------| | Form | Opcode<br>(hexadeci-<br>mal) <sup>2</sup> | Mod<br>Dep | ≧ Cat <sup>1</sup> | Mnemonic | Instruction | | С | 0000 | | VLE | se_illegal | Illegal | | С | 0001 | | VLE | se_isync | Instruction Synchronize | | C<br>C | 0002 | | VLE | se_sc | System Call | | С | 0004 | | VLE | se_blr | Branch To Link Register [and Link] | | С | 0006 | | VLE | se_bctr | Branch To Count Register [and Link] | | С | 0008 | | P VLE | se_rfi | Return from Interrupt | | С | 0009 | | P VLE | se_rfci | Return From Critical Interrupt | | С | 000A | | P VLE | se_rfdi | Return From Debug Interrupt | | С | 000B | | P VLE | se_rfmci | Return From Machine Check Interrupt | | R | 0020 | | VLE | se_not | NOT Short Form | | R | 0030 | | VLE | se_neg | Negate Short Form | | R | 0080 | | VLE | se_mflr | Move From Link Register | | R | 0090 | | VLE | se_mtlr | Move To Link Register | | R | 00A0 | | VLE | se_mfctr | Move From Count Register | | R | 00B0 | | VLE | se_mtctr | Move To Count Register | | R | 00C0 | | VLE | se_extzb | Extend Zero Byte | | R | 00D0 | | VLE | se_extsb | Extend Sign Byte Short Form | | R | 00E0 | | VLE | se_extzh | Extend Zero Halfword | | R | 00F0 | | VLE | se_extsh | Extend Sign Halfword Short Form | | RR | 0100 | | VLE | se_mr | Move Register | | RR | 0200 | | VLE | se_mtar | Move To Alternate Register | | RR | 0300 | | VLE | se_mfar | Move from Alternate Register | | RR | 0400 | | VLE | se_add | Add Short Form | | RR | 0500 | | VLE | se_mullw | Multiply Low Word Short Form | | RR | 0600 | | VLE | se_sub | Subtract | | RR | 0700 | | VLE | se_subf | Subtract From Short Form | | RR | 0C00 | | VLE | se_cmp | Compare Word | | RR | 0D00 | | VLE | se_cmpl | Compare Logical Word | | RR | 0E00 | | VLE | se_cmph | Compare Halfword Short Form | | RR | 0F00 | | VLE | se_cmphl | Compare Halfword Logical Short Form | | VX | 10000000 | | V | vaddubm | Vector Add Unsigned Byte Modulo | | VX | 10000002 | | V | vmaxub | Vector Maximum Unsigned Byte | | VX | 10000004 | | V | vrlb | Vector Rotate Left Byte | | VC | 10000006 | | V | vcmpequb[.] | Vector Compare Equal To Unsigned Byte | | VX | 10000008 | | V | vmuloub | Vector Multiply Odd Unsigned Byte | | VX | 1000000A | | V | vaddfp | Vector Add Floating-Point | | VX | 1000000C | | V | vmrghb | Vector Merge High Byte | | VX | 1000000E | _ | V | vpkuhum | Vector Pack Unsigned Halfword Unsigned Modulo | | X | 10000010 | SR | LIM | mulhhwu[o][.] | Multiply High Halfword to Word Unsigned | | X | 10000018 | SR | LIM | machhwu[o][.] | Multiply Accumulate High Halfword to Word Modulo | | .,, | | | ., | | Unsigned | | VA | 10000020 | | V | vmhaddshs | Vector Multiply-High-Add Signed Halfword Saturate | | VA | 10000021 | | V | vmhraddshs | Vector Multiply-High-Round-Add Signed Halfword Saturate | | VA | 10000022 | | V | vmladduhm | Vector Multiply-Low-Add Unsigned Halfword Modulo | | VA | 10000024 | | V | vmsumubm | Vector Multiply-Sum Unsigned Byte Modulo | | | Opcode | ω <b>−</b> . ↓ | _ | | | |----------|----------------------|----------------|--------------------|----------------------|------------------------------------------------------------------------------------------------| | Form | (hexadeci- | od<br>obj | ≧ Cat <sup>1</sup> | Mnemonic | Instruction | | | , | 20 | | | | | VA | 10000025 | | \<br>\<br>\ | vmsummbm | Vector Multiply-Sum Mixed Byte Modulo | | VA<br>VA | 10000026<br>10000027 | | V | vmsumuhm<br>vmsumuhs | Vector Multiply-Sum Unsigned Halfword Modulo<br>Vector Multiply-Sum Unsigned Halfword Saturate | | VA | 10000027 | | ľ | vmsumshm | Vector Multiply-Sum Signed Halfword Modulo | | VA | 10000020 | | V | vmsumshs | Vector Multiply-Sum Signed Halfword Saturate | | VA | 10000023 | | V | vsel | Vector Select | | VA | 1000002B | | V | vperm | Vector Permute | | VA | 1000002C | | V | vsldoi | Vector Shift Left Double by Octet Immediate | | VA | 1000002E | | V | vmaddfp | Vector Multiply-Add Single-Precision | | VA | 1000002F | | V | vnmsubfp | Vector Negative Multiply-Subtract Single-Precision | | VX | 10000040 | | V | vadduhm | Vector Add Unsigned Halfword Modulo | | VX | 10000042 | | V | vmaxuh | Vector Maximum Unsigned Halfword | | VX | 10000044 | | V | vrlh | Vector Rotate Left Halfword | | VC | 10000046 | | V | vcmpequh[.] | Vector Compare Equal To Unsigned Halfword | | VX<br>VX | 10000048 | | V | vmulouh<br>vsubfp | Vector Multiply Odd Unsigned Halfword<br>Vector Subtract Single-Precision | | VX | 1000004A<br>1000004C | | V | vmrghh | Vector Merge High Halfword | | VX | 1000004C | | V | vpkuwum | Vector Pack Unsigned Word Unsigned Modulo | | X | 1000004E | SR | LIM | mulhhw[o][.] | Multiply High Halfword to Word Signed | | X | 10000058 | SR | LIM | machhw[o][.] | Multiply Accumulate High Halfword to Word Modulo Signed | | Χ | 1000005C | SR | LIM | nmachhw[o][.] | Negative Multiply Accumulate High Halfword to Word Mod- | | | | | | | ulo Signed | | VX | 10000080 | | V | vadduwm | Vector Add Unsigned Word Modulo | | VX | 10000082 | | V | vmaxuw | Vector Maximum Unsigned Word | | VX | 10000084 | | V | vrlw | Vector Rotate Left Word | | VC | 10000086 | | V | vcmpequw[.] | Vector Compare Equal To Unsigned Word | | VX | 1000008C | | V | vmrghw | Vector Merge High Word | | VX | 1000008E | CD | V | vpkuhus | Vector Pack Unsigned Halfword Unsigned Saturate | | X | 10000098 | SR | LIM | machhwsu[o][.] | Multiply Accumulate High Halfword to Word Saturate | | VC | 100000C6 | | V | vcmpeqfp[.] | Unsigned Vector Compare Equal To Single-Precision | | VX | 100000C6 | | V | vpkuwus | Vector Pack Unsigned Word Unsigned Saturate | | X | 1000000E | SR | LIM | machhws[o][.] | Multiply Accumulate High Halfword to Word Saturate | | | 10000000 | O. C | | maomwo[o][.] | Signed | | X | 100000DC | SR | LIM | nmachhws[o][.] | Negative Multiply Accumulate High Halfword to Word Satu- | | | | | | | rate Signed | | VX | 10000102 | | V | vmaxsb | Vector Maximum Signed Byte | | VX | 10000104 | | V | vslb | Vector Shift Left Byte | | VX | 10000108 | | V | vmulosb | Vector Multiply Odd Signed Byte | | VX | 1000010A | | V | vrefp | Vector Reciprocal Estimate Single-Precision | | VX | 1000010C | | V | vmrglb | Vector Merge Low Byte | | VX | 1000010E | CD. | V | vpkshus | Vector Pack Signed Halfword Unsigned Saturate | | X | 10000110 | SR | LIM<br>LIM | mulchwu[o][.] | Multiply Cross Halfword to Word Unsigned Multiply Accumulate Cross Halfword to Word Modulo | | ^ | 10000118 | SR | LIIVI | macchwu[o][.] | Multiply Accumulate Cross Halfword to Word Modulo Unsigned | | VX | 10000142 | | V | vmaxsh | Vector Maximum Signed Halfword | | VX | 10000142 | | V | vslh | Vector Shift Left Halfword | | VX | 10000144 | | V | vmulosh | Vector Multiply Odd Signed Halfword | | VX | 1000014A | | V | vrsqrtefp | Vector Reciprocal Square Root Estimate Single-Precision | | VX | 1000014C | | V | vmrglh | Vector Merge Low Halfword | | VX | 1000014E | | V | vpkswus | Vector Pack Signed Word Unsigned Saturate | | X | 10000150 | SR | LIM | mulchw[o][.] | Multiply Cross Halfword to Word Signed | | X | 10000158 | SR | LIM | macchw[o][.] | Multiply Accumulate Cross Halfword to Word Modulo | | _ | 10000150 | 6D | 1 18 4 | nmacchur-1f 1 | Signed | | X | 1000015C | SR | LIM | nmacchw[o][.] | Negative Multiply Accumulate Cross Halfword to Word Mod- | | VX | 10000180 | | V | vaddcuw | ulo Signed Vector Add Carryout Unsigned Word | | VX | 10000180 | | V | vaducuw<br>vmaxsw | Vector Maximum Signed Word | | VX | 10000182 | | ľ | vslw | Vector Shift Left Word | | VX | 10000184<br>1000018A | | V | vexptefp | Vector 2 Raised to the Exponent Estimate Floating-Point | | VX | 1000018C | | V | vmrglw | Vector Merge Low Word | | *** | T0000T0C | | v | 1 *···· 9: <b>**</b> | 1 . sette. morge con front | | Form | Opcode<br>(hexadeci-<br>mal) <sup>2</sup> | Mode<br>Dep. <sup>1</sup> | Priv | Cat <sup>1</sup> | Mnemonic | Instruction | |------|-------------------------------------------|---------------------------|------|------------------|------------------------|---------------------------------------------------------------------| | VX | 1000018E | | | V | vpkshss | Vector Pack Signed Halfword Signed Saturate | | X | 10000198 | SR | | LIM | macchwsu[o][.] | Multiply Accumulate Cross Halfword to Word Saturate Unsigned | | VX | 100001C4 | | | V | vsl | Vector Shift Left | | VC | 100001C6 | | | V | vcmpgefp[.] | Vector Compare Greater Than or Equal To Single-Precision | | VX | 100001CA | | | V | vlogefp | Vector Log Base 2 Estimate Floating-Point | | VX | 100001CE | | | V | vpkswss | Vector Pack Signed Word Signed Saturate | | X | 100001D8 | SR | | LIM | macchws[o][.] | Multiply Accumulate Cross Halfword to Word Saturate Signed | | X | 100001DC | SR | | LIM | nmacchws[o][.] | Negative Multiply Accumulate Cross Halfword to Word Saturate Signed | | EVX | 10000200 | | | SP | evaddw | Vector Add Word | | VX | 10000200 | | | V | vaddubs | Vector Add Unsigned Byte Saturate | | EVX | 10000202 | | | SP | evaddiw | Vector Add Immediate Word | | VX | 10000202 | | | V | vminub | Vector Minimum Unsigned Byte | | EVX | 10000204 | | | SP | evsubfw | Vector Subtract from Word | | VX | 10000204 | | | V | vsrb | Vector Shift Right Byte | | EVX | 10000206 | | | SP | evsubifw | Vector Subtract Immediate from Word | | VC | 10000206 | | | V | vcmpgtub[.] | Vector Compare Greater Than Unsigned Byte | | EVX | 10000208 | | | SP | evabs | Vector Absolute Value | | VX | 10000208 | | | V | vmuleub | Vector Multiply Even Unsigned Byte | | EVX | 10000209 | | | SP | evneg | Vector Negate | | EVX | 1000020A | | | SP | evextsb | Vector Extend Sign Byte | | VX | 1000020A | | | V | vrfin | Vector Round to Single-Precision Integer Nearest | | EVX | 1000020B | | | SP | evextsh | Vector Extend Sign Halfword | | EVX | 1000020C | | | SP | evrndw | Vector Round Word | | VX | 1000020C | | | V | vspltb | Vector Splat Byte | | EVX | 1000020D | | | SP | evcntlzw | Vector Count Leading Zeros Bits Word | | EVX | 1000020E | | | SP | evcntlsw | Vector Count Leading Sign Bits Word | | VX | 1000020E | | | V | vupkhsb | Vector Unpack High Signed Byte | | EVX | 1000020F | | | SP | brinc | Bit Reverse Increment | | EVX | 10000211 | | | SP | evand | Vector AND | | EVX | 10000212 | | | SP | evandc | Vector AND with Complement | | EVX | 10000216 | | | SP | evxor | Vector XOR | | EVX | 10000217 | | | SP | evor | Vector OR | | EVX | 10000218 | | | SP | evnor | Vector NOR | | EVX | 10000219 | | | SP | eveqv | Vector Equivalent | | EVX | 1000021B | | | SP | evorc | Vector OR with Complement | | EVX | 1000021E | | | SP | evnand | Vector NAND | | EVX | 10000220 | | | SP | evsrwu | Vector Shift Right Word Unsigned | | EVX | 10000221 | | | SP | evsrws | Vector Shift Right Word Signed | | EVX | 10000222 | | | SP | evsrwiu | Vector Shift Right Word Immediate Unsigned | | EVX | 10000223 | | | SP | evsrwis | Vector Shift Right Word Immediate Signed | | EVX | 10000224 | | | SP | evslw | Vector Shift Left Word | | EVX | 10000226 | | | SP<br>SP | evslwi<br>evrlw | Vector Shift Left Word Immediate | | | 10000228 | | | SP<br>SP | - | Vector Rotate Left Word | | EVX | 10000229 | | | SP | evsplati<br>evrlwi | Vector Splat Immediate | | EVX | 1000022A | | | SP<br>SP | | Vector Rotate Left Word Immediate | | EVX | 1000022B | | | SP | evsplatfi | Vector Splat Fractional Immediate Vector Merge High | | EVX | 1000022C | | | SP | evmergehi<br>evmergelo | | | EVX | 1000022D | | | SP | evmergehilo | Vector Merge Low<br>Vector Merge High/Low | | EVX | 1000022E<br>1000022F | | | SP | evmergelohi | Vector Merge Low/High | | EVX | 1000022F | | | SP | evcmpgtu | Vector Compare Greater Than Unsigned | | EVX | 10000230 | | | SP | evempgts | Vector Compare Greater Than Signed | | EVX | 10000231 | | | SP | evempltu | Vector Compare Less Than Unsigned | | EVX | 10000232 | | | SP | evemplts | Vector Compare Less Than Onsigned Vector Compare Less Than Signed | | EVX | 10000233 | | | SP | evemped | Vector Compare Equal | | VX | 10000234 | | | V | vadduhs | Vector Add Unsigned Halfword Saturate | | VX | 10000240 | | | V | vminuh | Vector Minimum Unsigned Halfword | | VX | 10000242 | | | V | vsrh | Vector Shift Right Halfword | | ٧٨ | 10000244 | l | | ٧ | VOIII | V GOLOT OTHER MIGHT HALLWOLD | | Form | Opcode<br>(hexadeci-<br>mal) <sup>2</sup> | Mod<br>Dep<br>Priv | Cat <sup>1</sup> | Mnemonic | Instruction | |-----------|-------------------------------------------|--------------------|------------------|-------------|--------------------------------------------------------------------------------------------| | VC | 10000246 | | V | vcmpgtuh[.] | Vector Compare Greater Than Unsigned Halfword | | VX | 10000248 | | V | vmuleuh | Vector Multiply Even Unsigned Halfword | | VX | 1000024A | | V | vrfiz | Vector Round to Single-Precision Integer toward Zero | | VX | 1000024C | | V | vsplth | Vector Splat Halfword | | VX | 1000024E | | V | vupkhsh | Vector Unpack High Signed Halfword | | EVSE<br>L | 10000278 | | SP | evsel | Vector Select | | EVX | 10000280 | | SP.FV | evfsadd | Vector Floating-Point Single-Precision Add | | VX | 10000280 | | V | vadduws | Vector Add Unsigned Word Saturate | | EVX | 10000281 | | SP.FV | evfssub | Vector Floating-Point Single-Precision Subtract | | VX | 10000282 | | V | vminuw | Vector Minimum Unsigned Word | | EVX | 10000284 | | SP.FV | evfsabs | Vector Floating-Point Single-Precision Absolute Value | | VX | 10000284 | | V | vsrw | Vector Shift Right Word | | EVX | 10000285 | | SP.FV | evfsnabs | Vector Floating-Point Single-Precision Negative Absolute Value | | EVX | 10000286 | | SP.FV | evfsneg | Vector Floating-Point Single-Precision Negate | | VC | 10000286 | | V | vcmpgtuw[.] | Vector Compare Greater Than Unsigned Word | | EVX | 10000288 | | SP.FV | evfsmul | Vector Floating-Point Single-Precision Multiply | | EVX | 10000289 | | SP.FV | evfsdiv | Vector Floating-Point Single-Precision Divide | | VX | 1000028A | | V | vrfip | Vector Round to Single-Precision Integer toward +Infinity | | EVX | 1000028C | | SP.FV | evfscmpgt | Vector Floating-Point Single-Precision Compare Greater Than | | VX | 1000028C | | V | vspltw | Vector Splat Word | | EVX | 1000028D | | SP.FV | evfscmplt | Vector Floating-Point Single-Precision Compare Less Than | | EVX | 1000028E | | SP.FV | evfscmpeq | Vector Floating-Point Single-Precision Compare Equal | | VX | 1000028E | | V | vupklsb | Vector Unpack Low Signed Byte | | EVX | 10000290 | | SP.FV | evfscfui | Vector Convert Floating-Point Single-Precision from<br>Unsigned Integer | | EVX | 10000291 | | SP.FV | evfscfsi | Vector Convert Floating-Point Single-Precision from Signed Integer | | EVX | 10000292 | | SP.FV | evfscfuf | Vector Convert Floating-Point Single-Precision from Unsigned Fraction | | EVX | 10000293 | | SP.FV | evfscfsf | Vector Convert Floating-Point Single-Precision from Signed Fraction | | EVX | 10000294 | | SP.FV | evfsctui | Vector Convert Floating-Point Single-Precision to Unsigned Integer | | EVX | 10000295 | | SP.FV | evfsctsi | Vector Convert Floating-Point Single-Precision to Signed Integer | | EVX | 10000296 | | SP.FV | evfsctuf | Vector Convert Floating-Point Single-Precision to Unsigned Fraction | | EVX | 10000297 | | SP.FV | evfsctsf | Vector Convert Floating-Point Single-Precision to Signed Fraction | | EVX | 10000298 | | SP.FV | evfsctuiz | Vector Convert Floating-Point Single-Precision to Unsigned Integer with Round Towards Zero | | EVX | 1000029A | | SP.FV | evfsctsiz | Vector Convert Floating-Point Single-Precision to Signed Integer with Round Towards Zero | | EVX | 1000029C | | SP.FV | evfststgt | Vector Floating-Point Single-Precision Test Greater Than | | EVX | 1000029C | | SP.FV | eviststigt | Vector Floating-Point Single-Precision Test Less Than | | EVX | 1000029E | | SP.FV | evfststeg | Vector Floating-Point Single-Precision Test Equal | | VX | 1000023E | | V | vsr | Vector Shift Right | | VC | 100002C1 | | v | vcmpgtfp[.] | Vector Compare Greater Than Single-Precision | | VX | 100002CA | | V | vrfim | Vector Round to Single-Precision Integer toward -Infinity | | VX | 100002CE | | V | vupklsh | Vector Unpack Low Signed Halfword | | EVX | 100002CF | | SP.FD | efscfd | Floating-Point Single-Precision Convert from Double-Precision | | EVX | 100002E0 | | SP.FD | efdadd | Floating-Point Double-Precision Add | | EVX | 100002E0 | | SP.FS | efsadd | Floating-Point Single-Precision Add | | EVX | 100002E1 | | SP.FD | efdsub | Floating-Point Double-Precision Subtract | | EVX | 100002E1 | | SP.FS | efssub | Floating-Point Single-Precision Subtract | | Form | Opcode<br>(hexadeci-<br>mal) <sup>2</sup> | Mode<br>Dep. <sup>1</sup><br>Briv | Cat <sup>1</sup> | Mnemonic | Instruction | |------------|-------------------------------------------|-----------------------------------|------------------|--------------------|-----------------------------------------------------------------------------------------------------------------------| | EVX | 100002E2 | | SP.FD | efdcfuid | Convert Floating-Point Double-Precision from Unsigned Integer Doubleword | | EVX | 100002E2 | | SP.FS | efscfuid | Convert Floating-Point Single-Precision from Unsigned Integer Doubleword | | EVX | 100002E3 | | SP.FD | efdcfsid | Convert Floating-Point Double-Precision from Signed Integer Doubleword | | EVX | 100002E3 | | SP.FS | efscfsid | Convert Floating-Point Single-Precision from Signed Integer Doubleword | | EVX | 100002E4 | | SP.FD | efdabs | Floating-Point Double-Precision Absolute Value | | EVX | 100002E4 | | SP.FS | efsabs | Floating-Point Single-Precision Absolute Value | | EVX | 100002E5 | | SP.FD | efdnabs | Floating-Point Double-Precision Negative Absolute Value | | EVX | 100002E5 | | SP.FS | efsnabs | Floating-Point Single-Precision Negative Absolute Value | | EVX | 100002E6 | | SP.FD | efdneg | Floating-Point Double-Precision Negate | | EVX | 100002E6 | | SP.FS | efsneg | Floating-Point Single-Precision Negate | | EVX | 100002E8 | | SP.FD | efdmul | Floating-Point Double-Precision Multiply | | EVX | 100002E8 | | SP.FS | efsmul | Floating-Point Single-Precision Multiply | | EVX | 100002E9 | | SP.FD | efddiv | Floating-Point Double-Precision Divide | | EVX | 100002E9 | | SP.FS | efsdiv | Floating-Point Single-Precision Divide | | EVX | 100002EA | | SP.FD | efdctuidz | Convert Floating-Point Double-Precision to Unsigned Inte- | | - • • • | TOUUUZEA | | 01.10 | Cidotalaz | ger Doubleword with Round Towards Zero | | EVX | 100002EA | | SP.FS | efsctuidz | Convert Floating-Point Single-Precision to Unsigned Integer | | EVX | 1000000 | | SP.FD | efdctsidz | Doubleword with Round Towards Zero | | | 100002EB | | | elucisiaz | Convert Floating-Point Double-Precision to Signed Integer Doubleword with Round Towards Zero | | EVX | 100002EB | | SP.FS | efsctsidz | Convert Floating-Point Single-Precision to Signed Integer | | | | | | | Doubleword with Round Towards Zero | | EVX | 100002EC | | SP.FD | efdcmpgt | Floating-Point Double-Precision Compare Greater Than | | EVX | 100002EC | | SP.FS | efscmpgt | Floating-Point Single-Precision Compare Greater Than | | EVX | 100002ED | | SP.FD | efdcmplt | Floating-Point Double-Precision Compare Less Than | | EVX | 100002ED | | SP.FS | efscmplt | Floating-Point Single-Precision Compare Less Than | | EVX | 100002EE | | SP.FD | efdcmpeq | Floating-Point Double-Precision Compare Equal | | EVX | 100002EE | | SP.FS | efscmpeq | Floating-Point Single-Precision Compare Equal | | EVX | 100002EF | | SP.FD | efdcfs | Floating-Point Double-Precision Convert from Single-Precision | | EVX | 100002F0 | | SP.FD | efdcfui | Convert Floating-Point Double-Precision from Unsigned Integer | | EVX | 100002F0 | | SP.FS | efscfui | Convert Floating-Point Single-Precision from Unsigned Integer | | EVX | 100002F1 | | SP.FD | efdcfsi | Convert Floating-Point Double-Precision from Signed Inte- | | EVX | 100002F1 | | SP.FS | efscfsi | ger Convert Floating-Point Single-Precision from Signed Integer | | EVX | 100002F1<br>100002F2 | | SP.FD | efdcfuf | Convert Floating-Point Double-Precision from Unsigned Fraction | | EVX | 100002F2 | | SP.FS | efscfuf | Convert Floating-Point Single-Precision from Unsigned Fraction | | EVX | 100002F3 | | SP.FD | efdcfsf | Convert Floating-Point Double-Precision from Signed Fraction | | EVX | 100002F3 | | SP.FS | efscfsf | Convert Floating-Point Single-Precision from Signed Fraction | | EVX | 100002F4 | | SP.FD | efdctui | Convert Floating-Point Double-Precision to Unsigned Integer | | EVX | 100002F4 | | SP.FS | efsctui | Convert Floating-Point Single-Precision to Unsigned Integer | | EVX | 100002F1 | | SP.FD | efdctsi | Convert Floating-Point Double-Precision to Signed Integer | | EVX | 100002F5 | | SP.FS | efsctsi | Convert Floating Fount Bouble Fredision to Signed Integer | | EVX | 100002F5<br>100002F6 | | SP.FD | efdctuf | Convert Floating-Point Single-Frecision to Unsigned Fraction | | EVX | 100002F6 | | SP.FS | efsctuf | Convert Floating-Point Single-Precision to Unsigned Fraction | | EVX<br>EVX | 100002F7<br>100002F7 | | SP.FD<br>SP.FS | efdctsf<br>efsctsf | Convert Floating-Point Double-Precision to Signed Fraction Convert Floating-Point Single-Precision to Signed Fraction | | EVX | Form | Opcode<br>(hexadeci-<br>mal) <sup>2</sup> | Mode<br>Dep. <sup>1</sup><br>Priv <sup>1</sup> | Cat <sup>1</sup> | Mnemonic | Instruction | |--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|-------------------------------------------|------------------------------------------------|------------------|---------------|-------------------------------------------------------------| | EVX | EVX | , | | SP.FD | efdctuiz | | | EVX | EVX | 100002F8 | | SP.FS | efsctuiz | Convert Floating-Point Single-Precision to Unsigned Integer | | EVX | EVX | 100002FA | | SP.FD | efdctsiz | Convert Floating-Point Double-Precision to Signed Integer | | EVX | EVX | 100002FA | | SP.FS | efsctsiz | Convert Floating-Point Single-Precision to Signed Integer | | EVX | | | | | | | | EVX | | | | | | | | EVX | | | | | | | | EVX | | | | | | | | EVX 10000300 V | | | | | · · | | | VX 10000300 EVX V vaddsbs evidd Vector Load Doubleword into Doubleword EVX 10000302 V V Vector Load Doubleword into Doubleword EVX 10000302 V V Vector Load Doubleword into 2 Words Indexed EVX 10000304 V V Vector Load Doubleword into 2 Words EVX 10000304 V V Vector Load Doubleword into 2 Words EVX 10000305 V V Vector Load Doubleword into 2 Words EVX 10000306 V V Vector Load Doubleword into 2 Words EVX 10000306 V V Vector Load Doubleword into 2 Words EVX 10000306 V V Vector Shift Right Algebraic Word Vector Load Halfword into Halfwords Indexed VX 10000308 V V Vector Load Halfword into Halfwords Even and Splat Indexed Vector Load Halfword into Halfwords Even and Splat Indexed VX 10000300 V V Vector Load Halfword into Halfwords Even and Splat Indexed Vector Load Halfword into Halfwords Odd Unsigned and Splat Indexed VX 10000300 V V V Vector Load Halfword into Halfwords Odd Unsigned Indexed | | | | | | | | EVX | | | | | | | | EVX 10000302 V | | | | | | | | VX 10000302 EVX V vminsb Vector Minimum Signed Byte VX 10000304 EVX SP evidw Vector Load Doubleword into 2 Words EVX 10000304 EVX V vsrab Vector Shift Right Algebraic Word VC 10000306 EVX V vsrab Vector Load Doubleword into 4 Halfwords VX 10000306 EVX V vsrab Vector Load Doubleword into 4 Halfwords VX 10000306 EVX V vrmlesb Vector Load Halfword into Halfwords Even and Splat Indexed VX 10000308 EVX V vmulesb Vector Load Halfword into Halfwords Even and Splat Indexed VX 10000300 EVX V vsplitisb Vector Load Halfword into Halfwords Odd Unsigned and Splat Indexed VX 1000030C EVX V vsplitisb Vector Load Halfword into Halfwords Odd Unsigned and Splat Indexed VX 1000030C EVX V vsplitisb Vector Load Halfword into Halfwords Odd Signed and Splat Indexed VX 1000030E EVX SP evilhhossplat Vector Load Word into Two Halfwords Odd Gigned Indexed VX 10000310 EVX | | | | | | | | EVX | | | | | | | | EVX 10000304 EVX 10000305 SP evidhx V vsrab vector Load Doubleword into 4 Halfwords Indexed V vsrab vector Strift Right Algebraic Word Vector Load Doubleword into 4 Halfwords V vsrab vector Load Doubleword into 4 Halfwords V vsrab vector Load Doubleword into 4 Halfwords V Vsrab vector Load Doubleword into 4 Halfwords V Vsrab vector Load Doubleword into 4 Halfwords V Vsrab vector Load Doubleword into 4 Halfwords V Vector Load Doubleword into 4 Halfwords V Vector Load Halfword into Halfwords Even and Splat Vector Load Halfword into Halfwords Even and Splat V V V V V V V V V | | | | | | | | VX 10000305 EVX V vsrab evldh Vector Shift Right Algebraic Word EVX 10000308 EVX V vsrab evldh Vector Load Doubleword into 4 Halfwords VX 10000308 EVX V vsrab evlhhesplatx Vector Compare Greater Than Signed Byte VX 10000309 VX V vmulesb Vector Load Halfword into Halfwords Even and Splat Indexed VX 10000300 VX SP evlhhousplatx Vector Convert from Unsigned Fixed-Point Word to Single-Precision EVX 10000300 SP SP evlhhousplatx Vector Load Halfword into Halfwords Odd Unsigned and Splat Indexed VX 10000300 SP SP evlhhossplatx Vector Load Halfword into Halfwords Odd Unsigned and Splat Indexed VX 10000300 SP SP evlhhossplatx Vector Load Halfword into Halfwords Odd Unsigned and Splat Indexed VX 10000300 SP V vpkpx Vector Load Halfword into Halfwords Odd Unsigned and Splat Indexed VX 10000301 SP SP evlhhossplat Vector Load Word into Two Halfwords Even Indexed Vector Load Word into Two Halfwords Even Indexed Vector Load Word into Two Halfwords Odd Unsigned Indexed Vector Load Word into Two Halfwords Odd Unsigned (with sign extension) EVX 10000316 SP | | | | | | | | EVX 10000305 V V V V V V V V V | | | | | | | | VC | | | | | | | | EVX 10000308 V | | | | | | | | EVX 10000309 SP evlhhesplat vcuxwfp Vector Load Halfword into Halfwords Even and Splat Vector Convert from Unsigned Fixed-Point Word to Single-Precision EVX 1000030C SP evlhhousplatx Vector Load Halfword into Halfwords Odd Unsigned and Splat Indexed VX 1000030D SP evlhhousplat Vector Splat Immediate Signed Byte EVX 1000030E SP evlhhossplatx Vector Load Halfword into Halfwords Odd Unsigned and Splat Indexed EVX 1000030E SP evlhhossplatx Vector Load Halfword into Halfwords Odd Signed and Splat Indexed EVX 1000030F SP evlhhossplat Vector Load Word into Two Halfwords Odd and Splat evlwhex EVX 10000310 SP evlwhex Vector Load Word into Two Halfwords Even Indexed EVX 10000311 SP evlwhou Vector Load Word into Two Halfwords Odd Unsigned Indexed (zero-extended) EVX 10000315 SP evlwhos Vector Load Word into Two Halfwords Odd Signed Indexed (with sign extension) EVX 10000317 SP evlwhos Vector Load Word into Two Halfwords Odd Signed (with sign extension) EVX 100003 | | | | | | Vector Load Halfword into Halfwords Even and Splat | | EVX 1000030A SP evihhesplat vcuxwfp Vector Load Halfword into Halfwords Even and Splat Vector Convert from Unsigned Fixed-Point Word to Single-Precision EVX 1000030C V vspltisb vector Spalt Indexed Vector Load Halfword into Halfwords Odd Unsigned and Splat Indexed EVX 1000030E SP evihhousplat Vector Splat Immediate Signed Byte EVX 1000030E SP evihhossplatx Vector Load Halfword into Halfwords Odd Unsigned and Splat Indexed VX 1000030E SP evihhossplatx Vector Pack Pixel EVX 10000310 SP evihhossplat velwhex Vector Load Halfword into Halfwords Odd and Splat Indexed EVX 10000310 SP evihhossplat velwhex Vector Load Word into Two Halfwords Even Indexed EVX 10000311 SP eviwhoux Vector Load Word into Two Halfwords Odd Unsigned Indexed (Zero-extended) EVX 10000315 SP eviwhosx Vector Load Word into Two Halfwords Odd Signed Indexed (with sign extension) EVX 10000317 SP eviwhosx Vector Load Word into Two Halfwords Odd Signed (with sign extension) EVX 10000318 | VX | 10000308 | | V | vmulesb | Vector Multiply Even Signed Byte | | EVX 1000030C | EVX | 10000309 | | SP | evlhhesplat | | | VX | VX | 1000030A | | V | vcuxwfp | | | VX | EVX | 1000030C | | SP | evlhhousplatx | | | EVX 1000030E EVX 1000030E V vpkpx Vector Load Halfword into Halfwords Odd Unsigned and Splat Indexed V vpkpx Vector Pack Pixel V vector Load Halfword into Halfwords Odd and Splat Indexed V vpkpx Vector Load Halfword into Halfwords Odd and Splat Vector Load Halfword into Halfwords Odd and Splat Vector Load Halfword into Halfwords Odd and Splat Vector Load Word into Two Halfwords Even Indexed Vector Load Word into Two Halfwords Even Indexed (zero-extended) EVX 10000315 EVX 10000315 EVX 10000316 EVX 10000317 EVX 10000317 EVX 10000318 X 10000318 X 10000318 X 10000318 X 10000319 EVX 10000310 100 | VX | 1000030C | | V | vspltisb | | | EVX 1000030E | EVX | | | SP | | | | VX 1000030E EVX 10000310 EVX 10000311 EVX 10000315 EVX 10000315 EVX 10000315 EVX 10000316 EVX 10000316 EVX 10000316 EVX 10000316 EVX 10000317 EVX 10000317 EVX 10000318 CEVX 10000319 CEVX 10000319 EVX 10000310 10000320 EVX 10000320 EVX 10000320 EVX 10000321 EVX 10000321 EVX 10000322 EVX 10000322 EVX 10000322 EVX 10000323 EVX 10000323 EVX 10000323 EVX 10000322 EVX 10000323 EVX 10000322 EVX 10000323 EVX 10000323 EVX 10000323 EVX 10000323 EVX 10000323 EVX 10000322 EVX 10000323 10000320 1000032 | | | | | · | Splat | | EVX 10000310 SP EVWhous Vector Load Word into Two Halfwords Even Indexed Vector Load Word into Two Halfwords Odd and Splat Vector Load Word into Two Halfwords Even Vector Load Word into Two Halfwords Odd Unsigned Indexed (zero-extended) Vector Load Word into Two Halfwords Odd Unsigned (zero-extended) Vector Load Word into Two Halfwords Odd Unsigned (zero-extended) Vector Load Word into Two Halfwords Odd Signed Indexed (with sign extension) Vector Load Word into Two Halfwords Odd Signed Indexed (with sign extension) Vector Load Word into Two Halfwords Odd Signed Indexed (with sign extension) Vector Load Word into Two Halfwords Odd Signed Indexed (with sign extension) Vector Load Word into Two Halfwords Odd Signed Indexed Multiply Accumulate Low Halfword to Word Modulo Unsigned Vector Load Word into Two Halfwords and Splat Indexed Vector Load Word into Two Halfwords and Splat Indexed Vector Load Word into Two Halfwords and Splat Indexed Vector Load Word into Two Halfwords and Splat Indexed Vector Load Word into Two Halfwords and Splat Indexed Vector Load Word into Two Halfwords and Splat Indexed Vector Load Word into Two Halfwords and Splat Vector Load Word into Two Halfwords and Splat Vector Load Word into Two Halfwords and Splat Vector Load Word into Two Halfwords and Splat Vector Load Word into Two Halfwords and Splat Vector Store Doubleword of Doubleword Doubleword SP evstddx Vector Store Doubleword of Doubleword Indexed Vector Store Doubleword of Two Words | | | | | · | Indexed | | EVX 10000311 | | | | | | | | EVX 10000314 EVX 10000315 EVX 10000315 EVX 10000316 EVX 10000316 EVX 10000317 EVX 10000317 EVX 10000317 EVX 10000318 X 10000318 X 10000318 X 10000318 X 10000319 EVX 10000310 10000320 EVX 10000320 EVX 10000321 EVX 10000321 EVX 10000321 EVX 10000322 EVX 10000323 10000320 100003 | | | | | | | | EVX 10000315 SP evlwhoux Vector Load Word into Two Halfwords Odd Unsigned Indexed (zero-extended) EVX 10000316 SP evlwhos Vector Load Word into Two Halfwords Odd Unsigned (zero-extended) EVX 10000317 SP evlwhos Vector Load Word into Two Halfwords Odd Signed Indexed (with sign extension) EVX 10000318 SR EVIWHOS Vector Load Word into Two Halfwords Odd Signed (with sign extension) EVX 10000318 SR LIM EVIW maclhwu[o][.] Wector Load Word into Word and Splat Indexed Multiply Accumulate Low Halfwords to Word Modulo Unsigned EVX 10000319 SP evlwwsplat Vector Load Word into Word and Splat Indexed Vector Load Word into Two Halfwords and Splat Indexed Vector Load Word into Two Halfwords and Splat Indexed Vector Load Word into Two Halfwords and Splat Indexed Vector Load Word into Two Halfwords and Splat Indexed Vector Load Word into Two Halfwords and Splat Indexed Vector Load Word into Two Halfwords and Splat Indexed Vector Store Doubleword of Doubleword Indexed Vector Store Doubleword of Doubleword Indexed Vector Store Doubleword of Two Words Indexed Vector Store Doubleword of Two Words Indexed Vector Store Doubleword of Two Words Indexed Vector Store Doubleword of Two Words | | | | | | | | EVX 10000315 EVX 10000316 EVX 10000317 EVX 10000317 EVX 10000318 X 10000318 X 10000318 EVX 10000319 EVX 10000310 10000320 100000320 EVX 100000000000000 | | | | | | | | EVX 10000316 EVX 10000317 EVX 10000318 X 10000318 X 10000319 EVX 10000310 10000320 EVX 10000320 EVX 10000320 EVX 10000321 EVX 10000322 EVX 10000323 10000320 10000 | | 10000314 | | | eviwiloux | Indexed (zero-extended) | | EVX 10000318 SR SR LIM sign extension) EVX 10000318 SR LIM maclhwu[o][.] Vector Load Word into Two Halfwords Odd Signed (with sign extension) EVX 10000318 SR LIM maclhwu[o][.] Vector Load Word into Word and Splat Indexed Multiply Accumulate Low Halfword to Word Modulo Unsigned EVX 10000319 SP evlwwsplat Vector Load Word into Word and Splat Vector Load Word into Two Halfwords and Splat Indexed Vector Load Word into Two Halfwords and Splat Indexed Vector Load Word into Two Halfwords and Splat Vector Load Word into Two Halfwords and Splat Vector Store Doubleword of Doubleword Indexed Vector Store Doubleword of Doubleword SP evstddx Vector Store Doubleword of Two Words Indexed Vector Store Doubleword of Two Words EVX 10000323 SP evstdw Vector Store Doubleword of Two Words | EVX | 10000315 | | SP | evlwhou | - · · · · · · · · · · · · · · · · · · · | | EVX 10000318 SR SP evlwhos Vector Load Word into Two Halfwords Odd Signed (with sign extension) EVX 10000318 SR LIM maclhwu[o][.] Vector Load Word into Word and Splat Indexed Multiply Accumulate Low Halfword to Word Modulo Unsigned EVX 10000319 EVX 1000031C EVX 1000031D EVX 10000320 EVX 10000321 EVX 10000321 EVX 10000322 EVX 10000323 SP evstddx Vector Store Doubleword of Doubleword EVX 10000323 SP evstdwx Vector Store Doubleword of Two Words Indexed Vector Store Doubleword of Two Words Indexed Vector Store Doubleword of Two Words EVX 10000323 SP evstdwx Vector Store Doubleword of Two Words Indexed Vector Store Doubleword of Two Words | EVX | 10000316 | | SP | evlwhosx | | | EVX 10000318 X 10000318 X 10000318 X 10000318 X 10000318 X 10000318 X 10000319 EVX 10000310 EVX 10000310 EVX 10000310 EVX 10000320 EVX 10000320 EVX 10000321 EVX 10000322 EVX 10000323 SP evstdw Vector Store Doubleword of Two Words Indexed Vector Store Doubleword of Two Words Indexed Vector Store Doubleword of Two Words Indexed Vector Store Doubleword of Two Words Indexed Vector Store Doubleword of Two Words | EVX | 10000317 | | SP | evlwhos | Vector Load Word into Two Halfwords Odd Signed (with | | X 10000318 SR LIM maclhwu[o][.] Multiply Accumulate Low Halfword to Word Modulo Unsigned EVX 10000319 EVX 1000031C EVX 1000031D EVX 10000320 EVX 10000321 EVX 10000322 EVX 10000323 SP evides Vector Store Doubleword of Two Words Indexed Vector Store Doubleword of Two Words Indexed Vector Store Doubleword of Two Words | EVX | 10000318 | | SP | evlwwsplatx | | | EVX10000319SPevlwwsplat<br>evlwhsplatxVector Load Word into Word and Splat<br>Vector Load Word into Two Halfwords and Splat IndexedEVX1000031D<br>EVXSPevlwhsplatx<br>evlwhsplatVector Load Word into Two Halfwords and Splat<br>Vector Store Doubleword of Doubleword IndexedEVX10000320<br>EVXSPevstddx<br>evstddVector Store Doubleword of DoublewordEVX10000322<br>EVXSPevstdwx<br>evstdwxVector Store Doubleword of Two Words Indexed<br>Vector Store Doubleword of Two Words | | | SR | | • | Multiply Accumulate Low Halfword to Word Modulo | | EVX1000031CSPevlwhsplatxVector Load Word into Two Halfwords and Splat IndexedEVX1000031DSPevlwhsplatVector Load Word into Two Halfwords and SplatEVX10000320SPevstddxVector Store Doubleword of Doubleword IndexedEVX10000321SPevstddVector Store Doubleword of DoublewordEVX10000322SPevstdwxVector Store Doubleword of Two Words IndexedEVX10000323SPevstdwVector Store Doubleword of Two Words | EVV | 10000011 | | CD. | ovlunuos lat | | | EVX1000031DSPevlwhsplatVector Load Word into Two Halfwords and SplatEVX10000320SPevstddxVector Store Doubleword of Doubleword IndexedEVX10000321SPevstddVector Store Doubleword of DoublewordEVX10000322SPevstdwxVector Store Doubleword of Two Words IndexedEVX10000323SPevstdwVector Store Doubleword of Two Words | | | | | | | | EVX 10000320 SP evstddx Vector Store Doubleword of Doubleword Indexed EVX 10000321 SP evstdd Vector Store Doubleword of Doubleword of Doubleword EVX 10000322 SP evstdwx Vector Store Doubleword of Two Words Indexed EVX 10000323 SP evstdw Vector Store Doubleword of Two Words | | | | | | | | EVX10000321SPevstddVector Store Doubleword of DoublewordEVX10000322SPevstdwxVector Store Doubleword of Two Words IndexedEVX10000323SPevstdwVector Store Doubleword of Two Words | | | | | | | | EVX 10000322 SP evstdwx Vector Store Doubleword of Two Words Indexed Vector Store Doubleword of Two Words SP evstdwx Vector Store Doubleword of Two Words | | | | | | | | EVX 10000323 SP evstdw Vector Store Doubleword of Two Words | | | | | | | | | | | | | | | | TEVA TUUUUU324 OF EVSIUNX VECIOI SIGIE DOUDIEWOID OI FOUI MAIIWOIDS INDEXED | EVX | 10000323 | | SP | evstdhx | Vector Store Doubleword of Four Halfwords Indexed | | | _ | Opcode | ω <del>Γ</del> . | LT | | | | |--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------|------------|------------------|-----|------------------|----------------|-------------------------------------------------------------| | EVX | orn | (hexadeci- | ode<br>ep. | ri≤ | Cat <sup>1</sup> | Mnemonic | Instruction | | EVX | | . , | Q<br>W | Д. | | | | | EVX | | | | | | | | | EVX 10000334 SP evstwhox Vector Store Word of Two Halfwords from Odd Indexed EVX 10000335 SP evstwho Vector Store Word of Two Halfwords from Odd EVX 10000336 SP evstwwe Vector Store Word of Word from Even Indexed EVX 10000330 SP evstwwe Vector Store Word of Word from Ddd Indexed VX 10000340 V vaddshs Vector Store Word of Word from Odd VX 10000342 V vrininsh Vector Mord Sturate VX 10000346 V vrininsh Vector Minimum Signed Halfword VX 10000346 V vrongtshf.] V vrongtshf.] VX 10000346 V vspltish V vspltish V vspltish VX 10000346 V vspltish V vspltish V vspltish VX 10000386 SR LIM Mintliply Accumulate Low Halfword to Word Modulo Signed VX 10000386 V vspltish V vspltish <td></td> <td></td> <td></td> <td></td> <td></td> <td></td> <td></td> | | | | | | | | | EVX 10000338 SP evstwho Vector Store Word of Two Hallwords from Odd EVX 10000339 SP evstwwex Vector Store Word of Word from Even Indexed EVX 10000310 SP evstwwex Vector Store Word of Word from Odd VX 10000340 V vestwwox Vector Store Word of Word from Odd VX 10000342 V vwinish Vector Add Signed Halfword Sturate VX 10000346 V vwinish Vector Store Word of Word from Odd VX 10000346 V vwinish Vector Add Signed Halfword VX 10000346 V vwinish Vector Store Word of Word from Odd VX 10000346 V vwinish Vector Store Word of Word from Odd VX 10000346 V vwinish Vector Store Word of Word from Odd VX 10000346 V vwinish Vector Store Word of Word from Odd VX 10000346 V vwinish Vector Store Word of Word from Odd VX 10000346 V vwinish | | | | | | | | | EVX 10000338 SP SP evstwwex Vector Store Word of Word from Even Indexed EVX 10000330 SP SP evstwwex Vector Store Word of Word from Even Word EVX 10000340 VX V volume V volume VX 10000342 VX V volume V volume VX 10000344 VX V volume V volume VX 10000346 VX V volume V volume VX 10000346 VX V volume V volume VX 10000340 VX V volume V volume VX 10000340 VX V volume V volume VX 10000340 VX V volume V volume VX 10000352 VX V volume V volume VX 10000350 VX V volume V volume VX 10000380 VX V volume V volume VX 10000380 VX V volume V volume VX 10000380 VX V volume V volume VX 10000380 VX V volume V volume | | | | | | | | | EVX 10000332b SP evstwwe Vector Store Word of Word from Codd Indexed EVX 10000340b V vestwwo Vector Store Word of Word from Odd VX 10000342b V vestwwo Vector Store Word of Word from Odd VX 10000344b V vrongdsh[.] Vector Add Signed Halfword VX 10000346b V vrongdsh[.] Vector Compare Greater Than Signed Halfword VX 10000347b V vsplitish Vector Compare Greater Than Signed Halfword VX 10000342b V vsplitish Vector Compare Greater Than Signed Halfword VX 10000342b V vsplitish Vector Compare Greater Than Signed Halfword Vector Solat Immediate Signed Halfword VX 10000355b SR LIM LIM LIM LIM LIM LIM LIM LIM Vector Solat Immediate Signed Halfword Vector Solat Immediate Signed Halfword Vector Solat Immediate Signed Halfword Vector Modulo Vector Solat Immediate Signed Word Solatorate Vector Solatorate Vector Multiply Accumulate Low Halfword to Word Modulo Vector Multip | | | | | | | | | EVX 1000032D SP evstwwox Vector Store Word of Word from Odd Vector Store Word of Word from Odd VX 10000342 V vaddshs Vector Add Signed Halfword Saturate VX 10000344 V vrminsh Vector Minimum Signed Halfword VX 10000348 V vrminsh Vector Minimum Signed Halfword VX 10000340 V vrwingshl, I Vector Minimum Signed Halfword VX 10000340 V vsplish Vector Minimum Signed Halfword VX 10000340 V vsplish Vector Compace Greater Than Signed Halfword VX 10000355 SR LIM Multiply Accumulate Low Halfword to Word Modulo Signed VX 10000380 V vaddsws Vector Add Signed Word Sturate VX 10000380 V vaddsws Vector Convert from Signed Halfword VX 10000380 V vaddsws Vector Splat Immediate Signed Halfword VX 10000380 V vaddsws Vector Add Signed Halfword VX 1000038 | | | | | | | | | EVX | | | | | | | | | NX | | | | | | | | | VX | | | | | | | | | | | 10000342 | | | V | vminsh | | | VX | VX | 10000344 | | | V | vsrah | | | VX | VC | 10000346 | | | | vcmpgtsh[.] | | | VX 1000034E X 1000035E SR LIM machiny[0].] machiny[0].] machiny[0]. Waddsws V variaw Vector Splat Immediate Low Halfword to Word Modulo Signed Word 1000038C V vanisws Vector Multiply Accumulate Low Halfword to Word Modulo Signed VX 10000382 V vininsw Vector Multiply Accumulate Low Halfword to Word Modulo Signed VX 10000382 V vininsw Vector ASI Signed Word Vector Minimum Signed Word Vector Splat Immediate Signed Word Waddsws Vector ASI Signed Word Saturate Vector Minimum Signed Word Vector Minimum Signed Word Vector Compare Greater Than Signed Word Vector Compare Greater Than Signed Word Vector Compare Greater Than Signed Word Vector Splat Immediate Signed Word Vector Compare Greater Than Signed Word Vector Compare Greater Than Signed Word Vector Compare Greater Than Signed Word Vector Splat Immediate Signed Word Vector Compare Greater Than Signed Word Vector Compare Greater Than Signed Word Vector Splat Immediate Signed Word Vector Compare Greater Than Signed Word Vector Compare Greater Than Signed Word Vector Splat Immediate Signed Word Vector Compare Greater Than Signed Word Vector Compare Greater Than Signed Word Vector Splat Immediate Signed Word Vector Compare Greater Than Signed Word Vector Compare Greater Than Signed Word Vector Splat Immediate Signed Word Vector Compare Greater Than Signed Word Vector Splat Immediate Signed Word Vector Compare Greater Than Signed Word Vector Compare Greater Than Signed Word Vector Compare Greater Than Signed Word Vector Compare Greater Than Signed Word Vector Compare Greater Than Signed Word Vector Compare Greater Than Signed Word Vector Splat Immediate Signed Highly Figure Vector Splat Immediate Signed Word Vector Compare Greater Than Signed Word Vector Compare Greater Than Signed Word Vector Compare Greater Than Signed Word Vector Splat Immediate Signed Word Vector Compare Greater Than Signed Word Vector Compare Greater Than Signed Word Vector Splat Immediate Signed Word Vector Splat Immediate Signed Word Vector Option Splat Immediate Signed Word Vector Option Splat I | | 10000348 | | | | | | | VX | VX | 1000034A | | | V | vcsxwfp | | | XX | | | | | | | | | X | | | | | | | | | X | | | 0.0 | | | | | | VX 10000384 VV vaddsws Vector Add Signed Word Saturate Vector Minimum Signed Word Vector Minimum Signed Word Vector Minimum Signed Word Vector Minimum Signed Word Vector Minimum Signed Word Vector Shift Right Algebraic Word Vector Shift Right Algebraic Word Vector Shift Right Algebraic Word Vector Shift Right Algebraic Word Vector Compare Greater Than Signed Word Vector Compare Greater Than Signed Word Vector Compare Greater Than Signed Word Vector Compare Greater Than Signed Word Vector Compare Greater Than Signed Word Vector Compare Greater Than Signed Word Vector Convert from Single-Precision to Unsigned Fixed-Point Word Saturate Unsigned Vector Compare Bounds Single-Precision to Signed Fixed-Point Word Saturate Vector Compare Bounds Single-Precision to Signed Fixed-Point Word Saturate Vector Convert from Single-Precision to Signed Fixed-Point Word Saturate Vector Convert from Single-Precision to Signed Fixed-Point Word Saturate Vector Unpack Low Pixel Vector Unpack Low Pixel Vector Subtract Unsigned Byte Modulo Vector Average Unsigned Byte Wector Subtract Unsigned Byte Modulo Vector Average Unsigned Byte Wector Multiply Halfwords, Even, Signed, Saturate, Fractional Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vec | | | | | | | | | VX 10000380 10000382 VX 10000386 VX 10000386 VX 10000386 VX V vminsw vsraw Vector Minimum Signed Word Vector Shift Right Algebraic Word Vector Compare Greater Than Signed Word Vector Compare Greater Than Signed Word Vector Convert from Single-Precision to Unsigned Fixed-Point Word Saturate Vector Splat Immediate Signed Word Multiply Accumulate Low Halfword to Word Saturate Unsigned Vector Compare Bounds Single-Precision Shift words Subtract Vector Multiply Accumulate Low Halfword to Word Saturate Vector Multiply Accumulate Low Halfword to Word Saturate Signed Vector Average Unsigned Byte Modulo Vector Average Unsigned Byte Modulo Vector Average Unsigned Byte Modulo Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Dodd, Signed, Modulo, Integer Vector Multiply Halfwords, Dodd, Signed, Modulo, Integer Vector Multiply Halfwords, Dodd, Signed, Modulo, Integer Vector Multiply Halfwords, Dodd, Signed, Modulo, Integer Vector Multiply Halfwords, Dodd, | ^ | 10000350 | SK | | LIIVI | nmacinw[o][.] | | | VX 10000382 VC V vriinsw Vector Minimum Signed Word VX 10000384 VC V vsraw Vector Shift Right Algebraic Word VX 10000385 VC V vspltisw Vector Convert from Single-Precision to Unsigned Fixed-Point Word Saturate VX 10000365 VC V vspltisw V vspltisw Vector Convert from Single-Precision to Unsigned Word VC 10000366 VX V vspltisw V vcrpsxws Vector Splat Immediate Signed Word VX 10000366 VX V vspltisw Vector Compare Bounds Single-Precision to Signed Fixed-Point Word Saturate VX 10000308 SR Valud LIM maclhws[o][.] maclhws[o][.] maclhws[o][.] maclhws[o][.] maclhws[o][.] Vector Compare Bounds Single-Precision to Signed Fixed-Point Word Saturate VX 10000402 Valud V vsububm washes Vector Unpack Low Pixel VX 10000403 SR Valud V vsububm washes Vector Subtract Unsigned Byte Modulo VX 10000402 Valud V vand Vector Multiply Halfwords, Even, Signed, Saturate, Fractional VX 10000404 Valud V vand Vector Multiply Halfwords, Odd, Signed, Fractional VX 10000409 Valud SP evmhesmi Vector Multiply Halfwords, Ceven, Signed, Modulo, Integer Vector Multiply Halfwords, Codd, Signed, Modulo, Integer | \/\ | 10000000 | | | \/ | voddowo | Worter Add Signed Word Saturate | | VX 10000386 VC V vsraw vcmptsw[.] vctor Compare Greater Than Signed Word Vector Compare Greater Than Signed Word Vector Compare Greater Than Signed Word Vector Compare Greater Than Signed Word Vector Compare Greater Than Signed Word Vector Convert from Single-Precision to Unsigned Fixed-Point Word Saturate Vector Splat Immediate Signed Word Vector Compare Bounds Single-Precision Vector Compare Bounds Single-Precision Vector Compare Bounds Single-Precision to Signed Fixed-Point Word Saturate Unsigned Vector Compare Bounds Single-Precision to Signed Fixed-Point Word Saturate Unsigned Vector Unpack Low Pixel Multiply Accumulate Low Halfword to Word Saturate Vector Unpack Low Pixel Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Halfwords, Even, Signed, Saturate, Fractional Vector Multiply Halfwords, Odd, Signed, Fractional Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Odd, Unsigned, Modulo, Integer Vector Multiply Halfwords, Odd, Signed, Modulo, Integer Vector Multiply Halfwords, Odd, Signed, Modulo, Integer Vector Multiply Halfwords, Odd, Signed, Modulo, Integer Vector Multiply Halfwords, Odd, Signed, Modulo, Integer Vector Multiply Halfwords, Odd, Signed, Modulo, Integer Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply H | | | | | | | | | VC 10000386 VX V vcmpgtsw[.] vcfpuxws Vector Compare Greater Than Signed Word Vector Convert from Single-Precision to Unsigned Fixed-Point Word Saturate Vector Splat Immediate Signed Word Multiply Accumulate Low Halfword to Word Saturate Unsigned Word Vector Convert from Single-Precision to Signed Fixed-Point Word Saturate Unsigned Vector Convert from Single-Precision to Signed Fixed-Point Word Saturate Unsigned Vector Convert from Single-Precision to Signed Fixed-Point Word Saturate Vector Convert from Single-Precision to Signed Fixed-Point Word Saturate Vector Unpack Low Pale Immediate Signed Word Vector Convert from Single-Precision to Signed Fixed-Point Word Saturate Vector Unpack Low Pixel Immediate Signed Word Vector Splat Immediate Signed Word Vector Convert from Single-Precision to Signed Fixed-Point Word Saturate Vector Unpack Low Pixel Immediate Signed Word Vector Convert from Single-Precision to Signed Fixed-Point Word Saturate Vector Unpack Low Pixel Vector Unpack Low Pixel Vector Unpack Low Pixel Vector Subtract Unsigned Byte Modulo Vector Subtract Unsigned Byte Modulo Vector Average Unsigned Byte Vector Average Unsigned Byte Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Codd, Signed, Fractional Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Odd, Signed, Fractional Vector Multiply Halfwords, Odd, Signed, Modulo, Integer Vector Multiply Halfwords, Odd, Signed, Fractional Vector Multiply Halfwords, Odd, Signed, Fractional Vector Multiply Halfwords, Odd, Signed, Fractional Vector Multiply Halfwords, Odd, Signed, Fractional Vector Multiply Halfwords, Odd, Signed, Modulo, Integer Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Even, Sign | | | | | | _ | | | VX 1000038A V vcfpuxws Vector Convert from Single-Precision to Unsigned Fixed-Point Word Saturate VX 10000398 SR LIM Multiply Accumulate Low Halfword to Word Saturate VC 100003C6 V vcmpbfp[.] Vector Compare Bounds Single-Precision VX 100003CB V vcfpsxws Vector Compare Bounds Single-Precision to Signed Fixed-Point Word Saturate VX 100003DC SR LIM Maclhws[o][.] Multiply Accumulate Low Halfword to Word Saturate Vector Unpack Low Pixel X 100003DC SR LIM maclhws[o][.] Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Vector Average Unsigned Byte VX 10000402 V V vsubum Vector Splat Lim Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Even, Signed, M | | | | | | | | | VX 1000036C | | | | | | | | | VX 1000038C 1000038C Name (No. 100003CB) SR V LIM machwsu[0][.] machwsu[0][.] Multiply Accumulate Low Halfword to Word Saturate Unsigned Vector Convert from Single-Precision to Signed Fixed-Point Word Saturate Vector Unpack Low Pixel Multiply Accumulate Low Halfword to Word Saturate Signed Vector Unpack Low Pixel Multiply Accumulate Low Halfword to Word Saturate Signed Nord Saturate Signed Nord Saturate Signed Nord Saturate Vector Unpack Low Pixel Multiply Accumulate Low Halfword to Word Saturate Signed Nord Nord Saturate Signed Saturate Signed Nord | | | | | | | | | VC 100003CA V vcmpbfp[.] Vector Compare Bounds Single-Precision to Signed Fixed-Point Word Saturate VX 100003DC SR LIM maclhws[o][.] nmaclhws[o][.] nmaclhws | VX | 1000038C | | | V | vspltisw | Vector Splat Immediate Signed Word | | VC 100003CA V vcmpbfp[.] vcfpsxws Vector Compare Bounds Single-Precision Vector Convert from Single-Precision to Signed Fixed-Point Word Saturate VX 100003CE X V vupklpx Vupklpx maclhws[o][.] maclhws[o][.] Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Halfwords, Even, Signed, Modulo Wector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Odd, Signed, Modulo, Integer Vector Multiply Halfwords, Odd, Signed, Modulo, Integer Vector Multiply Halfwords, Odd, Signed, Fractional Vector Multiply Halfwords, Even, Signed, Modulo, Integer to Accumulator VX 10000427 SP evmhosmi Vector Multiply Halfwords, Even, Signed, Modulo, Integer to Accumulator EVX 10000428 SP evmhesmia Vector Multiply Halfwords, Even, Signed, Modulo, Integer to Accumulator Vector Mu | Χ | 10000398 | SR | | LIM | maclhwsu[o][.] | Multiply Accumulate Low Halfword to Word Saturate | | VX 100003CE X 100003DE X 100003DE X 10000400 YX 100004 | | | | | | | | | VX 100003DE X 100003DE X 10000400 | | 100003C6 | | | | | | | VX100003CE<br>XSR<br>100003DCV<br>SRVupklpx<br>LIMVector Unpack Low Pixel<br>Multiply Accumulate Low Halfword to Word Saturate Signed<br>Negative Multiply Accumulate Low Halfword to Word Saturate Signed<br> | VX | 100003CA | | | V | vcfpsxws | | | X 100003DC SR LIM maclhws[o][.] nmaclhws[o][.] nmaclhws[o][.] Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed Negative Multiply Halfwords Signed Negative Multiply Halfwords, Signed Negative Multiply Halfwords, Even, Signed Negative Multiply Halfwords, Even, Signed Negative Multiply Halfwords, Seven, Unsigned, Modulo, Integer Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Odd, Signed, Modulo, Integer Vector Multiply Halfwords, Odd, Unsigned, Modulo, Integer Vector Multiply Halfwords, Odd, Signed, Modulo, Integer Vector Multiply Halfwords, Odd, Signed, Modulo, Integer Vector Multiply Halfwords, Odd, Signed, Modulo, Integer Vector Multiply Halfwords, Odd, Signed, Modulo, Integer Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional Vector Multiply Halfwords, Odd, Signed, Fractional Vector Multiply Halfwords, Odd, Signed, Fractional Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Even, Signed, Modulo, Integer to Accumulator EVX 10000429 SP evmhesmia Vector Multiply Halfwords, Even, Signed, Modulo, Integer to Accumulator EVX 10000429 SP evmhesmia Vector Multiply Halfwords, Even, Signed, Modulo, Integer to Accumulator EVX 10000429 | | | | | | | | | X 10000400 SR LIM nmaclhws[o][.] Negative Multiply Accumulate Low Halfword to Word Saturate Signed Vector Subtract Unsigned Byte Modulo Vector Average Unsigned Byte Wodulo Vector Average Unsigned Byte Vector Multiply Halfwords, Even, Signed, Saturate, Fractional Vector Multiply Halfwords, Odd, Signed, Fractional Vector Multiply Halfwords, Even, Unsigned, Modulo, Integer Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Codd, Unsigned, Modulo, Integer Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Odd, Unsigned, Modulo, Integer Vector Multiply Halfwords, Odd, Signed, Modulo, Integer Vector Multiply Halfwords, Odd, Signed, Modulo, Integer Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional Vector Multiply Halfwords, Odd, Signed, Fractional Vector Multiply Halfwords, Odd, Signed, Fractional Vector Multiply Halfwords, Odd, Signed, Fractional Vector Multiply Halfwords, Odd, Signed, Fractional Vector Multiply Halfwords, Odd, Signed, Fractional to Accumulator EVX 10000428 SP evmhesmia Vector Multiply Halfwords, Even, Signed, Modulo, Integer to Accumulator EVX 10000429 SP evmhesmia Vector Multiply Halfwords, Even, Signed, Modulo, Integer to Accumulator Vector Multiply Halfwords, Even, Signed, Modulo, Integer to Accumulator Vector Multiply Halfwords, Even, Signed, Modulo, Integer to Accumulator Vector Multiply Halfwords, Even, Signed, Modulo, Integer to Accumulator Vector Multiply Halfwords, Even, Signed, Modulo, Fractional Vector Multiply Halfwords, Even, Signed, Modulo, Integer to Accumulator | | | 0.0 | | | | | | rate Signed VX 10000400 VX 10000402 EVX 10000403 VV vaygub SP evmhessf Vector Multiply Halfwords, Even, Signed, Saturate, Fractional VX 10000404 EVX 10000407 EVX 10000408 EVX 10000409 VV vand EVX 10000409 VV vand Vector Multiply Halfwords, Even, Unsigned, Modulo, Integer evmhesmi Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Odd, Unsigned, Modulo, Integer Vector Multiply Halfwords, Odd, Signed, Modulo, Integer Vector Multiply Halfwords, Odd, Signed, Modulo, Integer Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional Vector Multiply Halfwords, Odd, Signed, Fractional Vector Multiply Halfwords, Odd, Signed, Fractional Vector Multiply Halfwords, Odd, Signed, Fractional Vector Multiply Halfwords, Odd, Signed, Fractional Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Odd, Signed, Fractional to Accumulator Vector Multiply Halfwords, Even, Signed, Modulo, Integer Halfwo | X | | | | | | | | VX10000400<br>VXVvsububm<br>VavgubVector Subtract Unsigned Byte ModuloVX10000403SPevmhessfVector Average Unsigned ByteVX10000404<br>EVXVvandVector Multiply Halfwords, Even, Signed, Saturate, FractionalVX10000407<br>EVXSPevmhossf<br>evmhesmiVector Multiply Halfwords, Odd, Signed, FractionalEVX10000408<br>EVXSPevmhesmi<br>evmhesmiVector Multiply Halfwords, Even, Unsigned, Modulo, IntegerEVX10000408<br>EVXSPevmhesmi<br>evmhesmfVector Multiply Halfwords, Even, Signed, Modulo, IntegerEVX10000408<br>EVXSPevmhosmi<br>evmhosmiVector Multiply Halfwords, Odd, Unsigned, Modulo, IntegerVX10000400<br>EVXSPevmhosmi<br>evmhosmiVector Multiply Halfwords, Odd, Signed, Modulo, IntegerEVX10000423SPevmhosmi<br>evmhessfaVector Multiply Halfwords, Odd, Signed, Modulo, FractionalEVX10000427SPevmhessfaVector Multiply Halfwords, Even, Signed, Saturate, Fractional to AccumulatorEVX10000428SPevmhesmiaVector Multiply Halfwords, Even, Unsigned, Modulo, Integer<br>to AccumulatorEVX10000429SPevmhesmiaVector Multiply Halfwords, Even, Signed, Modulo, Integer to AccumulatorEVX10000428SPevmhesmiaVector Multiply Halfwords, Even, Signed, Modulo, Integer to AccumulatorEVX10000428SPevmhesmiaVector Multiply Halfwords, Even, Signed, Modulo, Fractional | ^ | 100003DC | SK | | LIIVI | nmacinws[o][.] | | | VX10000402<br>EVXVvavgub<br>evmhessfVector Average Unsigned ByteVX10000404<br>EVXVV vand<br>SPVector Multiply Halfwords, Even, Signed, Saturate, FractionalEVX10000407<br>EVXSPevmhossf<br>evmheumiVector Multiply Halfwords, Odd, Signed, FractionalEVX10000408<br>EVXSPevmheumi<br>evmheumiVector Multiply Halfwords, Even, Unsigned, Modulo, IntegerEVX10000409<br>VXVvmaxfp<br>evmhesmi<br>Vector Multiply Halfwords, Even, Signed, Modulo, IntegerVX1000040A<br>SPSPevmhesmf<br>evmhoumiVector Multiply Halfwords, Odd, Unsigned, Modulo, IntegerVX1000040C<br>VXVvsloVector Multiply Halfwords, Odd, Signed, Modulo, IntegerEVX1000040D<br>EVXSPevmhosmi<br>evmhosmiVector Multiply Halfwords, Odd, Signed, Modulo, IntegerEVX10000427SPevmhosmf<br>evmhessfaVector Multiply Halfwords, Odd, Signed, Modulo, Fractional<br>Vector Multiply Halfwords, Even, Signed, Modulo, IntegerEVX10000428SPevmhesmiaVector Multiply Halfwords, Even, Unsigned, Modulo, Integer<br>to AccumulatorEVX10000429SPevmhesmiaVector Multiply Halfwords, Even, Signed, Modulo, Integer to<br>AccumulatorEVX10000428SPevmhesmiaVector Multiply Halfwords, Even, Signed, Modulo, Integer to<br>AccumulatorEVX10000428SPevmhesmiaVector Multiply Halfwords, Even, Signed, Modulo, Fractional | \/Y | 10000400 | | | \/ | veuhuhm | | | EVX10000403SPevmhessfVector Multiply Halfwords, Even, Signed, Saturate, FractionalVX10000404VvandVector ANDEVX10000407SPevmhossfVector Multiply Halfwords, Odd, Signed, FractionalEVX10000408SPevmheumiVector Multiply Halfwords, Even, Unsigned, Modulo, IntegerEVX10000409VvmaxfpVector Multiply Halfwords, Even, Signed, Modulo, IntegerVX1000040BSPevmhesmfVector Multiply Halfwords, Even, Signed, Modulo, FractionalEVX1000040CSPevmhoumiVector Multiply Halfwords, Odd, Unsigned, Modulo, IntegerVX1000040CSPevmhosmiVector Multiply Halfwords, Odd, Signed, Modulo, IntegerEVX1000040FSPevmhosmiVector Multiply Halfwords, Odd, Signed, Modulo, FractionalEVX10000423SPevmhessfaVector Multiply Halfwords, Even, Signed, Saturate, Fractional to AccumulatorEVX10000428SPevmhesmiaVector Multiply Halfwords, Even, Unsigned, Modulo, Integer to AccumulatorEVX10000429SPevmhesmiaVector Multiply Halfwords, Even, Signed, Modulo, Integer to AccumulatorEVX10000428SPevmhesmiaVector Multiply Halfwords, Even, Signed, Modulo, Integer to AccumulatorEVX10000428SPevmhesmiaVector Multiply Halfwords, Even, Signed, Modulo, Fractional | | | | | | | | | tional Vx 10000404 EVX 10000407 EVX 10000408 EVX 10000409 VX 10000409 VX 10000408 EVX 10000400 VX 10000400 EVX 10000423 EVX 10000423 EVX 10000423 EVX 10000427 EVX 10000428 10000429 EVX 10000428 E | | | | | | _ | | | VX10000404<br>EVXVVand<br>evmhossf<br>SPVector AND<br>vector Multiply Halfwords, Odd, Signed, Fractional<br>Vector Multiply Halfwords, Even, Unsigned, Modulo, Integer<br>Vector Multiply Halfwords, Even, Signed, Modulo, Integer<br>Vector Multiply Halfwords, Even, Signed, Modulo, Integer<br>Vector Multiply Halfwords, Even, Signed, Modulo, Integer<br>Vector Multiply Halfwords, Even, Signed, Modulo, Fractional<br>Vector Multiply Halfwords, Odd, Unsigned, Modulo, Integer<br>Vector Multiply Halfwords, Odd, Unsigned, Modulo, Integer<br>Vector Multiply Halfwords, Odd, Signed, Modulo, Integer<br>Vector Multiply Halfwords, Odd, Signed, Modulo, Integer<br>Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional<br>Vector Multiply Halfwords, Even, Signed, Modulo, Fractional<br>Vector Multiply Halfwords, Odd, Signed, Fractional to AccumulatorEVX10000427SPevmhossfaVector Multiply Halfwords, Odd, Signed, Fractional to AccumulatorEVX10000428SPevmhesmiaVector Multiply Halfwords, Even, Unsigned, Modulo, Integer<br>to AccumulatorEVX10000429SPevmhesmiaVector Multiply Halfwords, Even, Unsigned, Modulo, Integer<br>to AccumulatorEVX10000428SPevmhesmiaVector Multiply Halfwords, Even, Signed, Modulo, Integer to<br>AccumulatorEVX10000428SPevmhesmiaVector Multiply Halfwords, Even, Signed, Modulo, Fractional | | 10000103 | | | O. | 0111110001 | | | EVX10000407<br>EVXSP<br>10000408<br>EVXevmhossf<br>evmheumi<br>SP<br>evmhesmiVector Multiply Halfwords, Even, Unsigned, Modulo, Integer<br>Vector Multiply Halfwords, Even, Signed, Modulo, Integer<br>Vector Multiply Halfwords, Even, Signed, Modulo, Integer<br>Vector Maximum Single-Precision<br>Vector Multiply Halfwords, Even, Signed, Modulo, Fractional<br>Vector Multiply Halfwords, Odd, Unsigned, Modulo, Integer<br>Vector Multiply Halfwords, Odd, Unsigned, Modulo, Integer<br>Vector Multiply Halfwords, Odd, Signed, Modulo, Integer<br>Vector Multiply Halfwords, Odd, Signed, Modulo, Integer<br>Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional<br>Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional<br>Vector Multiply Halfwords, Even, Signed, Saturate, Fractional to AccumulatorEVX10000427SPevmhosmfVector Multiply Halfwords, Odd, Signed, Fractional to AccumulatorEVX10000428SPevmhossfaVector Multiply Halfwords, Odd, Signed, Fractional to AccumulatorEVX10000429SPevmhesmiaVector Multiply Halfwords, Even, Unsigned, Modulo, Integer<br>to AccumulatorEVX10000428SPevmhesmiaVector Multiply Halfwords, Even, Signed, Modulo, Integer<br>to AccumulatorEVX10000428SPevmhesmiaVector Multiply Halfwords, Even, Signed, Modulo, Integer to<br>AccumulatorEVX10000428SPevmhesmfaVector Multiply Halfwords, Even, Signed, Modulo, Fractional | VX | 10000404 | | | V | vand | | | EVX10000408<br>EVX10000409<br>1000040A<br>VXSP<br>1000040A<br>EVXevmhesmi<br>V wmaxfp<br>evmhesmiVector Multiply Halfwords, Even, Unsigned, Modulo, Integer<br>Vector Multiply Halfwords, Even, Signed, Modulo, Integer<br>Vector Multiply Halfwords, Even, Signed, Modulo, Integer<br>Vector Multiply Halfwords, Odd, Unsigned, Modulo, Fractional<br>Vector Multiply Halfwords, Odd, Unsigned, Modulo, Integer<br>Vector Multiply Halfwords, Odd, Signed, Modulo, Integer<br>Vector Multiply Halfwords, Odd, Signed, Modulo, Integer<br>Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional<br>Vector Multiply Halfwords, Even, Signed, Modulo, Fractional<br>Vector Multiply Halfwords, Odd, Signed, Fractional to AccumulatorEVX10000427SPevmhossfaVector Multiply Halfwords, Odd, Signed, Fractional to AccumulatorEVX10000428SPevmhessfaVector Multiply Halfwords, Even, Unsigned, Modulo, Integer<br>to AccumulatorEVX10000429SPevmhesmiaVector Multiply Halfwords, Even, Signed, Modulo, Integer<br>to AccumulatorEVX1000042BSPevmhesmiaVector Multiply Halfwords, Even, Signed, Modulo, Integer to<br>AccumulatorEVX1000042BSPevmhesmiaVector Multiply Halfwords, Even, Signed, Modulo, Fractional | | | | | | | | | EVX10000409<br>VXSP<br>1000040A<br>EVXevmhesmi<br>V wmaxfpVector Multiply Halfwords, Even, Signed, Modulo, Integer<br>Vector Maximum Single-PrecisionEVX1000040B<br>EVXSP<br>1000040C<br>EVXSP<br>evmhoumi<br>V vsloVector Multiply Halfwords, Even, Signed, Modulo, Fractional<br>Vector Multiply Halfwords, Odd, Unsigned, Modulo, Integer<br>Vector Shift Left by OctetEVX1000040B<br>EVXSP<br>evmhosmi<br>SP<br>evmhosmfVector Multiply Halfwords, Odd, Signed, Modulo, Integer<br>Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional<br>Vector Multiply Halfwords, Even, Signed, Saturate, Fractional to AccumulatorEVX10000423SPevmhossfaVector Multiply Halfwords, Odd, Signed, Fractional to AccumulatorEVX10000428SPevmheumiaVector Multiply Halfwords, Even, Unsigned, Modulo, Integer<br>to AccumulatorEVX10000429SPevmhesmiaVector Multiply Halfwords, Even, Signed, Modulo, Integer to<br>AccumulatorEVX1000042BSPevmhesmiaVector Multiply Halfwords, Even, Signed, Modulo, Integer to<br>AccumulatorEVX1000042BSPevmhesmfaVector Multiply Halfwords, Even, Signed, Modulo, Fractional | | | | | SP | | Vector Multiply Halfwords, Even, Unsigned, Modulo, Integer | | EVX1000040B<br>EVXSP<br>1000040C<br>VXevmhesmf<br>evmhoumi<br>VsloVector Multiply Halfwords, Even, Signed, Modulo, Fractional<br>Vector Multiply Halfwords, Odd, Unsigned, Modulo, Integer<br>Vector Shift Left by OctetEVX1000040D<br>EVXSP<br>evmhosmi<br>evmhosmf<br>EVXVector Multiply Halfwords, Odd, Signed, Modulo, Integer<br>Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional<br>Vector Multiply Halfwords, Even, Signed, Modulo, Fractional<br>Vector Multiply Halfwords, Even, Signed, Fractional to AccumulatorEVX10000427SPevmhossfaVector Multiply Halfwords, Odd, Signed, Fractional to AccumulatorEVX10000428SPevmhesmiaVector Multiply Halfwords, Even, Unsigned, Modulo, Integer<br>to AccumulatorEVX10000429SPevmhesmiaVector Multiply Halfwords, Even, Signed, Modulo, Integer to<br>AccumulatorEVX1000042BSPevmhesmfaVector Multiply Halfwords, Even, Signed, Modulo, Fractional | | | | | | evmhesmi | Vector Multiply Halfwords, Even, Signed, Modulo, Integer | | EVX1000040C<br>VXSP<br>1000040C<br>EVXevmhoumi<br>VsloVector Multiply Halfwords, Odd, Unsigned, Modulo, Integer<br>Vector Shift Left by OctetEVX1000040F<br>EVXSP<br>evmhosmi<br>evmhosmf<br>evmhosmfVector Multiply Halfwords, Odd, Signed, Modulo, Integer<br>Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional<br>Vector Multiply Halfwords, Even, Signed, Saturate, Fractional to AccumulatorEVX10000427SPevmhossfaVector Multiply Halfwords, Odd, Signed, Fractional to AccumulatorEVX10000428SPevmhesmiaVector Multiply Halfwords, Even, Unsigned, Modulo, Integer<br>to AccumulatorEVX10000429SPevmhesmiaVector Multiply Halfwords, Even, Signed, Modulo, Integer to<br>AccumulatorEVX1000042BSPevmhesmfaVector Multiply Halfwords, Even, Signed, Modulo, Fractional | | | | | | | | | VX1000040CVvsloVector Shift Left by OctetEVX1000040DSPevmhosmiVector Multiply Halfwords, Odd, Signed, Modulo, IntegerEVX10000423SPevmhosmfVector Multiply Halfwords, Odd, Signed, Modulo, FractionalEVX10000427SPevmhossfaVector Multiply Halfwords, Even, Signed, Saturate, Fractional to AccumulatorEVX10000428SPevmheumiaVector Multiply Halfwords, Even, Unsigned, Modulo, Integer to AccumulatorEVX10000429SPevmhesmiaVector Multiply Halfwords, Even, Signed, Modulo, Integer to AccumulatorEVX1000042BSPevmhesmfaVector Multiply Halfwords, Even, Signed, Modulo, Fractional | | | | | | | | | EVX1000040D<br>EVXSP<br>1000040F<br>EVXSP<br>sp<br>sp<br>evmhosmf<br>evmhosmfVector Multiply Halfwords, Odd, Signed, Modulo, Integer<br>Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional<br>Vector Multiply Halfwords, Even, Signed, Saturate, Fractional to AccumulatorEVX10000427SPevmhossfaVector Multiply Halfwords, Odd, Signed, Fractional to AccumulatorEVX10000428SPevmheumiaVector Multiply Halfwords, Even, Unsigned, Modulo, Integer<br>to AccumulatorEVX10000429SPevmhesmiaVector Multiply Halfwords, Even, Signed, Modulo, Integer to<br>AccumulatorEVX1000042BSPevmhesmfaVector Multiply Halfwords, Even, Signed, Modulo, Fractional | | | | | | | | | EVX1000040F<br>EVXSP<br>10000423evmhosmf<br>evmhessfaVector Multiply Halfwords, Odd, Signed, Modulo, Fractional<br>Vector Multiply Halfwords, Even, Signed, Saturate, Fractional to Accumulator<br>Vector Multiply Halfwords, Odd, Signed, Fractional to AccumulatorEVX10000428SPevmhossfaVector Multiply Halfwords, Odd, Signed, Fractional to AccumulatorEVX10000428SPevmheumiaVector Multiply Halfwords, Even, Unsigned, Modulo, Integer to AccumulatorEVX10000429SPevmhesmiaVector Multiply Halfwords, Even, Signed, Modulo, Integer to AccumulatorEVX1000042BSPevmhesmfaVector Multiply Halfwords, Even, Signed, Modulo, Fractional | | | | | | | | | EVX10000423SPevmhessfaVector Multiply Halfwords, Even, Signed, Saturate, Fractional to AccumulatorEVX10000427SPevmhossfaVector Multiply Halfwords, Odd, Signed, Fractional to AccumulatorEVX10000428SPevmheumiaVector Multiply Halfwords, Even, Unsigned, Modulo, Integer to AccumulatorEVX10000429SPevmhesmiaVector Multiply Halfwords, Even, Signed, Modulo, Integer to AccumulatorEVX1000042BSPevmhesmfaVector Multiply Halfwords, Even, Signed, Modulo, Fractional | | | | | | | | | tional to Accumulator Vector Multiply Halfwords, Odd, Signed, Fractional to Accumulator EVX 10000428 SP evmheumia Vector Multiply Halfwords, Even, Unsigned, Modulo, Integer to Accumulator EVX 10000429 SP evmhesmia Vector Multiply Halfwords, Even, Signed, Modulo, Integer to Accumulator EVX 10000428 SP evmhesmia Vector Multiply Halfwords, Even, Signed, Modulo, Fractional | | | | | | | | | EVX10000427SPevmhossfaVector Multiply Halfwords, Odd, Signed, Fractional to AccumulatorEVX10000428SPevmheumiaVector Multiply Halfwords, Even, Unsigned, Modulo, Integer to AccumulatorEVX10000429SPevmhesmiaVector Multiply Halfwords, Even, Signed, Modulo, Integer to AccumulatorEVX1000042BSPevmhesmfaVector Multiply Halfwords, Even, Signed, Modulo, Fractional | LVA | 10000423 | | | OF- | GVIIIIIGOSIA | | | EVX 10000428 SP evmheumia Vector Multiply Halfwords, Even, Unsigned, Modulo, Integer to Accumulator EVX 10000429 SP evmhesmia Vector Multiply Halfwords, Even, Signed, Modulo, Integer to Accumulator EVX 1000042B SP evmhesmfa Vector Multiply Halfwords, Even, Signed, Modulo, Fractional | EVX | 10000427 | | | SP | evmhossfa | Vector Multiply Halfwords, Odd, Signed, Fractional to Accu- | | to Accumulator Vector Multiply Halfwords, Even, Signed, Modulo, Integer to Accumulator EVX 1000042B SP evmhesmfa Vector Multiply Halfwords, Even, Signed, Modulo, Fractional | | | | | | | | | EVX 10000429 SP evmhesmia Vector Multiply Halfwords, Even, Signed, Modulo, Integer to Accumulator EVX 1000042B SP evmhesmfa Vector Multiply Halfwords, Even, Signed, Modulo, Fractional | EVX | 10000428 | | | SP | evmheumia | | | EVX 1000042B SP evmhesmfa Accumulator Vector Multiply Halfwords, Even, Signed, Modulo, Fractional | EVY | 10000430 | | | QD | evmheemia | | | EVX 1000042B SP evmhesmfa Vector Multiply Halfwords, Even, Signed, Modulo, Fractional | - ۷ / | 10000429 | | | OF. | eviiiiesiiila | | | | EVX | 1000042B | | | SP | evmhesmfa | | | | | | | | | | | | Form | Opcode<br>(hexadeci-<br>mal) <sup>2</sup> | Mode<br>Dep. <sup>1</sup><br>Priv | Cat <sup>1</sup> | Mnemonic | Instruction | |------|-------------------------------------------|-----------------------------------|------------------|--------------|---------------------------------------------------------------------------------------| | EVX | 1000042C | | SP | evmhoumia | Vector Multiply Halfwords, Odd, Unsigned, Modulo, Integer | | EVX | 1000042D | | SP | evmhosmia | to Accumulator Vector Multiply Halfwords, Odd, Signed, Modulo, Integer to Accumulator | | EVX | 1000042F | | SP | evmhosmfa | Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional to Accumulator | | VX | 10000440 | | V | vsubuhm | Vector Subtract Unsigned Byte Modulo | | VX | 10000442 | | V | vavguh | Vector Average Unsigned Halfword | | VX | 10000444 | | V | vandc | Vector AND with Complement | | EVX | 10000447 | | SP | evmwhssf | Vector Multiply Word High Signed, Fractional | | EVX | 10000448 | | SP | evmwlumi | Vector Multiply Word Low Unsigned, Modulo, Integer | | VX | 1000044A | | V | vminfp | Vector Minimum Single-Precision | | EVX | 1000044C | | SP | evmwhumi | Vector Multiply Word High Unsigned, Modulo, Integer | | VX | 1000044C | | V | vsro | Vector Shift Right by Octet | | EVX | 1000044D | | SP | evmwhsmi | Vector Multiply Word High Signed, Modulo, Integer | | EVX | 1000044F | | SP | evmwhsmf | Vector Multiply Word High Signed, Modulo, Fractional | | EVX | 10000453 | | SP | evmwssf | Vector Multiply Word Signed, Saturate, Fractional | | EVX | 10000458 | | SP | evmwumi | Vector Multiply Word Unsigned, Modulo, Integer | | EVX | 10000459 | | SP | evmwsmi | Vector Multiply Word Signed, Modulo, Integer | | EVX | 1000045B | | SP | evmwsmf | Vector Multiply Word Signed, Modulo, Fractional | | EVX | 10000467 | | SP | evmwhssfa | Vector Multiply Word High Signed, Fractional to Accumulator | | EVX | 10000468 | | SP | evmwlumia | Vector Multiply Word Low Unsigned, Modulo, Integer to Accumulator | | EVX | 1000046C | | SP | evmwhumia | Vector Multiply Word High Unsigned, Modulo, Integer to Accumulator | | EVX | 1000046D | | SP | evmwhsmia | Vector Multiply Word High Signed, Modulo, Integer to Accumulator | | EVX | 1000046F | | SP | evmwhsmfa | Vector Multiply Word High Signed, Modulo, Fractional to Accumulator | | EVX | 10000473 | | SP | evmwssfa | Vector Multiply Word Signed, Saturate, Fractional to Accumulator | | EVX | 10000478 | | SP | evmwumia | Vector Multiply Word Unsigned, Modulo, Integer to Accumulator | | EVX | 10000479 | | SP | evmwsmia | Vector Multiply Word Signed, Modulo, Integer to Accumulator | | EVX | 1000047B | | SP | evmwsmfa | Vector Multiply Word Signed, Modulo, Fractional to Accumulator | | VX | 10000480 | | V | vsubuwm | Vector Subtract Unsigned Word Modulo | | VX | 10000482 | | V | vavguw | Vector Average Unsigned Word | | VX | 10000484 | | V | vor | Vector OR | | EVX | 100004C0 | | SP | evaddusiaaw | Vector Add Unsigned, Saturate, Integer to Accumulator Word | | EVX | 100004C1 | | SP | evaddssiaaw | Vector Add Signed, Saturate, Integer to Accumulator Word | | EVX | 100004C2 | | SP | evsubfusiaaw | Vector Subtract Unsigned, Saturate, Integer to Accumulator Word | | EVX | 100004C3 | | SP | evsubfssiaaw | Vector Subtract Signed, Saturate, Integer to Accumulator Word | | EVX | 100004C4 | | SP | evmra | Initialize Accumulator | | VX | 100004C4 | | V | vxor | Vector XOR | | EVX | 100004C6 | | SP | evdivws | Vector Divide Word Signed | | EVX | 100004C7 | | SP | evdivwu | Vector Divide Word Unsigned | | EVX | 100004C8 | | SP | evaddumiaaw | Vector Add Unsigned, Modulo, Integer to Accumulator Word | | EVX | 100004C9 | | SP | evaddsmiaaw | Vector Add Signed, Modulo, Integer to Accumulator Word | | EVX | 100004CA | | SP | evsubfumiaaw | Vector Subtract Unsigned, Modulo, Integer to Accumulator Word | | EVX | 100004CB | | SP | evsubfsmiaaw | Vector Subtract Signed, Modulo, Integer to Accumulator Word | | EVX | 10000500 | | SP | evmheusiaaw | Vector Multiply Halfwords, Even, Unsigned, Saturate Integer and Accumulate into Words | | Form | Opcode<br>(hexadeci-<br>mal) <sup>2</sup> | Mode<br>Dep. <sup>1</sup><br>Priv | Cat <sup>1</sup> | Mnemonic | Instruction | |------|-------------------------------------------|-----------------------------------|------------------|-----------------|------------------------------------------------------------------------------------------------------------------------| | EVX | 10000501 | | SP | evmhessiaaw | Vector Multiply Halfwords, Even, Signed, Saturate, Integer | | VX | 10000502 | | V | vavgsb | and Accumulate into Words Vector Average Signed Byte | | EVX | 10000502 | | SP | evmhessfaaw | Vector Multiply Halfwords, Even, Signed, Saturate, Frac- | | EVX | 10000504 | | SP | evmhousiaaw | tional and Accumulate into Words Vector Multiply Halfwords, Odd, Unsigned, Saturate, Integer and Accumulate into Words | | VX | 10000504 | | V | vnor | Vector NOR | | EVX | 10000505 | | SP | evmhossiaaw | Vector Multiply Halfwords, Odd, Signed, Saturate, Integer | | EVX | 10000507 | | SP | evmhossfaaw | and Accumulate into Words Vector Multiply Halfwords, Odd, Signed, Saturate, Fractional and Accumulate into Words | | EVX | 10000508 | | SP | evmheumiaaw | Vector Multiply Halfwords, Even, Unsigned, Modulo, Integer | | EVX | 10000509 | | SP | evmhesmiaaw | and Accumulate into Words Vector Multiply Halfwords, Even, Signed, Modulo, Integer and Accumulate into Words | | EVX | 1000050B | | SP | evmhesmfaaw | Vector Multiply Halfwords, Even, Signed, Modulo, Fractional | | EVX | 1000050C | | SP | evmhoumiaaw | and Accumulate into Words Vector Multiply Halfwords, Odd, Unsigned, Modulo, Integer | | | 10000300 | | | CVIIIIOdiiiidaW | and Accumulate into Words | | EVX | 1000050D | | SP | evmhosmiaaw | Vector Multiply Halfwords, Odd, Signed, Modulo, Integer | | EVX | 1000050F | | SP | evmhosmfaaw | and Accumulate into Words Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional and Accumulate into Words | | EVX | 10000528 | | SP | evmhegumiaa | Vector Multiply Halfwords, Even, Guarded, Unsigned, Mod- | | EVX | 10000529 | | SP | evmhegsmiaa | ulo, Integer and Accumulate Vector Multiply Halfwords, Even, Guarded, Signed, Modulo, | | EVX | 1000052B | | SP | evmhegsmfaa | Integer and Accumulate Vector Multiply Halfwords, Even, Guarded, Signed, Modulo, | | EVX | 1000052C | | SP | evmhogumiaa | Fractional and Accumulate Vector Multiply Halfwords, Odd, Guarded, Unsigned, Mod- | | EVX | 1000052D | | SP | evmhogsmiaa | ulo, Integer and Accumulate Vector Multiply Halfwords, Odd, Guarded, Signed, Modulo, | | | | | | | Integer and Accumulate | | EVX | 1000052F | | SP | evmhogsmfaa | Vector Multiply Halfwords, Odd, Guarded, Signed, Modulo, Fractional and Accumulate | | EVX | 10000540 | | SP | evmwlusiaaw | Vector Multiply Word Low Unsigned Saturate, Integer and | | EVX | 10000541 | | SP | evmwlssiaaw | Accumulate into Words Vector Multiply Word Low Signed, Saturate, Integer and Accumulate into Words | | VX | 10000542 | | V | vavgsh | Vector Average Signed Halfword | | EVX | 10000544 | | SP | evmwhusiaaw | Vector Multiply Word High Unsigned, Integer and Accumulate into Words | | EVX | 10000547 | | SP | evmwhssfaaw | Vector Multiply Word High Signed, Fractional and Accumulate into Words | | EVX | 10000548 | | SP | evmwlumiaaw | Vector Multiply Word Low Unsigned, Modulo, Integer and | | EVX | 10000549 | | SP | evmwlsmiaaw | Accumulate into Words Vector Multiply Word Low Signed, Modulo, Integer and Accumulate into Words | | EVX | 1000054C | | SP | evmwhumiaaw | Vector Multiply Word High Unsigned, Modulo, Integer and | | EVX | 1000054D | | SP | evmwhsmiaaw | Accumulate into Words Vector Multiply Word High Signed, Modulo, Integer and | | EVX | 1000054F | | SP | evmwhsmfaaw | Accumulate into Words Vector Multiply Word High Signed, Modulo, Fractional and | | EVX | 10000553 | | SP | evmwssfaa | Accumulate into Words Vector Multiply Word Signed, Saturate, Fractional and Accu- | | EVX | 10000558 | | SP | evmwumiaa | mulate Vector Multiply Word Unsigned, Modulo, Integer and Accumulate | | Form | Opcode<br>(hexadeci-<br>mal) <sup>2</sup> | Mode<br>Dep. <sup>1</sup> | Cat <sup>1</sup> | Mnemonic | Instruction | |-----------|-------------------------------------------|---------------------------|------------------|------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | EVX | 10000559 | | SP | evmwsmiaa | Vector Multiply Word Signed, Modulo, Integer and Accumu- | | EVX | 1000055B | | SP | evmwsmfaa | late Vector Multiply Word Signed, Modulo, Fractional and Accumulate | | EVX | 10000580 | | SP | evmheusianw | Vector Multiply Halfwords, Even, Unsigned, Saturate Integer | | VX<br>EVX | 10000580<br>10000581 | | V<br>SP | vsubcuw<br>evmhessianw | and Accumulate Negative into Words Vector Subtract and Write Carry-Out Unsigned Word Vector Multiply Halfwords, Even, Signed, Saturate, Integer and Accumulate Negative into Words | | VX<br>EVX | 10000582<br>10000583 | | V<br>SP | vavgsw<br>evmhessfanw | Vector Average Signed Word<br>Vector Multiply Halfwords, Even, Signed, Saturate, Frac- | | EVX | 10000584 | | SP | evmhousianw | tional and Accumulate Negative into Words Vector Multiply Halfwords, Odd, Unsigned, Saturate, Integer | | EVX | 10000585 | | SP | evmhossianw | and Accumulate Negative into Words Vector Multiply Halfwords, Odd, Signed, Saturate, Integer and Accumulate Negative into Words | | EVX | 10000587 | | SP | evmhossfanw | Vector Multiply Halfwords, Odd, Signed, Saturate, Fractional and Accumulate Negative into Words | | EVX | 10000588 | | SP | evmheumianw | Vector Multiply Halfwords, Even, Unsigned, Modulo, Integer and Accumulate Negative into Words | | EVX | 10000589 | | SP | evmhesmianw | Vector Multiply Halfwords, Even, Signed, Modulo, Integer and Accumulate Negative into Words | | EVX | 1000058B | | SP | evmhesmfanw | Vector Multiply Halfwords, Even, Signed, Modulo, Fractional | | EVX | 1000058C | | SP | evmhoumianw | and Accumulate Negative into Words Vector Multiply Halfwords, Odd, Unsigned, Modulo, Integer | | EVX | 1000058D | | SP | evmhosmianw | and Accumulate Negative into Words Vector Multiply Halfwords, Odd, Signed, Modulo, Integer | | EVX | 1000058F | | SP | evmhosmfanw | and Accumulate Negative into Words Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional and Accumulate Negative into Words | | EVX | 100005A8 | | SP | evmhegumian | Vector Multiply Halfwords, Even, Guarded, Unsigned, Mod- | | EVX | 100005A9 | | SP | evmhegsmian | ulo, Integer and Accumulate Negative Vector Multiply Halfwords, Even, Guarded, Signed, Modulo, | | EVX | 100005AB | | SP | evmhegsmfan | Integer and Accumulate Negative Vector Multiply Halfwords, Even, Guarded, Signed, Modulo, | | EVX | 100005AC | | SP | evmhogumian | Fractional and Accumulate Negative Vector Multiply Halfwords, Odd, Guarded, Unsigned, Mod- | | EVX | 100005AD | | SP | evmhogsmian | ulo, Integer and Accumulate Negative Vector Multiply Halfwords, Odd, Guarded, Signed, Modulo, | | EVX | 100005AF | | SP | evmhogsmfan | Integer and Accumulate Negative Vector Multiply Halfwords, Odd, Guarded, Signed, Modulo, | | EVX | 100005C0 | | SP | evmwlusianw | Fractional and Accumulate Negative Vector Multiply Word Low Unsigned Saturate, Integer and Accumulate Negative into Words | | EVX | 100005C1 | | SP | evmwlssianw | Vector Multiply Word Low Signed, Saturate, Integer and Accumulate Negative into Words | | EVX | 100005C4 | | SP | evmwhusianw | Vector Multiply Word High Unsigned, Integer and Accumulate Negative into Words | | EVX | 100005C5 | | SP | evmwhssianw | Vector Multiply Word High Signed, Integer and Accumulate Negative into Words | | EVX | 100005C7 | | SP | evmwhssfanw | Vector Multiply Word High Signed, Fractional and Accumulate Negative into Words | | EVX | 100005C8 | | SP | evmwlumianw | Vector Multiply Word Low Unsigned, Modulo, Integer and | | EVX | 100005C9 | | SP | evmwlsmianw | Accumulate Negative into Words Vector Multiply Word Low Signed, Modulo, Integer and Accumulate Negative into Words | | EVX | 100005CC | | SP | evmwhumianw | Vector Multiply Word High Unsigned, Modulo, Integer and Accumulate Negative into Words | | | Opcode | | | | | | |--------------|-------------------------------------------|---------------|--------|------------------|--------------------------|------------------------------------------------------------------------------------------------------------------------| | Form | Opcode<br>(hexadeci-<br>mal) <sup>2</sup> | Mode<br>Dep.¹ | Priv ' | Cat <sup>1</sup> | Mnemonic | Instruction | | EVX | 100005CD | | | SP | evmwhsmianw | Vector Multiply Word High Signed, Modulo, Integer and | | EVX | 100005CF | | | SP | evmwhsmfanw | Accumulate Negative into Words Vector Multiply Word High Signed, Modulo, Fractional and Accumulate Negative into Words | | EVX | 100005D3 | | | SP | evmwssfan | Vector Multiply Word Signed, Saturate, Fractional and Accu- | | EVX | 100005D8 | | | SP | evmwumian | mulate Negative Vector Multiply Word Unsigned, Modulo, Integer and Accumulate Negative | | EVX | 100005D9 | | | SP | evmwsmian | Vector Multiply Word Signed, Modulo, Integer and Accumulate Negative | | EVX | 100005DB | | | SP | evmwsmfan | Vector Multiply Word Signed, Modulo, Fractional and Accumulate Negative | | VX | 10000600 | | | V | vsububs | Vector Subtract Unsigned Byte Saturate | | VX | 10000604 | | | V | mfvscr | Move from Vector Status and Control Register | | VX | 10000608 | | | V | vsum4ubs | Vector Sum across Quarter Unsigned Byte Saturate | | VX | 10000640 | | | V | vsubuhs | Vector Subtract Unsigned Halfword Saturate | | VX | 10000644 | | | V | mtvscr | Move to Vector Status and Control Register | | VX | 10000648 | | | V | vsum4shs | Vector Sum across Quarter Signed Halfword Saturate | | VX | 10000680 | | | V | vsubuws | Vector Subtract Unsigned Word Saturate | | VX | 10000688 | | | V | vsum2sws | Vector Sum across Half Signed Word Saturate | | VX<br>VX | 10000700 | | | V<br>V | vsubsbs<br>vsum4sbs | Vector Subtract Signed Byte Saturate Vector Sum across Quarter Signed Byte Saturate | | VX | 10000708<br>10000740 | | | V | vsubshs | Vector Subtract Signed Halfword Saturate | | VX | 10000740 | | | V | vsubsits | Vector Subtract Signed Word Saturate Vector Subtract Signed Word Saturate | | VX | 10000788 | | | V | vsumsws | Vector Sum across Signed Word Saturate | | D8 | 18000000 | | | VĽE | e_lbzu | Load Byte and Zero with Update | | D8 | 18000100 | | | VLE | e lhzu | Load Halfword and Zero with Update | | D8 | 18000200 | | | VLE | e_lwzu | Load Word and Zero with Update | | D8 | 18000300 | | | VLE | e_lhau | Load Halfword Algebraic with Update | | D8 | 18000400 | | | VLE | e_stbu | Store Byte with Update | | D8 | 18000500 | | | VLE | e_sthu | Store Halfword with Update | | D8 | 18000600 | | | VLE | e_stwu | Store word with Update | | D8 | 18000800 | | | VLE | e_lmw | Load Multiple Word | | D8 | 18000900 | | | VLE | e_stmw | Store Multiple Word | | SCI8 | 18008000 | SR | | VLE | e_addi[.] | Add Scaled Immediate | | SCI8 | 18009000 | SR | | VLE | e_addic[.] | Add Scaled Immediate Carrying | | SCI8 | 1800A000 | | | VLE | e_mulli | Multiply Low Scaled Immediate | | SCI8<br>SCI8 | 1800A800<br>1800B000 | SR | | VLE<br>VLE | e_cmpi | Compare Scaled Immediate Word | | SCI8 | 1800B000 | SR | | VLE | e_subfic[.]<br>e_andi[.] | Subtract From Scaled Immediate Carrying AND Scaled Immediate | | SCI8 | 1800C000 | | | VLE | e_aridi[.] | OR Scaled Immediate | | SCI8 | 1800E000 | SR | | VLE | e_xori[.] | XOR Scaled Immediate | | SCI8 | 1880A800 | 0.1 | | VLE | e_cmpli | Compare Logical Scaled Immediate Word | | D | 1C000000 | | | VLE | e_add16i | Add Immediate | | OIM5 | 2000 | | | VLE | se_addi | Add Immediate Short Form | | OIM5 | 2200 | | | VLE | se_cmpli | Compare Logical Immediate Word | | OIM5 | 2400 | SR | | VLE | se_subi[.] | Subtract Immediate | | IM5 | 2A00 | | | VLE | se_cmpi | Compare Immediate Word Short Form | | IM5 | 2C00 | | | VLE | se_bmaski | Bit Mask Generate Immediate | | IM5 | 2E00 | | | VLE | se_andi | AND Immediate Short Form | | D | 30000000 | | | VLE | e_lbz | Load Byte and Zero | | D | 34000000 | | | VLE | e_stb | Store Byte | | D<br>RR | 38000000 | | | VLE<br>VLE | e_lha | Load Halfword Algebraic | | RR | 4100 | SR | | VLE | Se_SrW | Shift Right Word Shift Right Algebraic Word | | RR | 4200 | SK | | VLE | se_sraw<br>se_slw | Shift Left Word | | RR | 4400 | | | VLE | se_siw | OR SHort Form | | RR | 4500 | | | VLE | se_andc | AND with Complement Short Form | | RR | 4600 | SR | | VLE | se_and[.] | AND Short Form | | IM7 | 4800 | | | VLE | se_li | Load Immediate Short Form | | D | 50000000 | | | VLE | e_lwz | Load Word and Zero | | Form | Opcode<br>(hexadeci-<br>mal) <sup>2</sup> | Mode<br>Dep.¹ | Priv | Cat <sup>1</sup> | Mnemonic | Instruction | |--------------|-------------------------------------------|---------------|------|------------------|-------------------------|-----------------------------------------------------------------------| | D | 54000000 | | | VLE | e_stw | Store Word | | D | 58000000 | | | VLE | e_lhz | Load Halfword and Zero | | D | 5C000000 | | | VLE | e_sth | Store Halfword | | IM5 | 6000 | | | VLE | se_bclri | Bit Clear Immediate | | IM5 | 6200 | | | VLE | se_bgeni | Bit Generate Immediate | | IM5 | 6400 | | | VLE | se_bseti | Bit Set Immediate | | IM5 | 6600 | | | VLE<br>VLE | se_btsti | Bit Test Immediate | | IM5<br>IM5 | 6800<br>6A00 | SR | | VLE | se_srwi<br>se_srawi | Shift Right Word Immediate Short Form Shift Right Algebraic Immediate | | IM5 | 6C00 | SIX | | VLE | se_slwi | Shift Left Word Immediate Short Form | | LI20 | 70000000 | | | VLE | e_li | Load Immediate | | I16A | 70008800 | SR | | VLE | e_add2i. | Add (2 operand) Immediate and Record | | I16A | 70009000 | | | VLE | e_add2is | Add (2 operand) Immediate Shifted | | IA16 | 70009800 | | | VLE | e_cmp16i | Compare Immediate Word | | I16A | 7000A000 | | | VLE | e_mull2i | Multiply (2 operand) Low Immediate | | I16A | 7000A800 | | | VLE | e_cmpl16i | Compare Logical Immediate Word | | IA16 | 7000B000 | | | VLE | e_cmph16i | Compare Halfword Immediate | | IA16 | 7000B800 | | | VLE | e_cmphl16i | Compare Halfword Logical Immediate | | I16L | 7000C000 | CL | | VLE | e_or2i | OR (2operand) Immediate | | I16L | 7000C800 | SR | | VLE | e_and2i. | AND (2 operand) Immediate | | 116L<br>116L | 7000D000 | | | VLE<br>VLE | e_or2is | OR (2 operand) Immediate Shifted<br>Load Immediate Shifted | | I16L | 7000E000<br>7000E800 | SR | | VLE | e_lis<br>e_and2is. | AND (2 operand) Immediate Shifted | | M | 74000000 | SIX | | VLE | e_rlwimi | Rotate Left Word Immediate then Mask Insert | | M | 74000000 | | | VLE | e_rlwinm | Rotate Left Word Immediate then AND with Mask | | BD24 | 78000000 | | | VLE | e_b[l] | Branch [and Link] | | BD15 | 7A000000 | CT | | VLE | e_bc[l] | Branch Conditional [and Link] | | X | 7C000000 | | | В | cmp | Compare | | X | 7C000008 | | | В | tw | Trap Word | | X | 7C00000C | | | V | lvsl | Load Vector for Shift Left Indexed | | X | 7C00000E | 0.0 | | V | lvebx | Load Vector Element Byte Indexed | | XO | 7C000010 | SR | | B | subfc[o][.] | Subtract From Carrying | | XO<br>XO | 7C000012<br>7C000014 | SR | | 64<br>B | mulhdu[.]<br>addc[o][.] | Multiply High Doubleword Unsigned Add Carrying | | XO | 7C000014<br>7C000016 | SR | | В | mulhwu[.] | Multiply High Word Unsigned | | X | 7C00001C | 0.1 | | VLE | e_cmph | Compare Halfword | | A | 7C00001E | | | B.in | isel | Integer Select | | XL | 7C000020 | | | VLE | e_mcrf | Move CR Field | | XFX | 7C000026 | | | В | mfcr | Move From Condition Register | | X | 7C000028 | | | В | lwarx | Load Word and Reserve Indexed | | X | 7C00002A | | | 64 | ldx | Load Doubleword Indexed | | X | 7C00002C | | | E | icbt | Instruction Cache Block Touch | | X<br>X | 7C00002E<br>7C000030 | SR | | B<br>B | lwzx<br>slw[.] | Load Word and Zero Indexed<br>Shift Left Word | | X | 7C000030<br>7C000034 | SR | | В | cntlzw[.] | Count Leading Zeros Word | | X | 7C000034<br>7C000036 | SR | | 64 | sld[.] | Shift Left Doubleword | | X | 7C000038 | SR | | В | and[.] | AND | | Χ | 7C00003A | | Р | E.PD | Idepx | Load Doubleword by External Process ID Indexed | | Χ | 7C00003E | | Р | E.PD | lwepx | Load Word by External Process ID Indexed | | X | 7C000040 | | | В | cmpl | Compare Logical | | XL | 7C000042 | | | VLE | e_crnor | Condition Register NOR | | X | 7C00004C | | | V | lvsr | Load Vector for Shift Right Indexed | | X<br>XO | 7C00004E | CD. | | V | lvehx | Load Vector Element Halfword Indexed | | X | 7C000050<br>7C00005C | SR | | B<br>VLE | subf[o][.]<br>e_cmphl | Subtract From<br>Compare Halfword Logical | | x | 7C00005C | | | 64 | ldux | Load Doubleword with Update Indexed | | X | 7C00006A | | | В | dcbst | Data Cache Block Store | | X | 7C00006E | | | В | lwzux | Load Word and Zero with Update Indexed | | X | 7C000070 | SR | | VLE | e_slwi[.] | Shift Left Word Immediate | | X | 7C000074 | SR | | 64 | cntlzd[.] | Count Leading Zeros Doubleword | | X | 7C000078 | SR | | В | andc[.] | AND with Complement | | Form | Opcode<br>(hexadeci-<br>mal) <sup>2</sup> | Mode<br>Dep.¹ | Priv | Cat <sup>1</sup> | Mnemonic | Instruction | |----------|-------------------------------------------|---------------|--------|------------------|---------------------------|-------------------------------------------------------------------------------| | X | 7C00007C | | | WT | wait | Wait | | X | 7C000088 | | | 64 | td | Trap Doubleword | | X<br>XO | 7C00008E | SR | | V<br>64 | lvewx | Load Vector Element Word Indexed Multiply High Doubleword | | XO | 7C000092<br>7C000096 | SR | | 64<br>B | mulhd[.]<br>mulhw[.] | Multiply High Word | | X | 7C000096 | Six | Р | В | mfmsr | Move From Machine State Register | | X | 7C0000A8 | | • | 64 | Idarx | Load Doubleword and Reserve Indexed | | Χ | 7C0000AC | | | В | dcbf | Data Cache Block Flush | | Χ | 7C0000AE | | | В | lbzx | Load Byte and Zero Indexed | | X | 7C0000BE | | Р | E.PD | lbepx | Load Byte by External Process ID Indexed | | X | 7C0000CE | 0.0 | | V | lvx[l] | Load Vector Indexed [Last] | | X | 7C0000D0<br>7C0000EE | SR | | B<br>B | neg[o][.]<br>lbzux | Negate Load Byte and Zero with Update Indexed | | x | 7C0000EE<br>7C0000F4 | | | В | popentb | Population Count Bytes | | X | 7C0000F4 | SR | | В | nor[.] | NOR | | X | 7C0000FE | 0.1 | Р | E.PD | dcbfep | Data Cache Block Flush by External Process ID | | XL | 7C000102 | | | VLE | e_crandc | Condition Register AND with Completement | | Χ | 7C000106 | | Р | Е | wrtee | Write MSR External Enable | | X | 7C00010C | | М | ECL | dcbtstls | Data Cache Block Touch for Store and Lock Set | | VX | 7C00010E | 0.0 | | V | stvebx | Store Vector Element Byte Indexed | | XO<br>XO | 7C000110<br>7C000114 | SR<br>SR | | B<br>B | subfe[o][.]<br>adde[o][.] | Subtract From Extended Add Extended | | EVX | 7C000114 | SK | Р | E.PD | evlddepx | Vector Load Doubleword into Doubleword by External Pro- | | LVX | 7000011D | | ' | LiD | eviddepx | cess ID Indexed | | XFX | 7C000120 | | | В | mtcrf | Move To Condition Register Fields | | X | 7C000124 | | Р | Ē | mtmsr | Move To Machine State Register | | X | 7C00012A | | | 64 | stdx | Store Doubleword Indexed | | X | 7C00012D | | | В | stwcx. | Store Word Conditional Indexed | | X | 7C00012E | | _ | В | stwx | Store Word Indexed | | X | 7C00013A | | Р | E.PD | stdepx | Store Doubleword by External Process ID Indexed | | X | 7C00013E<br>7C000146 | | P<br>P | E.PD<br>E | stwepx<br>wrteei | Store Word by External Process ID Indexed Write MSR External Enable Immediate | | x | 7C000146<br>7C00014C | | М | ECL | dcbtls | Data Cache Block Touch and Lock Set | | VX | 7C00011E | | | V | stvehx | Store Vector Element Halfword Indexed | | Χ | 7C00016A | | | 64 | stdux | Store Doubleword with Update Indexed | | Χ | 7C00016E | | | В | stwux | Store Word with Update Indexed | | XL | 7C000182 | | | VLE | e_crxor | Condition Register XOR | | VX | 7C00018E | 0.0 | | V | stvewx | Store Vector Element Word Indexed | | XO<br>XO | 7C000190<br>7C000194 | SR<br>SR | | B<br>B | subfze[o][.] | Subtract From Zero Extended<br>Add to Zero Extended | | X | 7C000194<br>7C00019C | SIN | Р | E.PC | addze[o][.]<br>msgsnd | Message Send | | EVX | 7C00019D | | P | E.PD | evstddepx | Vector Store Doubleword into Doubleword by External Pro- | | | | | - | | | cess ID Indexed | | Χ | 7C0001AD | | | 64 | stdcx. | Store Doubleword Conditional Indexed | | X | 7C0001AE | | | В | stbx | Store Bye Indexed | | X | 7C0001BE | | Р | E.PD | stbepx | Store Byte by External Process ID Indexed | | XL | 7C0001C2 | | | VLE | e_crnand | Condition Register NAND | | X<br>VX | 7C0001CC | | М | ECL<br>V | icblc | Instruction Cache Block Lock Clear | | XO | 7C0001CE<br>7C0001D0 | SR | | v<br>B | stvx[l]<br>subfme[o][.] | Store Vector Indexed [Last] Subtract From Minus One Extended | | XO | 7C0001D0<br>7C0001D2 | SR | | 64 | mulld[o][.] | Multiply Low Doubleword | | XO | 7C0001D2 | SR | | В | addme[o][.] | Add to Minus One Extended | | XO | 7C0001D6 | SR | | В | mullw[o][.] | Multiply Low Word | | X | 7C0001DC | | Р | E.PC | msgclr | Message Clear | | X | 7C0001EC | | | В | dcbtst | Data Cache Block Touch for Store | | X | 7C0001EE | | _ | В | stbux | Store Byte with Update Indexed | | X<br>XL | 7C0001FE | | Р | E.PD<br>VLE | dcbtstep | Data Cache Block Touch for Store by External Process ID | | XFX | 7C000202<br>7C000206 | | Р | VLE<br>E | e_crand<br>mfdcrx | Condition Register AND Move From Device Control Register Indexed | | X | 7C00020E | | Р | E.PD | Ivepxl | Load Vector by External Process ID Indexed LRU | | хо | 7C000201 | | • | В | add[o][.] | Add | | Form | Opcode<br>(hexadeci-<br>mal) <sup>2</sup> | Mode<br>Dep.¹ | Priv' | Cat <sup>1</sup> | Mnemonic | Instruction | |----------|-------------------------------------------|---------------|--------|------------------|----------------------|-------------------------------------------------------------------------------------| | X | 7C00022C | | | В | dcbt | Data Cache Block Touch | | X | 7C00022E | | | B_ | lhzx | Load Halfword and Zero Indexed | | X | 7C000230 | SR | | VLE | e_rlw[.] | Rotate Left Word | | X | 7C000238 | SR | Р | B<br>E.PD | eqv[.] | Equivalent | | XL | 7C00023E<br>7C000242 | | Г | VLE | lhepx<br>e_creqv | Load Halfword by External Process ID Indexed Condition Register Equivalent | | XFX | 7C000242 | | Р | E | mfdcrux | Move From Device Control Register User-mode Indexed | | X | 7C00024E | | P | E.PD | lvepx | Load Vector by External Process ID Indexed | | X | 7C00026E | | | В | lhzux | Load Halfword and Zero with Update Indexed | | X | 7C000270 | SR | | VLE | e_rlwi[.] | Rotate Left Word Immediate | | D | 7C000278 | SR | | В | xor[.] | XOR | | X | 7C00027E | | Р | E.PD | dcbtep | Data Cache Block Touch by External Process ID | | XFX<br>X | 7C000286 | | P<br>P | E<br>E.CD | mfdcr | Move From Device Control Register | | XFX | 7C00028C<br>7C00029C | | 0 | E.PM | dcread<br>mfpmr | Data Cache Read Move From Performance Monitor Register | | XFX | 7C00029C | | 0 | B E.F.W | mfspr | Move From Special Purpose Register | | X | 7C0002AA | | | 64 | lwax | Load Word Algebraic Indexed | | X | 7C0002AE | | | В | lhax | Load Halfword Algebraic Indexed | | X | 7C0002EA | | | 64 | lwaux | Load Word Algebraic with Update Indexed | | X | 7C0002EE | | | В | lhaux | Load Halfword Algebraic with Update Indexed | | X | 7C000306 | | Р | E | mtdcrx | Move To Device Control Register Indexed | | X | 7C00030C | | M | ECL | dcblc | Data Cache Block Lock Clear | | X<br>X | 7C00032E | CD | | В | sthx | Store Halfword Indexed | | X | 7C000338<br>7C00033E | SR | Р | B<br>E.PD | orc[.]<br>sthepx | OR with Complement Store Halfword by External Process ID Indexed | | XL | 7C00033E | | ' | VLE | e_crorc | Condition Register OR with Complement | | X | 7C000342 | | | E | mtdcrux | Move To Device Control Register User-mode Indexed | | X | 7C00036E | | | B | sthux | Store Halfword with Update Indexed | | X | 7C000378 | SR | | В | or[.] | OR . | | XL | 7C000382 | | | VLE | e_cror | Condition Register OR | | XFX | 7C000386 | | Р | E | mtdcr | Move To Device Control Register | | X | 7C00038C | 0.0 | Р | E.CI | dci | Data Cache Invalidate | | XO<br>XO | 7C000392 | SR<br>SR | | 64<br>B | divdu[o][.] | Divide Doubleword Unsigned Divide Word Unsigned | | XFX | 7C000396<br>7C00039C | SIX | 0 | E.PM | divwu[o][.]<br>mtpmr | Move To Performance Monitor Register | | XFX | 7C00033C | | Ö | B | mtspr | Move To Special Purpose Register | | X | 7C0003AC | | P | Ē | dcbi | Data Cache Block Invalidate | | X | 7C0003B8 | SR | | В | nand[.] | NAND | | X | 7C0003CC | | М | ECL | icbtls | Instruction Cache Block Touch and Lock Set | | X | 7C0003CC | | Р | E.CD | dcread | Data Cache Read | | XO | 7C0003D2 | SR | | 64 | divd[o][.] | Divide Doubleword | | XO<br>X | 7C0003D6 | SR | | B<br>B | divw[o][.]<br>mcrxr | Divide Word Move To Condition Register From XER | | x | 7C000400<br>7C00042A | | | MA | Iswx | Load String Word Indexed | | X | 7C00042A<br>7C00042C | | | B | lwbrx | Load Word Byte-Reversed Indexed | | X | 7C00042C | SR | | В | srw[.] | Shift Right Word | | Χ | 7C000436 | SR | | 64 | srd[.] | Shift Right Doubleword | | Χ | 7C00046C | | Р | E | tlbsync | TLB Synchronize | | X | 7C000470 | SR | | VLE | e_srwi[.] | Shift Right Word Immediate | | X | 7C0004AA | | | MA | Iswi | Load String Word Immediate | | X | 7C0004AC | | Г | В | sync | Synchronize | | X | 7C0004BE<br>7C00052A | | Р | E.PD<br>MA | Ifdepx<br>stswx | Load Floating-Point Double by External Process ID Indexed Store String Word Indexed | | X | 7C00052A<br>7C00052C | | | IVIA<br>B | stwbrx | Store Word Byte-Reversed Indexed | | X | 7C00052C | | | MA | stswi | Store String Word Immediate | | X | 7C0005BE | | Р | E.PD | stfdepx | Store Floating-Point Double by External Process ID Indexed | | X | 7C0005EC | | | Е | dcba | Data Cache Block Allocate | | X | 7C00060E | | Р | E.PD | stvepxl | Store Vector by External Process ID Indexed LRU | | X | 7C000624 | | Р | E | tlbivax | TLB Invalidate Virtual Address Indexed | | X | 7C00062C | 0.5 | | В | lhbrx | Load Halfword Byte-Reversed Indexed | | Χ | 7C000630 | SR | | В | sraw[.] | Shift Right Algebraic Word | | Form | Opcode<br>(hexadeci-<br>mal) <sup>2</sup> | | Priv' | Cat <sup>1</sup> | Mnemonic | Instruction | | |------|-------------------------------------------|----|-------|------------------|----------|-----------------------------------------------------------|--| | Χ | 7C000634 | SR | | 64 | srad[.] | Shift Right Algebraic Doubleword | | | X | 7C00064E | | Р | E.PD | stvepx | Store Vector by External Process ID Indexed | | | X | 7C000670 | SR | | В | srawi[.] | Shift Right Algebraic Word Immediate | | | X | 7C000674 | SR | | 64 | sradi[.] | Shift Right Algebraic Doubleword Immediate | | | XFX | 7C0006AC | | | Е | mbar | Memory Barrier | | | X | 7C000724 | | Р | Е | tlbsx | TLB Search Indexed | | | X | 7C00072C | | | В | sthbrx | Store Halfword Byte-Reversed Indexed | | | X | 7C000734 | SR | | В | extsh[.] | Extend Sign Halfword | | | X | 7C000764 | | Р | Е | tlbre | TLB Read Entry | | | X | 7C000774 | SR | | В | extsb[.] | Extend Shign Byte | | | X | 7C00078C | | Р | E.CI | ici | Instruction Cache Invalidate | | | X | 7C0007A4 | | Р | E | tlbwe | TLB Write Entry | | | X | 7C0007AC | | | В | icbi | Instruction Cache Block Invalidate | | | X | 7C0007B4 | SR | | 64 | extsw[.] | Extend Sign Word | | | X | 7C0007BE | | Р | E.PD | icbiep | Instruction Cache Block Invalidate by External Process ID | | | X | 7C0007CC | | Р | E.CD | icread | Instruction Cache Read | | | X | 7C0007EC | | | В | dcbz | Data Cache Block set to Zero | | | Χ | 7C0007FE | | Ρ | E.PD | dcbzep | Data Cache Block set to Zero by External Process ID | | | XFX | 7C100026 | | | В | mfocrf | Move From One Condition Register Field | | | XFX | 7C100120 | | | В | mtocrf | Move To One Condition Register Field | | | SD4 | 8000 | | | VLE | se_lbz | Load Byte and Zero Short Form | | | SD4 | 9000 | | | VLE | se_stb | Store Byte Short Form | | | SD4 | A000 | | | VLE | se_lhz | Load Halfword and Zero Short Form | | | SD4 | B000 | | | VLE | se_sth | Store Halfword SHort Form | | | SD4 | C000 | | | VLE | se_lwz | Load Word and Zero Short Form | | | SD4 | D000 | | | VLE | se_stw | Store Word Short Form | | | BD8 | E000 | | | VLE | se_bc | Branch Conditional Short Form | | | BD8 | E800 | | | VLE | se_b[l] | Branch [and Link] | | See the key to the mode dependency and privilege column below and the key to the category column in Section 1.3.5 of Book I. <sup>2</sup>For 16-bit instructions, the "Opcode" column represents the 16-bit hexadecimal instruction encoding with the opcode and extended opcode in the corresponding fields in the instruction, and with 0's in bit positions which are not opcode bits; dashes are used following the opcode to indicate the form is a 16-bit instruction. For 32-bit instructions, the "Opcode" column represents the 32-bit hexadecimal instruction encoding with the opcode and extended opcode in the corresponding fields in the instruction, and with 0's in bit positions which are not opcode bits. #### **Mode Dependency and Privilege Abbreviations** Except as described below and in Section 1.10.3, "Effective Address Calculation", in Book I, all instructions are independent of whether the processor is in 32-bit or 64-bit mode. | Mode Dep.<br>CT | Description If the instruction tests the Count Register, it tests the low-order 32 bits in 32-bit mode | |-----------------|--------------------------------------------------------------------------------------------------------| | | and all 64 bits in 64-bit mode. | | SR | The setting of status registers (such as XER and CR0) is mode-dependent. | | 32 | The instruction must be executed only in 32-bit mode. | | 64 | The instruction must be executed only in 64-bit mode. | #### **Key to Privilege Column** #### Priv. Description P Denotes a privileged instruction. #### Priv. Description - O Denotes an instruction that is treated as privileged or nonprivileged (or hypervisor, for *mtspr*), depending on the SPR number. - M Denotes an instruction that is treated as privileged or nonprivileged, depending on the value of the UCLE bit of the MSR. - H Denotes an instruction that can be executed only in hypervisor state. **Appendices:** **Power ISA Book I-III Appendices** # Appendix A. Incompatibilities with the POWER Architecture This appendix identifies the known incompatibilities that must be managed in the migration from the POWER Architecture to the Power ISA. Some of the incompatibilities can, at least in principle, be detected by the processor, which could trap and let software simulate the POWER operation. Others cannot be detected by the processor even in principle. In general, the incompatibilities identified here are those that affect a POWER application program; incompatibilities for instructions that can be used only by POWER system programs are not necessarily discussed. # A.1 New Instructions, Formerly Privileged Instructions Instructions new to Power ISA typically use opcode values (including extended opcode) that are illegal in POWER. A few instructions that are privileged in POWER (e.g., *dclz*, called *dcbz* in Power ISA) have been made nonprivileged in Power ISA. Any POWER program that executes one of these now-valid or now-nonprivileged instructions, expecting to cause the system illegal instruction error handler or the system privileged instruction error handler to be invoked, will not execute correctly on Power ISA. ## A.2 Newly Privileged Instructions The following instructions are nonprivileged in POWER but privileged in Power ISA. mfmsr mfsr # A.3 Reserved Fields in Instructions These fields are shown with "/"s in the instruction layouts. In both POWER and Power ISA these fields are ignored by the processor. The Power ISA states that these fields must contain zero. The POWER Architecture lacks such a statement, but it is expected that essentially all POWER programs contain zero in these fields. In several cases the Power ISA assumes that reserved fields in POWER instructions indeed contain zero. The cases include the following. - bclr[I] and bcctr[I] assume that bits 19:20 in the POWER instructions contain zero. - cmpi, cmp, cmpli, and cmpl assume that bit 10 in the POWER instructions contains zero. - mtspr and mfspr assume that bits 16:20 in the POWER instructions contain zero. - mtcrf and mfcr assume that bit 11 in the POWER instructions is contains zero. - Synchronize assumes that bits 9:10 in the POWER instruction (dcs) contain zero. (This assumption provides compatibility for application programs, but not necessarily for operating system programs; see Section A.22.) - mtmsr assumes that bit 15 in the POWER instruction contains zero. ## A.4 Reserved Bits in Registers Both POWER and Power ISA permit software to write any value to these bits. However in POWER reading such a bit always returns 0, while in Power ISA reading it may return either 0 or the value that was last written to it ### A.5 Alignment Check The POWER MSR AL bit (bit 24) is no longer supported; the corresponding Power ISA MSR bit, bit 56, is reserved. The low-order bits of the EA are always used. (Notice that the value 0 — the normal value for a reserved bit — means "ignore the low-order EA bits" in POWER, and the value 1 means "use the low-order EA bits".) POWER-compatible operating system code will probably write the value 1 to this bit. ### A.6 Condition Register The following instructions specify a field in the CR explicitly (via the BF field) and also, in POWER, use bit 31 as the Record bit. In Power ISA, bit 31 is a reserved field for these instructions and is ignored by the processor. In POWER, if bit 31 contains 1 the instructions execute normally (i.e., as if the bit contained 0) except as follows: $\begin{array}{ll} \textit{cmp} & \text{CR0 is undefined if Rc=1 and BF} \neq 0 \\ \textit{cmpl} & \text{CR0 is undefined if Rc=1 and BF} \neq 0 \\ \textit{mcrxr} & \text{CR0 is undefined if Rc=1 and BF} \neq 0 \\ \end{array}$ fcmpu CR1 is undefined if Rc=1 CR1 is undefined if Rc=1 mcrfs CR1 is undefined if Rc=1 and BF≠1 #### A.7 LK and Rc Bits For the instructions listed below, if bit 31 (LK or Rc bit in POWER) contains 1, in POWER the instruction executes as if the bit contained 0 except as follows: if LK=1, the Link Register is set (to an undefined value, except for **svc**); if Rc=1, Condition Register Field 0 or 1 is set to an undefined value. In Power ISA, bit 31 is a reserved field for these instructions and is ignored by the processor. Power ISA instructions for which bit 31 is the LK bit in POWER: sc (svc in POWER) the Condition Register Logical instructions mcrf isync (ics in POWER) Power ISA instructions for which bit 31 is the Rc bit in POWER: fixed-point X-form *Load* and *Store* instructions fixed-point X-form *Compare* instructions the X-form *Trap* instruction mtspr, mfspr, mtcrf, mcrxr, mfcr, mtocrf, mfo- floating-point X-form *Load* and *Store* instructions floating-point *Compare* instructions *mcrfs* dcbz (dclz in POWER) #### A.8 BO Field POWER shows certain bits in the BO field — used by *Branch Conditional* instructions — as "x". Although the POWER Architecture does not say how these bits are to be interpreted, they are in fact ignored by the processor. Power ISA shows these bits as "z", "a", or "t". The "z" bits are ignored, as in POWER. However, the "a" and "t" bits can be used by software to provide a hint about how the branch is likely to behave. If a POWER program has the "wrong" value for these bits, the program will produce the same results as on POWER but performance may be affected. #### A.9 BH Field Bits 19:20 of the *Branch Conditional to Link Register* and *Branch Conditional to Count Register* instructions are reserved in POWER but are defined as a branch hint (BH) field in Power ISA. Because these bits are hints, they may affect performance but do not affect the results of executing the instruction. # A.10 Branch Conditional to Count Register For the case in which the Count Register is decremented and tested (i.e., the case in which BO<sub>2</sub>=0), POWER specifies only that the branch target address is undefined, with the implication that the Count Register, and the Link Register if LK=1, are updated in the normal way. Power ISA specifies that this instruction form is invalid. ### A.11 System Call There are several respects in which Power ISA is incompatible with POWER for *System Call* instructions — which in POWER are called *Supervisor Call* instructions. - POWER provides a version of the Supervisor Call instruction (bit 30 = 0) that allows instruction fetching to continue at any one of 128 locations. It is used for "fast SVCs". Power ISA provides no such version: if bit 30 of the instruction is 0 the instruction form is invalid. - POWER provides a version of the Supervisor Call instruction (bits 30:31 = 0b11) that resumes instruction fetching at one location and sets the Link Register to the address of the next instruction. Power ISA provides no such version: bit 31 is a reserved field. - For POWER, information from the MSR is saved in the Count Register. For Power ISA this information is saved in SRR1. - In POWER bits 16:19 and 27:29 of the instruction comprise defined instruction fields or a portion thereof, while in Power ISA these bits comprise reserved fields. - In POWER bits 20:26 of the instruction comprise a portion of the SV field, while in Power ISA these bits comprise the LEV field. - POWER saves the low-order 16 bits of the instruction, in the Count Register. Power ISA does not save them. - The settings of MSR bits by the associated interrupt differ between POWER and Power ISA; see POWER Processor Architecture and Book III. # A.12 Fixed-Point Exception Register (XER) Bits 48:55 of the XER are reserved in Power ISA, while in POWER the corresponding bits (16:23) are defined and contain the comparison byte for the *Ischx* instruction (which Power ISA lacks). ## A.13 Update Forms of Storage Access Instructions Power ISA requires that RA not be equal to either RT (fixed-point *Load* only) or 0. If the restriction is violated the instruction form is invalid. POWER permits these cases, and simply avoids saving the EA. ### A.14 Multiple Register Loads Power ISArequires that RA, and RB if present in the instruction format, not be in the range of registers to be loaded, while POWER permits this and does not alter RA or RB in this case. (The Power ISA restriction applies even if RA=0, although there is no obvious benefit to the restriction in this case since RA is not used to compute the effective address if RA=0.) If the Power ISA restriction is violated, either the system illegal instruction error handler is invoked or the results are boundedly undefined. The instructions affected are: Imw (Im in POWER) Iswi (Isi in POWER) Iswx (Isx in POWER) For example, an *Imw* instruction that loads all 32 registers is valid in POWER but is an invalid form in Power ISA. # A.15 Load/Store Multiple Instructions There are two respects in which Power ISA is incompatible with POWER for *Load Multiple* and *Store Multiple* instructions. - If the EA is not word-aligned, in Power ISA either an Alignment exception occurs or the addressed bytes are loaded, while in POWER an Alignment interrupt occurs if MSR<sub>AL</sub>=1 (the low-order two bits of the EA are ignored if MSR<sub>AL</sub>=0). - In Power ISA the instruction may be interrupted by a system-caused interrupt, while in POWER the instruction cannot be thus interrupted. #### A.16 Move Assist Instructions There are several respects in which Power ISA is incompatible with POWER for *Move Assist* instructions. - In Power ISA an *Iswx* instruction with zero length leaves the contents of RT undefined (if RT≠RA and RT≠RB) or is an invalid instruction form (if RT=RA or RT=RB), while in POWER the corresponding instruction (*Isx*) is a no-op in these cases. - In Power ISA an *Iswx* instruction with zero length may alter the Reference bit, and a *stswx* instruction with zero length may alter the Reference and Change bits, while in POWER the corresponding instructions (*Isx* and *stsx*) do not alter the Reference and Change bits in this case. - In Power ISA a Move Assist instruction may be interrupted by a system-caused interrupt, while in POWER the instruction cannot be thus interrupted. #### A.17 Move To/From SPR There are several respects in which Power ISA is incompatible with POWER for *Move To/From Special Purpose Register* instructions. - The SPR field is ten bits long in Power ISA, but only five in POWER (see also Section A.3, "Reserved Fields in Instructions"). - mfspr can be used to read the Decrementer in problem state in POWER, but only in privileged state in Power ISA. - If the SPR value specified in the instruction is not one of the defined values, POWER behaves as follows. - If the instruction is executed in problem state and SPR<sub>0</sub>=1, a Privileged Instruction type Program interrupt occurs. No architected registers are altered except those set by the interrupt. - Otherwise no architected registers are altered. In this same case, Power ISA behaves as follows. If the instruction is executed in problem state and spr<sub>0</sub>=1, a Privileged Instruction type Program interrupt occurs. No architected registers are altered except those set by the interrupt. (See Section 4.4.5, "Move To/From System Register Instructions" in Book III-S.). ## A.18 Effects of Exceptions on FPSCR Bits FR and FI For the following cases, POWER does not specify how FR and FI are set, while Power ISA preserves them for Invalid Operation Exception caused by a *Compare* instruction, sets FI to 1 and FR to an undefined value for disabled Overflow Exception, and clears them otherwise. - Invalid Operation Exception (enabled or disabled) - Zero Divide Exception (enabled or disabled) - Disabled Overflow Exception # A.19 Store Floating-Point Single Instructions There are several respects in which Power ISA is incompatible with POWER for *Store Floating-Point Single* instructions. - POWER uses FPSCR<sub>UE</sub> to help determine whether denormalization should be done, while Power ISA does not. Using FPSCR<sub>UE</sub> is in fact incorrect: if FPSCR<sub>UE</sub>=1 and a denormalized single-precision number is copied from one storage location to another by means of *Ifs* followed by *stfs*, the two "copies" may not be the same. - For an operand having an exponent that is less than 874 (unbiased exponent less than -149), POWER stores a zero (if FPSCR<sub>UE</sub>=0) while Power ISA stores an undefined value. ### A.20 Move From FPSCR POWER defines the high-order 32 bits of the result of *mffs* to be 0xFFFF\_FFFF, while Power ISA copies the high-order 32-bits of the FPSCR. ## A.21 Zeroing Bytes in the Data Cache The *dclz* instruction of POWER and the *dcbz* instruction of Power ISA have the same opcode. However, the functions differ in the following respects. - dclz clears a line while dcbz clears a block. - dclz saves the EA in RA (if RA≠0) while dcbz does not - *dclz* is privileged while *dcbz* is not. ### A.22 Synchronization The *Synchronize* instruction (called *dcs* in POWER) and the *isync* instruction (called *ics* in POWER) cause more pervasive synchronization in Power ISA than in POWER. However, unlike *dcs*, *Synchronize* does not wait until data cache block writes caused by preceding instructions have been performed in main storage. Also, *Synchronize* has an L field while *dcs* does not, and some uses of the instruction by the operating system require L=2<S>. (The L field corresponds to reserved bits in *dcs* and hence is expected to be zero in POWER programs; see Section A.3.) # A.23 Move To Machine State Register Instruction The *mtmsr* instruction has an L field in Power ISA but not in POWER. The function of the variant of *mtmsr* with L=1 differs from the function of the instruction in the POWER architecture in the following ways. - In Power ISA, this variant of *mtmsr* modifies only the EE and RI bits of the MSR, while in the POWER *mtmsr* modifies all bits of the MSR. - This variant of *mtmsr* is execution synchronizing in Power ISA but is context synchronizing in POWER. (The POWER architecture lacks Power ISA's distinction between execution synchronization and context synchronization. The statement in the POWER architecture specification that *mtmsr* is "synchronizing" is equivalent to stating that the instruction is context synchronizing.) Also, *mtmsr* is optional in Power ISA but required in POWER. ### A.24 Direct-Store Segments POWER's direct-store segments are not supported in Power ISA. ### A.25 Segment Register Manipulation Instructions The definitions of the four Segment Register Manipulation instructions **mtsr**, **mtsrin**, **mfsr**, and **mfsrin** differ in two respects between POWER and Power ISA. Instructions similar to **mtsrin** and **mfsrin** are called **mtsri** and **mfsri** in POWER. privilege: **mfsr** and **mfsri** are problem state instructions in POWER, while **mfsr** and **mfsrin** are privileged in Power ISA. function: the "indirect" instructions (mtsri and *mfsri*) in POWER use an RA register in computing the Segment Register number, and the computed EA is stored into RA (if RA≠0 and RA≠RT), while in Power ISA *mtsrin* and *mfsrin* have no RA field and the EA is not stored. mtsr, mtsrin (mtsri), and mfsr have the same opcodes in Power ISA as in POWER. mfsri (POWER) and mfsrin (Power ISA) have different opcodes. Also, the Segment Register Manipulation instructions are required in POWER whereas they are optional in Power ISA. ### A.26 TLB Entry Invalidation The *tlbi* instruction of POWER and the *tlbie* instruction of Power ISA have the same opcode. However, the functions differ in the following respects. - tlbi computes the EA as (RA|0) + (RB), while tlbie lacks an RA field and computes the EA and related information as (RB). - tlbi saves the EA in RA (if RA≠0), while tlbie lacks an RA field and does not save the EA. - For tlbi the high-order 36 bits of RB are used in computing the EA, while for tlbie these bits contain additional information that is not directly related to the EA - *tlbie* has an L field, while *tlbi* does not. Also, *tlbi* is required in POWER whereas *tlbie* is optional in Power ISA. ### A.27 Alignment Interrupts Placing information about the interrupting instruction into the DSISR and the DAR when an Alignment interrupt occurs is optional in Power ISA but required in POWER. ### A.28 Floating-Point Interrupts POWER uses MSR bit 20 to control the generation of interrupts for floating-point enabled exceptions, and Power ISA uses the corresponding MSR bit, bit 52, for the same purpose. However, in Power ISA this bit is part of a two-bit value that controls the occurrence, precision, and recoverability of the interrupt, while in POWER this bit is used independently to control the occurrence of the interrupt (in POWER all floating-point interrupts are precise). ### A.29 Timing Facilities #### A.29.1 Real-Time Clock The POWER Real-Time Clock is not supported in Power ISA. Instead, Power ISA provides a Time Base. Both the RTC and the TB are 64-bit Special Purpose Registers, but they differ in the following respects. - The RTC counts seconds and nanoseconds, while the TB counts "ticks". The ticking rate of the TB is implementation-dependent. - The RTC increments discontinuously: 1 is added to RTCU when the value in RTCL passes 999\_999\_999. The TB increments continuously: 1 is added to TBU when the value in TBL passes 0xFFFF FFFF. - The RTC is written and read by the *mtspr* and *mfspr* instructions, using SPR numbers that denote the RTCU and RTCL. The TB is written and read by the same instructions using different SPR numbers. - The SPR numbers that denote POWER's RTCL and RTCU are invalid in Power ISA. - The RTC is guaranteed to increment at least once in the time required to execute ten Add Immediate instructions. No analogous guarantee is made for the TB. - Not all bits of RTCL need be implemented, while all bits of the TB must be implemented. #### A.29.2 Decrementer The Power ISA Decrementer differs from the POWER Decrementer in the following respects. - The Power ISA DEC decrements at the same rate that the TB increments, while the POWER DEC decrements every nanosecond (which is the same rate that the RTC increments). - Not all bits of the POWER DEC need be implemented, while all bits of the Power ISA DEC must be implemented. - The interrupt caused by the DEC has its own interrupt vector location in Power ISA, but is considered an External interrupt in POWER. #### A.30 Deleted Instructions The following instructions are part of the POWER Architecture but have been dropped from the Power ISA. | abs | Absolute | |---------|-----------------------------------------| | clcs | Cache Line Compute Size | | clf | Cache Line Flush | | cli (*) | Cache Line Invalidate | | dclst | Data Cache Line Store | | div | Divide | | divs | Divide Short | | doz | Difference Or Zero | | dozi | Difference Or Zero Immediate | | Iscbx | Load String And Compare Byte Indexed | | maskg | Mask Generate | | maskir | Mask Insert From Register | | mfsri | Move From Segment Register Indirect | | mul | Multiply | | nabs | Negative Absolute | | rac (*) | Real Address Compute | | rfi (*) | Return From Interrupt | | rfsvc | Return From SVC | | rlmi | Rotate Left Then Mask Insert | | rrib | Rotate Right And Insert Bit | | sle | Shift Left Extended | | sleq | Shift Left Extended With MQ | | sliq | Shift Left Immediate With MQ | | slliq | Shift Left Long Immediate With MQ | | sllq | Shift Left Long With MQ | | slq | Shift Left With MQ | | sraiq | Shift Right Algebraic Immediate With MQ | | sraq | Shift Right Algebraic With MQ | | sre | Shift Right Extended | | srea | Shift Right Extended Algebraic | | sreq | Shift Right Extended With MQ | | sriq | Shift Right Immediate With MQ | | srliq | Shift Right Long Immediate With MQ | | srlq | Shift Right Long With MQ | | srq | Shift Right With MQ | <sup>(\*)</sup> This instruction is privileged. **Note:** Many of these instructions use the MQ register. The MQ is not defined in the Power ISA. ## **A.31 Discontinued Opcodes** The opcodes listed below are defined in the POWER Architecture but have been dropped from the Power ISA. The list contains the POWER mnemonic (MNEM), the primary opcode (PRI), and the extended opcode (XOP) if appropriate. The corresponding instructions are reserved in Power ISA. | MNEM | PRI | XOP | |---------|-----|-----| | abs | 31 | 360 | | clcs | 31 | 531 | | clf | 31 | 118 | | cli (*) | 31 | 502 | | dclst | 31 | 630 | | div | 31 | 331 | | divs | 31 | 363 | | doz | 31 | 264 | | dozi | 09 | - | | Iscbx | 31 | 277 | | maskg | 31 | 29 | | maskir | 31 | 541 | | mfsri | 31 | 627 | | mul | 31 | 107 | | nabs | 31 | 488 | | rac (*) | 31 | 818 | | rfi (*) | 19 | 50 | | rfsvc | 19 | 82 | | rlmi | 22 | - | | rrib | 31 | 537 | | sle | 31 | 153 | | sleq | 31 | 217 | | sliq | 31 | 184 | | slliq | 31 | 248 | | sllq | 31 | 216 | | slq | 31 | 152 | | sraiq | 31 | 952 | | sraq | 31 | 920 | | sre | 31 | 665 | | srea | 31 | 921 | | sreq | 31 | 729 | | sriq | 31 | 696 | | srliq | 31 | 760 | | srlq | 31 | 728 | | srq | 31 | 664 | <sup>(\*)</sup> This instruction is privileged. #### Assembler Note — It might be helpful to current software writers for the Assembler to flag the discontinued POWER instructions. ### A.32 POWER2 Compatibility The POWER2 instruction set is a superset of the POWER instruction set. Some of the instructions added for POWER2 are included in the Power ISA. Those that have been renamed in the Power ISA are listed in this section, as are the new POWER2 instructions that are not included in the Power ISA. Other incompatibilities are also listed. ### A.32.1 Cross-Reference for Changed POWER2 Mnemonics The following table lists the new POWER2 instruction mnemonics that have been changed in the Power ISA User Instruction Set Architecture, sorted by POWER2 mnemonic. To determine the Power ISA mnemonic for one of these POWER2 mnemonics, find the POWER2 mnemonic in the second column of the table: the remainder of the line gives the Power ISA mnemonic and the page on which the instruction is described, as well as the instruction names. POWER2 mnemonics that have not changed are not listed. | Page | | POWER2 | Power ISA | | | |------|----------|-------------------------------------------------------|-----------|---------------------------------------------------------|--| | Page | Mnemonic | Instruction | Mnemonic | Instruction | | | 135 | fcir[.] | Floating Convert Double to Integer with Round | fctiw[.] | Floating Convert To Integer Word | | | 136 | fcirz[.] | Floating Convert Double to Integer with Round to Zero | fctiwz[.] | Floating Convert To Integer Word with round toward Zero | | ## A.32.2 Load/Store Floating-Point Double Several of the opcodes for the Load/Store Floating-Point Quad instructions of the POWER2 architecture have been reclaimed by the Load/Store Foating-Point Double [Indexed] instructions (entries with a '-' in the Power ISA column have not been reclaimed): #### **MNEMONIC** | POWER2 | <b>POWER ISA</b> | PRI | XOP | |--------|------------------|-----|-----| | Ifq | lq | 56 | - | | Ifqu | lfdp | 57 | 0 | | Ifqux | - | 31 | 823 | | Ifqx | lfdpx | 31 | 791 | | stfq | - | 60 | - | | stfqu | stfdp | 61 | - | | stfqux | - | 31 | 951 | | stfqx | stfdpx | 31 | 919 | Differences between the **I**/stfdp[x] instructions and the POWER2 **I**/stfq[u][x] instructions include the following. - The storage operand for the I/stfdp[x] instructions must be quadword aligned for optimal performance. - The register pairs for the \( \lambda \) stfdp[x] instructions must be even-odd pairs, instead of any consecutive pair. - The I/stfdp[x] instructions do not have update forms. ## A.32.3 Floating-Point Conversion to Integer The *fcir* and *fcirz* instructions of POWER2 have the same opcodes as do the *fctiw* and *fctiwz* instructions, respectively, of Power ISA. However, the functions differ in the following respects. - fcir and fcirz set the high-order 32 bits of the target FPR to 0xFFFF\_FFFF, while fctiw and fctiwz set them to an undefined value. - Except for enabled Invalid Operation Exceptions, fcir and fcirz set the FPRF field of the FPSCR based on the result, while fctiw and fctiwz set it to an undefined value. - fcir and fcirz do not affect the VXSNAN bit of the FPSCR, while fctiw and fctiwz do. - fcir and fcirz set FPSCR<sub>XX</sub> to 1 for certain cases of "Large Operands" (i.e., operands that are too large to be represented as a 32-bit signed fixedpoint integer), while fctiw and fctiwz do not alter it for any case of "Large Operand". (The IEEE standard requires not altering it for "Large Operands".) #### A.32.4 Floating-Point Interrupts POWER2 uses MSR bits 20 and 23 to control the generation of interrupts for floating-point enabled exceptions, and Power ISA uses the corresponding MSR bits, bits 52 and 55, for the same purpose. However, in Power ISA these bits comprise a two-bit value that controls the occurrence, precision, and recoverability of the interrupt, while in POWER2 these bits are used independently to control the occurrence (bit 20) and the precision (bit 23) of the interrupt. Moreover, in Power ISA all floating-point interrupts are considered Program interrupts, while in POWER2 imprecise floating-point interrupts have their own interrupt vector location. #### A.32.5 Trace The Trace interrupt vector location differs between the two architectures, and there are many other differences. #### A.33 Deleted Instructions The following instructions are new in POWER2 implementations of the POWER Architecture but have been dropped from the Power ISA. | lfq | Load Floating-Point Quad | |--------|-----------------------------------------------| | lfqu | Load Floating-Point Quad with Update | | Ifqux | Load Floating-Point Quad with Update Indexed | | lfqx | Load Floating-Point Quad Indexed | | stfq | Store Floating-Point Quad | | stfqu | Store Floating-Point Quad with Update | | stfqux | Store Floating-Point Quad with Update Indexed | | stfqx | Store Floating-Point Quad Indexed | ### A.33.1 Discontinued Opcodes The opcodes listed below are new in POWER2 implementations of the POWER Architecture but have been dropped from the Power ISA. The list contains the POWER2 mnemonic (MNEM), the primary opcode (PRI), and the extended opcode (XOP) if appropriate. The instructions are either illegal or reserved in Power ISA; see Appendix D. | MNEM | PRI | XOP | |-------|-----|-----| | Ifq | 56 | - | | Ifqx | 31 | 791 | | stfqx | 31 | 919 | ## **Appendix B. Platform Support Requirements** As described in Chapter 1 of Book I, the architecture is structured as a collection of categories. Each category is comprised of facilities and/or instructions that together provide a unit of functionality. The Server and Embedded categories are referred to as "special" because all implementations must support at least one of these categories. Each special category, when taken together with the Base category, is referred to as an "environment", and provides the minimum functionality required to develop operating systems and applications. Every processor implementation supports at least one of the environments, and may also support a set of categories chosen based on the target market for the implementation. However, a Server implementation supports only those categories designated as part of the Server platform in Figure 20. To facilitate the development of operating systems and applications for a well-defined purpose or customer set, usually embodied in a unique hardware platform, this appendix documents the association between a platform and the set of categories it requires. Adding a new platform may permit cost-performance optimization by clearly identifying a unique set of categories. However, this has the potential to fragment the application base. As a result, new platforms will be added only when the optimization benefit clearly outweighs the loss due to fragmentation. The platform support requirements are documented in Figure 20. An "x" in a column indicates that the category is required. A "+" in a column indicates that the requirement is being phased in. | Category | Server Plat-<br>form | Embedde<br>Platform | |----------------------------------------------------------------------------------------------------------------------|----------------------|---------------------| | Base | Х | Х | | Server | х | | | Embedded | | Х | | Alternate Time Base | | | | BCD Assistance | | | | Cache Specification | | | | Decimal Floating-Point | x <sup>2</sup> | | | Embedded.Cache Debug | | | | Embedded.Cache Initialization | | | | Embedded.Enhanced Debug | | | | Embedded.External PID | | | | Embedded.Little-Endian | | | | Embedded.MMU Type FSL | | | | Embedded.Performance Monitor | | | | Embedded.Processor Control | | | | Embedded Cache Locking | | | | External Control | | | | External Proxy | | | | Floating-Point Floating-Point.Record | x<br>x | | | Hypervisor Emulation Assistance | | | | Legacy Move Assist | | | | Legacy Integer Multiply-Accumulate | | | | Load/Store Quadword | x <sup>3</sup> | | | Memory Coherence | х | | | Move Assist | х | | | Processor Compatibility | | | | Server.Performance Monitor | х | | | Signal Processing Engine SPE.Embedded Float Scalar Double SPE.Embedded Float Scalar Single SPE.Embedded Float Vector | | | | Stream | х | | | Trace | х | | | Variable Length Encoding | | | | Vector<br>Vector.Little-Endian | +<br>+ <sup>1</sup> | | | Wait 1. If the Vector category is supported, V | actor Little Endian | is required | <sup>1.</sup> If the Vector category is supported, Vector.Little-Endian is required on the Server platform. Figure 20. Platform Support Requirements (Sheet 1 of 2) <sup>2.</sup> The Decimal Floating-Point category may be emulated through support for the BCD Assistance and Hypervisor Emulation Assistance categories. <sup>3.</sup> Optional for the Server Platform. | Category | Server Plat-<br>form | Embedded<br>Platform | |----------|----------------------|----------------------| | 64-Bit | х | | - 1. If the Vector category is supported, Vector.Little-Endian is required on the Server platform. - The Decimal Floating-Point category may be emulated through support for the BCD Assistance and Hypervisor Emulation Assistance categories. - 3. Optional for the Server Platform. Figure 20. Platform Support Requirements (Sheet 2 of 2) #### Programming Note - The requirement to support the Hypervisor Emulation Assistance and BCD Assistance categories if the Decimal Floating-Point category is emulated is to provide a minimum level of performance. ## **Appendix C. Complete SPR List** This appendix lists all the Special Purpose Registers in the Power ISA , $\,$ ordered by SPR number. | decimal SPR <sup>1</sup> | | Register | Privil | eged | Length | Cat <sup>2</sup> | | |--------------------------|---------------------------------------|-----------|-------------------|-------------------|--------|------------------|--| | decimai | spr <sub>5:9</sub> spr <sub>0:4</sub> | Name | mtspr | mfspr | (bits) | Cat | | | 1 | 00000 00001 | XER | no | no | 64 | В | | | 8 | 00000 01000 | LR | no | no | 64 | В | | | 9 | 00000 01001 | CTR | no | no | 64 | В | | | 17 | 00000 10001 | DSCR | yes | yes | 64 | S | | | 18 00000 10010 <b>[</b> | | DSISR | yes | yes | 32 | S | | | 19 | 00000 10011 | DAR | yes | yes | 64 | S | | | 22 | 00000 10110 | DEC | yes | yes | 32 | В | | | 25 | 00000 11001 | SDR1 | hypv <sup>3</sup> | hypv <sup>3</sup> | 64 | S | | | 26 | 00000 11010 | SRR0 | yes | yes | 64 | В | | | 27 | 00000 11011 | SRR1 | yes | yes | 64 | В | | | 28 | 00000 11100 | CFAR | yes | yes | 64 | S | | | 29 | 00000 11101 | AMR | yes | yes | 64 | S | | | 48 | 00001 10000 | PID | yes | yes | 32 | Е | | | 54 | 00001 10110 | DECAR | yes | yes | 32 | Е | | | 58 | 00001 11010 | CSRR0 | yes | yes | 64 | Е | | | 59 | 00001 11011 | CSRR1 | yes | yes | 32 | Е | | | 61 | 00001 11101 | DEAR | yes | yes | 64 | Е | | | 62 | 00001 11110 | ESR | yes | yes | 32 | Е | | | 63 | 00001 11111 | IVPR | yes | yes | 64 | Е | | | 136 | 00100 01000 | CTRL | - | no | 32 | S | | | 152 | 00100 11000 | CTRL | yes | - | 32 | S | | | 256 | 01000 00000 | VRSAVE | no | no | 32 | E,V | | | 259 | 01000 00011 | SPRG3 | - | no | 64 | В | | | 260-263 | 01000 001xx | SPRG[4-7] | - | no | 64 | Е | | | 268 | 01000 01100 | TB | - | no | 64 | В | | | 269 | 01000 01101 | TBU | - | no | 32 | В | | | 272-275 | 01000 100xx | SPRG[0-3] | yes | yes | 64 | В | | | 276-279 | 01000 101xx | SPRG[4-7] | yes | yes | 64 | Е | | | 282 | 01000 11010 | EAR | hypv <sup>3</sup> | hypv <sup>3</sup> | 32 | EC | | | 284 | 01000 11100 | TBL | hypv <sup>4</sup> | - | 32 | В | | | 285 | 01000 11101 | TBU | hypv <sup>4</sup> | - | 32 | В | | | 286 | 01000 11110 | TBU40 | hypv | - | 64 | S | | | 286 | 01000 11110 | PIR | - | yes | 32 | Ε | | | 287 | 01000 11111 | PVR | | yes | 32 | В | | | 304 | 01001 10000 | HSPRG0 | hypv <sup>3</sup> | hypv <sup>3</sup> | 64 | S | | | 304 | 01001 10000 | DBSR | yes <sup>5</sup> | yes | 32 | Е | | | 305 | 01001 10001 | HSPRG1 | hypv <sup>3</sup> | hypv <sup>3</sup> | 64 | S | | | 306 | 01001 10010 | HDSISR | hypv <sup>3</sup> | hypv <sup>3</sup> | 32 | В | | | 307 | 01001 10011 | HDAR | hypv <sup>3</sup> | hypv <sup>3</sup> | 64 | В | | Figure 21. SPR Numbers (Sheet 1 of 3) | deelmed | SPR <sup>1</sup> | Register | Privileged | | Length | Cat <sup>2</sup> | | |---------|---------------------------------------|-------------|---------------------|-------------------|--------|------------------|--| | decimal | spr <sub>5:9</sub> spr <sub>0:4</sub> | Name | mtspr | mfspr | (bits) | Cat- | | | 308 | 01001 10100 | DBCR0 | yes | yes | 32 | Е | | | 308 | 01001 10100 | SPURR | hypv <sup>3</sup> | yes | 64 | S | | | 309 | 01001 10101 | PURR | hypv <sup>3</sup> | yes | 64 | S | | | 309 | 01001 10101 | DBCR1 | yes | yes | 32 | E | | | 310 | 01001 10110 | HDEC | hypv <sup>3</sup> | hypv <sup>3</sup> | 32 | S | | | 310 | 01001 10110 | DBCR2 | yes | yes | 32 | E | | | 312 | 01001 11000 | RMOR | hypv <sup>3</sup> | hypv <sup>3</sup> | 64 | S | | | 312 | 01001 11000 | IAC1 | yes | yes | 64 | E | | | 313 | 01001 11001 | HRMOR | hypv <sup>3</sup> | hypv <sup>3</sup> | 64 | S | | | 313 | 01001 11001 | IAC2 | yes | yes | 64 | E | | | 314 | 01001 11010 | HSRR0 | hypv <sup>3</sup> | hypv <sup>3</sup> | 64 | S | | | 314 | 01001 11010 | IAC3 | yes | yes | 64 | E | | | 315 | 01001 11010 | HSRR1 | hypv <sup>3</sup> | hypv <sup>3</sup> | 64 | S | | | 315 | | IAC4 | | | 64 | E | | | | 01001 11011 | DAC1 | yes | yes | 64 | E | | | 316 | 01001 11100 | | yes | yes | | E | | | 317 | 01001 11101 | DAC2 | yes | yes | 64 | | | | 318 | 01001 11110 | LPCR | hypv <sup>3</sup> | hypv <sup>3</sup> | 64 | S | | | 318 | 01001 11110 | DVC1 | yes | yes | 64 | E | | | 319 | 01001 11111 | LPIDR | hypv <sup>3</sup> | hypv <sup>3</sup> | 32 | S | | | 319 | 01001 11111 | DVC2 | yes | yes | 64 | E | | | 336 | 01010 10000 | TSR | yes <sup>5</sup> | yes | 32 | E | | | 336 | 01010 10000 | HMER | hypv <sup>3,8</sup> | hypv <sup>3</sup> | 64 | S | | | 337 | 01010 10001 | HMEER | hypv <sup>3</sup> | hypv <sup>3</sup> | 64 | S | | | 338 | 01010 10010 | PCR | hypv <sup>3</sup> | hypv <sup>3</sup> | 64 | S | | | 339 | 01010 10011 | HEIR | hypv <sup>3</sup> | hypv <sup>3</sup> | 32 | HEA | | | 340 | 01010 10100 | TCR | yes | yes | 32 | Е | | | 400-415 | 01100 1xxxx | IVOR[0-15] | yes | yes | 32 | Е | | | 512 | 10000 00000 | SPEFSCR | no | no | 32 | SP | | | 526 | 10000 01110 | ATB/ATBL | - | no | 64 | ATB | | | 527 | 10000 01111 | ATBU | - | no | 32 | ATB | | | 528 | 10000 10000 | IVOR32 | yes | yes | 32 | SP | | | 529 | 10000 10001 | IVOR33 | yes | yes | 32 | SP | | | 530 | 10000 10010 | IVOR34 | yes | yes | 32 | SP | | | 531 | 10000 10011 | IVOR35 | yes | yes | 32 | E.PM | | | 532 | 10000 10100 | IVOR36 | yes | yes | 32 | E.PC | | | 533 | 10000 10101 | IVOR37 | yes | yes | 32 | E.PC | | | 570 | 10001 11010 | MCSRR0 | yes | yes | 64 | E | | | 571 | 10001 11011 | MCSRR1 | yes | yes | 32 | E | | | 572 | 10001 11100 | MCSR | yes | yes | 64 | E | | | 574 | 10001 11110 | DSRR0 | yes | yes | 64 | E.ED | | | 575 | 10001 11111 | DSRR1 | yes | yes | 32 | E.ED | | | 604 | 10001 11111 | SPRG8 | yes | yes | 64 | E | | | 605 | 10010 11100 | SPRG9 | yes | yes | 64 | E.ED | | | 624 | 10010 11101 | MAS0 | yes | yes | 32 | E.MF | | | 625 | 10011 10000 | MAS1 | • | • | 32 | E.MF | | | 626 | 10011 10001 | MAS2 | yes | yes | 64 | E.MF | | | 627 | | MAS3 | yes | yes | 32 | E.MF | | | 628 | 10011 10011 | | yes | yes | 32 | E.MF | | | | 10011 10100 | MAS4 | yes | yes | 32 | E.MF | | | 630 | 10011 10110 | MAS6 | yes | yes | | | | | 633 | 10011 11001 | PID1 | yes | yes | 32 | E.MF | | | 634 | 10011 11010 | PID2 | yes | yes | 32 | E.MF | | | 688-691 | 10101 100xx | TLB[0-3]CFG | yes | yes | 32 | E.MF | | | 702 | 10101 11110 | EPR | - | yes | 32 | EXP | | | 768-783 | 11000 0xxxx | perf_mon | | | 64 | S.PM | | Figure 21. SPR Numbers (Sheet 2 of 3) | decimal | SPR <sup>1</sup> | Register | Privil | eged | Length | Cat <sup>2</sup> | |---------|---------------------------------------|----------|-------------------|-------------------|--------|------------------| | decimal | spr <sub>5:9</sub> spr <sub>0:4</sub> | Name | mtspr | mfspr | (bits) | Cat | | 784-799 | 11000 1xxxx | perf_mon | varies | yes | 64 | S.PM | | 896 | 11100 00000 | PPR | no | no | 64 | S | | 924 | 11100 11100 | DCDBTRL | _6 | yes | 32 | E.CD | | 925 | 11100 11101 | DCDBTRH | _6 | yes | 32 | E.CD | | 926 | 11100 11110 | ICDBTRL | _7 | yes | 32 | E.CD | | 927 | 11100 11111 | ICDBTRH | _7 | yes | 32 | E.CD | | 944 | 11101 10000 | MAS7 | yes | yes | 32 | E.MF | | 947 | 11101 10011 | EPLC | yes | yes | 32 | E.PD | | 948 | 11101 10100 | EPSC | yes | yes | 32 | E.PD | | 979 | 11110 10011 | ICDBDR | -7 | yes | 32 | E.CD | | 1012 | 11111 10100 | MMUCSR0 | yes | yes | 32 | E.MF | | 1013 | 11111 10101 | DABR | hypv <sup>3</sup> | hypv <sup>3</sup> | 64 | S | | 1015 | 11111 10111 | DABRX | hypv <sup>3</sup> | hypv <sup>3</sup> | 64 | S | | 1015 | 11111 10111 | MMUCFG | yes | yes | 32 | E.MF | | 1023 | 11111 11111 | PIR | - | yes | 32 | S | - This register is not defined for this instruction. - Note that the order of the two 5-bit halves of the SPR number is reversed. - <sup>2</sup> See Section 1.3.5 of Book I. - This register is a hypervisor resource, and can be modified by this instruction only in hypervisor state (see Chapter 2 of Book III-S). - 4 <S>This register is a hypervisor resource, and can be modified by this instruction only in hypervisor state (see Chapter 2 of Book III-S). <E>This register is privileged. - This register cannot be directly written to. Instead, bits in the register corresponding to 1 bits in (RS) can be cleared using mtspr SPR,RS. - <sup>6</sup> The register can be written by the *dcread* instruction. - <sup>7</sup> The register can be written by the *icread* instruction. - <sup>8</sup> This register cannot be directly written. Instead, bits in the register corresponding to 0 bits in (RS) can be cleared using *mtspr SPR*,RS. All SPR numbers that are not shown above and are not implementation-specific are reserved. Figure 21. SPR Numbers (Sheet 3 of 3) ## **Appendix D. Illegal Instructions** With the exception of the instruction consisting entirely of binary 0s, the instructions in this class are available for future extensions of the Power ISA; that is, some future version of the Power ISA may define any of these instructions to perform new functions. The following primary opcodes are illegal. 1, 5, 6, 60 The following primary opcodes have unused extended opcodes. Their unused extended opcodes can be determined from the opcode maps in Appendix F of Book Appendices. All unused extended opcodes are illegal. 4, 19, 30, 31, 56, 57, 58, 59, 62, 63 An instruction consisting entirely of binary 0s is illegal, and is guaranteed to be illegal in all future versions of this architecture. ## **Appendix E. Reserved Instructions** The instructions in this class are allocated to specific purposes that are outside the scope of the Power ISA. The following types of instruction are included in this class. - The instruction having primary opcode 0, except the instruction consisting entirely of binary 0s (which is an illegal instruction; see Section 1.7.2, "Illegal Instruction Class" on page 21) and the extended opcode shown below. - 256 Service Processor "Attention" - 2. Instructions for the POWER Architecture that have not been included in the Power ISA. These are listed in Section A.31, "Discontinued Opcodes" and Section A.33.1, "Discontinued Opcodes". - 3. Implementation-specific instructions used to conform to the Power ISA specification. - 4. Any other implementation-dependent instructions that are not defined in the Power ISA. ## Appendix F. Opcode Maps This appendix contains tables showing the opcodes and extended opcodes. For the primary opcode table (Table 3 on page 868), each cell is in the following format. | Opcode in Decimal | Opcode in<br>Hexadecimal | |-------------------|--------------------------| | | Instruction<br>Mnemonic | | Category | Instruction<br>Format | The category abbreviations are shown on Section 1.3.5 of Book I. However, the categories "Phased-In", "Phased-Out", and floating-point "Record" are not listed in the opcode tables. The extended opcode tables show the extended opcode in decimal, the instruction mnemonic, the category, and the instruction format. These tables appear in order of primary opcode within three groups. The first group consists of the primary opcodes that have small extended opcode fields (2-4 bits), namely 30, 58, and 62. The second group consists of primary opcodes that have 11-bit extended opcode fields. The third group consists of primary opcodes that have 10-bit extended opcode fields. The tables for the second and third groups are rotated. In the extended opcode tables several special markings are used. - A prime (') following an instruction mnemonic denotes an additional cell, after the lowest-numbered one, used by the instruction. For example, *subfc* occupies cells 8 and 520 of primary opcode 31, with the former corresponding to OE=0 and the latter to OE=1. Similarly, *sradi* occupies cells 826 and 827, with the former corresponding to sh<sub>5</sub>=0 and the latter to sh<sub>5</sub>=1 (the 9-bit extended opcode 413, shown on page 91, excludes the sh<sub>5</sub> bit). - Two vertical bars (||) are used instead of primed mnemonics when an instruction occupies an entire column of a table. The instruction mnemonic is repeated in the last cell of the column. - For primary opcode 31, an asterisk (\*) in a cell that would otherwise be empty means that the cell is reserved because it is "overlaid", by a fixed-point or *Storage Access* instruction having only a primary opcode, by an instruction having an extended opcode in primary opcode 30, 58, or 62, or by a potential instruction in any of the categories just mentioned. The overlaying instruction, if any, is also shown. A cell thus reserved should not be assigned to an instruction having primary opcode 31. (The overlaying is a consequence of opcode decoding for fixed-point instructions: the primary opcode, and the extended opcode if any, are mapped internally to a 10-bit "compressed opcode" for ease of subsequent decoding.) - Parentheses around the opcode or extended opcode mean that the instruction was defined in earlier versions of the Power ISA but is no longer defined in the Power ISA. - Curly brackets around the opcode or extended opcode mean that the instruction will be defined in future versions of the Power ISA. - long is used as filler for mnemonics that are longer than a table cell. An empty cell, a cell containing only an asterisk, or a cell in which the opcode or extended opcode is parenthesized, corresponds to an illegal instruction. The instruction consisting entirely of binary 0s causes the system illegal instruction error handler to be invoked for all members of the POWER family, and this is likely to remain true in future models (it is guaranteed in the Power ISA). An instruction having primary opcode 0 but not consisting entirely of binary 0s is reserved except for the following extended opcode (instruction bits 21:30). 256 Service Processor "Attention" (Power ISA only) | Tab | le 3: Pri | mar | y opc | codes | | | | | | | | | |---------|---------------------|-----------|----------|----------|----------|----------|------------------|---------------|-----|---------------------|-----|-------------------------------------------------------------------------------------------------------| | 0 | | 00 | | | 01 | 2 | | 02 | 3 | | 03 | See primary opcode 0 extensions on page 867 | | F | Illegal,<br>Reserve | d | | | | 64 | tdi | D | В | twi | D | Trap Doubleword Immediate<br>Trap Word Immediate | | 4<br>Ve | ctor, LIV | 04<br>1A, | 5 | | 05 | 6 | | 06 | | mulli | 07 | See Table 8 and Table 9 | | | SP<br>MA, SF | | | | | | | | BD | | | Multiply Low Immediate | | 8 | IVIA, SF | 08 | 9 | | 09 | 10 | | 0A | | | 0B | Multiply Low Immediate Subtract From Immediate Carrying | | | subfic | | | | | | cmpli | | | cmpi | | , , | | В | | D | | | | В | | D | В | | D | Compare Logical Immediate<br>Compare Immediate | | 12 | addic | 0C | 13 | addic. | 0D | 14 | addi | 0E | 15 | addis | 0F | Add Immediate Carrying Add Immediate Carrying and Record | | В | | D | В | | D | В | | D | В | | D | Add Immediate<br>Add Immediate Shifted | | 16 | | 10 | 17 | | 11 | 18 | | 12 | 19 | | 13 | Branch Conditional | | | bc | | | SC | | | b | | | CR ops, etc. | | System Call<br>Branch | | В | | | В | | SC | | | I | | | XL | See Table 11 on page 880 | | 20 | rlwimi | 14 | 21 | rlwinm | 15 | 22 | | 16 | 23 | rlwnm | 17 | Rotate Left Word Imm. then Mask Insert<br>Rotate Left Word Imm. then AND with Mask | | В | | М | | | М | | | | В | | М | Rotate Left Word then AND with Mask | | 24 | ori | 18 | 25 | oris | 19 | 26 | xori | 1A | 27 | xoris | 1B | OR Immediate OR Immediate Shifted XOR Immediate | | В | | D | В | | D | В | | D | В | | D | XOR Immediate Shifted | | 28 | andi. | 1C | 29 | andis. | 1D | 30<br>FX | Dwd R | 1E<br>ot | 31 | FX | 1F | AND Immediate AND Immediate Shifted | | В | G G | D | В | a. raio. | D | . , , | | D[S] | Ext | tended C | )ps | See Table 4 on page 869<br>See Table 11 on page 880 | | 32 | lwz | 20 | 33 | lwzu | 21 | 34 | lbz | 22 | 35 | lbzu | 23 | Load Word and Zero Load Word and Zero with Update | | В | | D | В | | D | В | | D | В | | D | Load Byte and Zero<br>Load Byte and Zero with Update | | 36 | -4 | 24 | 37 | -4 | 25 | 38 | -41- | 26 | 39 | - 41 | 27 | Store Word | | | stw | | | stwu | | | stb | | | stbu | | Store Word with Update<br>Store Byte | | В | | | В | | | В | | D | | | D | Store Byte with Update | | 40 | lhz | | 41 | lhzu | 29 | 42 | lha | 2A | 43 | lhau | 2B | Load Half and Zero<br>Load Half and Zero with Update<br>Load Half Algebraic | | В | | | В | | D | | | | В | | D | Load Half Algebraic with Update | | 44 | sth | 2C | 45 | sthu | 2D | 46 | lmw | 2E | 47 | stmw | 2F | Store Half<br>Store Half with Update<br>Load Multiple Word | | В | | D | В | | D | В | | D | В | | D | Store Multiple Word | | 48 | lfs | 30 | 49 | lfsu | 31 | 50 | lfd | 32 | 51 | lfdu | 33 | Load Floating-Point Single<br>Load Floating-Point Single with Update<br>Load Floating-Point Double | | FP | | | FP | | | FP | | | FP | | D | Load Floating-Point Double with Update | | 52 | stfs | 34 | 53 | stfsu | 35 | 54 | stfd | 36 | 55 | stfdu | 37 | Store Floating-Point Single<br>Store Floating-Point Single with Update<br>Store Floating-Point Double | | FP | | | FP | | | FP | | | FP | | D | Store Floating-Point Double with Update | | 56 | lq | | 57 | | 39 | 58<br>FX | DS-for | m | | FP Single<br>DFP Op | | Load Quadword<br>See Table 5 on page 869<br>See Table 6 on page 869 | | LSC | ) | DQ | 04 | | 0.0 | 00 | | DS | 00 | | ٥٦ | See Table 16 on page 884 | | 60 | | 3C | 61<br>FP | stfdp | 3D<br>DS | FX | DS-for<br>Stores | 3E<br>m<br>DS | 8 | P Doubl | | Store Floating-Point Double Pair<br>See Table 7 on page 869<br>See Table 17 on page 886 | | | | | ı | | | | | | | | | DOD . GIOTO IT OIT Pago 000 | | | Table 4: Extended opcodes for primary opcode 30 (instruction bits 27:30) | | | | | | | | | | | | | |----|--------------------------------------------------------------------------|--------------------------------|--------------------------------|----------------------------|--|--|--|--|--|--|--|--|--| | (. | 00 01 10 11 | | | | | | | | | | | | | | 00 | 0<br><b>rldicl</b><br>64<br>MD | 1<br><i>rldicl</i> '<br>MD | 2<br><b>rldicr</b><br>64<br>MD | 3<br><i>rldicr'</i><br>MD | | | | | | | | | | | 01 | 4<br><b>rldic</b><br>64<br>MD | 5<br><b>rldic</b> '<br>MD | 6<br><b>rldimi</b><br>64<br>MD | 7<br><b>rldimi</b> '<br>MD | | | | | | | | | | | 10 | 8<br><b>rldcl</b><br>64<br>MDS | 9<br><b>rldcr</b><br>64<br>MDS | | | | | | | | | | | | | 11 | | | | | | | | | | | | | | | Table 5: Extended opcodes for primary opcode 57 | | | | | | | | | | | |-------------------------------------------------|------------------------------|--|--|--|--|--|--|--|--|--| | (instruction bits 30:31) | | | | | | | | | | | | | 1 | | | | | | | | | | | 0 | 0<br><i>Ifdp</i><br>FP<br>DS | | | | | | | | | | | 1 | | | | | | | | | | | | Table 6: Extended opcodes for primary opcode 58 (instruction bits 30:31) | | | | | | | | | | | |--------------------------------------------------------------------------|-----------------------------|-----------------------------|--|--|--|--|--|--|--|--| | 0 1 | | | | | | | | | | | | 0 | 0<br><b>Id</b><br>64<br>DS | 1<br><i>Idu</i><br>64<br>DS | | | | | | | | | | 1 | 2<br><b>Iwa</b><br>64<br>DS | | | | | | | | | | | Table 7: Extended opcodes for primary opcode 62 | | | | | | | | | | | |-------------------------------------------------|-----|------|--|--|--|--|--|--|--|--| | (instruction bits 30:31) | | | | | | | | | | | | 0 1 | | | | | | | | | | | | | 0 | 1 | | | | | | | | | | | std | stdu | | | | | | | | | | 0 | 64 | 64 | | | | | | | | | | | DS | DS | | | | | | | | | | | 2 | | | | | | | | | | | 1 | stq | | | | | | | | | | | ' | LSQ | | | | | | | | | | | | DS | | | | | | | | | | | Table | e 8: (Le | eft) Exte | ended o | opcodes for pi | imary o | pcode | 4 [Cate | gory: \ | / & LM | A] (inst | ruction | bits 21 | :31) | | | |-------|--------------------------------|-----------|-------------------------------|-------------------------------|---------|---------------------------------|---------|----------------------------------|--------|--------------------------------|---------|--------------------------------|--------|--------------------------------|--------| | | Λ | 000001 | 2 | 000011 000100 | 000101 | 6 | | 8 | | 10 | 001011 | | 001101 | 001110<br>14 | 001111 | | 00000 | vaddubm<br>V VX | | vmaxub<br>V VX | v <b>rlb</b><br>V VX<br>68 | | vcmpequb<br>V VC<br>70 | \ | vmuloub<br>V VX<br>72 | | vaddfp<br>V VX<br>74 | | 12<br><i>vmrghb</i><br>V VX | | vpkuhum<br>V VX<br>78 | | | 00001 | <i>vadduhm</i><br>V VX | | vmaxuh<br>V VX | vrlh<br>V VX | | vcmpequh<br>V VC | ! | vmulouh<br>V VX | | vsubfp<br>V VX | | vmrghh<br>V VX | | <i>vpkuwum</i><br>V VX | | | 00010 | 128<br><i>vadduwm</i><br>V VX | | 130<br><i>vmaxuw</i><br>V VX | 132<br><i>vrlw</i><br>V VX | | 134<br>vcmpequw<br>V VC | | | | | | 140<br><b>vmrghw</b><br>V VX | | 142<br><b>vpkuhus</b><br>V VX | | | 00011 | | | | | | 198<br><i>vcmpeqfp</i><br>V VC | | | | | | | | 206<br>V <b>pkuwus</b><br>V VX | | | 00100 | | | 258<br><b>vmaxsb</b><br>V VX | 260<br><i>vslb</i><br>V VX | | | 1 | 264<br><b>vmulosb</b><br>V VX | | 266<br><b>vrefp</b><br>V VX | | 268<br><b>vmrglb</b><br>V VX | | 270<br><b>vpkshus</b><br>V VX | | | 00101 | | | vmaxsh<br>V VX | 324<br><i>vslh</i><br>V VX | | | ļ | 328<br><b>vmulosh</b><br>V VX | | 330<br>vrsqrtefp<br>V VX | | vmrglh<br>V VX | | vpkswus<br>V VX | | | 00110 | 384<br><b>vaddcuw</b><br>V VX | | 386<br>vmaxsw<br>V VX | 388<br><i>vslw</i><br>V VX | | | | | | 394<br>vexptefp<br>V VX | | 396<br>vmrglw<br>V VX | | 398<br>vpkshss<br>V VX | | | 00111 | | | | 452<br><i>vsI</i><br>V VX | | 454<br>vcmpgefp<br>V VC | | | | 458<br>vlogefp<br>V VX | | | | 462<br>vpkswss<br>V VX | | | 01000 | 512<br><b>vaddubs</b><br>V VX | | 514<br><b>vminub</b><br>V VX | 516<br><i>vsrb</i><br>V VX | | 518<br>vcmpgtub<br>V VC | Į, | 520<br><b>vmuleub</b><br>V VX | | 522<br>vrfin<br>V VX | | 524<br>vspltb<br>V VX | | 526<br>vupkhsb<br>V VX | | | 01001 | 576<br><b>vadduhs</b><br>V VX | | 578<br><b>vminuh</b><br>V VX | 580<br><i>vsrh</i><br>V VX | | 582<br><b>vcmpgtuh</b><br>V VC | I. | 584<br><b>vmuleuh</b><br>V VX | | 586<br><b>vrfiz</b><br>V VX | | 588<br><b>vsplth</b><br>V VX | | 590<br><b>vupkhsh</b><br>V VX | | | 01010 | 640<br><b>vadduws</b><br>V VX | | 642<br><b>vminuw</b><br>V VX | 644<br><b>vsrw</b><br>V VX | | 646<br>vcmpgtuw<br>V VC | | | | 650<br><b>vrfip</b><br>V VX | | 652<br><b>vspltw</b><br>V VX | | 654<br><b>vupkisb</b><br>V VX | | | 01011 | | | | 708<br><i>vsr</i><br>V VX | | 710<br>vcmpgtfp<br>V VC | | | | 714<br><i>vrfim</i><br>V VX | | | | 718<br><b>vupkish</b><br>V VX | | | 01100 | 768<br><b>vaddsbs</b><br>V VX | | 770<br><b>vminsb</b><br>V VX | 772<br><i>vsrab</i><br>V VX | | 774<br><i>vcmpgtsb</i><br>V VC | ! | 776<br><b>vmulesb</b><br>V VX | | 778<br><i>vcuxwfp</i><br>V VX | | 780<br><b>vspltisb</b><br>V VX | | 782<br><b>vpkpx</b><br>V VX | | | 01101 | 832<br><b>vaddshs</b><br>V VX | | 834<br><b>vminsh</b><br>V VX | 836<br><i>vsrah</i><br>V VX | | 838<br><i>vcmpgtsh</i><br>V VC | ļ | 840<br><b>vmulesh</b><br>V VX | | 842<br>vcsxwfp<br>V VX | | 844<br><b>vspltish</b><br>V VX | | 846<br><b>vupkhpx</b><br>V VX | | | 01110 | 896<br><b>vaddsws</b><br>V VX | | 898<br><b>vminsw</b><br>V VX | 900<br><i>vsraw</i><br>V VX | | 902<br><i>vcmpgtsw</i><br>V VC | | | | 906<br><i>vcfpuxws</i><br>V VX | | 908<br><b>vspltisw</b><br>V VX | | | | | 01111 | | | | | | 966<br><i>vcmpbfp</i><br>V VC | | | | 970<br><i>vcfpsxws</i><br>V VX | | | | 974<br>vupklpx<br>V VX | | | 10000 | 1024<br><b>vsububm</b><br>V VX | | 1026<br><i>vavgub</i><br>V VX | 1028<br><i>vand</i><br>V VX | | 1030<br>vcmpequb.<br>V VC | | | | 1034<br><i>vmaxfp</i><br>V VX | | 1036<br><i>vslo</i><br>V VX | | | | | 10001 | 1088<br><b>vsubuhm</b><br>V VX | | 1090<br><i>vavgub</i><br>V VX | 1092<br><i>vandc</i><br>V VX | | 1094<br>vcmpequh.<br>V VC | | | | 1098<br><i>vminfp</i><br>V VX | | 1100<br><i>vsro</i><br>V VX | | | | | 10010 | 1152<br><i>vsubuwm</i><br>V VX | | 1154<br><i>vavgub</i><br>V VX | 1156<br><i>vor</i><br>V VX | | 1158<br>vcmpequw.<br>V VC | | | | | | | | | | | 10011 | | | | 1220<br><i>vxor</i><br>V VX | | 1222<br>vcmpeqfp.<br>V VC | | | | | | | | | | | 10100 | | | 1282<br><i>vavgsb</i><br>V VX | 1284<br><i>vnor</i><br>V VX | | | | | | | | | | | | | 10101 | | | 1346<br><i>vavgsb</i><br>V VX | | | | | | | | | | | | | | 10110 | 1408<br><i>vsubcuw</i><br>V VX | | 1410<br><i>vavgsb</i><br>V VX | | | | | | | | | | | | | | 10111 | | | | | | 1478<br><i>vcmpgefp</i><br>V VC | | | | | | | | | | | 11000 | 1536<br><i>vsububs</i><br>V VX | | | 1540<br><i>mfvscr</i><br>V VX | | 1542<br>vcmpgtub.<br>V VC | \ | 1544<br>vsum4ubs<br>V VX | | | | | | | | | | 1600<br><i>vsubuhs</i><br>V VX | | | 1604<br><i>mtvscr</i><br>V VX | | 1606<br>vcmpgtuh.<br>V VC | l. | 1608<br>/ <b>sum4shs</b><br>V VX | | | | | | | | | 11010 | 1664<br>vsubuws<br>V VX | | | | | 1670<br>vcmpgtuw.<br>V VC | | 1672<br>vsum2sws<br>V VX | | | | | | | | | 11011 | | | | | | 1734<br>vcmpgtfp.<br>V VC | | | | | | | | | | | 11100 | 1792<br>vsubsbs<br>V VX | | | | | 1798<br>vcmpgtsb.<br>V VC | ľ | 1800<br>/ <b>sum4sbs</b><br>V VX | | | | | | | | | 11101 | 1856<br>vsubshs<br>V VX | | | | | 1862<br>vcmpgtsh.<br>V VC | | | | | | | | | | | 11110 | 1920<br>vsubsws<br>V VX | | | | | 1926<br>vcmpgtsw.<br>V VC | | 1928<br>vsumsws<br>V VX | | | | | | | | | 11111 | | | | | | 1990<br>vcmpbfp.<br>V VC | | | | | | | | | | | Table | e 8 (Le | ft-Cent | er) Exte | ended o | pcodes | s for pr | marv o | pcode | 4 [Cate | aorv: V | / & LMA1 (inst | ruction | bits 21 | :31) | | |-------|--------------------------------|--------------------------------|----------|---------|--------|----------|--------|--------|-------------------------------|----------------------------------|-----------------|-------------------------------|-------------------------------|--------|--------| | | 010000 | 010001 | 010010 | 010011 | 010100 | 010101 | 010110 | 010111 | 011000 | 011001 | / & LMA] (insti | 011100 | 011101 | 011110 | 011111 | | | 16<br><i>mulhhwu</i><br>LMA XO | LMA XO | | | | | | | 24<br>machhwu<br>LMA XO | 24<br><b>long</b><br>LMA XO | | | | | | | 00001 | 80<br><i>mullhw</i><br>LMA XO | 81<br><i>mullhw.</i><br>LMA XO | | | | | | | 88<br>machhw | 89<br>machhw. | | 92<br>nmachhw<br>LMA XO | 93<br>Iong<br>LMA XO | | | | 00010 | | | | | | | | | 152<br>long<br>LMA XO | 153 | | | | | | | 00011 | | | | | | | | | 216<br>machhws | 217<br>long | | 220<br>long | 220<br><i>long</i><br>LMA XO | | | | 00100 | 272<br>mulchwu<br>LMA X | 273<br>mulchwu. | | | | | | | 280<br>macchwu<br>LMA XO | 281<br>Iong | | LIVIA XU | LIVIA XO | | | | 00101 | 336<br>mulchw<br>LMA X | 337<br>mulchw. | | | | | | | 344<br>macchw<br>LMA XO | 345<br>macchw. | | 348<br>nmacchw<br>LMA XO | 349<br>Iong | | | | 00110 | | | | | | | | | 408<br>long<br>LMA XO | 409 | | 2.0.7.7.0 | 2 | | | | 00111 | | | | | | | | | 472<br>macchws<br>LMA XO | 473<br>long | | 476<br><i>long</i><br>LMA XO | 477<br>Iong<br>LMA XO | | | | 01000 | | | | | | | | | 2.007 710 | | | 2.0.7.7.0 | Link XO | | | | 01001 | | | | | | | | | | | | | | | | | 01010 | | | | | | | | | | | | | | | | | 01011 | | | | | | | | | | | | | | | | | 01100 | 784<br><i>mullhwu</i><br>LMA X | 784<br>mullhwu.<br>LMA X | | | | | | | 792<br>maclhwu<br>LMA XO | 793<br><i>maclhwu.</i><br>LMA XO | | | | | | | 01101 | 848<br><i>mullhw</i><br>LMA X | 849<br><i>mullhw.</i><br>LMA X | | | | | | | 856<br>maclhw | 857<br>maclhw. | | 860<br>nmaclhw<br>LMA XO | 861<br>nmaclhw.<br>LMA XO | | | | 01110 | | | | | | | | | 920<br><i>long</i><br>LMA XO | 921<br><i>long</i><br>LMA XO | | | | | | | 01111 | | | | | | | | | 984<br>maclhws<br>LMA XO | 985<br>maclhws. | | 988<br><i>long</i><br>LMA XO | 989<br><i>Iong</i><br>LMA XO | | | | 10000 | 1040<br><i>long</i><br>LMA XO | 1041<br>Iong | | | | | | | 1048 | 1049<br><i>long</i><br>LMA XO | | | | | | | 10001 | 1104<br>mullhwo.<br>LMA XO | 1105<br>mullhwo. | | | | | | | 1112<br>machhwo<br>LMA XO | 1113<br><i>long</i> | | 1116<br><i>long</i><br>LMA XO | 1117<br><i>long</i><br>LMA XO | | | | 10010 | | | | | | | | | 1176<br><i>long</i><br>LMA XO | 1177 | | | | | | | 10011 | | | | | | | | | 1240<br><i>long</i><br>LMA XO | 1241 | | 1244<br><i>long</i><br>LMA XO | 1245<br><i>long</i><br>LMA XO | | | | 10100 | | | | | | | | | 1304<br><i>long</i><br>LMA XO | 1305 | | | | | | | 10101 | | | | | | | | | 1368<br>macchwo<br>LMA XO | 1369<br>Iong<br>LMA XO | | 1372<br><i>long</i><br>LMA XO | 1373<br><i>long</i><br>LMA XO | | | | 10110 | | | | | | | | | 1432<br>long<br>LMA XO | 1433 | | | | | | | 10111 | | | | | | | | | 1496<br>long<br>LMA XO | 1497 | | 1500<br><i>long</i><br>LMA XO | 1501<br><i>long</i><br>LMA XO | | | | 11000 | | | | | | | | | | 7.5 | | | 7.0 | | | | 11001 | | | | | | | | | | | | | | | | | 11010 | | | | | | | | | | | | | | | | | 11011 | | | | | | | | | | | | | | | | | 11100 | | | | | | | | | 1816<br><i>long</i><br>LMA XO | 1817<br><i>long</i><br>LMA XO | | | | | | | 11101 | | | | | | | | | 1880<br>maclhwo<br>LMA XO | 1881<br>maclhwo. | | 1884<br><i>long</i><br>LMA XO | 1885<br><i>long</i><br>LMA XO | | | | 11110 | | | | | | | | | 1944<br>long<br>LMA XO | 1946 | | | | | | | 11111 | | | | | | | | | 2008<br>long<br>LMA XO | 2009 | | 2012<br>Iong<br>LMA XO | 2013<br><i>long</i><br>LMA XO | | | | Table | e 8 | (Ri | aht- | Cer | nter | ) Ex | tended | go | cod | es f | or r | orim | arv | OD | cod | e 4 | ГСа | tea | orv: | V 8 | & LN | ИAI | (ins | stru | ctior | n bits 2 | 1:3 | 1) | | | |-------|-----------|------------|-------|-------------|------------|------------|------------------|------------|-----------|-------|-----------|------|------------|------|-------------|-----------|------------|-----------|------------|-----------------|----------|-----|-------------------|------|----------|----------|-----|-----------|-----|-------------------| | | 100 | 000 | 100 | 001 | 100 | 010 | tended<br>100011 | 100 | 100 | 100 | 101 | 100 | )110 | 100 | )111 | 101 | 000 | 101 | 001 | 101 | 010 | 101 | 011 | 101 | 100 | 101101 | 101 | 110 | 101 | 111 | | 00000 | 3<br>vmha | 2<br>ddshs | ymhra | 2<br>addshs | 3<br>vmlad | 4<br>Iduhm | | vmsur<br>V | 6<br>nubm | ymsu. | 7<br>mmbm | vmsu | 88<br>muhm | vmsi | 19<br>Imuhs | 4<br>vmsu | ()<br>mshm | 4<br>vmsu | 1<br>Imshs | 4<br><b>v</b> s | 2<br>sel | vp. | 13<br>e <b>rm</b> | VS | 4<br>doi | | vma | 6<br>ddfp | vnm | 17<br>subfp<br>VA | | | ٧ | VA | ٧ | VA | ٧ | VA | | V | VA | ٧ | VA | ٧ | VA | ٧ | VA | ٧ | VA | ٧ | VA | ٧ | VA | ۷, | VA | ٧ | VA | | ٧ | VΆ | V | VÁ. | | 00001 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 00010 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 00011 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 00100 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 00101 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 00110 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 00111 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 01000 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 01001 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 01010 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 01011 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 01100 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 01101 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 01110 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 01111 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 10000 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 10001 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 10010 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 10011 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 10100 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 10101 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 10110 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 10111 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 11000 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 11001 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 11010 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 11011 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 11100 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 11101 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 11110 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 11111 | vmha | ddshs | vmhra | addshs | vmlad | duhm | | vmsur | nubm | vmsu | mmbm | vmsu | muhm | vmsı | ımuhs | vmsu | mshm | vmsı | ımshs | VS | sel | vn | erm | VS | doi | | vma | ddfp | vnm | subfp | | Table | e 8 (Ric | ght) Ext | tended | opcode | es for p | rimary | opcode | 4 [Cat | egory: | V & LIV | 1A1 (ins | truction | bits 21 | 1:31) | | | |-------|----------|----------|--------|--------|----------|--------|--------|--------|--------|----------|----------|----------|---------|----------|----------|----------| | | 110000 | 110001 | 110010 | 110011 | 110100 | 110101 | 110110 | 110111 | 111000 | 111001 | 111010 | 111011 | 111100 | 111101 | 111110 | 111111 | | 00000 | | | | | | | | | | | | | | | | | | 00001 | | | | | | | | | | | | | | | | | | 00010 | | | | | | | | | | | | | | | | | | 00011 | | | | | | | | | | | | | | | | | | 00100 | | | | | | | | | | | | | | | | | | 00101 | | | | | | | | | | | | | | | | | | 00110 | | | | | | | | | | | | | | | | | | 00111 | | | | | | | | | | | | | | | | | | 01000 | | | | | | | | | | | | | | | | | | 01001 | | | | | | | | | | | | | | | | | | 01010 | | | | | | | | | | | | | | | | | | 01011 | | | | | | | | | | | | | | | | | | 01100 | | | | | | | | | | | | | | | | | | 01101 | | | | | | | | | | | | | | | | | | 01110 | | | | | | | | | | | | | | | | | | 01111 | | | | | | | | | | | | | | | | | | 10000 | | | | | | | | | | | | | | | | | | 10001 | | | | | | | | | | | | | | | | | | 10010 | | | | | | | | | | | | | | | | | | 10011 | | | | | | | | | | | | | | | | | | 10100 | | | | | | | | | | | | | | | | | | 10101 | | | | | | | | | | | | | | | | | | 10110 | | | | | | | | | | | | | | | | | | 10111 | | | | | | | | | | | | | | | | | | 11000 | | | | | | | | | | | | | | | | | | 11001 | | | | | | | | | | | | | | | | | | 11010 | | | | | | | | | | | | | | | | | | 11011 | | | | | | | | | | | | | | | | | | 11100 | | | | | | | | | | | | | | | | | | 11101 | | | | | | | | | | | | | | | | | | 11110 | | | | | | | | | | | | | | | | | | 11111 | | | | | | | | | | | | | | | | | | | | | | | | | | | | <u> </u> | <u> </u> | | | <u> </u> | <u> </u> | <u> </u> | | Table | e 9: (Lo | eft) Ext | ended ( | opcode | s for pr | imary o | opcode | 4 [Cate | egory: S | SP.*] (ir | nstructi | on bits | 21:31) | | | | |-------|-------------------------------|-------------------------------|--------------------------------|-------------------------------|--------------------------------|-------------------------------------|-----------------------------|-------------------------------|-------------------------------|------------------------------------|------------------------|-------------------------------|---------------------------------|--------------------------------------|------------------------------|-------------------------------| | | 000000 | 000001 | 000010 | 000011 | 000100 | 000101 | 000110 | 000111 | 001000 | 001001 | 001010 | 001011 | 001100 | 001101 | 001110 | 001111 | | 00000 | | | | | | | | | | | | | | | | | | 00001 | | | | | | | | | | | | | | | | | | 00010 | | | | | | | | | | | | | | | | | | 00011 | | | | | | | | | | | | | | | | | | 00100 | | | | | | | | | | | | | | | | | | 00101 | | | | | | | | | | | | | | | | | | 00110 | | | | | | | | | | | | | | | | | | 00111 | | | | | | | | | | | | | | | | | | 01000 | 512<br>evaddw<br>SP EVX | | 514<br>evaddiw<br>SP EVX | | 516<br>evsubfw<br>SP EVX | | 518<br>evsubifw<br>SP EVX | | 520<br>evabs | 521<br>evneg<br>SP EVX | 522<br>evextsb | 523<br>evextsh | 524<br>evrndw | 525<br>evcntlzw | 526<br>evcntlsw | 527<br>brinc | | 01001 | OF LVA | | SF LVX | | SF LVX | | SF LVX | | SF LVA | SF LVX | SF LVX | SF LVA | SF LVX | SF LVX | SF LVX | SF LVX | | 01010 | | 641<br>evssub<br>sp.fv EVX | | | 644<br>evfsabs | 645<br><b>evfsnabs</b><br>sp.fv EVX | 646<br>evfsneg<br>sp.fv EVX | | 648<br>evfsmul | 649<br><i>evfsdiv</i><br>sp.fv EVX | | | 652<br><i>long</i><br>sp.fv EVX | 653<br><b>evfscmplt</b><br>sp.fv EVX | 654<br>long | | | 01011 | 704<br>efsadd<br>sp.fs EVX | 705<br>efssub | | | 708<br>efsabs | 709<br>efsnabs<br>sp.fs EVX | 710<br>efsneg<br>sp.fs EVX | | 712<br>efsmul | 713<br>efsdiv<br>sp.fs EVX | | | 716<br>efscmpat | 717<br>efscmplt | 718<br>efscmpeq<br>sp.fs EVX | 719<br>efscfd<br>sp fd EVX | | 01100 | 768<br>evlddx<br>SP EVX | 769<br>evldd<br>SP EVX | 770<br><b>evldwx</b><br>SP EVX | 771<br>evldw<br>SP EVX | 772 | 773 | | | 776 | 777<br>Iong<br>SP EVX | | | 780 | 781<br>Iong<br>SP EVX | 782 | 783 | | 01101 | | | | | _ | | | | | | | | | | | | | 01110 | | | | | | | | | | | | | | | | | | 01111 | | | | | | | | | | | | | | | | | | 10000 | | | | 1027<br>evmhessf<br>SP EVX | | | | 1031<br>evmhossf<br>SP EVX | 1032<br><i>long</i><br>SP EVX | 1033<br>Iong<br>SP EVX | | 1035<br><i>long</i><br>SP EVX | 1036<br><i>long</i><br>SP EVX | 1037<br><i>long</i><br>SP EVX | | 1039<br><i>long</i><br>SP EVX | | 10001 | | | | | | | | 1095<br>long<br>SP EVX | 1096 | | | | 1100 | 1101<br>long<br>SP EVX | | 1103<br>long<br>SP EVX | | 10010 | | | | | | | | | | | | | | | | | | 10011 | 1216<br><i>long</i><br>SP EVX | 1217<br>Iong<br>SP EVX | 1218<br>Iong<br>SP EVX | 1219<br><i>long</i><br>SP EVX | 1220<br><b>evmra</b><br>SP EVX | | 1222<br>evdivws | 1223<br>evdivwu<br>SP EVX | 1224<br>Iong<br>SP EVX | 1225<br>Iong<br>SP EVX | 1226<br>Iong<br>SP EVX | 1227<br>Iong<br>SP EVX | | | | | | | 1280<br>long<br>SP EVX | 1281<br>long<br>SP EVX | 01 27% | 1283<br>long<br>SP EVX | OI EVX | 1285<br><i>long</i><br>SP EVX | OI LVX | 1287<br>long<br>SP EVX | 1288 | 1289<br>long<br>SP EVX | OI LVX | 1291 | 1292<br>Iong<br>SP EVX | 1293<br><i>long</i><br>SP FVX | | 1295<br><i>long</i><br>SP EVX | | 10101 | 1344<br><i>long</i><br>SP EVX | 1345<br><i>long</i> | | 01 27% | | OI LVX | | OI LVX | 1352<br><i>Iona</i> | 1353<br>long<br>SP EVX | | OI LVX | OI LVX | OI LVX | | OI LVX | | | 1408 | 1409<br>long<br>SP EVX | | 1411<br><i>long</i><br>SP EVX | 1412<br>long<br>SP EVX | 1413<br><i>long</i><br>SP EVX | | 1415<br><i>long</i><br>SP EVX | 1416 | 1417<br>long<br>SP EVX | | 1419<br><i>long</i><br>SP EVX | 1420<br><i>long</i><br>SP EVX | 1421<br><i>long</i><br>SP EVX | | 1423<br><i>long</i><br>SP EVX | | 10111 | 1472<br>Iona | 1473<br><i>long</i><br>SP EVX | | | | | | | 1480<br><i>long</i> | 1481<br><i>long</i><br>SP EVX | | | | | | | | 11000 | | | | | | | | | | | | | | | | | | 11001 | | | | | | | | | | | | | | | | | | 11010 | | | | | | | | | | | | | | | | | | 11011 | | | | | | | | | | | | | | | | | | 11100 | | | | | | | | | | | | | | | | | | 11101 | | | | | | | | | | | | | | | | | | 11110 | | | | | | | | | | | | | | | | | | 11111 | | | | | | | | | | | | | | | | | | Tabl | <u> </u> | ft-Cent | er) Exte | ended o | ncode | s for nr | imary o | ncode | 4 [Cate | aory: S | SP *1 (in | structio | n hits | 21.31) | | | |-------|--------------------------|-------------------------------|--------------------------------|-------------------------------|------------------------------|------------------------------|------------------------------|-----------------------|-------------------------------|-------------------------------|-------------------------------|-------------------------------|-------------------------------|-------------------------------|--------------------------------|--------| | Tabl | 010000 | 010001 | 010010 | 010011 | 010100 | 010101 | 010110 | 010111 | 011000 | 011001 | 011010 | 011011 | 011100 | 011101 | 011110 | 011111 | | 00000 | | | | | | | | | | | | | | | | | | 00001 | | | | | | | | | | | | | | | | | | 00010 | | | | | | | | | | | | | | | | | | 00011 | | | | | | | | | | | | | | | | | | 00100 | | | | | | | | | | | | | | | | | | 00101 | | | | | | | | | | | | | | | | | | 00110 | | | | | | | | | | | | | | | | | | 00111 | | | | | | | | | | | | | | | | | | 01000 | | 529<br><b>evand</b><br>SP EVX | 530<br><b>evandc</b><br>SP EVX | | | | 534<br>evxor<br>SP EVX | 535<br>evor<br>SP EVX | 536<br>evnor<br>SP EVX | 537<br><b>eveqv</b><br>SP EVX | | 539<br>evorc<br>SP EVX | | | 542<br><b>evnand</b><br>SP EVX | | | 01001 | | | | | | | | | | | | | | | | | | 01010 | sp.fv EVX | 657<br>evfscfsi<br>sp.fv EVX | 658<br>evfscfuf<br>sp.fv EVX | 659<br>evfscfsf<br>sp.fv EVX | 660<br>evfsctui<br>sp.fv EVX | 661<br>evfsctsi<br>sp.fv EVX | 662<br>evfsctuf<br>sp.fv EVX | sp.fv EVX | SD.TV EVX | | 666<br>evfsctsiz<br>sp.fv EVX | | 668<br>evfststgt<br>sp.fv EVX | 669<br>evfststlt<br>sp.fv EVX | 670<br>evfststeq<br>sp.fv EVX | | | 01011 | sp.fs EVX | 721<br>efscfsi<br>sp.fs EVX | 722<br>efscfuf<br>sp.fs EVX | 723<br>efscfsf<br>sp.fs EVX | efsctui<br>sp.fs EVX | efsctsi<br>sp.fs EVX | 726<br>efsctuf<br>sp.fs EVX | efsctsf<br>sp.fs EVX | efsctuiz<br>sp.fs EVX | 700 | 730<br>efsctsiz<br>sp.fs EVX | | efststgt<br>sp.fs EVX | 733<br>efststlt<br>sp.fs EVX | efststeq<br>sp.fs EVX | | | 01100 | 784<br>evlwhex<br>SP EVX | evlwhe<br>SP EVX | | | evlwhoux<br>SP EVX | evlwhou<br>SP EVX | evlwhosx<br>SP EVX | evlwhos<br>SP EVX | 792<br>long<br>SP EVX | long<br>SP EVX | | | 796<br>long<br>SP EVX | long<br>SP EVX | | | | 01101 | | | | | | | | | | | | | | | | | | 01110 | | | | | | | | | | | | | | | | | | 01111 | | | | | | | | | | | | | | | | | | 10000 | | | | | | | | | 1112 | 1113 | | 1115 | | | | | | 10001 | | | | | | | | | long<br>SP EVX | long<br>SP EVX | | long<br>SP EVX | | | | | | 10010 | | | | | | | | | | | | | | | | | | 10011 | | | | | | | | | | | | | | | | | | 10100 | | | | 1363 | | | | | 1368 | 1369 | | 1371 | | | | | | 10101 | | | | long<br>SP EVX | | | | | long<br>SP EVX | long<br>SP EVX | | long<br>SP EVX | | | | | | 10110 | | | | 1491 | | | | | 1496 | 1497 | | 1499 | | | | | | 10111 | | | | 1491<br><i>long</i><br>SP EVX | | | | | 1496<br><i>long</i><br>SP EVX | long<br>SP EVX | | 1499<br><i>long</i><br>SP EVX | | | | | | 11000 | | | | | | | | | | | | | | | | | | 11001 | | | | | | | | | | | | | | | | | | 11010 | | | | | | | | | | | | | | | | | | 11011 | | | | | | | | | | | | | | | | | | 11100 | | | | | | | | | | | | | | | | | | 11101 | | | | | | | | | | | | | | | | | | 11110 | | | | | | | | | | | | | | | | | | 11111 | | | | | | | | | | | | | | | | | | Tabl | 2 0 (Rid | nht-Cer | nter) Ex | tended | oncod | es for r | rimary | oncod | <u> </u> | teaory: | SP *1 / | instruc | tion hits | 21:31 | 1 | | |-------|---------------------------|----------------------------------|-----------------------------|------------------------------------|---------------------------|----------------------------|----------------------------------|-------------------------------|----------------------------------|----------------------------------|--------------------------------|-----------------------------------|-------------------------------|-------------------------------|------------------------------------|----------------------------------| | Table | 100000 | 100001 | 100010 | 100011 | 100100 | 100101 | 100110 | 100111 | 101000 | 101001 | 101010 | 101011 | 101100 | 101101 | 101110 | 101111 | | 00000 | | | | | | | | | | | | | | | | | | 00001 | | | | | | | | | | | | | | | | | | 00010 | | | | | | | | | | | | | | | | | | 00011 | | | | | | | | | | | | | | | | | | 00100 | | | | | | | | | | | | | | | | | | 00101 | | | | | | | | | | | | | | | | | | 00110 | | | | | | | | | | | | | | | | | | 00111 | | | | | | | | | | | | | | | | | | 01000 | 544<br>evsrwu<br>SP EVX | 545<br>evsrws<br>SP EVX | 546<br>evsrwiu<br>SP EVX | 547<br><b>evsrwis</b><br>SP EVX | 548<br>evslw<br>SP EVX | | 550<br><b>evslwi</b><br>SP EVX | | 552<br>evrlw<br>SP EVX | 553<br><b>evsplati</b><br>SP EVX | 554<br><b>evrlwi</b><br>SP EVX | 555<br><b>evsplatfi</b><br>SP EVX | 556<br>long<br>SP EVX | 557<br>long<br>SP EVX | 558<br>long<br>SP EVX | 559<br><i>long</i><br>SP EVX | | 01001 | | | | | | | | | | | | | | | | | | 01010 | | | | | | | | | | | | | | | | | | 01011 | 736<br>efdadd<br>sp.fdEVX | 737<br><b>efdsub</b><br>sp.fdEVX | 738<br>efdcfuid<br>sp.fdEVX | 739<br><b>efdcfsid</b><br>sp.fdEVX | 740<br>efdabs<br>sp.fdEVX | 741<br>efdnabs<br>sp.fdEVX | 742<br><b>efdneg</b><br>sp.fdEVX | | 744<br><b>efdmul</b><br>sp.fdEVX | 745<br><i>efddiv</i><br>sp.fdEVX | 746<br>efdctuidz<br>sp.fdEVX | 747<br>efdctsidz<br>sp.fdEVX | 748<br>efdcmpgt<br>sp.fdEVX | 749<br>efdcmplt<br>sp.fdEVX | 750<br><b>efdcmpeg</b><br>sp.fdEVX | 751<br><i>efdcfs</i><br>sp.fdEVX | | 01100 | 800<br>evstddx<br>SP EVX | 801<br>evstdd<br>SP EVX | 802<br>evstdwx | | 804<br>evstdhx | 805<br>evstdh<br>SP EVX | | | | • | • | • | | | | | | 01101 | | | | | | | | | | | | | | | | | | 01110 | | | | | | | | | | | | | | | | | | 01111 | | | | | | | | | | | | | | | | | | 10000 | | | | 1059<br><i>long</i><br>SP EVX | | | | 1063<br><i>long</i><br>SP EVX | 1064<br><i>long</i><br>SP EVX | 1065<br><i>long</i><br>SP EVX | | 1067<br><i>long</i><br>SP EVX | 1068<br><i>long</i><br>SP EVX | 1069<br><i>long</i><br>SP EVX | | 1071<br><i>long</i><br>SP EVX | | 10001 | | | | | | | | 1127<br><i>long</i><br>SP EVX | 1128<br><i>long</i><br>SP EVX | | | | 1132 | 1133<br><i>long</i><br>SP EVX | | 1135<br><i>long</i><br>SP EVX | | 10010 | | | | | | | | | | | | | | | | | | 10011 | | | | | | | | | | | | | | | | | | 10100 | | | | | | | | | 1320<br><i>long</i><br>SP EVX | 1321<br><i>long</i><br>SP EVX | | 1323<br><i>long</i><br>SP EVX | 1324<br><i>long</i><br>SP EVX | 1325<br><i>long</i><br>SP EVX | | 1327<br><i>long</i><br>SP EVX | | 10101 | | | | | | | | | | | | | | | | | | 10110 | | | | | | | | | 1448<br><i>long</i><br>SP EVX | 1449<br><i>long</i><br>SP EVX | | 1451<br><i>long</i><br>SP EVX | 1452<br><i>long</i><br>SP EVX | 1453<br><i>long</i><br>SP EVX | | 1455<br><i>long</i><br>SP EVX | | 10111 | | | | | | | | | | | | | | | | | | 11000 | | | | | | | | | | | | | | | | | | 11001 | | | | | | | | | | | | | | | | | | 11010 | | | | | | | | | | | | | | | | | | 11011 | | | | | | | | | | | | | | | | | | 11100 | | | | | | | | | | | | | | | | | | 11101 | | | | | | | | | | | | | | | | | | 11110 | | | | | | | | | | | | | | | | | | 11111 | | | | | | | | | | | | | | | | | | Tabl | e 9 (Rio | aht) Ext | tended | opcode | es for p | rimarv | opcode | 4 [Cat | egory: | SP.*1 (i | nstruct | ion bits | 21:31) | | | | |-------|----------------------------------|-----------------------------------|-----------------------------------|-----------------------------------|-----------------------------------|-----------------------------------|-----------------------------------|-----------------------------------|------------------------------------|---------------------------------|------------------------------------|---------------------------------|------------------------------------|------------------------------------|------------------------------------|-------------------------| | | 110000 | 110001 | 110010 | 110011 | 110100 | 110101 | 110110 | 110111 | 111000 | 111001 | 111010 | 111011 | 111100 | 111101 | 111110 | 111111 | | 00000 | | | | | | | | | | | | | | | | | | 00001 | | | | | | | | | | | | | | | | | | 00010 | | | | | | | | | | | | | | | | | | 00011 | | | | | | | | | | | | | | | | | | 00100 | | | | | | | | | | | | | | | | | | 00101 | | | | | | | | | | | | | | | | | | 00110 | | | | | | | | | | | | | | | | | | 00111 | | | | | | | | | | | | | | | | | | 01000 | 560<br><b>evcmpgtu</b><br>SP EVX | 561<br><b>evcmpgts</b><br>SP EVX | 562<br><b>evcmpltu</b><br>SP EVX | 563<br><b>evcmplts</b><br>SP EVX | 564<br><b>evcmpeq</b><br>SP EVX | | | | | | | | | | | | | 01001 | | | | | | | | | 632<br>evsel<br>SP EVS | 633<br><b>evsel</b> '<br>SP EVS | 634<br>evsel'<br>SP EVS | 635<br><b>evsel</b> '<br>SP EVS | 636,<br><b>evsel</b> '<br>SP EVS | 637<br>evsel'<br>SP EVS | 638<br>evsel'<br>SP EVS | 639<br>evsel'<br>SP EVS | | 01010 | | | | | | | | | | | | | | | | | | 01011 | sp.fdEVX | 753<br><b>efdcfsi</b><br>sp.fdEVX | 754<br><b>efdcfuf</b><br>sp.fdEVX | 755<br><b>efdcfsf</b><br>sp.fdEVX | 756<br><b>efdctui</b><br>sp.fdEVX | 757<br><b>efdctsi</b><br>sp.fdEVX | 758<br><b>efdctuf</b><br>sp.fdEVX | 759<br><b>efdctsf</b><br>sp.fdEVX | 760<br><b>efdctuiz</b><br>sp.fdEVX | | 762<br><b>efdctsiz</b><br>sp.fdEVX | | 764<br><b>efdtstgt</b><br>sp.fdEVX | 765<br><i>efdtstlt</i><br>sp.fdEVX | 766<br><b>efdtsteq</b><br>sp.fdEVX | | | 01100 | 816<br>evstwhex<br>SP EVX | 817<br>evstwhe<br>SP EVX | | | 820<br>evstwhox<br>SP EVX | 821<br>evstwho<br>SP EVX | | | 824<br>evstwwex<br>SP EVX | 825<br>evstwwe<br>SP EVX | | | 828<br>evstwwox<br>SP EVX | 829<br>evstwwo<br>SP EVX | | | | 01101 | | | | | | | | | | | | | | | | | | 01110 | | | | | | | | | | | | | | | | | | 01111 | | | | | | | | | | | | | | | | | | 10000 | | | | | | | | | | | | | | | | | | 10001 | | | | 1139<br><i>long</i><br>SP EVX | | | | | | 1145<br><i>long</i><br>SP EVX | | 1147<br><i>long</i><br>SP EVX | | | | | | 10010 | | | | | | | | | | | | | | | | | | 10011 | | | | | | | | | | | | | | | | | | 10100 | | | | | | | | | | | | | | | | | | 10101 | | | | | | | | | | | | | | | | | | 10110 | | | | | | | | | | | | | | | | | | 10111 | | | | | | | | | | | | | | | | | | 11000 | | | | | | | | | | | | | | | | | | 11001 | | | | | | | | | | | | | | | | | | 11010 | | | | | | | | | | | | | | | | | | 11011 | | | | | | | | | | | | | | | | | | 11100 | | | | | | | | | | | | | | | | | | 11101 | | | | | | | | | | | | | | | | | | 11110 | | | | | | | | | | | | | | | | | | 11111 | | | | | | | | | | | | | | | | | | | | <u> </u> | | <u> </u> | | | | | | | | | | <u> </u> | | | | Table | o 10· / | Loft\ E | vtondo | donoor | loo for | nrimor | , on oo | lo 10 /ii | actructi | on hita | 21.20\ | | | | | | |-------|---------|-----------------------------------------------|--------|--------|---------|--------|------------------------------|-----------------------------|----------|---------|--------|-------|-------|-------|-------|-------| | Table | 00000 | 00001 | 00010 | 00011 | 00100 | 00101 | 00110 | de 19 (ii<br>00111 | 01000 | 01001 | 01010 | 01011 | 01100 | 01101 | 01110 | 01111 | | 00000 | 0 | | | | | | | | | | | | | | | | | 00001 | | 33<br><i>crnor</i><br>B XL | | | | | 38<br><i>rfmci</i><br>E XL | 39<br><i>rfdi</i><br>E.ED X | | | | | | | | | | 00010 | | | | | | | | | | | | | | | | | | 00011 | | | | | | | | | | | | | | | | | | 00100 | | 129<br><i>crandc</i><br>B XL | | | | | | | | | | | | | | | | 00101 | | | | | | | | | | | | | | | | | | 00110 | | 193<br>crxor<br>B XL<br>225<br>crnand<br>B XL | | | | | 198<br><i>dnh</i><br>E.EDXFX | | | | | | | | | | | 00111 | | 225<br><i>crnand</i><br>B XL | | | | | | | | | | | | | | | | 01000 | | crand<br>B XL | | | | | | | | | | | | | | | | 01001 | | 289<br><i>creqv</i><br>B XL | | | | | | | | | | | | | | | | 01010 | | | | | | | | | | | | | | | | | | 01011 | | | | | | | | | | | | | | | | | | 01100 | | | | | | | | | | | | | | | | | | 01101 | | 417<br>crorc<br>B XL<br>449<br>cror<br>B XL | | | | | | | | | | | | | | | | 01110 | | 449<br><i>cror</i><br>B XL | | | | | | | | | | | | | | | | 01111 | | | | | | | | | | | | | | | | | | 10000 | | | | | | | | | | | | | | | | | | 10001 | | | | | | | | | | | | | | | | | | 10010 | | | | | | | | | | | | | | | | | | 10011 | | | | | | | | | | | | | | | | | | 10100 | | | | | | | | | | | | | | | | | | 10101 | | | | | | | | | | | | | | | | | | 10110 | | | | | | | | | | | | | | | | | | 10111 | | | | | | | | | | | | | | | | | | 11000 | | | | | | | | | | | | | | | | | | 11001 | | | | | | | | | | | | | | | | | | 11010 | | | | | | | | | | | | | | | | | | 11011 | | | | | | | | | | | | | | | | | | 11100 | | | | | | | | | | | | | | | | | | 11101 | | | | | | | | | | | | | | | | | | 11110 | | | | | | | | | | | | | | | | | | 11111 | | | | | | | | | | | | | | | | | | Tabl | e 10. (Right | t) Extende<br>01 10010 | ed opco | des for | primar<br>10101 | y opco<br>10110 | de 19 ( | instruct | tion bits | 21:30)<br>11010 | 11011 | 11100 | 11101 | 11110 | 11111 | |-------|-----------------------------|-------------------------------------------------|---------------------------|---------|-----------------|-----------------------------|---------|----------|-----------|-----------------|-------|-------|-------|-------|-------| | 00000 | 16 | 18<br><b>rfid</b><br>S XL | | | | | | | | | | | | | | | 00001 | D AL | 50<br>rfi | 51<br><i>rfci</i><br>E XL | | | | | | | | | | | | | | 00010 | | 50<br>rfi<br>E XL<br>(82)<br>rfsvc<br>XL | E AL | | | | | | | | | | | | | | 00011 | | , AL | | | | | | | | | | | | | | | 00100 | | | | | | 150<br><i>isync</i><br>B XL | | | | | | | | | | | 00101 | | | | | | D AL | | | | | | | | | | | 00110 | | | | | | | | | | | | | | | | | 00111 | | | | | | | | | | | | | | | | | 01000 | | 274<br><i>hrfid</i><br>S XL | | | | | | | | | | | | | | | 01001 | | J AL | | | | | | | | | | | | | | | 01010 | | | | | | | | | | | | | | | | | 01011 | | | | | | | | | | | | | | | | | 01100 | | 402<br>doze<br>B XI | | | | | | | | | | | | | | | 01101 | | 402<br>doze<br>B XL<br>434<br>nap<br>B XL | | | | | | | | | | | | | | | 01110 | | 466<br>sleep<br>B XL<br>498<br>rvwinkle<br>B XL | | | | | | | | | | | | | | | 01111 | | 498<br>rvwinkle<br>B XI | | | | | | | | | | | | | | | 10000 | 528<br><i>bcctr</i><br>B XL | | | | | | | | | | | | | | | | 10001 | | | | | | | | | | | | | | | | | 10010 | | | | | | | | | | | | | | | | | 10011 | | | | | | | | | | | | | | | | | 10100 | | | | | | | | | | | | | | | | | 10101 | | | | | | | | | | | | | | | | | 10110 | | | | | | | | | | | | | | | | | 10111 | | | | | | | | | | | | | | | | | 11000 | | | | | | | | | | | | | | | | | 11001 | | | | | | | | | | | | | | | | | 11010 | | | | | | | | | | | | | | | | | 11011 | | | | | | | | | | | | | | | | | 11100 | | | | | | | | | | | | | | | | | 11101 | | | | | | | | | | | | | | | | | 11110 | | | | | | | | | | | | | | | | | 11111 | | | | | | | | | | | | | | | | | Table | | | | | | primary | | | | on bits | | | | | | | | |-------|----------------------------|---------------------|-------|-----------------------------|-------------------------|---------|------------------------------|-------------------------------|-------------------------------|--------------------------------|--------------------------------|-------------------------------|-------|-------|-------------------------------|------|----------------| | 00000 | 00000<br>0 | 00001 | 00010 | 00011 | 00100<br>4<br>tw | 00101 | 00110<br>6 | 00111<br>7<br>Ivebx | 01000<br>8 | 9<br>mulhdu<br>64 XO | 01010<br>10<br>addc | 01011<br>11<br>mulhwu | 01100 | 01101 | 01110<br>14<br>Res'd | 011 | 111<br>5<br>ee | | 00000 | <i>cmp</i><br>B X | 33 | | | B X | | V X | V X | 8<br>subfc<br>B XO | 64 XO | B XO | B XO | | | VLE | Tabl | е<br>е 15 | | 00001 | cmpl<br>B X | 33<br>Res'd<br>VLE | | | | | V X | <i>Ivehx</i><br>V X | 40<br>subf<br>B XO | | | | | | 46<br>Res'd<br>VLE | | | | 00010 | | | | | 68<br><b>td</b><br>64 X | | | 71<br><i>Ivewx</i><br>V X | | 73<br><b>mulhd</b><br>64 XO | 74<br><b>addg6s</b><br>BCDA XO | 75<br><b>mulhw</b><br>B XO | | | 78<br><i>dlmzb</i><br>LMA X | | | | 00011 | | | | | | | | 103<br><i>Ivx</i><br>V X | 104<br><b>neg</b><br>B XO | | | | | | | | | | 00100 | | 129<br>Res'd<br>VLE | | 131<br><i>wrtee</i><br>E X | | | 134<br>dcbtstls<br>ECL X | 135<br>stvehy | 136<br>subfe<br>B XO | | 138<br><b>add</b><br>B XO | | | | | | | | 00101 | | | | 163<br><b>wrteei</b><br>E X | | | 166<br>dcbtls<br>ECL X | 167<br><b>stvehx</b><br>V X | | | | | | | | | | | 00110 | | 193<br>Res'd<br>VLE | | | | | | 199<br>stvewx<br>V X | 200<br>subfze<br>B X0 | | 202<br><b>addz</b><br>B XO | | | | 206<br>msgsnd<br>E.PC X | | | | 00111 | | 225<br>Res'd<br>VLE | | | | | 230<br><i>icblc</i><br>ECL X | 231<br><i>stvx</i> | 232<br>subfme<br>B X0 | 233<br><b>mulld</b><br>64 XO | 234<br>addme<br>B XO | 235<br><b>mullw</b><br>B XO | | | 238<br>msgclr<br>E.PC X | | | | 01000 | | 257<br>Res'd<br>VLE | | 259<br><b>mfdrx</b><br>E X | | | 262<br>Res'd<br>AP | 263<br>IvepxI<br>E.PD X | В ХО | 04 70 | 266<br><b>add</b><br>B XO | В ХО | | | 2 0 X | | | | 01001 | | 289<br>Res'd<br>VLE | | 291<br><b>mfdrux</b><br>E X | | | 711 | 295<br><i>Ivepx</i><br>E.PD X | | | B AG | | | | | | | | 01010 | | V L.L. | | 323<br>mfdcr<br>E X | | | 326<br>dcread<br>E.CD X | D X | | | | | | | 334<br><i>mfpmr</i><br>E.PM X | | П | | 01011 | | | | | | | | {359}<br><i>IvxI</i><br>V X | | | | | | | | | | | 01100 | | | | 387<br>mtdcrx<br>E X | | | 390<br><i>dcblc</i><br>ECL X | | | | | | | | | | | | 01101 | | 417<br>Res'd<br>VLE | | 419<br>mtdcrux<br>E X | | | _ | | | | | | | | | | | | 01110 | | 449<br>Res'd<br>VLE | | 451<br>mtdcr<br>E X | | | 454<br><b>dci</b><br>E.Cl X | | | 457<br><b>divdu</b><br>64 XO | | 459<br><b>divwu</b><br>B XO | | | 462<br><i>mtpmr</i><br>E.PM X | | | | 01111 | | | | | | | 486<br>Res'd<br>AP | {487}<br><b>stvx1</b><br>V X | | 489<br><b>divd</b><br>64 XO | | 491<br><b>divw</b><br>B XO | | | | | | | 10000 | 512<br><b>mcrxr</b><br>E X | | | | | | 7.0 | {519}<br><i>IvIx</i><br>V X | 520<br><b>subfc</b> '<br>B XO | 521<br><b>mulhdu</b> ′<br>64XO | 522<br><b>addc</b> '<br>B XO | 523<br>mulhwu'<br>B XO | | | | | | | 10001 | LX | | | | | | | {551}<br><i>Ivrx</i><br>V X | 552<br><b>subf</b> '<br>B XO | 01/10 | B AG | B NO | | | | | | | 10010 | | | | | | | | V X | В хо | 585<br><b>mulhd</b> '<br>64 XO | | 587<br><b>mulhw</b> '<br>B XO | | | | | | | 10011 | | | | | | | | | 616<br><b>neg</b> '<br>B XO | 04 70 | | В ХО | | | | | | | 10100 | | | | | | | | {647}<br><b>stvlx</b><br>V X | 648<br><b>subfe</b> '<br>B X0 | | 650<br><b>adde</b> '<br>B XO | | | | | | | | 10101 | | | | | | | | {679}<br>stvrx<br>V X | В ХО | | В ХО | | | | | | | | 10110 | | | | | | | | | 712<br>subfze'<br>B XO | | 714<br><b>addze</b> ′<br>B XO | | | | | | | | 10111 | | | | | | | | | 744<br>subfme'<br>B XO | 745<br><b>mulld</b> '<br>64 XO | 746<br>addme' | 747<br><b>mullw</b> '<br>B XO | | | | | | | 11000 | | | | | | | | 775<br>stvepxl<br>E.PD X | טא מ | υ+ ΛΟ | 778<br><i>add'</i><br>B XO | טאמ | | | | | | | 11001 | | | | | | | | 807<br>stvepx<br>E.PD X | | | D AU | | | | | | | | 11010 | | | | | | | | L.1 D X | | | | | | | | | | | 11011 | | | | | | | | | | | | | | | | | | | 11100 | | | | | | | | 903<br>stvlxl<br>V X | | | | | | | | | | | 11101 | | | | | | | | 935<br><i>stvrxI</i><br>V X | | | | | | | | | П | | 11110 | | | | | | | 966<br><i>ici</i><br>E.CI X | v ^ | | 969<br><b>divdu</b> '<br>64 XO | | 971<br><b>divwu</b> ′<br>B XO | | | | | | | 11111 | | | | | | | 998<br>icread<br>E.CD X | | | 1001<br>divd'<br>64 XO | | 1003<br><i>divw'</i><br>B XO | | | | _ Se | ee<br>e 15 | | Table | | | xtende | | | | | | | | | | 11100 | 11101 | 11110 | 1 11111 | |-------|-----------------------|-------|--------------------------------|------------------------------|-----------------------------|-----------------------------|-------------------------------|------------------------------------|----------------------------------|-------|--------------------------------|---------------------------|---------------------------|------------------------------------|--------------------------------|---------------------------------| | 00000 | 10000<br>16<br>Res'd | 10001 | 10010 | 10011<br>19<br>mfcr | 10100<br>20<br><i>Iwarx</i> | 10101<br>21<br><i>Idx</i> | 10110<br>22<br>icbt | 10111<br>23<br><i>Iwzx</i> | 11000<br>24<br><i>slw</i><br>B X | 11001 | 11010<br>26<br>cntlzw | 11011<br>27<br>sld | 28<br>and | 11101<br>29<br>Idepx | 30<br>rldic!* | 31<br>/wepx<br>E.PD X | | 00001 | VLE | | | B XFX | ВХ | 53<br><i>Idux</i><br>64 X | 54<br><b>dcbst</b><br>B X | 55<br><i> wzux</i><br>B X | 56<br>Res'd | | B X<br>58<br>cntlzd | 64 X | 60<br>andc<br>B X | E.PD X | 64 MD<br>62<br>See<br>Table 12 | E.PD X | | 00010 | | | (82)<br>mtsrd | 83<br>mfmsr<br>B X | 84<br><i>Idarx</i><br>64 X | 64 X | 86<br><b>dcbf</b><br>B X | 87<br><i>Ibzx</i><br>B X | VLE | | 64 X | | ВХ | | 94<br><b>rldicr*</b><br>64 MD | 95<br><i>Ibepx</i><br>E.PD X | | 00011 | | | (114)<br>mtsrdin<br>X | υх | 04 A | | 118<br>clf<br>X | 119<br><i>Ibzux</i><br>B X | | | 122<br>popcntb<br>B X | | 124<br><i>nor</i><br>B X | | 126<br><b>rldicr*</b><br>64 MD | 127<br>dcbfep<br>E.PD X | | 00100 | 144<br>mtcrf<br>B XFX | | 146<br>mtmsr<br>B X | | | 149<br><i>stdx</i><br>64 X | 150<br>stwcx.<br>B X | 151<br>stwx<br>B X | | | 154<br>prtyw<br>B X | | Б Х | 157<br>stdepx<br>E.PD X | 158<br><i>rldic*</i><br>64 MD | 159<br>See<br>Table 14 | | 00101 | D XIX | | 178<br>mtmsrd<br>S X | | | 181<br><i>stdux</i><br>64 X | D A | 183<br><i>stwux</i><br>B X | | | 186<br><b>prtyd</b><br>64 X | | | L.I D X | 190<br><i>rldic*</i><br>64 MD | 191<br>rlwinm*<br>B M | | 00110 | | | 210<br><i>mtsr</i><br>S X | | | 01 X | 214<br><b>stdcx</b> .<br>64 X | 215<br><b>stbx</b><br>B X | | | 01 X | | | | 222<br><b>rldimi*</b><br>64 MD | 223<br>stbepx<br>E.PD X | | 00111 | | | 242<br>mtsrin<br>S X | | | | 246<br>dcbtst<br>B X | 247<br>stbux<br>B X | | | | | | | 254<br><i>rldimi*</i><br>64 MD | 255<br>See<br>Table 14 | | 01000 | | | 274<br><b>tlbiel</b><br>S X | 275<br><b>mfapidi</b><br>E X | | | 278<br><b>dcbt</b><br>B X | 279<br><i>Ihzx</i><br>B X | 280<br>Res'd<br>VLE | | 282<br><b>cdtbcd</b><br>BCDA X | | 284<br><i>eqv</i><br>B X | 285<br><b>evlddepx</b><br>E.PD evx | 286<br><i>rldcl*</i><br>64 MDS | 286<br>See<br>Table 14 | | 01001 | | | 306<br><b>tlbie</b><br>S X | | 308<br>Res'd | | 310<br><i>eciwx</i><br>EC X | 311<br><i>Ihzux</i><br>B X | 312<br>Res'd<br>VLE | | 314<br><b>cbcdtd</b><br>BCDA X | | 316<br><b>xor</b><br>B X | | 318<br><i>rldcr*</i><br>64 MDS | 319<br>See<br>Table 14 | | 01010 | | | | 339<br><b>mfspr</b><br>B XFX | | 341<br><i>Iwax</i><br>64 X | 342<br>Res'd<br>AP | 343<br><i>Ihax</i><br>B X | | | | | | | 350 | 351<br><b>xori*</b><br>B D | | 01011 | | | 370<br><b>tlbia</b><br>S X | 371<br><i>mftb</i><br>S XFX | | 373<br><i>Iwaux</i><br>64 X | 374<br>Res'd<br>AP | 375<br><i>Ihaux</i><br>B X | | | | | | | 382 | 383<br><b>xoris*</b><br>B D | | 01100 | | | 402<br><b>slbmte</b><br>S X | | | | | 407<br><i>sthx</i><br>B X | | | | | 412<br><i>orc</i><br>B X | 413<br>evstddepx<br>E.PD evx | 414 | 415<br>See<br>Table 14 | | 01101 | | | 434<br><b>slbie</b><br>S X | | | | 438<br><i>ecowx</i><br>EC X | 439<br><b>sthux</b><br>B X | | | | | 444<br><b>or</b><br>B X | | 446<br>* | 447<br><b>andis.</b> *<br>B D | | 01110 | | | | 467<br><b>mtspr</b><br>B XFX | | 469 | 470<br><b>dcbi</b><br>E X | | | | | | 476<br><b>nand</b><br>B X | | 478<br>* | | | 01111 | | | 498<br><b>slbia</b><br>S X | | F00 | 501 | 504 | 503<br><b>stmw*</b><br>All D | F2/ | | | F00 | 508<br><b>cmpb</b><br>B X | | 510 | | | 10000 | | | (530)<br>no-op | | 532<br>Res'd | 533<br><i>Iswx</i><br>B MA | 534<br>Iwbrx<br>B X | 535<br><b>Ifsx</b><br>FP X | 536<br><b>srw</b><br>B X | | | 539<br><b>srd</b><br>64 X | | | | | | 10001 | | | (562)<br><b>no-op</b> | FOF | | F07 | 566<br>tlbsync<br>S X | 567<br><b>Ifsux</b><br>FP X | 568<br>Res'd<br>VLE | | | | | | | 007 | | 10010 | | | (594)<br><b>no-op</b> | 595<br><b>mfsr</b><br>S X | | 597<br><i>Iswi</i><br>B MA | 598<br><b>sync</b><br>B X | 599<br><i>Ifdx</i><br>FP X | | | | | | | | 607<br><i>Ifdepx</i><br>E.PD X | | 10011 | | | (626)<br><b>no-op</b><br>(658) | 659 | 660 | 661 | 662 | 631<br><b>Ifdux</b><br>FP X<br>663 | | | | | | | | | | 10100 | | | no-op<br>(690) | mfsrin<br>S X | Res'd | stswx<br>B MA | stwbrx<br>B X | stfsx<br>FP X | | | | | | | | | | 10101 | | | no-op<br>(722) | | | 725 . | | stfsux<br>FP X | | | | | | | | 735 | | 10110 | | | no-op<br>(754) | | | stswi<br>B MA | 758 | stfdx<br>FP X<br>759 | | | | | | | | 735<br><b>stfdepx</b><br>E.PD X | | 10111 | | | <b>no-op</b><br>786 | | | 789 | E X | stfdux<br>FP X<br>791 | 792 | | 794 | | | | | | | 11000 | | | tlbivax<br>E X<br>818 | | | S X<br>821 | Ihbrx<br>B X<br>822 | FP X | B X | | <b>srad</b><br>64 X<br>826 | 827 | | | | | | 11001 | | | rac<br>X | .851 | | S X<br>853 | Res'd | Res'd | <i>srawi</i><br>B X | | <b>sradi</b><br>64 XS | <i>sradi′</i><br>64 XS | | | | | | 11010 | | | | slbmfev<br>S X | | S X | 854<br>See<br>Table 13 | <b>Ifiwax</b><br>S.PI X | | | | | | | | | | 11011 | | | 914 | 915 | | S X | 918 | 919 | | | 922 | | | | | | | 11100 | | | E X | slbmfee<br>S X | | stwcix<br>S X | sthbrx<br>B X | stfdpx<br>FP X | | | extsh<br>B X | | | | | | | 11101 | | | E X | 979 | | sthcix<br>S X | 982 | Res'd<br>AP<br>983 | | | extsb<br>B X<br>986 | | | | | 991<br>icbiep | | 11110 | | | E X | 979<br><b>slbfee</b><br>S X | | stbcix<br>S X | B X | stfiwx<br>FP X | | | extsw<br>64 X | | | | | 1023 | | 11111 | | | Res'd | | | stdcix<br>S X | dcbz<br>B X | | | | <u> </u> | | | | | dcbzep<br>E.PD X | Table 12: Opcode: 31, Extended Opcode: 62 | | — | | , | |-------|-------------------------------|---------------------------|---| | 0 | 000 | 001 | | | 00001 | 62<br><i>rldicl*</i><br>64 MD | 62<br><b>wait</b><br>WT X | | Table 13: Opcode: 31, Extended Opcode: 854 | | | 10 | 110 | | |-------|---|-------------------|-----|------------------| | 11010 | s | 854<br>eieio<br>X | E | 854<br>mbar<br>X | Table 14: Opcode: 31, Extended Opcode: 159 | · abi | <i>,</i> , , , , | poodo | |-------|------------------------------|---------------------------------| | | | 111 | | 00100 | 159<br><i>rlwimi*</i><br>B M | 159<br><b>stwepx</b><br>E.PD X | | 00101 | 191<br><i>rlwinm*</i><br>B M | | | 00110 | | 223<br>stbepx<br>E.PD X | | 00111 | 255<br><b>rlwnm*</b><br>B M | 255<br><i>dcbstep</i><br>E.PD X | | 01000 | 287<br><b>ori*</b><br>B D | 287<br><i>Ihepx</i><br>E.PD X | | 01001 | 319<br><i>oris*</i><br>B D | 319<br><i>dcbtep</i><br>E.PD X | | 01010 | 351<br><b>xori*</b><br>B D | | | 01011 | 383<br><b>xoris*</b><br>B D | | | 01100 | 415<br><b>andi.*</b><br>B D | 415<br>sthepx<br>E.PD X | Table 15: Opcode: 31, Extended Opcode: 15 | Table | 9 15: C | )pc | ode | |-------|------------------------------|-----|---------------------| | 00000 | | | 5<br><b>el</b><br>A | | 00001 | 47<br>* | | | | 00010 | 79<br><b>tdi*</b><br>64 D | | | | 00011 | 111<br><i>twi*</i><br>B D | | | | 00100 | 143 | | | | 00101 | 175 | | | | 00110 | 207 | | | | 00111 | 239<br><b>mulli*</b><br>B D | | | | 01000 | 271<br><b>subfic*</b><br>B D | | | | 01001 | | | | | 01010 | 335<br><b>cmpli*</b><br>B D | | | | 01011 | 367<br><b>cmpi*</b><br>B D | | | | 01100 | 399<br><b>addic*</b><br>B D | | | | 01101 | 431<br><b>addic.*</b><br>B D | | | | 01110 | 463<br><b>addi*</b><br>B D | | | | 01111 | 495<br><b>addis*</b><br>B D | | | | 10000 | | | | | 10001 | | | | | 10010 | | | | | 10011 | | | | | 10100 | | | | | 10101 | | | | | 10110 | | | | | 10111 | | | | | 11000 | | | | | 11001 | | | | | 11010 | | | | | 11011 | | | | | 11100 | | | | | 11101 | | | | | 11110 | | | | | 11111 | _ | is | el | | | | | | | Tabl | e 16:(L | eft) Ext | ended | opcode<br>00011 | es for p | rimary | opcode | 59 (ins | structio | n bits 2 | 1:30) | | | | | | |-------|---------|----------|---------------------------------|-------------------------------------------------|----------|--------|--------|---------|----------|----------|-------|-------|-------|-------|-------|-------| | 00000 | 00000 | 00001 | 00010<br>2<br>dadd | 3<br><b>dgua</b><br>DFP Z | 00100 | 00101 | 00110 | 00111 | 01000 | 01001 | 01010 | 01011 | 01100 | 01101 | 01110 | 01111 | | | | | DFP X | DFP Z | | | | | | | | | | | | | | 00001 | | | DFP X | 35<br><b>drrnd</b><br>DFP Z | | | | | | | | | | | | | | 00010 | | | dscli<br>DFP Z22<br>98 | 67<br>dquai<br>DFP Z<br>99<br>drintx<br>DFP Z23 | | | | | | | | | | | | | | 00011 | | | dscri<br>DFP Z | <i>drintx</i><br>DFP Z23 | | | | | | | | | | | | | | 00100 | | | 130<br><b>dcmpo</b><br>DFP X | | | | | | | | | | | | | | | 00101 | | | 162<br><b>dtstex</b><br>DFP X | | | | | | | | | | | | | | | 00110 | | | 194<br>dtstdc<br>DFP Z23 | | | | | | | | | | | | | | | 00111 | | | 226<br><b>dtstdg</b><br>DFP Z23 | 227<br><b>drintn</b><br>DFP Z23 | | | | | | | | | | | | | | 01000 | | | 258<br><b>dctdps</b><br>DFP X | | | | | | | | | | | | | | | 01001 | | | 290<br>dctfix<br>DFP X | | | | | | | | | | | | | | | 01010 | | | 322<br><b>ddedpd</b><br>DFP X | | | | | | | | | | | | | | | 01011 | | | 354<br><b>dxex</b><br>DFP X | | | | | | | | | | | | | | | 01100 | | | DII X | | | | | | | | | | | | | | | 01101 | | | | | | | | | | | | | | | | | | 01110 | | | | | | | | | | | | | | | | | | 01111 | | | | | | | | | | | | | | | | | | 10000 | | | 514<br><i>dsub</i><br>DFP X | | | | | | | | | | | | | | | 10001 | | | 546<br><i>ddiv</i><br>DFP X | | | | | | | | | | | | | | | 10010 | | | DFP X | | | | | | | | | | | | | | | 10011 | | | | | | | | | | | | | | | | | | 10100 | | | 642<br><b>dcmpu</b><br>DFP X | | | | | | | | | | | | | | | 10101 | | | DFP X<br>674<br>dtstsf | | | | | | | | | | | | | | | | | | DFP X | | | | | | | | | | | | | | | 10110 | | | | | | | | | | | | | | | | | | 10111 | | | 7,70 | | | | | | | | | | | | | | | 11000 | | | 770<br><b>drsp</b><br>DFP X | | | | | | | | | | | | | | | 11001 | | | 834 | | | | | | | | | | | | | | | 11010 | | | 834<br>denbcd<br>DFP X | | | | | | | | | | | | | | | 11011 | | | 866<br><i>diex</i><br>DFP X | | | | | | | | | | | | | | | 11100 | | | | | | | | | | | | | | | | | | 11101 | | | | | | | | | | | | | | | | | | 11110 | | | | | | | | | | | | | | | | | | 11111 | | | | | | | | | | | | | | | | | | 100000 | 10000 | 10001 | 1000<br>11 fdui<br>FP | 010<br>8<br>8<br>ivs<br>A | ed opco<br>10011 | fsi<br>fsi<br>FP | 100<br>20<br>20<br>A | 100<br>22 fauch<br>FP | 101<br>11<br>1dds<br>A | 10<br>2<br>fsc<br>FP | 110<br>22<br>grts<br>A | 10111 | 110<br>22<br>fr<br>FP | 000<br>4<br>es<br>A | 110<br>2<br>fm<br>FP | 5<br>uls<br>A | 110<br>2<br>frsq<br>FP | o10<br>6<br>rtes<br>A | 11011 | 2 fm:<br>FP | 8<br>Sub<br>A | 111<br>2<br>fma<br>FP | 9<br>dds<br>A | 111<br>3<br>fnms<br>FP | 0<br>Subs<br>A | fnma<br>FP | 1111<br>adds<br>A | |-------------------------------------------------------------------------------------------------------------------------------------|-------|-------|-----------------------|---------------------------|------------------|------------------|----------------------|-----------------------|------------------------|----------------------|------------------------|-------|-----------------------|---------------------|----------------------|---------------|------------------------|-----------------------|-------|-------------|---------------|-----------------------|---------------|------------------------|----------------|---------------|-------------------| | 00001<br>00010<br>00011<br>00100<br>00101<br>00110<br>00111<br>01000<br>01011<br>01100<br>01101<br>01110<br>01111<br>01110<br>01111 | | | FP | A | | FP | A | FP | A | FP | A | | FP | A A | FP | A | FP | A | | FP | A | FP | A | FP | A | FP | A | | 00010<br>00011<br>00100<br>00101<br>00101<br>00111<br>01000<br>01001<br>01101<br>01110<br>01111<br>01110<br>01111 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 00011<br>00100<br>00101<br>00110<br>00111<br>01000<br>01001<br>01010<br>01101<br>01110<br>01111<br>01110<br>01111<br>01000<br>01001 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 00100<br>00101<br>00110<br>00111<br>01000<br>01001<br>01010<br>01101<br>01110<br>01111<br>01110<br>01111<br>01000<br>01001 | | | | | | | | | | | | | | | 1 | | | | | | | | - | | | | 1 | | 00101<br>00110<br>00111<br>01000<br>01001<br>01010<br>01011<br>01110<br>01111<br>01110<br>01111<br>011111 | | | | | | | | | | | | | | | | | | | | | | | | | $\perp$ | $\rightarrow$ | | | 00110<br>00111<br>01000<br>01001<br>01010<br>01011<br>01110<br>01111<br>011110 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 00111<br>01000<br>01001<br>01010<br>01011<br>01100<br>01111<br>01110<br>01111 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 01000<br>01001<br>01010<br>01011<br>01100<br>01101<br>01110<br>01111<br>011110 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 01001<br>01010<br>01011<br>01100<br>01101<br>01110<br>01111<br>011110 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 01010<br>01011<br>01100<br>01101<br>01101<br>011110<br>011111<br>100000 | | | | | 1 | | | | | | | | | | | | | | | | | | | | | | | | 01011<br>01100<br>01101<br>01110<br>01111<br>01111<br>10000 | | | | 1 | | | | | | | | | | | | | | | | | | | | | | | | | 01100<br>01101<br>01110<br>01111<br>01111<br>10000 | | | ŀ | | | | | | | | | | | | | | | | | | | | | | | | | | 01101<br>01110<br>01111<br>011111<br>10000 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 01110<br>01111<br>10000<br>10001 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 01111 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 10000 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 10001 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 10010 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 10011 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 10100 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 10101 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 10110 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 10111 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 11000 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 11001 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 11010 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 11011 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 11100 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 11101 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 11110 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 11111 | | | fdi | | | | ubs | | dds | fsc | | | | | $\vdash$ | | | | | | | H | | | | - | | | Table | e 17:(L<br>00000 | eft) Ext | ended<br>00010 | opcode<br>00011 | es for p | rimary<br>00101 | opcode | 63 (ins | structio<br>01000 | n bits 2 | 21:30) | 01011 | 01100 | 01101 | 01110 | 01111 | |-------|----------------------------|----------|----------------------------------|----------------------------------|----------|-----------------|------------------------------|-------------------------------|-----------------------------|----------|--------|-------|---------------------------|-------|-----------------------------|------------------------------| | 00000 | 0<br>fcmpu<br>FP X | | daddq<br>DFP X | dquaq<br>DFP Z | | | | | fcpsgn<br>FP X | | | | 12<br><b>frsp</b><br>FP X | | 14<br>fctiw<br>FP X | 15<br><i>fctiwz</i><br>FP X | | 00001 | 32<br>fcmpo<br>FP X | | 34<br><b>dmulqq</b><br>DFP X | 35<br><b>drrndg</b><br>DFP Z23 | | | 38<br>mtfsb1<br>FP X | | 40<br>fneg | | | | | | | | | 00010 | 64<br><b>mcrfs</b><br>FP X | | 66<br>dsclig<br>DEP 722 | dquaiq<br>DEP 7 | | | 70<br>mtfsb0<br>FP X | | 72<br><i>fmr</i><br>FP X | | | | | | | | | 00011 | | | 98<br><b>dscriq</b><br>DEP 7 | 99<br><b>drintxq</b><br>DEP 723 | | | | | | | | | | | | | | 00100 | | | 130<br>dcmpoq<br>DFP X | | | | 134<br><i>mtfsfi</i><br>FP X | | 136<br><b>fnabs</b><br>FP X | | | | | | | | | 00101 | | | 162<br><b>dtstexq</b><br>DFP X | | | | | | | | | | | | | | | 00110 | | | 194<br>dtstdcq<br>DFP 722 | | | | | | | | | | | | | | | 00111 | | | 226<br><b>dtstdgq</b><br>DFP Z22 | 227<br><b>drintnq</b><br>DFP Z23 | | | | | | | | | | | | | | 01000 | | | 258<br><b>dctqpq</b><br>DFP X | | | | | | 264<br><b>fabs</b><br>FP X | | | | | | | | | 01001 | | | 290<br><b>dctfixq</b><br>DEP X | | | | | | | | | | | | | | | 01010 | | | 322<br><b>ddedpdg</b><br>DFP X | | | | | | | | | | | | | | | 01011 | | | 354<br><b>dxexq</b><br>DFP X | | | | | | | | | | | | | | | 01100 | | | | | | | | | 392<br><i>frin</i><br>FP X | | | | | | | | | 01101 | | | | | | | | | 424<br><i>friz</i><br>FP X | | | | | | | | | 01110 | | | | | | | | | 456<br><i>frip</i><br>FP X | | | | | | | | | 01111 | | | | | | | | | 488<br><i>frim</i><br>FP X | | | | | | | | | 10000 | | | 514<br><b>dsubq</b><br>DFP X | | | | | | | | | | | | | | | 10001 | | | 546<br><b>ddivq</b><br>DFP X | | | | | | | | | | | | | | | 10010 | | | | | | | | 583<br><b>mffs</b><br>FP X | | | | | | | | | | 10011 | | | | | | | | | | | | | | | | | | 10100 | | | 642<br><b>dcmpuq</b><br>DFP X | | | | | | | | | | | | | | | 10101 | | | 674<br><b>dtstsfq</b><br>DFP X | | | | | | | | | | | | | | | 10110 | | | | | | | | 711<br><i>mtfsf</i><br>FP XFL | | | | | | | | | | 10111 | | | | | | | | | | | | | | | | | | 11000 | | | 770<br><b>drdpq</b><br>DFP X | | | | | | | | | | | | | | | 11001 | | | 802<br><b>dcffixq</b><br>DFP X | | | | | | | | | | | | 814<br><i>fctid</i><br>FP X | 815<br><i>fctidz</i><br>FP > | | 11010 | | | 834<br><b>denbcdg</b><br>DFP X | | | | | | | | | | | | 846<br><i>fcfid</i><br>FP X | | | 11011 | | | 866<br><b>diexq</b><br>DFP X | | | | | | | | | | | | | | | 11100 | | | | | | | | | | | | | | | | | | 11101 | | | | | | | | | | | | | | | | | | 11110 | | | | | | | | | | | | | | | | | | 11111 | | | | | | | | | | | | | | | | | | Table | 2 17. (F | Right) E | xte | ende | ed opco | des | for | pri | mar | уо | рсо | de 6 | 63 ( | inst | ruct | tion | bits | 21 | :30) | ) | | | | | | | | | |-------|----------|----------|-----|-----------------------|---------|-----|----------------------|-----|----------------------|----------|----------------------|----------------------|----------|------|----------------------|----------|-----------------------|-------------------|------------------|-------|-----|------------------------------|----------|----------------------------------------------|-----|-------------------------------|-----|------------------------| | 00000 | 10000 | 10001 | 10 | 010<br>18<br>div<br>A | 10011 | 10 | 100<br>20<br>20<br>A | 10° | 101<br>21<br>dd<br>A | 10<br>fs | 110<br>2<br>qrt<br>A | 101<br>2<br>fs<br>FP | 111<br>3 | 110 | 000<br>24<br>re<br>A | 11<br>fr | 001<br>25<br>nul<br>A | 110<br>frsc<br>FP | 010<br>6<br>arte | 11011 | 11° | 100<br>28<br><b>sub</b><br>A | 11<br>fm | 101<br>29<br>add<br>A | 11° | 110<br>0<br>1 <b>sub</b><br>A | 11° | 111<br>31<br>nadd<br>A | | | | | FP | A | | FP | A | FP | A | FP | A<br>A | FP | A | FP | A | FP | <u>А</u> | FP | Ä | | FP | A | FP | A | FP | A | FP | A | | 00001 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 00010 | | | | | | | | | | | | | | | | | | | | | | | | <u> </u> | | | | | | 00011 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 00100 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 00101 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 00110 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 00111 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 01000 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 01001 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 01010 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 01011 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 01100 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 01101 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 01110 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 01111 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 10000 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 10001 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 10010 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 10011 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 10100 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 10101 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 10110 | | | | | | | | | | | | | | | <u> </u><br> <br> | | | | | | | | | <u> </u><br> | | | | | | 10111 | | | | | | | | | | | | | | | | | | H | | | | | | | | | | | | 11000 | | | | | | | | | | | | | | | | | | H | | | | | | | | | H | | | 11001 | | | | | | | | | | | | | | | | | | $\vdash$ | | | | | | | | | H | | | 11010 | | | | <u> </u> | | | | | | | | | | | <u> </u><br> | | <u> </u> | | | | | | | <u> </u><br> | | | | | | 11011 | | | | - | | | | | | | | | | | | | | H | | | | | | - | | | Н | | | | | | | | | | | | | | | | | | | | | | | | | | | <u> </u> | | | | | | 11100 | | | | <u> </u> | | | | | | | | | | | <u> </u><br> | | | | | | | | | <u> </u><br> | | | | | | 11101 | | | | | | | | | | | | | | | <u> </u><br> | | | | | | | | | <u> </u> | | | | | | 11110 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 11111 | | | f | div | | fs | ub | fa | dd | fs | qrt | fs | el | f | re | fn | nul | frs | qrte | | fm | sub | fm | add | fnn | isub | fnn | nadd | ## **Appendix G. Power ISA Instruction Set Sorted by Mnemonic** This appendix lists all the instructions in the Power ISA, in order by mnemonic. | | _ | Op | code | o ⁻. | ١, | | | | | |---|-----------|---------|------------|---------------------------|------|------------|------------------|-----------------|---------------------------------------------------------------------------------| | | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Priv | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | | XO | 31 | 266 | SR | | 63 | В | add[o][.] | Add | | | XO | 31 | 10 | SR | | 64 | В | addc[o][.] | Add Carrying | | | XO | 31 | 138 | SR | | 65 | В | adde[o][.] | Add Extended | | | XO | 31 | 74 | SR | Н | 495 | BCDA | addg6s | Add and Generate Sixes | | | D | 14 | | | | 62 | В | addi | Add Immediate | | | D | 12 | | SR | | 63 | В | addic | Add Immediate Carrying | | | D | 13 | | SR | | 63 | В | addic. | Add Immediate Carrying and Record | | | D | 15 | | | | 62 | В | addis | Add Immediate Shifted | | | XO | 31 | 234 | SR | | 65 | В | addme[o][.] | Add to Minus One Extended | | | XO | 31 | 202 | SR | | 66 | В | addze[o][.] | Add to Zero Extended | | | X | 31 | 28 | SR | | 77 | В | and[.] | AND | | | X | 31 | 60 | SR | | 78 | В | andc[.] | AND with Complement | | | D | 28 | | SR | | 75 | В | andi. | AND Immediate | | | D | 29 | | SR | | 75 | В | andis. | AND Immediate Shifted | | | _ | 18 | | <b>ОТ</b> | | 35 | В | b[l][a] | Branch Conditional | | | В | 16 | <b>500</b> | CT | | 35 | В | bc[l][a] | Branch Conditional | | | XL | 19 | 528 | CT | | 36 | В | bcctr[l] | Branch Conditional to Count Register | | | XL<br>EVX | 19<br>4 | 16<br>527 | СТ | | 36 | B<br>SP | bclr[l] | Branch Conditional to Link Register | | | X | 31 | 314 | | Н | 268<br>494 | BCDA | brinc<br>cbcdtd | Bit Reversed Increment | | | x | 31 | 282 | | Н | 494<br>494 | BCDA | cdtbcd | Convert Binary Coded Decimal to Declets Convert Declets To Binary Coded Decimal | | | x | 31 | 0 | | ''' | 71 | В | cmp | Compare | | | X | 31 | 508 | | | 79 | В | cmpb | Compare Bytes | | | D | 11 | 300 | | | 71 | В | cmpi | Compare Immediate | | | X | 31 | 32 | | | 72 | В | cmpl | Compare Logical | | | D D | 10 | 02 | | | 72 | В | cmpli | Compare Logical Immediate | | | X | 31 | 58 | SR | | 81 | 64 | cntlzd[.] | Count Leading Zeros Doubleword | | | X | 31 | 26 | SR | | 79 | В | cntlzw[.] | Count Leading Zeros Word | | | XL | 19 | 257 | | | 37 | В | crand | Condition Register AND | | | XL | 19 | 129 | | | 38 | В | crandc | Condition Register AND with Complement | | | XL | 19 | 289 | | | 38 | В | creqv | Condition Register Equivalent | | | XL | 19 | 225 | | | 37 | В | crnand | Condition Register NAND | | | XL | 19 | 33 | | | 38 | В | crnor | Condition Register NOR | | | XL | 19 | 449 | | | 37 | В | cror | Condition Register OR | | | XL | 19 | 417 | | | 38 | В | crorc | Condition Register OR with Complement | | | XL | 19 | 193 | | | 37 | В | crxor | Condition Register XOR | | | Χ | 59 | 2 | | | 163 | DFP | dadd | DFP Add | | | Χ | 63 | 2 | | | 163 | DFP | daddq | DFP Add Quad | | | Χ | 31 | 758 | | | 433 | Е | dcba | Data Cache Block Allocate | | | Χ | 31 | 86 | | | 437 | В | dcbf | Data Cache Block Flush | | | X | 31 | 127 | | Р | 632 | E.PD | dcbfep | Data Cache Block Flush by External PID | | _ | X | 31 | 470 | | Р | 652 | Ε. | dcbi | Data Cache Block Invalidate | | | X | 31 | 390 | | М | 656 | ECL | dcblc | Data Cache Block Lock Clear | | | X | 31 | 54 | | | 436 | В | dcbst | Data Cache Block Store | | | X | 31 | 63 | | Р | 631 | E.PD | dcbstep | Data Cache Block Store by External PID | | | Χ | 31 | 278 | ] | | 434 | В | dcbt | Data Cache Block Touch | | ۶ | Ор | code | e | ۲, | | - 1 | | | |------------|----------|------------|---------------------------|--------|------------|------------------|---------------------------|----------------------------------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Pri | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | X | 31<br>31 | 319<br>166 | | P<br>M | 631<br>655 | E.PD<br>ECL | dcbtep<br>dcbtls | Data Cache Block Touch by External PID Data Cache Block Touch and Lock Set | | ı ^ | 31 | 246 | | IVI | 435 | B | dcbtst | Data Cache Block Touch for Store | | ı Î | 31 | 255 | | Р | 633 | E.PD | dcbtstep | Data Cache Block Touch for Store by External PID | | X | 31 | 134 | | M | 655 | ECL | dcbtstls | Data Cache Block Touch for Store and Lock Set | | X | 31 | 1014 | | | 436 | B | dcbz | Data Cache Block set to Zero | | X | 31 | 1023 | | Р | 634 | E.PD | dcbzep | Data Cache Block set to Zero by External PID | | Χ | 63 | 802 | | | 185 | DFP | dcffixq | DFP Convert From Fixed Quad | | Χ | 31 | 454 | | Р | 727 | E.CI | dci | Data Cache Invalidate | | X | 59 | 130 | | | 169 | DFP | dcmpo | DFP Compare Ordered | | X | 63 | 130 | | | 169 | DFP | dcmpoq | DFP Compare Ordered Quad | | X | 59 | 642 | | | 168 | DFP | dcmpu | DFP Compare Unordered | | X | 63 | 642 | | _ | 169 | DFP | dcmpuq | DFP Compare Unordered Quad | | X | 31 | 326 | | Р | 730 | E.CD | dcread | Data Cache Read [Alternative Encoding] | | X | 31 | 486 | | Р | 730 | E.CD | dcread | Data Cache Read | | X | 59<br>59 | 258<br>290 | | | 183<br>185 | DFP<br>DFP | dctdp<br>dctfix | DFP Convert To DFP Long DFP Convert To Fixed | | x | 63 | 290 | | | 185 | DFP | dctfixq | DFP Convert To Fixed Quad | | X | 63 | 258 | | | 183 | DFP | dctqpq | DFP Convert To DFP Extended | | X | 59 | 322 | | | 187 | DFP | ddedpd | DFP Decode DPD To BCD | | X | 63 | 322 | | | 187 | DFP | ddedpdq | DFP Decode DPD To BCD Quad | | X | 59 | 546 | | | 166 | DFP | ddiv | DFP Divide | | X | 63 | 546 | | | 166 | DFP | ddivq | DFP Divide Quad | | X | 59 | 834 | | | 187 | DFP | denbcd | DFP Encode BCD To DPD | | X | 63 | 834 | | | 187 | DFP | denbcdq | DFP Encode BCD To DPD Quad | | X | 59 | 866 | | | 188 | DFP | diex | DFP Insert Biased Exponent | | X X | 63 | 866 | 0.0 | | 188 | DFP | diexq | DFP Insert Biased Exponent Quad | | XO | 31 | 489 | SR<br>SR | | 70 | 64 | divd[o][.] | Divide Doubleword | | XO<br>XO | 31<br>31 | 457<br>491 | SR | | 70<br>68 | 64<br>B | divdu[o][.]<br>divw[o][.] | Divide Doubleword Unsigned Divide Word | | XO | 31 | 459 | SR | | 68 | В | divwu[o][.] | Divide Word Unsigned | | X | 31 | 78 | 0.1 | | 349 | LMV | dlmzb[.] | Determine Leftmost Zero Byte | | X | 59 | 34 | | | 165 | DFP | dmul | DFP Multiply | | X | 63 | 34 | | | 165 | DFP | dmulq | DFP Multiply Quad | | XFX | 19 | 198 | | | 718 | E.ED | dnh | Debugger Notify Halt | | XL | 19 | 402 | | Н | 482 | S | doze | Doze | | Z | 59 | 3 | | | 174 | DFP | dqua | DFP Quantize | | Z23 | 59 | 67 | | | 173 | DFP | dquai[.] | DFP Quantize Immediate | | Z23<br>Z23 | 63<br>63 | 67<br>3 | | | 173<br>174 | DFP<br>DFP | dquaiq[.] | DFP Quantize Immediate Quad DFP Quantize Quad | | X | 63 | 770 | | | 184 | DFP | dquaq[.]<br>drdpq | DFP Round To DFP Long | | Z23 | 59 | 227 | | | 181 | DFP | drintn[.] | DFP Round To FP Integer Without Inexact | | Z23 | 63 | 227 | | | 181 | DFP | drintnq[.] | DFP Round To FP Integer Without Inexact Quad | | Z23 | 59 | 99 | | | 179 | DFP | drintx[.] | DFP Round To FP Integer With Inexact | | Z23 | 63 | 99 | | | 179 | DFP | drintxq[.] | DFP Round To FP Integer With Inexact Quad | | Z | 59 | 35 | | | 176 | DFP | drrnd | DFP Reround | | Z23 | 63 | 35 | | | 176 | DFP | drrndq[.] | DFP Reround Quad | | X | 59 | 770 | | | 184 | DFP | drsp | DFP Round To DFP Short | | Z23 | 59 | 66 | | | 190 | DFP | dscli[.] | DFP Shift Significand Left Immediate | | Z23 | 63 | 66 | | | 190 | DFP | dscliq[.] | DFP Shift Significand Left Immediate Quad | | Z<br>Z | 59<br>63 | 98<br>98 | | | 190<br>190 | DFP<br>DFP | dscri<br>dscriq | DFP Shift Significand Right Immediate DFP Shift Significand Right Immediate Quad | | X | 59 | 514 | | | 163 | DFP | dsub | DFP Subtract | | X | 63 | 514 | | | 163 | DFP | dsubq | DFP Subtract Quad | | Z23 | 59 | 194 | | | 170 | DFP | dtstdc | DFP Test Data Class | | Z23 | 63 | 194 | | | 170 | DFP | dtstdcq | DFP Test Data Class Quad | | Z23 | 59 | 226 | | | 170 | DFP | dtstdg | DFP Test Data Group | | Z23 | 63 | 226 | | | 170 | DFP | dtstdgq | DFP Test Data Group Quad | | X | 59 | 162 | | | 171 | DFP | dtstex | DFP Test Exponent | | X | 63 | 162 | | | 171 | DFP | dtstexq | DFP Test Exponent Quad | | Χ | 59 | 674 | | | 172 | DFP | dtstsf | DFP Test Significance | | ٦ | Op | code | o ⁻. | _ | | | | | |------|-----|------|---------------------------|------|------|------------------|-----------|-----------------------------------------------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Priv | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | X | 63 | 674 | | | 172 | DFP | dtstsfq | DFP Test Significance Quad | | X | 59 | 354 | | | 188 | DFP | dxex | DFP Extract Biased Exponent | | X | 63 | 354 | | | 188 | DFP | dxexq | DFP Extract Biased Exponent Quad | | X | 31 | 310 | | | 456 | EC | eciwx | External Control In Word Indexed | | Χ | 31 | 438 | | | 456 | EC | ecowx | External Control Out Word Indexed | | EVX | 4 | 740 | | | 335 | SP.FD | efdabs | Floating-Point Double-Precision Absolute Value | | EVX | 4 | 736 | | | 336 | SP.FD | efdadd | Floating-Point Double-Precision Add | | EVX | 4 | 751 | | | 342 | SP.FD | efdcfs | Floating-Point Double-Precision Convert from Single-Precision | | EVX | 4 | 755 | | | 340 | SP.FD | efdcfsf | Convert Floating-Point Double-Precision from Signed Fraction | | EVX | 4 | 753 | | | 339 | SP.FD | efdcfsi | Convert Floating-Point Double-Precision from Signed Integer | | EVX | 4 | 739 | | | 340 | SP.FD | efdcfsid | Convert Floating-Point Double-Precision from Signed Integer Doubleword | | EVX | 4 | 754 | | | 340 | SP.FD | efdcfuf | Convert Floating-Point Double-Precision from Unsigned Fraction | | EVX | 4 | 752 | | | 339 | SP.FD | efdcfui | Convert Floating-Point Double-Precision from Unsigned Integer | | EVX | 4 | 738 | | | 340 | SP.FD | efdcfuid | Convert Floating-Point Double-Precision from Unsigned Integer Doubleword | | EVX | 4 | 750 | | | 337 | SP.FD | efdcmpeq | Floating-Point Double-Precision Compare Equal | | EVX | 4 | 748 | | | 337 | SP.FD | efdcmpgt | Floating-Point Double-Precision Compare Greater Than | | EVX | 4 | 749 | | | 337 | SP.FD | efdcmplt | Floating-Point Double-Precision Compare Less Than | | EVX | 4 | 759 | | | 342 | SP.FD | efdctsf | Convert Floating-Point Double-Precision to Signed Fraction | | EVX | 4 | 757 | | | 340 | SP.FD | efdctsi | Convert Floating-Point Double-Precision to Signed Integer | | EVX | 4 | 747 | | | 341 | SP.FD | efdctsidz | Convert Floating-Point Double-Precision to Signed Integer Doubleword with Round toward Zero | | EVX | 4 | 762 | | | 342 | SP.FD | efdctsiz | Convert Floating-Point Double-Precision to Signed Integer with Round toward Zero | | EVX | 4 | 758 | | | 342 | SP.FD | efdctuf | Convert Floating-Point Double-Precision to Unsigned Fraction | | EVX | 4 | 756 | | | 340 | SP.FD | efdctui | Convert Floating-Point Double-Precision to Unsigned Integer | | EVX | 4 | 746 | | | 341 | SP.FD | efdctuidz | Convert Floating-Point Double-Precision to Unsigned Integer Doubleword with Round toward Zero | | EVX | 4 | 760 | | | 342 | SP.FD | efdctuiz | Convert Floating-Point Double-Precision to Unsigned Integer with Round toward Zero | | EVX | 4 | 745 | | | 336 | SP.FD | efddiv | Floating-Point Double-Precision Divide | | EVX | 4 | 744 | | | 336 | SP.FD | efdmul | Floating-Point Double-Precision Multiply | | EVX | 4 | 741 | | | 335 | SP.FD | efdnabs | Floating-Point Double-Precision Negative Absolute Value | | EVX | 4 | 742 | | | 335 | SP.FD | efdneg | Floating-Point Double-Precision Negate | | EVX | 4 | 737 | | l | 336 | SP.FD | efdsub | Floating-Point Double-Precision Subtract | | EVX | 4 | 766 | | l | 338 | SP.FD | efdtsteg | Floating-Point Double-Precision Test Equal | | EVX | 4 | 764 | | | 337 | SP.FD | efdtstgt | Floating-Point Double-Precision Test Greater Than | | EVX | 4 | 765 | | | 338 | SP.FD | efdtstlt | Floating-Point Double-Precision Test Less Than | | EVX | 4 | 708 | | l | 328 | SP.FS | efsabs | Floating-Point Single-Precision Absolute Value | | EVX | 4 | 704 | | | 329 | SP.FS | efsadd | Floating-Point Single-Precision Add | | EVX | 4 | 719 | | | 343 | SP.FD | efscfd | Floating-Point Single-Precision Convert from Double-Precision | | EVX | 4 | 723 | | | 333 | SP.FS | efscfsf | Convert Floating-Point Single-Precision from Signed Fraction | | EVX | 4 | 721 | | | 333 | SP.FS | efscfsi | Convert Floating-Point Single-Precision from Signed | | EVX | 4 | 722 | | | 333 | SP.FS | efscfuf | Integer Convert Floating-Point Single-Precision from Unsigned Fraction | | _ | Ор | code | <u>o</u> ⁻. | | | | | | |--------------------------------------------------------------------|---------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------|---------------------------|------|----------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Priv | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | EVX | 4 | 720 | | | 333 | SP.FS | efscfui | Convert Floating-Point Single-Precision from Unsigned Integer | | EVX<br>EVX<br>EVX | 4<br>4<br>4 | 718<br>716<br>717<br>727 | | | 331<br>330<br>330<br>334 | SP.FS<br>SP.FS<br>SP.FS<br>SP.FS | efscmpeq<br>efscmpgt<br>efscmplt<br>efsctsf | Floating-Point Single-Precision Compare Equal<br>Floating-Point Single-Precision Compare Greater Than<br>Floating-Point Single-Precision Compare Less Than<br>Convert Floating-Point Single-Precision to Signed Frac- | | EVX | 4 | 725 | | | 333 | SP.FS | efsctsi | tion Convert Floating-Point Single-Precision to Signed Inte- | | EVX | 4 | 730 | | | 334 | SP.FS | efsctsiz | ger Convert Floating-Point Single-Precision to Signed Integer with Round toward Zero | | EVX | 4 | 726 | | | 334 | SP.FS | efsctuf | Convert Floating-Point Single-Precision to Unsigned Fraction | | EVX | 4 | 724 | | | 333 | SP.FS | efsctui | Convert Floating-Point Single-Precision to Unsigned Integer | | EVX | 4 | 728 | | | 334 | SP.FS | efsctuiz | Convert Floating-Point Single-Precision to Unsigned Integer with Round toward Zero | | EVX<br>EVX | 4<br>4<br>4 | 713<br>712<br>709 | | | 329<br>329<br>328 | SP.FS<br>SP.FS<br>SP.FS | efsdiv<br>efsmul<br>efsnabs | Floating-Point Single-Precision Divide Floating-Point Single-Precision Multiply Floating-Point Single-Precision Negative Absolute Value | | EVX<br>EVX<br>EVX<br>EVX<br>X | 4<br>4<br>4<br>4<br>31<br>31 | 710<br>705<br>734<br>732<br>733<br>854<br>284 | SR | | 328<br>329<br>332<br>331<br>332<br>448<br>78 | SP.FS<br>SP.FS<br>SP.FS<br>SP.FS<br>SP.FS<br>B | efsneg<br>efssub<br>efststeq<br>efststgt<br>efststlt<br>eieio | Floating-Point Single-Precision Negate Floating-Point Single-Precision Subtract Floating-Point Single-Precision Test Equal Floating-Point Single-Precision Test Greater Than Floating-Point Single-Precision Test Less Than Enforce In-order Execution of I/O Equivalent | | EVX<br>EVX<br>EVX | 4<br>4<br>4 | 520<br>514<br>1225 | SK | | 268<br>268<br>268 | SP<br>SP<br>SP | eqv[.]<br>evabs<br>evaddiw<br>evaddsmiaaw | Vector Absolute Value Vector Add Immediate Word Vector Add Signed, Modulo, Integer to Accumulator Word | | EVX | 4 | 1217 | | | 269 | SP | evaddssiaaw | Vector Add Signed, Saturate, Integer to Accumulator Word | | EVX | 4 | 1224 | | | 269 | SP | evaddumiaaw | Vector Add Unsigned, Modulo, Integer to Accumulator Word | | EVX | 4 | 1216 | | | 269 | SP | evaddusiaaw | Vector Add Unsigned, Saturate, Integer to Accumulator Word | | EVX<br>EVX<br>EVX<br>EVX<br>EVX<br>EVX<br>EVX<br>EVX<br>EVX<br>EVX | 4<br>4<br>4<br>4<br>4<br>4<br>4<br>4<br>4<br>4<br>4<br>4<br>4<br>4<br>4<br>4<br>4<br>4<br>4 | 512<br>529<br>530<br>564<br>561<br>560<br>563<br>562<br>525<br>1222<br>1223<br>537<br>522<br>523<br>644<br>640<br>659 | | | 269<br>270<br>270<br>270<br>271<br>271<br>271<br>271<br>272<br>272<br>272<br>273<br>273<br>273<br>273<br>273<br>273<br>273 | SP<br>SP<br>SP<br>SP<br>SP<br>SP<br>SP<br>SP<br>SP<br>SP<br>SP<br>SP<br>SP<br>S | evaddw evand evandc evandc evcmpeq evcmpgts evcmpltu evcmpltu evcntlsw evcntlzw evdivwu eveqv evextsb evextsb evfsabs evfsadd evfscfsi | Vector Add Word Vector AND Vector AND with Complement Vector Compare Equal Vector Compare Greater Than Signed Vector Compare Greater Than Unsigned Vector Compare Less Than Signed Vector Compare Less Than Unsigned Vector Compare Less Than Unsigned Vector Count Leading Signed Bits Word Vector Count Leading Signed Bits Word Vector Divide Word Signed Vector Divide Word Unsigned Vector Divide Word Unsigned Vector Equivalent Vector Extend Sign Byte Vector Extend Sign Halfword Vector Floating-Point Single-Precision Absolute Value Vector Floating-Point Single-Precision from Signed Fraction Vector Convert Floating-Point Single-Precision from | | | · | 30, | | | J_ 1 | J V | | Signed Integer | | _ | Ор | code | o۲. | F | | | | | |------|---------|------------|---------------------------|------|------------|------------------|-----------------------------|---------------------------------------------------------------------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Priv | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | EVX | 4 | 658 | | | 324 | SP.FV | evfscfuf | Vector Convert Floating-Point Single-Precision from<br>Unsigned Fraction | | EVX | 4 | 656 | | | 324 | SP.FV | evfscfui | Vector Convert Floating-Point Single-Precision from<br>Unsigned Integer | | EVX | 4 | 654<br>652 | | | 322<br>321 | SP.FV<br>SP.FV | evfscmpeq<br>evfscmpgt | Vector Floating-Point Single-Precision Compare Equal Vector Floating-Point Single-Precision Compare Greater Than | | EVX | 4 | 653 | | | 321 | SP.FV | evfscmplt | Vector Floating-Point Single-Precision Compare Less Than | | EVX | 4 | 663 | | | 326 | SP.FV | evfsctsf | Vector Convert Floating-Point Single-Precision to Signed Fraction | | EVX | 4 | 661 | | | 325 | SP.FV | evfsctsi | Vector Convert Floating-Point Single-Precision to Signed Integer | | EVX | 4 | 666 | | | 325 | SP.FV | evfsctsiz | Vector Convert Floating-Point Single-Precision to<br>Signed Integer with Round toward Zero | | EVX | 4 | 662 | | | 326 | SP.FV | evfsctuf | Vector Convert Floating-Point Single-Precision to Unsigned Fraction | | EVX | 4 | 660 | | | 325 | SP.FV | evfsctui | Vector Convert Floating-Point Single-Precision to Unsigned Integer | | EVX | 4 | 664 | | | 325 | SP.FV | evfsctuiz | Vector Convert Floating-Point Single-Precision to Unsigned Integer with Round toward Zero | | EVX | 4 | 649 | | | 320 | SP.FV | evfsdiv | Vector Floating-Point Single-Precision Divide | | EVX | 4 | 648<br>645 | | | 320<br>319 | SP.FV<br>SP.FV | evfsmul<br>evfsnabs | Vector Floating-Point Single-Precision Multiply Vector Floating-Point Single-Precision Negative Absolute Value | | EVX | 4 | 646 | | | 319 | SP.FV | evfsneg | Vector Floating-Point Single-Precision Negate | | EVX | 4 | 641<br>670 | | | 320<br>323 | SP.FV<br>SP.FV | evfssub<br>evfststeg | Vector Floating-Point Single-Precision Subtract<br>Vector Floating-Point Single-Precision Test Equal | | EVX | 4 | 668 | | | 322 | SP.FV | evfststgt | Vector Floating-Point Single-Precision Test Greater Than | | EVX | 4 | 669 | | | 323 | SP.FV | evfststlt | Vector Floating-Point Single-Precision Test Less Than | | EVX | 4<br>31 | 769<br>285 | | Р | 274<br>636 | SP<br>E.PD | evldd<br>evlddepx | Vector Load Double Word into Double Word Vector Load Doubleword into Doubleword by External | | EVX | 4 | 768 | | | 274 | SP | evlddx | Process ID Indexed Vector Load Double Word into Double Word Indexed | | EVX | 4 | 773 | | | 274 | SP | evldh | Vector Load Double into Four Halfwords | | EVX | 4 | 772 | | | 274 | SP | evldhx | Vector Load Double into Four Halfwords Indexed | | EVX | 4 | 771 | | | 275 | SP | evldw | Vector Load Double into Two Words | | EVX | 4 | 770 | | | 275 | SP | evldwx | Vector Load Double into Two Words Indexed | | EVX | 4 | 777<br>776 | | | 275<br>275 | SP<br>SP | evlhhesplat<br>evlhhesplatx | Vector Load Halfword into Halfwords Even and Splat<br>Vector Load Halfword into Halfwords Even and Splat<br>Indexed | | EVX | 4 | 783 | | | 276 | SP | evlhhossplat | Vector Load Halfword into Halfword Odd Signed and Splat | | EVX | 4 | 782 | | | 276 | SP | evlhhossplatx | Vector Load Halfword into Halfword Odd Signed and Splat Indexed | | EVX | 4 | 781 | | | 276 | SP | evlhhousplat | Vector Load Halfword into Halfword Odd Unsigned and Splat | | EVX | 4 | 780 | | | 276 | SP | evlhhousplatx | Vector Load Halfword into Halfword Odd Unsigned and Splat Indexed | | EVX | 4 | 785 | | | 277 | SP | evlwhe | Vector Load Word into Two Halfwords Even | | EVX | 4 | 784 | | | 277 | SP | evlwhex | Vector Load Word into Two Halfwords Even Indexed | | EVX | 4 | 791 | | | 277 | SP | evlwhos | Vector Load Word into Two Halfwords Odd Signed (with sign extension) | | EVX | 4 | 790 | | | 277 | SP | evlwhosx | Vector Load Word into Two Halfwords Odd Signed Indexed (with sign extension) | | EVX | 4 | 789 | | | 278 | SP | evlwhou | Vector Load Word into Two Halfwords Odd Unsigned (zero-extended) | | EVX | 4 | 788 | | | 278 | SP | evlwhoux | Vector Load Word into Two Halfwords Odd Unsigned Indexed (zero-extended) | | ٤ | Op | code | .₁e | ۲, | | - 1 | | | |------|-----|--------------|--------------|------|------------|------------------|---------------------------|-----------------------------------------------------------------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. | Pri∖ | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | EVX | 4 | 797<br>796 | | | 278<br>278 | SP<br>SP | evlwhsplat<br>evlwhsplatx | Vector Load Word into Two Halfwords and Splat<br>Vector Load Word into Two Halfwords and Splat<br>Indexed | | EVX | 4 | 793 | | | 279 | SP | evlwwsplat | Vector Load Word into Word and Splat | | EVX | 4 | 792 | | | 279 | SP | evlwwsplatx | Vector Load Word into Word and Splat Indexed | | EVX | 4 | 556 | | | 279 | SP | evmergehi | Vector Merge High | | EVX | 4 | 558<br>557 | | | 280<br>279 | SP<br>SP | evmergehilo<br>evmergelo | Vector Merge High/Low<br>Vector Merge Low | | EVX | 4 | 559 | | | 280 | SP | evmergelohi | Vector Merge Low/High | | EVX | 4 | 1323 | | | 280 | SP | evmhegsmfaa | Vector Multiply Halfwords, Even, Guarded, Signed, | | | | | | | | | | Modulo, Fractional and Accumulate | | EVX | 4 | 1451 | | | 280 | SP | evmhegsmfan | Vector Multiply Halfwords, Even, Guarded, Signed, Modulo, Fractional and Accumulate Negative | | EVX | 4 | 1321 | | | 281 | SP | evmhegsmiaa | Vector Multiply Halfwords, Even, Guarded, Signed, Modulo, Integer and Accumulate | | EVX | 4 | 1449 | | | 281 | SP | evmhegsmian | Vector Multiply Halfwords, Even, Guarded, Signed, Modulo, Integer and Accumulate Negative | | EVX | 4 | 1320 | | | 281 | SP | evmhegumiaa | Vector Multiply Halfwords, Even, Guarded, Unsigned, Modulo, Integer and Accumulate | | EVX | 4 | 1448 | | | 281 | SP | evmhegumian | Vector Multiply Halfwords, Even, Guarded, Unsigned, Modulo, Integer and Accumulate Negative | | EVX | 4 | 1035 | | | 282 | SP | evmhesmf | Vector Multiply Halfwords, Even, Signed, Modulo, Fractional | | EVX | 4 | 1067 | | | 282 | SP | evmhesmfa | Vector Multiply Halfwords, Even, Signed, Modulo, Fractional to Accumulator | | EVX | 4 | 1291 | | | 282 | SP | evmhesmfaaw | Vector Multiply Halfwords, Even, Signed, Modulo, Fractional and Accumulate into Words | | EVX | 4 | 1419 | | | 282 | SP | evmhesmfanw | Vector Multiply Halfwords, Even, Signed, Modulo, Fractional and Accumulate Negative into Words | | EVX | 4 | 1033<br>1065 | | | 283 | SP<br>SP | evmhesmi<br>evmhesmia | Vector Multiply Halfwords, Even, Signed, Modulo, Integer Vector Multiply Halfwords, Even, Signed, Modulo, Inte- | | EVX | 4 | 1289 | | | 283 | SP | evmhesmiaaw | ger to Accumulator Vector Multiply Halfwords, Even, Signed, Modulo, Inte- | | | | | | | | | | ger and Accumulate into Words | | EVX | 4 | 1417 | | | 283 | SP | evmhesmianw | Vector Multiply Halfwords, Even, Signed, Modulo, Integer and Accumulate Negative into Words | | EVX | 4 | 1027 | | | 284 | SP | evmhessf | Vector Multiply Halfwords, Even, Signed, Saturate, Fractional | | EVX | 4 | 1059 | | | 284 | SP | evmhessfa | Vector Multiply Halfwords, Even, Signed, Saturate, Fractional to Accumulator | | EVX | 4 | 1283 | | | 285 | SP | evmhessfaaw | Vector Multiply Halfwords, Even, Signed, Saturate, Fractional and Accumulate into Words | | EVX | 4 | 1411 | | | 285 | SP | evmhessfanw | Vector Multiply Halfwords, Even, Signed, Saturate,<br>Fractional and Accumulate Negative into Words | | EVX | 4 | 1281 | | | 286 | SP | evmhessiaaw | Vector Multiply Halfwords, Even, Signed, Saturate, Integer and Accumulate into Words | | EVX | 4 | 1409 | | | 286 | SP | evmhessianw | Vector Multiply Halfwords, Even, Signed, Saturate, Integer and Accumulate Negative into Words | | EVX | 4 | 1032 | | | 287 | SP | evmheumi | Vector Multiply Halfwords, Even, Unsigned, Modulo, Integer | | EVX | 4 | 1064 | | | 287 | SP | evmheumia | Vector Multiply Halfwords, Even, Unsigned, Modulo, Integer to Accumulator | | EVX | 4 | 1288 | | | 287 | SP | evmheumiaaw | Vector Multiply Halfwords, Even, Unsigned, Modulo, Integer and Accumulate into Words | | EVX | 4 | 1416 | | | 287 | SP | evmheumianw | Vector Multiply Halfwords, Even, Unsigned, Modulo,<br>Integer and Accumulate Negative into Words | | EVX | 4 | 1280 | | | 288 | SP | evmheusiaaw | Vector Multiply Halfwords, Even, Unsigned, Saturate,<br>Integer and Accumulate into Words | | ٤ | Ор | code | e | ر ا | | - 1 | | | |------|-----|--------------|--------------|-----|------------|------------------|-----------------------|----------------------------------------------------------------------------------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. | Pri | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | EVX | 4 | 1408 | | | 288 | SP | evmheusianw | Vector Multiply Halfwords, Even, Unsigned, Saturate, | | EVX | 4 | 1327 | | | 289 | SP | evmhogsmfaa | Integer and Accumulate Negative into Words Vector Multiply Halfwords, Odd, Guarded, Signed, Mod- ulo, Fractional and Accumulate | | EVX | 4 | 1455 | | | 289 | SP | evmhogsmfan | Vector Multiply Halfwords, Odd, Guarded, Signed, Mod- | | EVX | 4 | 1325 | | | 289 | SP | evmhogsmiaa | ulo, Fractional and Accumulate Negative Vector Multiply Halfwords, Odd, Guarded, Signed, Mod- ulo, Integer and Accumulate | | EVX | 4 | 1453 | | | 289 | SP | evmhogsmian | Vector Multiply Halfwords, Odd, Guarded, Signed, Mod- | | EVX | 4 | 1324 | | | 290 | SP | evmhogumiaa | ulo, Integer and Accumulate Negative Vector Multiply Halfwords, Odd, Guarded, Unsigned, Modulo, Integer and Accumulate | | EVX | 4 | 1452 | | | 290 | SP | evmhogumian | Vector Multiply Halfwords, Odd, Guarded, Unsigned, | | EVX | 4 | 1039 | | | 290 | SP | evmhosmf | Modulo, Integer and Accumulate Negative<br>Vector Multiply Halfwords, Odd, Signed, Modulo, Frac-<br>tional | | EVX | 4 | 1071 | | | 290 | SP | evmhosmfa | Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional to Accumulator | | EVX | 4 | 1295 | | | 291 | SP | evmhosmfaaw | Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional and Accumulate into Words | | EVX | 4 | 1423 | | | 291 | SP | evmhosmfanw | Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional and Accumulate Negative into Words | | EVX | 4 | 1037 | | | 291 | SP | evmhosmi | Vector Multiply Halfwords, Odd, Signed, Modulo, Integer | | EVX | 4 | 1069 | | | 291 | SP | evmhosmia | Vector Multiply Halfwords, Odd, Signed, Modulo, Integer to Accumulator | | EVX | 4 | 1293 | | | 292 | SP | evmhosmiaaw | Vector Multiply Halfwords, Odd, Signed, Modulo, Integer and Accumulate into Words | | EVX | 4 | 1421 | | | 291 | SP | evmhosmianw | Vector Multiply Halfwords, Odd, Signed, Modulo, Inte- | | EVX | 4 | 1031 | | | 293 | SP | evmhossf | ger and Accumulate Negative into Words Vector Multiply Halfwords, Odd, Signed, Saturate, Fractional | | EVX | 4 | 1063 | | | 293 | SP | evmhossfa | Vector Multiply Halfwords, Odd, Signed, Saturate, Frac- | | EVX | 4 | 1287 | | | 294 | SP | evmhossfaaw | tional to Accumulator Vector Multiply Halfwords, Odd, Signed, Saturate, Fractional and Accumulate into Words | | EVX | 4 | 1415 | | | 294 | SP | evmhossfanw | Vector Multiply Halfwords, Odd, Signed, Saturate, Frac- | | EVX | 4 | 1285 | | | 295 | SP | evmhossiaaw | tional and Accumulate Negative into Words Vector Multiply Halfwords, Odd, Signed, Saturate, Integer and Accumulate into Words | | EVX | 4 | 1413 | | | 295 | SP | evmhossianw | Vector Multiply Halfwords, Odd, Signed, Saturate, Inte- | | EVX | 4 | 1036 | | | 295 | SP | evmhoumi | ger and Accumulate Negative into Words<br>Vector Multiply Halfwords, Odd, Unsigned, Modulo,<br>Integer | | EVX | 4 | 1068 | | | 295 | SP | evmhoumia | Vector Multiply Halfwords, Odd, Unsigned, Modulo, | | EVX | 4 | 1292 | | | 296 | SP | evmhoumiaaw | Integer to Accumulator Vector Multiply Halfwords, Odd, Unsigned, Modulo, Integer and Accumulate into Words | | EVX | 4 | 1420 | | | 292 | SP | evmhoumianw | Vector Multiply Halfwords, Odd, Unsigned, Modulo, | | EVX | 4 | 1284 | | | 296 | SP | evmhousiaaw | Integer and Accumulate Negative into Words Vector Multiply Halfwords, Odd, Unsigned, Saturate, Integer and Accumulate into Words | | EVX | 4 | 1412 | | | 296 | SP | evmhousianw | Vector Multiply Halfwords, Odd, Unsigned, Saturate, Integer and Accumulate Negative into Words | | EVX | 4 | 1220 | | | 297 | SP | evmra | Initialize Accumulator | | EVX | 4 | 1103<br>1135 | | | 297<br>297 | SP<br>SP | evmwhsmf<br>evmwhsmfa | Vector Multiply Word High Signed, Modulo, Fractional Vector Multiply Word High Signed, Modulo, Fractional to | | | | | | | | | | Accumulator | | EVX | 4 | 1101 | | | 297 | SP | evmwhsmi | Vector Multiply Word High Signed, Modulo, Integer | | EVX 4 1133 S 297 SP evmwhsmia Vector Multiply Word High Signed, Modulo, Integer Accumulator EVX 4 1109 298 SP evmwhssfa vector Multiply Word High Signed, Saturate, Frac Vector Multiply Word High Unsigned, Modulo, Integer Accumulator EVX 4 1100 298 SP evmwhumia vector Multiply Word High Unsigned, Modulo, Integer Accumulator EVX 4 1132 298 SP evmwhumia vector Multiply Word High Unsigned, Modulo, Integer Accumulator EVX 4 1353 299 SP evmwlsmiaaw vector Multiply Word Low Signed, Modulo, Integer Accumulate Negative in Words EVX 4 1481 299 SP evmwlssiaaw vector Multiply Word Low Signed, Modulo, Integer Accumulate Negative in Words vector Multiply Word Low Unsigned, Modulo, Integer Accumulate Negative in Words EVX 4 1352 300 SP evmwlumiaw vector Multiply Word Low Unsigned, Modulo, Integer Accumulate Negative in Words EVX 4 1344 301 SP evmwlumiaw vector Multiply Word Low Unsigned, Modulo, Integer Accumulate Negative | _ | Ор | code | <u>o</u> ⊤. | | | | | | |--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|-----|------|-------------|------|------|------------------|-------------|---------------------------------------------------------------------------------------------------------------| | EVX 4 1133 Perminsmia Vector Multiply Word High Signed, Modulo, Integer Accumulator EVX 4 1127 298 SP evmwhsfa vector Multiply Word High Signed, Saturate, Frac Locumulator EVX 4 1100 298 SP evmwhumia vector Multiply Word High Unsigned, Modulo, Integer Accumulator EVX 4 11353 299 SP evmwhamiaaw vector Multiply Word High Unsigned, Modulo, Integer Accumulate EVX 4 1345 299 SP evmwlssiaaw vector Multiply Word Low Signed, Modulo, Integer Accumulate into Words vector Multiply Word Low Signed, Modulo, Integer Accumulate into Words vector Multiply Word Low Signed, Saturate, Integer Accumulate Negative in Words vector Multiply Word Low Signed, Saturate, Integer Accumulate Negative in Words vector Multiply Word Low Unsigned, Modulo, Integer Accumulate Negative in Words vector Multiply Word Low Unsigned, Modulo, Integer Accumulate into Words vector Multiply Word Low Unsigned, Modulo, Integer Accumulate into Words vector Multiply Word Low Unsigned, Modulo, Integer Accumulate into Words vector Multiply Word Low Unsigned, Modulo, Integer Accumulate into Words vector Multiply Word Low Unsigned, Modulo, Integer Accumulate Negative in Words vector Multiply Word Low Unsigned, Modulo, Integer Accumulate Negative in Words | Forn | Pri | Ext | Mod<br>Dep | Priv | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | EVX 4 1095 298 SP evmwhssfa evmwhsfa Vector Multiply Word High Signed, Saturate, Fractor Accumulator EVX 4 1100 298 SP evmwhsfa Vector Multiply Word High Unsigned, Modulo, Integer Accumulator EVX 4 11353 299 SP evmwlsmiaaw Vector Multiply Word High Unsigned, Modulo, Integer Accumulate into Words EVX 4 1345 299 SP evmwlsmiaaw Vector Multiply Word Low Signed, Modulo, Integer Accumulate into Words EVX 4 1345 299 SP evmwlssiaaw Vector Multiply Word Low Signed, Saturate, Integer Accumulate Negative in Words EVX 4 1473 299 SP evmwlsiaaw Vector Multiply Word Low Signed, Saturate, Integer Accumulate Negative in Words EVX 4 1728 300 SP evmwlumiaaw Vector Multiply Word Low Unsigned, Modulo, Integer Accumulate Negative in Words EVX 4 1344 301 SP evmwlumiaaw Vector Multiply Word Low Unsigned, Modulo, Integer Accumulate Negative in Words EVX 4 13472 301 SP evmwsmfa Vector Multiply Word Low Unsigned, Modulo, Integer Negative in Words Ve | EVX | 4 | 1133 | | | 297 | SP | evmwhsmia | Vector Multiply Word High Signed, Modulo, Integer to | | EVX 4 1102 298 SP evmwhumi Vector Multiply Word High Unsigned, Modulo, Integer Accumulate EVX 4 1353 299 SP evmwlsmiaaw Vector Multiply Word Low Signed, Modulo, Integer Accumulate EVX 4 1481 299 SP evmwlsniaaw Vector Multiply Word Low Signed, Modulo, Integer Accumulate in Words EVX 4 1481 299 SP evmwlsiaaw Vector Multiply Word Low Signed, Modulo, Integer Accumulate in Words EVX 4 1473 299 SP evmwlsiaaw Vector Multiply Word Low Signed, Saturate, Integer Accumulate Negative in Words EVX 4 1096 300 SP evmwlumi Vector Multiply Word Low Unsigned, Modulo, Integer Accumulate Negative in Words EVX 4 1352 300 SP evmwlumiaww Vector Multiply Word Low Unsigned, Modulo, Integer Accumulate in Words EVX 4 1480 301 SP evmwlumiaww Vector Multiply Word Low Unsigned, Modulo, Integer Accumulate Negative in Words EVX 4 1472 301 SP evmwsmfa <td></td> <td></td> <td></td> <td></td> <td></td> <td></td> <td></td> <td></td> <td>Vector Multiply Word High Signed, Saturate, Fractional Vector Multiply Word High Signed, Saturate, Fractional</td> | | | | | | | | | Vector Multiply Word High Signed, Saturate, Fractional Vector Multiply Word High Signed, Saturate, Fractional | | EVX 4 1353 299 SP evmwlsmiaaw Vector Multiply Word Low Signed, Modulo, Integer Accumulate into Words EVX 4 1345 299 SP evmwlsmianw Vector Multiply Word Low Signed, Modulo, Integer Accumulate Negative in Words EVX 4 1473 299 SP evmwlssiaaw Vector Multiply Word Low Signed, Saturate, Integer Accumulate Negative in Words EVX 4 1096 300 SP evmwlumi Vector Multiply Word Low Unsigned, Modulo, Integer Accumulate Negative in Words EVX 4 1128 300 SP evmwlumia Vector Multiply Word Low Unsigned, Modulo, Integer Accumulate Negative in Words EVX 4 1352 300 SP evmwlumiaaw Vector Multiply Word Low Unsigned, Modulo, Integer Accumulate Negative in Words EVX 4 1480 300 SP evmwlumiaaw Vector Multiply Word Low Unsigned, Modulo, Integer Accumulate into Words EVX 4 1472 301 SP evmwlumiaaw Vector Multiply Word Low Unsigned, Modulo, Integer Accumulate into Words Vector Multiply Word Low Unsigned, Modulo, Integer Accumulate into Words <t< td=""><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td>Vector Multiply Word High Unsigned, Modulo, Integer<br/>Vector Multiply Word High Unsigned, Modulo, Integer to</td></t<> | | | | | | | | | Vector Multiply Word High Unsigned, Modulo, Integer<br>Vector Multiply Word High Unsigned, Modulo, Integer to | | EVX 4 1481 299 SP evmwlsmianw Vector Multiply Word Low Signed, Modulo, Integer Accumulate Negative in Words EVX 4 1473 299 SP evmwlssiaaw Vector Multiply Word Low Signed, Saturate, Intege Accumulate into Words EVX 4 1096 300 SP evmwlumi Vector Multiply Word Low Signed, Saturate, Intege Accumulate Negative in Words EVX 4 1128 300 SP evmwlumia Vector Multiply Word Low Unsigned, Modulo, Integer Accumulate Negative in Words EVX 4 1352 300 SP evmwlumiaaw Vector Multiply Word Low Unsigned, Modulo, Integer Accumulate Negative in Words EVX 4 1480 300 SP evmwlumiaaw Vector Multiply Word Low Unsigned, Modulo, Integer Accumulate into Words EVX 4 1472 301 SP evmwlumiaaw Vector Multiply Word Low Unsigned, Modulo, Integer Accumulate Negative in Words EVX 4 1175 301 SP evmwsmf Vector Multiply Word Low Unsigned, Modulo, Integer Accumulate Negative in Words EVX 4 1173 302< | EVX | 4 | 1353 | | | 299 | SP | evmwlsmiaaw | Vector Multiply Word Low Signed, Modulo, Integer and | | EVX 4 1345 299 SP evmwlssiaaw Vector Multiply Word Low Signed, Saturate, Integer Accumulate into Words Yector Multiply Word Low Signed, Saturate, Integer Accumulate Negative in Words EVX 4 1096 300 SP evmwlumi Vector Multiply Word Low Unsigned, Modulo, Integer Accumulate Negative in Words EVX 4 1128 300 SP evmwlumia Vector Multiply Word Low Unsigned, Modulo, Integer and Accumulate into Words EVX 4 1480 300 SP evmwlumianw Vector Multiply Word Low Unsigned, Modulo, Integer and Accumulate into Words EVX 4 1480 301 SP evmwlumianw Vector Multiply Word Low Unsigned, Modulo, Integer and Accumulate into Words EVX 4 1447 301 SP evmwlusiaaw Vector Multiply Word Low Unsigned, Modulo, Integer and Accumulate into Words EVX 4 1472 301 SP evmwsmf Vector Multiply Word Low Unsigned, Saturate, Int and Accumulate into Words EVX 4 1115 301 SP evmwsmf Vector Multiply Word Signed, Modulo, Fractional Accumulate Negative EVX | EVX | 4 | 1481 | | | 299 | SP | evmwlsmianw | Vector Multiply Word Low Signed, Modulo, Integer and | | EVX 4 1473 299 SP evmwlssianw Vector Multiply Word Low Signed, Saturate, Integer Accumulate Negative in Words EVX 4 1096 300 SP evmwlumi Vector Multiply Word Low Unsigned, Modulo, Integer Accumulator EVX 4 1352 300 SP evmwlumiaw Vector Multiply Word Low Unsigned, Modulo, Integer and Accumulate into Words EVX 4 1480 301 SP evmwlusiaaw Vector Multiply Word Low Unsigned, Modulo, Integer and Accumulate into Words EVX 4 1344 301 SP evmwlusiaaw Vector Multiply Word Low Unsigned, Modulo, Integer and Accumulate into Words EVX 4 1472 301 SP evmwlusiaaw Vector Multiply Word Low Unsigned, Saturate, Int and Accumulate into Words EVX 4 1115 301 SP evmwsmfa Vector Multiply Word Low Unsigned, Saturate, Int and Accumulate into Words EVX 4 1117 302 SP evmwsmfa Vector Multiply Word Signed, Modulo, Fractional Accumulate Vector Multiply Word Signed, Modulo, Fractional Accumulate Vector Multiply Word Signed, Modulo, Integer and Accumulate Vector Multiply Word Si | EVX | 4 | 1345 | | | 299 | SP | evmwlssiaaw | Vector Multiply Word Low Signed, Saturate, Integer and | | EVX 4 1096 300 SP evmwlumia Vector Multiply Word Low Unsigned, Modulo, Interest Accumulator EVX 4 1352 300 SP evmwlumiaaw Vector Multiply Word Low Unsigned, Modulo, Interest Accumulate into Words EVX 4 1480 300 SP evmwlumiaaw Vector Multiply Word Low Unsigned, Modulo, Interest and Accumulate into Words EVX 4 1344 301 SP evmwlusiaaw Vector Multiply Word Low Unsigned, Modulo, Interest and Accumulate into Words EVX 4 1472 301 SP evmwlusiaaw Vector Multiply Word Low Unsigned, Modulo, Interest and Accumulate into Words EVX 4 1147 301 SP evmwlusiaaw Vector Multiply Word Low Unsigned, Saturate, Interest and Accumulate Negative in Words EVX 4 1115 301 SP evmwsmfa Vector Multiply Word Signed, Modulo, Fractional stance Accumulate Negative in Words EVX 4 1371 302 SP evmwsmfa Vector Multiply Word Signed, Modulo, Fractional stance Accumulate Negative in Words Vector Multiply Word Signed, Modulo, Integer of Accumulate Negative in Words <td>EVX</td> <td>4</td> <td>1473</td> <td></td> <td></td> <td>299</td> <td>SP</td> <td>evmwlssianw</td> <td>Vector Multiply Word Low Signed, Saturate, Integer and</td> | EVX | 4 | 1473 | | | 299 | SP | evmwlssianw | Vector Multiply Word Low Signed, Saturate, Integer and | | EVX 4 1480 300 SP evmwlumiaaw Vector Multiply Word Low Unsigned, Modulo, Integer Accumulate Negative in Words Spreams (Synthyside) Spre | | | | | | | | | Vector Multiply Word Low Unsigned, Modulo, Integer Vector Multiply Word Low Unsigned, Modulo, Integer to | | EVX 4 1480 300 SP evmwlumianw Vector Multiply Word Low Unsigned, Modulo, Integrand Accumulate Negative in Words EVX 4 1472 301 SP evmwlusiaaw Vector Multiply Word Low Unsigned, Saturate, Intand Accumulate Negative in Words EVX 4 1115 301 SP evmwsmf Vector Multiply Word Low Unsigned, Saturate, Intand Accumulate Negative in Words EVX 4 1147 301 SP evmwsmf Vector Multiply Word Signed, Modulo, Fractional Accumulate EVX 4 1371 302 SP evmwsmfa Vector Multiply Word Signed, Modulo, Fractional Accumulate EVX 4 1499 302 SP evmwsmia Vector Multiply Word Signed, Modulo, Integer Overor Multiply Word Signed, Modulo, Integer to Accumulate Negative EVX 4 1145 302 SP evmwsmiaa Vector Multiply Word Signed, Modulo, Integer to Accumulate Negative EVX 4 1369 302 SP evmwsmiaa Vector Multiply Word Signed, Modulo, Integer and Accumulate Negative EVX 4 1497 303 SP evmwssfa Vector Multiply Word Signed, Saturate, Fractional Accumulate Negative <td>EVX</td> <td>4</td> <td>1352</td> <td></td> <td></td> <td>300</td> <td>SP</td> <td>evmwlumiaaw</td> <td>Vector Multiply Word Low Unsigned, Modulo, Integer</td> | EVX | 4 | 1352 | | | 300 | SP | evmwlumiaaw | Vector Multiply Word Low Unsigned, Modulo, Integer | | EVX 4 1344 301 SP evmwlusiaaw Vector Multiply Word Low Unsigned, Saturate, Int and Accumulate into Words EVX 4 1115 301 SP evmwsmf Vector Multiply Word Low Unsigned, Saturate, Int and Accumulate Negative in Words EVX 4 1147 301 SP evmwsmf Vector Multiply Word Signed, Modulo, Fractional Vector Multiply Word Signed, Modulo, Fractional Vector Multiply Word Signed, Modulo, Fractional Accumulate Negative EVX 4 1371 302 SP evmwsmfa Vector Multiply Word Signed, Modulo, Fractional Accumulate EVX 4 1499 302 SP evmwsmfa Vector Multiply Word Signed, Modulo, Fractional Accumulate Negative EVX 4 1113 302 SP evmwsmia Vector Multiply Word Signed, Modulo, Integer evmwsmia Vector Multiply Word Signed, Modulo, Integer and Accumulate Negative EVX 4 1369 302 SP evmwsmia Vector Multiply Word Signed, Modulo, Integer and Accumulate Negative EVX 4 1497 302 SP evmwsmia Vector Multiply Word Signed, Modulo, Integer and Accumulate Negative EVX 4 1107 303 SP evmwssfa Vector Multiply Word Signed, Saturate, Fractional Accumulate EVX 4 1139 303 SP evmwssfa Vector Multiply Word Signed, Saturate, Fractional Accumulate EVX 4 1491 304 SP evmwssfa Vector Multiply Word Signed, Saturate, Fractional Accumulate EVX 4 1491 304 SP evmwssfa Vector Multiply Word Signed, Saturate, Fractional Accumulate EVX 4 1491 304 SP evmwssfa Vector Multiply Word Signed, Saturate, Fractional Accumulate EVX 4 1491 304 SP evmwssfa Vector Multiply Word Signed, Saturate, Fractional Accumulate EVX 4 1491 304 SP evmwumia Vector Multiply Word Unsigned, Modulo, Integer and Accumulate Negative EVX 4 1368 305 SP evmwumia Vector Multiply Word Unsigned, Modulo, Integer and Accumulate EVX 4 1496 305 SP evmwumia Vector Multiply Word Unsigned, Modulo, Integer and Accumulate EVX 4 1496 305 SP evmwumia Vector Multiply Word Unsigned, Modulo, Integer and Accumulate EVX 4 1496 305 SP evmwumia Vector Multiply Word Unsigned, Modulo, Integer and Accumulate EVX 4 1496 305 SP evmwumia Vector Multiply Word Unsigned, Modulo, Integer and Accumulate EVX 4 1496 305 SP evmwumia Ve | EVX | 4 | 1480 | | | 300 | SP | evmwlumianw | Vector Multiply Word Low Unsigned, Modulo, Integer | | EVX 4 1115 EVX 4 1116 EVX 4 1117 EVX 4 1117 EVX 4 1117 EVX 4 1117 EVX 4 1117 EVX 4 1118 EVX 4 1118 EVX 4 1371 EVX 4 1119 BVX 4 1119 BVX 4 1111 1112 BVX 4 1112 BVX 4 1112 BVX 4 1114 BVX 4 1114 BVX 4 1114 BVX 4 1111 | EVX | 4 | 1344 | | | 301 | SP | evmwlusiaaw | Vector Multiply Word Low Unsigned, Saturate, Integer | | EVX 4 1115 EVX 4 1147 EVX 4 1147 EVX 4 1147 EVX 4 1147 EVX 4 1371 SP evmwsmfa EVX 4 1371 SP evmwsmfa EVX 4 1499 SP evmwsmfa EVX 4 1499 SP evmwsmfa EVX 4 1113 SO2 SP evmwsmfa EVX 4 1113 SO2 SP evmwsmfa EVX 4 11145 SO2 SP evmwsmfa EVX 4 1115 EVX 4 1116 SO2 SP evmwsmfa EVX 4 1117 EVX 4 1118 SO2 SP evmwsmia EVX 4 1119 SO2 SP evmwsmia EVX 4 1119 SO2 SP evmwsmia EVX 4 1145 SO2 SP evmwsmia EVX 4 1145 SO2 SP evmwsmia EVX 4 1369 EVX 4 1369 SP evmwsmia EVX 4 1107 EVX 4 1107 EVX 4 1107 EVX 4 1107 EVX 4 1139 SO3 SP evmwsmfa EVX 4 1139 SO3 SP evmwsmfa EVX 4 1149 SO3 SP evmwsmfa EVX 4 11491 SO3 SP evmwsmfa EVX 4 1363 SP evmwsmfa EVX 4 1491 SO3 SP evmwsmfa EVX 4 1491 SO4 SP evmwsmfa EVX 4 1491 SO5 SP evmwsmfa EVX 4 1491 SO5 SP evmwsmfa EVX 4 1491 SO6 SP evmwsmfa EVX 4 1491 SO6 SP evmwsmfa EVX 4 1496 EVX 4 1496 SO6 SP evmwmia EVX 4 1496 SO7 SP evmwmia EVX 4 1496 SO7 SP evmwmia EVX 4 1496 SO7 SP evmwmia EVX 4 1496 SO8 EV | EVX | 4 | 1472 | | | 301 | SP | evmwlusianw | Vector Multiply Word Low Unsigned, Saturate, Integer | | EVX 4 1499 302 SP evmwsmfaa Vector Multiply Word Signed, Modulo, Fractional at Accumulate Vector Multiply Word Signed, Modulo, Fractional at Accumulate Negative Vector Multiply Word Signed, Modulo, Integer Vector Multiply Word Signed, Modulo, Integer Vector Multiply Word Signed, Modulo, Integer Vector Multiply Word Signed, Modulo, Integer Negative Vector Multiply Word Signed, Modulo, Integer and Accumulate Vector Multiply Word Signed, Modulo, Integer and Accumulate Vector Multiply Word Signed, Modulo, Integer and Accumulate Negative Vector Multiply Word Signed, Saturate, Fractional Accumulator Vector Multiply Word Signed, Saturate, Fractional Accumulator Vector Multiply Word Signed, Saturate, Fractional Accumulate Vector Multiply Word Signed, Saturate, Fractional Accumulate Vector Multiply Word Signed, Saturate, Fractional Accumulate Vector Multiply Word Signed, Saturate, Fractional Accumulate Vector Multiply Word Signed, Saturate, Fractional Accumulate Vector Multiply Word Signed, Saturate, Fractional Accumulate Vector Multiply Word Unsigned, Modulo, Integer Service Accumulate Vector Multiply Word Unsigned, Modulo, Integer Service Accumulate Vector Multiply Word Unsigned, Modulo, Integer Service Accumulate Vector Multiply Word Unsigned, Modulo, Integer Service Accumulate Vector Multiply Word Unsigned, Modulo, Integer Service Mo | | | | | | | _ | | Vector Multiply Word Signed, Modulo, Fractional Vector Multiply Word Signed, Modulo, Fractional to | | EVX 4 1113 302 SP evmwsmfan Vector Multiply Word Signed, Modulo, Fractional and Accumulate Negative Vector Multiply Word Signed, Modulo, Integer Vector Multiply Word Signed, Modulo, Integer to Accumulate Negative Vector Multiply Word Signed, Modulo, Integer to Accumulate Negative Vector Multiply Word Signed, Modulo, Integer and Accumulate Negative Vector Multiply Word Signed, Modulo, Integer and Accumulate Negative Vector Multiply Word Signed, Saturate, Fractional Accumulator Vector Multiply Word Signed, Saturate, Fractional Accumulator Vector Multiply Word Signed, Saturate, Fractional Accumulator Vector Multiply Word Signed, Saturate, Fractional Accumulate Negative Vector Multiply Word Signed, Saturate, Fractional Accumulate Negative Vector Multiply Word Signed, Saturate, Fractional Accumulate Negative Vector Multiply Word Signed, Saturate, Fractional Accumulate Negative Vector Multiply Word Signed, Saturate, Fractional Accumulate Negative Vector Multiply Word Unsigned, Modulo, Integer of Accumulate Negative Vector Multiply Word Unsigned, Modulo, Integer of Accumulate Negative Vector Multiply Word Unsigned, Modulo, Integer of Accumulate Negative Vector Multiply Word Unsigned, Modulo, Integer of Accumulate Negative Vector Multiply Word Unsigned, Modulo, Integer of Accumulate Negative Vector Multiply Word Unsigned, Modulo, Integer of Accumulate Negative Vector Multiply Word Unsigned, Modulo, Integer of Accumulate Negative Vector Multiply Word Unsigned, Modulo, Integer of Accumulate Negative Vector Multiply Word Unsigned, Modulo, Integer of Accumulate Negative Vector Multiply Word Unsigned, Modulo, Integer of Accumulate Negative Vector Multiply Word Unsigned, Modulo, Integer of Accumulate Negative Vector Multiply Word Unsigned, Modulo, Integer of Accumulate Negative Vector Multiply Word Unsigned, Modulo, Integer of Accumulate Negative Vector Multiply Word Unsigned, Modulo, Integer of Accumulate Negative Vector Multiply Word Unsigned, Modulo, Integer of Accumulate Negative Vector Multiply Word Unsigned, Modulo, In | EVX | 4 | 1371 | | | 302 | SP | evmwsmfaa | Vector Multiply Word Signed, Modulo, Fractional and | | EVX 4 1145 302 SP evmwsmia Vector Multiply Word Signed, Modulo, Integer Vector Multiply Word Signed, Modulo, Integer of Accumulate Negative EVX 4 1369 302 SP evmwsmia Vector Multiply Word Signed, Modulo, Integer and Accumulate Negative EVX 4 1497 302 SP evmwsmian Vector Multiply Word Signed, Modulo, Integer and Accumulate Negative EVX 4 1107 303 SP evmwssfa Vector Multiply Word Signed, Saturate, Fractional Accumulator EVX 4 1139 303 SP evmwssfa Vector Multiply Word Signed, Saturate, Fractional Accumulator EVX 4 1363 303 SP evmwssfaa Vector Multiply Word Signed, Saturate, Fractional Accumulate EVX 4 1491 304 SP evmwssfan Vector Multiply Word Signed, Saturate, Fractional Accumulate Negative EVX 4 1112 304 SP evmwumi Vector Multiply Word Unsigned, Modulo, Integer Accumulator EVX 4 1368 305 SP evmwumian Vector Multiply Word Unsigned, Modulo, Integer Accumulate EVX 4 1496 305 SP evmwumian Vector Multiply Word Unsigned, Modulo, Integer Accumulate EVX 4 542 305 SP evmwumian Vector Multiply Word Unsigned, Modulo, Integer Accumulate Negative Vector NAND | EVX | 4 | 1499 | | | 302 | SP | evmwsmfan | Vector Multiply Word Signed, Modulo, Fractional and | | EVX41369302SPevmwsmiaaVector Multiply Word Signed, Modulo, Integer and AccumulateEVX41497302SPevmwsmianVector Multiply Word Signed, Modulo, Integer and Accumulate NegativeEVX41107303SPevmwssfVector Multiply Word Signed, Saturate, Fractional AccumulatorEVX41139303SPevmwssfaaVector Multiply Word Signed, Saturate, Fractional AccumulateEVX41491304SPevmwssfanVector Multiply Word Signed, Saturate, Fractional AccumulateEVX41112304SPevmwumiVector Multiply Word Signed, Saturate, Fractional Accumulate NegativeEVX41114304SPevmwumiaVector Multiply Word Unsigned, Modulo, Integer to AccumulatorEVX41368305SPevmwumiaaVector Multiply Word Unsigned, Modulo, Integer accumulateEVX41496305SPevmwumianVector Multiply Word Unsigned, Modulo, Integer accumulateEVX41496305SPevmwumianVector Multiply Word Unsigned, Modulo, Integer accumulateEVX41496305SPevmwumianVector Multiply Word Unsigned, Modulo, Integer accumulateEVX41496305SPevmwumianVector Multiply Word Unsigned, Modulo, Integer accumulateEVX41496305SPevmwumianVector Multiply Word Unsigned, Modulo, Integer accumulateEVX41496 <t< td=""><td></td><td></td><td></td><td></td><td></td><td></td><td>_</td><td></td><td>Vector Multiply Word Signed, Modulo, Integer Vector Multiply Word Signed, Modulo, Integer to Accu-</td></t<> | | | | | | | _ | | Vector Multiply Word Signed, Modulo, Integer Vector Multiply Word Signed, Modulo, Integer to Accu- | | EVX 4 1107 EVX 4 1107 EVX 4 1139 SP evmwssfa EVX 4 1363 SP evmwssfa EVX 4 1491 SP evmwssfa EVX 4 1491 SP evmwssfa EVX 4 1112 EVX 4 1112 EVX 4 1144 SP evmwumia EVX 4 1368 SP evmwumia EVX 4 1368 EVX 4 1496 EVX 4 1496 EVX 4 1496 EVX 4 1496 EVX 4 1496 EVX 4 542 SP evmwsmian Vector Multiply Word Signed, Saturate, Fractional Accumulate Vector Multiply Word Signed, Saturate, Fractional Accumulate Vector Multiply Word Signed, Saturate, Fractional Accumulate Vector Multiply Word Signed, Saturate, Fractional Accumulate Vector Multiply Word Signed, Saturate, Fractional Accumulate Negative Vector Multiply Word Unsigned, Modulo, Integer of Accumulate | EVX | 4 | 1369 | | | 302 | SP | evmwsmiaa | Vector Multiply Word Signed, Modulo, Integer and | | EVX41107303SPevmwssf<br>evmwssfaVector Multiply Word Signed, Saturate, Fractional<br>Vector Multiply Word Signed, Saturate, Fractional<br>AccumulatorEVX41363303SPevmwssfaaVector Multiply Word Signed, Saturate, Fractional<br>AccumulateEVX41491304SPevmwssfanVector Multiply Word Signed, Saturate, Fractional<br>AccumulateEVX41112304SPevmwumiVector Multiply Word Unsigned, Modulo, Integer<br>Vector Multiply Word Unsigned, Modulo, Integer to<br> | EVX | 4 | 1497 | | | 302 | SP | evmwsmian | Vector Multiply Word Signed, Modulo, Integer and | | EVX 4 1491 304 SP evmwssfaa Vector Multiply Word Signed, Saturate, Fractional Accumulate EVX 4 1491 304 SP evmwssfan Vector Multiply Word Signed, Saturate, Fractional Accumulate Negative EVX 4 1112 304 SP evmwumi evmwumia Vector Multiply Word Unsigned, Modulo, Integer Vector Multiply Word Unsigned, Modulo, Integer to Accumulate EVX 4 1368 305 SP evmwumian Vector Multiply Word Unsigned, Modulo, Integer and Accumulate EVX 4 1496 305 SP evmwumian Vector Multiply Word Unsigned, Modulo, Integer and Accumulate EVX 4 542 305 SP evnand Vector Multiply Word Unsigned, Modulo, Integer and Accumulate Negative Vector NAND | | | | | | | | | Vector Multiply Word Signed, Saturate, Fractional Vector Multiply Word Signed, Saturate, Fractional to | | EVX 4 1491 304 SP evmwssfan Vector Multiply Word Signed, Saturate, Fractional Accumulate Negative Vector Multiply Word Unsigned, Modulo, Integer Vector Multiply Word Unsigned, Modulo, Integer Vector Multiply Word Unsigned, Modulo, Integer to Accumulator Accumulator Vector Multiply Word Unsigned, Modulo, Integer and Accumulate Vector Multiply Word Unsigned, Modulo, Integer and Accumulate Vector Multiply Word Unsigned, Modulo, Integer and Accumulate Vector Multiply Word Unsigned, Modulo, Integer and Accumulate Negative Vector NAND | EVX | 4 | 1363 | | | 303 | SP | evmwssfaa | Vector Multiply Word Signed, Saturate, Fractional and | | EVX41112304SP<br>spevmwumi<br>evmwumiaVector Multiply Word Unsigned, Modulo, Integer Vector Multiply Word Unsigned, Modulo, Integer Vector Multiply Word Unsigned, Modulo, Integer AccumulatorEVX41368305SPevmwumiaaVector Multiply Word Unsigned, Modulo, Integer AccumulateEVX41496305SPevmwumianVector Multiply Word Unsigned, Modulo, Integer AccumulateEVX4542305SPevnandVector Multiply Word Unsigned, Modulo, Integer AccumulateEVX4542305SPevnandVector NAND | EVX | 4 | 1491 | | | 304 | SP | evmwssfan | Vector Multiply Word Signed, Saturate, Fractional and | | EVX 4 1368 305 SP evmwumiaa Accumulator Vector Multiply Word Unsigned, Modulo, Integer a Accumulate Vector Multiply Word Unsigned, Modulo, Integer a Accumulate Vector Multiply Word Unsigned, Modulo, Integer a Accumulate Negative Vector NAND | | | | | | | | | Vector Multiply Word Unsigned, Modulo, Integer | | EVX 4 1368 305 SP evmwumiaa Vector Multiply Word Unsigned, Modulo, Integer a Accumulate Vector Multiply Word Unsigned, Modulo, Integer a Accumulate Negative Vector NAND | EVA | 4 | 1144 | | | 304 | 37 | eviniwuniia | | | EVX 4 1496 305 SP evmwumian Vector Multiply Word Unsigned, Modulo, Integer a Accumulate Negative Vector NAND | EVX | 4 | 1368 | | | 305 | SP | evmwumiaa | Vector Multiply Word Unsigned, Modulo, Integer and | | EVX 4 542 305 SP evnand Vector NAND | EVX | 4 | 1496 | | | 305 | SP | evmwumian | Vector Multiply Word Unsigned, Modulo, Integer and | | EVX 4 521 | | | | | | | | | Vector NAND | | | EVX | 4 | 521 | | | 305 | SP | evneg | Vector Negate | | EVX 4 536 305 SP evnor Vector NOR EVX 4 535 306 SP evor Vector OR | | | | | | | | | | | EVX 4 535 306 SP evor Vector OR EVX 4 539 306 SP evorc Vector OR with Complement | | | | | | | | | | | EVX | | E | Ор | code | ₽Ţ. | ر1 | | - 1 | | | |--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---|------------------------------------------------|-----|------|------------|------|------|------------------|--------------|--------------------------------------------------------| | EVX 4 554 307 SP evrlwi Vector Rotate Left Word Immediate Vector Splet | | Form | Pri | Ext | Мос<br>Dep | Pri∖ | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | EVX 4 524 590 307 SP evsel Vector Round Word Vector Select Vector Shift Left Word Right Word Immediate Signed Vector Shift Right Word Signed Vector Shift Right Word Signed Vector Shift Right Word Signed Vector Shift Right Word Unsigned Unsi | | | | | | | | | - | | | EVX 4 79 | | | - | | | | | | _ | | | EVX 4 548 Sol 308 SP evslw Vector Shift Left Word Immediate EVX 4 555 308 SP evslw Vector Shift Left Word Immediate Vector Splat Fractional Immediate Vector Splat Fractional Immediate Vector Splat Store Double Of Double Vector Store Double Of Double Vector Store Double Of Double Vector Store Double Of Double Vector Store Double Of Double Vector Store Double Of Doubleword Vector Store Double Of Doubleword Vector Store Double Of Doubleword Vector Store Double Of Two Halfwords Of Two Halfwords Vector Store Word Of Two Halfwords Halfwo | | | | | | | | | | | | EVX 4 555 308 SP evsluf Vector Shift Left Word Immediate Vector Shift Left Word Immediate Vector Shift Left Word Immediate Vector Shift Right Isinged Vector Shift Right Word Immediate | | | - | | | | | | | | | EVX 4 555 Service 508 SP evsplatifi Vector Splat Fractional Immediate Vector Splat Fractional Immediate Vector Splat Fractional Immediate Signed Signed Vector Splat Fractional Immediate Immediat | | | - | | | | | | | | | EVX 4 543 546 308 SP evsplati Vector Spital Immediate | | | | | | | | | | | | EVX 4 547 | | | | | | | | | | | | EVX 4 546 S44 S45 | | | | | | | | | | | | EVX 4 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 544 | | EVX | 4 | | | | | | | | | EVX 4 801 | | EVX | 4 | 545 | | | 309 | SP | evsrws | | | EVX 4 800 309 SP evstddx Vector Store Double of Double Indexed Vector Store Double of Four Halfwords Indexed Vector Store Double of Four Halfwords Two Words Vector Store Double of Two Words Vector Store Double of Two Words Vector Store Double of Two Words Vector Store Word of Two Halfwords from Even Vector Store Word of Two Halfwords from Even Vector Store Word of Two Halfwords from Odd Vector Store Word of Two Halfwords from Odd Vector Store Word of Two Halfwords from Odd Vector Store Word of Two Halfwords from Odd Vector Store Word of Two Halfwords from Odd Vector Store Word of Word from Even Vector Store Word of Word from Odd Vector Store Word of Word from Even Vector Store Word of Word from Even Vector Store Word of Word from Even Vector Store Word of Word from Odd Vector Store Word of Word from Odd Vector Store Word of Word from Odd Vector Store Word of Word from Odd Vector Store Word of Word from Odd Vector Store Word of Word from Odd Vector Subtract Signed, Modulo, Integer to Acc Vector Subtract Unsigned, Saturate, Doubleword Vector Subtract Inmediate from Word Vector Sub | | | | | | | | | | | | EVX | _ | | - | | | _ | | | | | | EVX | I | EVX | 31 | 413 | | Р | 636 | E.PD | evstddepx | | | EVX | | EVX | 4 | 800 | | | 309 | SP | evstddx | | | EVX | | | | | | | | | | | | EVX | | EVX | 4 | | | | | | evstdhx | | | EVX | | EVX | 4 | | | | 310 | | evstdw | Vector Store Double of Two Words | | EVX | | | | | | | | | evstdwx | | | EVX | | | | | | | _ | | | | | EVX | | | | | | | | | | | | EVX 4 825 | | | | | | | | | | | | EVX 4 824 829 311 SP evstwwc Vector Store Word of Word from Even Indexed EVX 4 829 812 SP evstwwox Vector Store Word of Word from Odd Vector Store Word of Word from Odd Vector Store Word of Word from Odd Indexed Vector Store Word of Word from Odd Indexed Vector Store Word of Word from Odd Indexed Vector Store Word of Word from Odd Indexed Vector Store Word of Word from Odd Indexed Vector Subtract Signed, Modulo, Integer to Accum Word Vector Subtract Signed, Saturate, Integer to Accum Vector Subtract Unsigned, Modulo, Integer to Accum Vector Subtract Unsigned, Modulo, Integer to Accum Vector Subtract Unsigned, Saturate, Integer to Accum Vector Subtract Unsigned, Saturate, Integer to Accum Vector Subtract Unsigned, Saturate, Integer to Accum Vector Subtract Unsigned, Saturate, Integer to Accum Vector Subtract Unsigned, Saturate, Integer to Accum Vector Subtract Immediate from Word Subtrac | | | | | | | _ | | | | | EVX | | | | | | | | | | | | EVX | | | | | | | | | | | | EVX | | | | | | | - | | | | | EVX 4 1219 312 SP evsubfssiaaw Vector Subtract Signed, Saturate, Integer to Acctor Word EVX 4 1226 313 SP evsubfumiaaw Vector Subtract Unsigned, Modulo, Integer to Accidator Word EVX 4 1218 313 SP evsubfusiaaw Vector Subtract Unsigned, Saturate, Integer to Accidator Word EVX 4 516 313 SP evsubfw Vector Subtract Unsigned, Saturate, Integer to Accidate Word EVX 4 516 313 SP evsubfw Vector Subtract Unsigned, Modulo, Integer to Accidate Word Integer to Accidate Word Vector Subtract Unsigned, Modulo, Integer to Accidate Word Vector Subtract Unsigned, Modulo, Integer to Accidate Word EVX 4 516 313 SP evsubfw Vector Subtract Unsigned, Modulo, Integer to Accidate Word EVX 4 516 313 SP evsubfw Vector Subtract Unsigned, Modulo, Integer to Accidate Word EVX 4 518 313 SP evsubfw Vector Subtract Unsigned, Mod EVX 4 5 | | EVX | 4 | | | | 312 | SP | evsubfsmiaaw | Vector Subtract Signed, Modulo, Integer to Accumulator | | EVX 4 1226 313 SP evsubfumiaaw tor Word Vector Subtract Unsigned, Modulo, Integer to Actilator Word EVX 4 1218 313 SP evsubfusiaaw Vector Subtract Unsigned, Saturate, Integer to Actilator Word EVX 4 516 313 SP evsubfw Vector Subtract from Word EVX 4 518 313 SP evsubfw Vector Subtract Immediate from Word EVX 4 518 313 SP evsubfw Vector Subtract Immediate from Word EVX 4 518 313 SP evsubfw Vector Subtract Immediate from Word EVX 4 518 313 SP evsubfw Vector Subtract Immediate from Word Vector Subtract Immediate from Word Vector Subtract Immediate from Word Vector Subtract Immediate from Word X 31 954 SR 79 B extself.] Extend Sign Byte X 31 986 SR 81 64 extsw[.] Extend Sign Halfword | | <b>-</b> \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ | | 4040 | | | 040 | 0.0 | | | | EVX 4 1226 313 SP evsubfumiaaw Vector Subtract Unsigned, Modulo, Integer to Accellator Word EVX 4 1218 313 SP evsubfusiaaw Vector Subtract Unsigned, Saturate, Integer to Accellator Word EVX 4 516 313 SP evsubfw Vector Subtract from Word EVX 4 518 313 SP evsubfw Vector Subtract Immediate from Word EVX 4 518 313 SP evsubfw Vector Subtract Unsigned, Modulo, Integer to Accellator Word EVX 4 516 313 SP evsubfw Vector Subtract Unsigned, Modulo, Integer to Accellator Word EVX 4 518 313 SP evsubfw Vector Subtract Unsigned, Saturate, Integer to Accellator Word EVX 4 518 313 SP evsubfw Vector Subtract Unsigned, Saturate, Integer to Accellator Word EVX 4 518 313 SP evsubfw Vector Subtract Unsigned, Saturate, Integer to Accellator Word X 63 58 | | EVX | 4 | 1219 | | | 312 | 5P | evsubtssiaaw | _ | | EVX 4 1218 313 SP evsubfusiaaw Vector Subtract Unsigned, Saturate, Integer to Adlator Word EVX 4 516 313 SP evsubfw Vector Subtract from Word EVX 4 518 313 SP evsubifw Vector Subtract Immediate from Word Vex 4 534 313 SP evsubifw Vector Subtract Immediate from Word Vex 4 518 313 SP evsubifw Vector Subtract Immediate from Word Vex 4 514 518 313 SP evsubifw Vector Subtract Immediate from Word Vex 4 514 518 81 64 evsubifw Vector Subtract Immediate from Word Vex 4 518 82 82 82 evsubifw Vector Subtract Immediate from Word Vex 63 192 SR 79 B evsubifw Vector Subtract Immediate from Word Vex 63 192 SR 79 B extsb | | EVX | 4 | 1226 | | | 313 | SP | evsubfumiaaw | Vector Subtract Unsigned, Modulo, Integer to Accumu- | | EVX | | <b>-</b> \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ | | 4040 | | | 0.40 | 0.0 | | | | EVX | | EVX | 4 | 1218 | | | 313 | SP | evsubfusiaaw | | | EVX | | EVX | 4 | 516 | | | 313 | SP | evsubfw | | | EVX | | | | | | | | | | | | X 31 922 SR 79 B extsh[.] Extend Sign Halfword X 31 986 SR 81 64 extsw[.] Extend Sign Halfword X 63 264 126 FP[R] fabs[.] Floating Absolute Value A 63 21 127 FP[R] fadds[.] Floating Add Single X 63 846 136 FP[R] fcfid[.] Floating Convert From Integer Doubleword X 63 32 138 FP fcmpo Floating Compare Ordered X 63 8 126 FP[R] fcpsgn[.] Floating Compare Unordered X 63 814 134 FP[R] fctid[.] Floating Convert To Integer Doubleword X 63 815 135 FP[R] fctidz[.] Floating Convert To Integer Word X 63 14 135 FP[R] fctiw[.] Floating Convert To Integer Word with round tow X 63 18 </th <th></th> <td>EVX</td> <td>4</td> <td>534</td> <td></td> <td></td> <td>313</td> <td></td> <td>evxor</td> <td>Vector XOR</td> | | EVX | 4 | 534 | | | 313 | | evxor | Vector XOR | | X 31 986 SR 126 FP[R] FP[R] Floating Absolute Value A 63 21 127 FP[R] Fp[R] Floating Add A 59 21 127 FP[R] Fodd[.] Floating Add X 63 846 136 FP[R] Fcfid[.] Floating Convert From Integer Doubleword X 63 32 138 FP fcmpo Floating Compare Ordered X 63 84 126 FP[R] Fcmpu Floating Compare Unordered X 63 81 134 FP[R] Fctid[.] Floating Convert To Integer Doubleword X 63 814 135 FP[R] Fctide[.] Floating Convert To Integer Doubleword with rou toward Zero X 63 14 135 FP[R] Fctiw[.] Floating Convert To Integer Word X 63 15 136 FP[R] Fctiw[.] Floating Convert To Integer Word with round tow Zero X 63 18 128 FP[R] Fdiv[.] Floating Divide 128 FP[R] Fdiv[.] Floating Divide X 63 128 FP[R] Fdiv[.] Floating Add Single Floating Add Floating Add F | | | | | | | - | | | | | X 63 264 126 FP[R] fabs[.] Floating Absolute Value A 63 21 127 FP[R] fadd[.] Floating Add A 59 21 127 FP[R] fadds[.] Floating Add Single X 63 846 136 FP[R] fcfid[.] Floating Convert From Integer Doubleword X 63 32 138 FP fcmpo Floating Compare Ordered X 63 8 126 FP[R] fcpsgn[.] Floating Compare Unordered X 63 814 134 FP[R] fctid[.] Floating Convert To Integer Doubleword X 63 815 135 FP[R] fctidz[.] Floating Convert To Integer Doubleword with round tow toward Zero X 63 14 135 FP[R] fctiw[.] Floating Convert To Integer Word with round tow Zero A 63 18 FP[R] fdiv[.] Floating Convert To Integer Word with round tow Zero | | | | | | | - | | extsh[.] | Extend Sign Halfword | | A 63 21 127 FP[R] fadd[.] Floating Add A 59 21 127 FP[R] fadds[.] Floating Add Single X 63 846 136 FP[R] fcfid[.] Floating Convert From Integer Doubleword X 63 32 138 FP fcmpo Floating Compare Ordered X 63 8 126 FP[R] fcpsgn[.] Floating Compare Unordered X 63 814 134 FP[R] fctid[.] Floating Convert To Integer Doubleword X 63 815 135 FP[R] fctidz[.] Floating Convert To Integer Doubleword with round tow toward Zero X 63 14 135 FP[R] fctiw[.] Floating Convert To Integer Word X 63 15 136 FP[R] fctiwz[.] Floating Convert To Integer Word with round tow Zero A 63 18 128 FP[R] fdiv[.] Floating Convert To Integer Word with round tow Zero | | | | | SR | | | | | | | A 59 21 | | _ | | | | | | | | | | X 63 846 | | | | | | | | | | | | X 63 32 | | | | | | | | | | | | X 63 8 14 126 FP[R] fcpsgn[.] Floating Copy Sign Floating Copy Sign Floating Copy Sign Floating Convert To Integer Doubleword Floating Convert To Integer Doubleword with rou toward Zero X 63 14 135 FP[R] fctiw[.] Floating Convert To Integer Doubleword with rou toward Zero Floating Copy Sign Floating Convert To Integer Doubleword with rou toward Zero Floating Copy Sign Floating Convert To Integer Doubleword with round tow Zero Floating Copy Sign Floating Convert To Integer Doubleword with round tow Zero Floating Copy Sign Floating Convert To Integer Doubleword Floating Convert To Integer Word with round tow Zero Floating Copy Sign Floating Copy Sign Floating Convert To Integer Doubleword Floating Convert To Integer Doubleword Floating Convert To Integer Doubleword Floating Convert To Integer Word Con | | | | 32 | | | | | | | | X 63 814 134 FP[R] fctid[.] Floating Convert To Integer Doubleword Floating Convert To Integer Doubleword with rou toward Zero X 63 14 135 FP[R] fctiw[.] Floating Convert To Integer Doubleword with rou toward Zero Floating Convert To Integer Word Floating Convert To Integer Word Floating Convert To Integer Word Floating Convert To Integer Word with round tow Zero A 63 18 14 FP[R] fdiv[.] Floating Divide | | | | | | | | | | | | X 63 815 135 FP[R] fctidZ[.] Floating Convert To Integer Doubleword with rou toward Zero X 63 14 135 FP[R] fctiw[.] Floating Convert To Integer Word X 63 15 136 FP[R] fctiwz[.] Floating Convert To Integer Word with round tow Zero A 63 18 128 FP[R] fdiv[.] Floating Convert To Integer Word with round tow Zero Floating Divide | | | | - | | | | | | | | X 63 14 135 FP[R] fctiw[.] toward Zero Floating Convert To Integer Word Floating Convert To Integer Word with round tow Zero A 63 18 128 FP[R] fdiv[.] Floating Divide | | | | | | | | | | Floating Convert To Integer Doubleword | | X 63 14 135 FP[R] fctiw[.] Floating Convert To Integer Word X 63 15 136 FP[R] fctiwz[.] Floating Convert To Integer Word with round tow Zero A 63 18 128 FP[R] fdiv[.] Floating Divide | | ^ | 63 | 815 | | | 135 | FP[K] | ictidz[.] | | | X 63 15 136 FP[R] fctiwz[.] Floating Convert To Integer Word with round tow Zero A 63 18 128 FP[R] fdiv[.] Floating Divide | | x | 63 | 14 | | | 135 | FPIRI | fctiw[ ] | | | | | | | | | | | | | Floating Convert To Integer Word with round toward | | | | | | | | | | | | Zero | | | | | | | | | | | | | | | | A | 59 | 18 | | | 128 | FP[R] | fdivs[.] | Floating Divide Single | | A 63 29 132 FP[R] fmadd[.] Floating Multiply-Add A 59 29 132 FP[R] fmadds[.] Floating Multiply-Add Single | | | | | | | | | | | | X 63 72 126 FP[R] fmr[.] Floating Multiply-Add Single | | | | | | | | | | | | A 63 28 132 FP[R] fmsub[.] Floating Multiply-Subtract | | | | | | | | | | | | | : OF | code | de<br>.1 | ١, | _ | - 1 | | | |------|------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------|-------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------|--------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Pri | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | | | 28<br>25<br>25<br>25<br>136<br>40<br>31<br>31<br>30<br>30<br>24<br>488<br>392<br>456<br>424<br>12<br>26<br>23<br>22<br>20<br>274<br>982<br>991<br>230<br>22<br>486<br>966<br>998<br>15<br>150<br>95<br>87<br>0<br>84<br>885<br>29<br>1<br>53<br>21<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>60<br>791<br>791<br>791<br>791<br>791<br>791<br>791<br>791<br>791<br>791 | | H PM MPP P H HP P | 132<br>128<br>128<br>126<br>126<br>133<br>133<br>133<br>133<br>129<br>129<br>137<br>137<br>137<br>137<br>137<br>137<br>137<br>480<br>428<br>634<br>657<br>428<br>656<br>727<br>731<br>74<br>440<br>627<br>440<br>45<br>45<br>46<br>50<br>119<br>635<br>125<br>119<br>119<br>119<br>119<br>119<br>120<br>121<br>122<br>122<br>122<br>122 | | frin[.]<br>frip[.] | Floating Multiply-Subtract Single Floating Multiply Single Floating Negative Absolute Value Floating Negative Multiply-Add Floating Negative Multiply-Add Single Floating Negative Multiply-Subtract Floating Negative Multiply-Subtract Floating Negative Multiply-Subtract Floating Negative Multiply-Subtract Floating Reciprocal Estimate Floating Reciprocal Estimate Floating Round to Integer Minus Floating Round to Integer Minus Floating Round to Integer Plus Floating Round to Integer Plus Floating Round to Integer Plus Floating Round to Integer Poward Zero Floating Round to Integer Poward Zero Floating Round to Single-Precision Floating Reciprocal Square Root Estimate Floating Reciprocal Square Root Estimate Floating Square Root Floating Square Root Floating Square Root Floating Square Root Floating Square Root Single Floating Subtract Floati | | | _ | Opcode<br>Pri Ext | | o ⁻. | ۲, | | | | | |---|---------|-------------------|------------|---------------------------|------|------------|------------------|----------------------|---------------------------------------------------------------------------------| | | Form | Pri | | Mode<br>Dep. <sup>1</sup> | Priv | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | | X | 31 | 375 | | | 47 | В | Ihaux | Load Halfword Algebraic with Update Indexed | | | X | 31 | 343 | | | 47 | В | lhax | Load Halfword Algebraic Indexed | | | X<br>X | 31<br>31 | 790<br>287 | | Р | 55<br>627 | B<br>E.PD | Ihbrx | Load Halfword Byte-Reverse Indexed | | I | Ď | 40 | 201 | | Г | 46 | В | lhepx<br>lhz | Load Halfword by External Process ID Indexed<br>Load Halfword and Zero | | | X | 31 | 821 | | Н | 491 | S | Ihzcix | Load Halfword and Zero Caching Inhibited Indexed | | I | Ď | 41 | 021 | | ''' | 46 | В | Ihzu | Load Halfword and Zero with Update | | | X | 31 | 311 | | | 46 | В | lhzux | Load Halfword and Zero with Update Indexed | | | X | 31 | 279 | | | 46 | В | lhzx | Load Halfword and Zero Indexed | | | D | 46 | | | | 56 | В | lmw | Load Multiple Word | | | DQ | 56 | | | Р | 493 | LSQ | lq | Load Quadword | | | X | 31 | 597 | | | 59 | MA | Iswi | Load String Word Immediate | | | X | 31 | 533 | | | 59 | MA | Iswx | Load String Word Indexed | | | X | 31 | 7 | | | 206 | V | lvebx | Load Vector Element Byte Indexed | | | X<br>X | 31<br>31 | 39<br>295 | | Р | 203<br>637 | V<br>E.PD | lvehx<br>lvepx | Load Vector Element Halfword Indexed Load Vector by External Process ID Indexed | | | X | 31 | 263 | | Р | 637 | E.PD | Ivepx | Load Vector by External Process ID Indexed LRU | | ı | X | 31 | 71 | | • | 203 | V V | Ivewx | Load Vector Element Word Indexed | | | X | 31 | 6 | | | 208 | V | Ivsl | Load Vector for Shift Left Indexed | | | Χ | 31 | 38 | | | 208 | V | lvsr | Load Vector for Shift Right Indexed | | | Χ | 31 | 103 | | | 204 | V | lvx | Load Vector Indexed | | | Х | 31 | 359 | | | 204 | V | lvxl | Load Vector Indexed LRU | | | DS | 58 | 2 | | | 49 | 64 | lwa | Load Word Algebraic | | | X | 31 | 20<br>373 | | | 442 | B | lwarx | Load Word Algebraic with Undeted | | | X | 31<br>31 | 341 | | | 49<br>49 | 64<br>64 | lwaux<br>lwax | Load Word Algebraic with Update Indexed<br>Load Word Algebraic Indexed | | | X | 31 | 534 | | | 55 | B | lwbrx | Load Word Byte-Reverse Indexed | | I | X | 31 | 31 | | Р | 628 | E.PD | lwepx | Load Word by External Process ID Indexed | | • | D | 32 | | | | 48 | В | lwz | Load Word and Zero | | I | X | 31 | 789 | | Н | 491 | S | lwzcix | Load Word and Zero Caching Inhibited Indexed | | | D | 33 | | | | 48 | В | lwzu | Load Word and Zero with Update | | | X | 31 | 55 | | | 48 | В | lwzux | Load Word and Zero with Update Indexed | | | X<br>XO | 31<br>4 | 23<br>172 | | | 48<br>351 | B<br>LMA | lwzx<br>macchw[o][.] | Load Word and Zero Indexed Multiply Accumulate Cross Halfword to Word Modulo | | | ٨٥ | 7 | 172 | | | 331 | LIVIA | macchw[o][.] | Signed | | | хо | 4 | 236 | | | 351 | LMA | macchws[o][.] | Multiply Accumulate Cross Halfword to Word Saturate | | | | | | | | | | | Signed | | | XO | 4 | 204 | | | 352 | LMA | macchwsu[o][.] | Multiply Accumulate Cross Halfword to Word Saturate | | | | | | | | | | | Unsigned | | | XO | 4 | 140 | | | 352 | LMA | macchwu[o][.] | Multiply Accumulate Cross Halfword to Word Modulo | | | V ~ | 4 | | | | 050 | 1 8 4 4 | mana alakuurururu | Unsigned | | | ХО | 4 | 44 | | | 353 | LMA | machhw[o][.] | Multiply Accumulate High Halfword to Word Modulo Signed | | | ХО | 4 | 108 | | | 353 | LMA | machhws[o][.] | Multiply Accumulate High Halfword to Word Saturate | | | хо | 4 | 76 | | | 354 | LMA | machhwsu[o][.] | Signed Multiply Accumulate High Halfword to Word Saturate | | | | | | | | | | | Unsigned | | | ХО | 4 | 12 | | | 354 | LMA | machhwu[o][.] | Multiply Accumulate High Halfword to Word Modulo Unsigned | | | хо | 4 | 428 | | | 355 | LMA | maclhw[o][.] | Multiply Accumulate Low Halfword to Word Modulo Signed | | | хо | 4 | 492 | | | 355 | LMA | maclhws[o][.] | Multiply Accumulate Low Halfword to Word Saturate | | | хо | 4 | 460 | | | 356 | LMA | maclhwsu[o][.] | Signed Multiply Accumulate Low Halfword to Word Saturate Unsigned | | | хо | 4 | 396 | | | 356 | LMA | maclhwu[o][.] | Multiply Accumulate Low Halfword to Word Modulo | | | Х | 31 | 854 | | | 448 | Е | mbar | Unsigned<br>Memory Barrier | | | XL | 19 | 004 | | | 38 | В | mcrf | Move Condition Register Field | | | X | 63 | 64 | | | 140 | FP | mcrfs | Move to Condition Register from FPSCR | | | Χ | 31 | 512 | | | 97 | Е | mcrxr | Move to Condition Register from XER | | | | | | | | | | • | | | | E | Op | code | e | ٦, | | - 1 | | | |---|------------|----------|-------------|---------------------------|--------|-------------|------------------|-------------------------|---------------------------------------------------------------------------------| | | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Pri∨ | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | | X | 31 | 275 | | | 97<br>95 | E | mfapidi | Move From APID Indirect | | ı | XFX<br>XFX | 31<br>31 | 19<br>323 | | Р | 95<br>625 | B<br>E | mfcr<br>mfdcr | Move From Condition Register Move From Device Control Register | | | X | 31 | 291 | | P | 97 | Ē | mfdcrux | Move From Device Control Register User-mode Indexed | | | Χ | 31 | 259 | | Ρ | 625 | Е | mfdcrx | Move From Device Control Register Indexed | | | X | 63 | 583 | | | 140 | FP[R] | mffs[.] | Move From FPSCR | | | X | 31 | 83 | | Р | 503,<br>625 | В | mfmsr | Move From Machine State Register | | | XFX | 31 | 19 | | | 96 | B.in | mfocrf | Move From One Condition Register Field | | I | XFX | 31 | 334 | | 0 | 756 | E.PM | mfpmr | Move From Performance Monitor Register | | | XFX | 31 | 339 | | 0 | 94,4<br>51 | В | mfspr | Move From Special Purpose Register | | | Χ | 31 | 595 | 32 | Ρ | 538 | S | mfsr | Move From Segment Register | | | X<br>XFX | 31 | 659 | 32 | Р | 538 | S | mfsrin | Move From Segment Register Indirect | | | VX | 31<br>4 | 371<br>1540 | | | 451<br>259 | S.out<br>V | mftb<br>mfvscr | Move From Time Base Move From Vector Status and Control Register | | ı | X | 31 | 238 | | Р | 721 | E.PC | msgclr | Message Clear | | | X | 31 | 206 | | Р | 721 | E.PC | msgsnd | Message Send | | | XFX<br>XFX | 31<br>31 | 144<br>451 | | Р | 95<br>624 | B<br>E | mtcrf<br>mtdcr | Move To Condition Register Fields Move To Device Control Register | | | X | 31 | 419 | | | 97 | Ē | mtdcrux | Move To Device Control Register User-mode Indexed | | | Χ | 31 | 387 | | Р | 624 | Е | mtdcrx | Move To Device Control Register Indexed | | | X | 63<br>63 | 70<br>38 | | | 142<br>142 | FP[R]<br>FP[R] | mtfsb0[.]<br>mtfsb1[.] | Move To FPSCR Bit 0<br>Move To FPSCR Bit 1 | | | XFL | 63 | 711 | | | 141 | FP[R] | mtfsf[.] | Move To FPSCR Fields | | | X | 63 | 134 | | | 141 | FP[R] | mtfsfi[.] | Move To FPSCR Field Immediate | | | X | 31 | 146 | | Р | 625 | E | mtmsr | Move To Machine State Register | | | X | 31<br>31 | 146<br>178 | | P<br>P | 501<br>502 | S<br>S | mtmsr<br>mtmsrd | Move To Machine State Register Move To Machine State Register Doubleword | | | XFX | 31 | 144 | | | 96 | B.in | mtocrf | Move To One Condition Register Field | | I | XFX | 31 | 462 | | 0 | 756 | E.PM | mtpmr | Move To Performance Monitor Register | | | XFX<br>X | 31<br>31 | 467<br>210 | 32 | O<br>P | 93<br>537 | B<br>S | mtspr<br>mtsr | Move To Special Purpose Register Move To Segment Register | | | X | 31 | 242 | 32 | P | 537 | S | mtsrin | Move To Segment Register Indirect | | | VX | 4 | 1604 | | | 259 | V | mtvscr | Move To Vector Status and Control Register | | | X | 4 | 168<br>136 | | | 356<br>356 | LMA<br>LMA | mulchw[.]<br>mulchwu[.] | Multiply Cross Halfword to Word Signed Multiply Cross Halfword to Word Unsigned | | | хо | 31 | 73 | SR | | 69 | 64 | mulhd[.] | Multiply High Doubleword | | | XO | 31 | 9 | SR | | 69 | 64 | mulhdu[.] | Multiply High Doubleword Unsigned | | | X<br>X | 4 | 40<br>8 | | | 357<br>357 | LMA<br>LMA | mulhhw[.] | Multiply High Halfword to Word Signed Multiply High Halfword to Word Unsigned | | | хо | 4<br>31 | o<br>75 | SR | | 67 | В | mulhhwu[.]<br>mulhw[.] | Multiply High Word | | | XO | 31 | 11 | SR | | 67 | В | mulhwu[.] | Multiply High Word Unsigned | | | XO<br>X | 31<br>4 | 233 | SR | | 69 | 64<br>LMA | mulld[o][.] | Multiply Low Doubleword Multiply Low Halfword to Word Signed | | | X | 4 | 424<br>392 | | | 357<br>357 | LMA | mullhw[.]<br>mullhwu[.] | Multiply Low Halfword to Word Signed Multiply Low Halfword to Word Unsigned | | | D | 7 | 002 | | | 67 | В | mulli | Multiply Low Immediate | | | XO | 31 | 235 | SR | | 67 | В | mullw[o][.] | Multiply Low Word | | | X<br>XL | 31<br>19 | 476<br>434 | SR | Н | 77<br>482 | B<br>S | nand[.]<br>nap | NAND<br>Nap | | 1 | XO | 31 | 104 | SR | • | 66 | В | neg[o][.] | Negate | | | XO | 4 | 174 | | | 358 | LMA | nmacchw[o][.] | Negative Multiply Accumulate Cross Halfword to Word | | | хо | 4 | 238 | | | 358 | LMA | nmacchws[o][.] | Modulo Signed Negative Multiply Accumulate Cross Halfword to Word | | | | | | | | | | | Saturate Signed | | | XO | 4 | 46 | | | 359 | LMA | nmachhw[o][.] | Negative Multiply Accumulate High Halfword to Word Modulo Signed | | | ХО | 4 | 110 | | | 359 | LMA | nmachhws[o][.] | Negative Multiply Accumulate High Halfword to Word | | | | | | | | | | | Saturate Signed | | | _ | Op | code | -7 e | ٦, | | 4 | | | |---|------------|----------|------------|---------------------------|--------|----------------|------------------|------------------------|-----------------------------------------------------------------------------------------------| | | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Priv | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | | XO | 4 | 430 | | | 360 | LMA | nmaclhw[o][.] | Negative Multiply Accumulate Low Halfword to Word | | | хо | 4 | 494 | | | 360 | LMA | nmaclhws[o][.] | Modulo Signed Negative Multiply Accumulate Low Halfword to Word Saturate Signed | | | Χ | 31 | 124 | SR | | 78 | В | nor[.] | NOR | | | X | 31 | 444 | SR | | 77 | В | or[.] | OR | | | X<br>D | 31<br>24 | 412 | SR | | 78<br>75 | B<br>B | orc[.]<br>ori | OR with Complement OR Immediate | | | D | 25 | | | | 76 | В | oris | OR Immediate OR Immediate Shifted | | | Χ | 31 | 122 | | | 81 | B.in | popentb | Population Count Bytes | | 1 | Χ | 31 | 186 | | | 80 | 64 | prtyd | Parity Doubleword | | | X<br>XL | 31<br>19 | 154<br>51 | | Р | 80<br>614 | B<br>E | prtyw<br>rfci | Parity Word Return From Critical Interrupt | | | X | 19 | 39 | | Р | 614 | E.ED | rfdi | Return From Debug Interrupt | | | XL | 19 | 50 | | Р | 613 | E | rfi | Return From Interrupt | | | XL | 19 | 18 | | Р | 480 | S | rfid | Return From Interrupt Doubleword | | | XL | 19 | 38 | 00 | Р | 614 | E | rfmci | Return From Machine Check Interrupt | | | MDS<br>MDS | 30<br>30 | 8<br>9 | SR<br>SR | | 86<br>87 | 64<br>64 | rldcl[.]<br>rldcr[.] | Rotate Left Doubleword then Clear Left Rotate Left Doubleword then Clear Right | | | MD | 30 | 2 | SR | | 86 | 64 | rldic[.] | Rotate Left Doubleword Immediate then Clear | | | MD | 30 | 0 | SR | | 85 | 64 | rldicl[.] | Rotate Left Doubleword Immediate then Clear Left | | | MD | 30 | 1 | SR | | 85 | 64 | rldicr[.] | Rotate Left Doubleword Immediate then Clear Right | | | MD<br>M | 30<br>20 | 3 | SR<br>SR | | 87<br>84 | 64<br>B | rldimi[.]<br>rlwimi[.] | Rotate Left Doubleword Immediate then Mask Insert Rotate Left Word Immediate then Mask Insert | | | M | 21 | | SR | | 82 | В | rlwinn[.] | Rotate Left Word Immediate then AND with Mask | | | M | 23 | | SR | | 83 | В | rlwnm[.] | Rotate Left Word then AND with Mask | | | XL | 19 | 498 | | Н | 483 | S | rvwinkle | Rip Van Winkle | | | SC | 17 | | | | 39, | В | sc | System Call | | | | | | | | 479,<br>613 | | | | | Ī | Х | 31 | 979 | SR | Р | 535 | S | slbfee. | SLB Find Entry ESID | | • | Х | 31 | 498 | | Р | 532 | S | slbia | SLB Invalidate All | | | X | 31 | 434 | | Р | 531 | S | slbie | SLB Invalidate Entry | | | X<br>X | 31<br>31 | 915<br>851 | | P<br>P | 534<br>534 | S<br>S | slbmfee<br>slbmfev | SLB Move From Entry ESID<br>SLB Move From Entry VSID | | | X | 31 | 402 | | Р | 533 | S | slbmte | SLB Move To Entry | | | Χ | 31 | 27 | SR | | 90 | 64 | sld[.] | Shift Left Doubleword | | | XL | 19 | 466 | | Н | 483 | S | sleep | Sleep | | | X | 31 | 24<br>704 | SR | | 88 | B | slw[.] | Shift Left Word | | | X<br>XS | 31<br>31 | 794<br>413 | SR<br>SR | | 91<br>91 | 64<br>64 | srad[.]<br>sradi[.] | Shift Right Algebraic Doubleword Shift Right Algebraic Doubleword Immediate | | | X | 31 | 792 | SR | | 89 | В | sraw[.] | Shift Right Algebraic Word | | | X | 31 | 824 | SR | | 89 | В | srawi[.] | Shift Right Algebraic Word Immediate | | | X | 31<br>31 | 539<br>536 | SR<br>SR | | 90<br>88 | 64<br>B | srd[.] | Shift Right Doubleword | | | Ď | 38 | 556 | SK | | 51 | В | srw[.]<br>stb | Shift Right Word<br>Store Byte | | ı | X | 31 | 981 | | Н | 492 | Š | stbcix | Store Byte Caching Inhibited Indexed | | | X | 31 | 223 | | Р | 629 | E.PD | stbepx | Store Byte by External Process ID Indexed | | | D<br>X | 39 | 247 | | | 51<br>51 | В | stbu | Store Byte with Update | | | X | 31<br>31 | 247<br>215 | | | 51<br>51 | B<br>B | stbux<br>stbx | Store Byte with Update Indexed<br>Store Byte Indexed | | | DS | 62 | 0 | | | 54 | 64 | std | Store Doubleword | | | Χ | 31 | 1013 | | Н | 492 | S | stdcix | Store Doubleword Caching Inhibited Indexed | | - | X | 31 | 214 | | _ | 444 | 64<br>E.D.D. | stdcx. | Store Doubleword Conditional Indexed | | | X<br>DS | 31<br>62 | 157<br>1 | | Р | 630<br>54 | E.PD<br>64 | stdepx<br>stdu | Store Doubleword by External Process ID Indexed<br>Store Doubleword with Update | | | X | 31 | 181 | | | 5 <del>4</del> | 64 | stdux | Store Doubleword with Opdate Store Doubleword with Update Indexed | | | X | 31 | 149 | | | 54 | 64 | stdx | Store Doubleword Indexed | | _ | D | 54 | 70- | | _ | 123 | FP | stfd | Store Floating-Point Double | | | Х | 31 | 735 | | Р | 635 | E.PD | stfdepx | Store Floating-Point Double by External Process ID | | | | | | | | | | | Indexed | | [ | _ | Ор | code | <u>o</u> ⁻. | F. | | | | | |--------------|---------|----------|------------|---------------------------|------|------------|------------------|-----------------|-----------------------------------------------------------------------------------------| | E C | 5 | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Priv | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | ı 🖯 | DS | 61 | - | | | 125 | FP.out | stfdp | Store Floating-Point Double Pair | | | X | 31<br>55 | 919 | | | 125<br>123 | FP.out<br>FP | stfdpx | Store Floating-Point Double Pair Indexed | | | D<br>X | 31 | 759 | | | 123 | FP<br>FP | stfdu<br>stfdux | Store Floating-Point Double with Update Store Floating-Point Double with Update Indexed | | | X | 31 | 727 | | | 123 | FP | stfdx | Store Floating-Point Double Indexed | | | X | 31 | 983 | | | 123 | FP | stfiwx | Store Floating-Point Bodble Indexed Store Floating-Point as Integer Word Indexed | | | Ď | 52 | 303 | | | 122 | FP | stfs | Store Floating-Point as integer word indexed | | | D | 53 | | | | 122 | FP | stfsu | Store Floating-Point Single with Update | | | Χ | 31 | 695 | | | 122 | FP | stfsux | Store Floating-Point Single with Update Indexed | | 2 | Χ | 31 | 663 | | | 122 | FP | stfsx | Store Floating-Point Single Indexed | | | D | 44 | | | | 52 | В | sth | Store Halfword | | | Χ | 31 | 918 | | | 55 | В | sthbrx | Store Halfword Byte-Reverse Indexed | | | X | 31 | 949 | | Н | 492 | S | sthcix | Store Halfword Caching Inhibited Indexed | | | X | 31 | 415 | | Р | 629 | E.PD | sthepx | Store Halfword by External Process ID Indexed | | | D | 45 | 400 | | | 52 | В | sthu | Store Halfword with Update | | | X | 31 | 439 | | | 52 | В | sthux | Store Halfword with Update Indexed | | | X | 31 | 407 | | | 52 | В | sthx | Store Halfword Indexed | | | D<br>DS | 47<br>62 | 2 | | Р | 57<br>493 | B<br>LSQ | stmw | Store Multiple Word<br>Store Quadword | | | X | 31 | 725 | | Г | 60 | MA | stq<br>stswi | Store String Word Immediate | | | X | 31 | 661 | | | 60 | MA | stswx | Store String Word Infinediate Store String Word Indexed | | | X | 31 | 135 | | | 206 | V | stvebx | Store Vector Element Byte Indexed | | | X | 31 | 167 | | | 206 | V | stvehx | Store Vector Element Halfword Indexed | | | X | 31 | 807 | | Р | 638 | E.PD | stvepx | Store Vector by External Process ID Indexed | | | Χ | 31 | 775 | | P | 638 | E.PD | stvepxl | Store Vector by External Process ID Indexed LRU | | • | Χ | 31 | 199 | | | 207 | V | stvewx | Store Vector Element Word Indexed | | ) | Χ | 31 | 231 | | | 204 | V | stvx | Store Vector Indexed | | | X | 31 | 487 | | | 207 | V | stvxl | Store Vector Indexed LRU | | | D | 36 | | | | 53 | В | stw | Store Word | | | X | 31 | 662 | | ١ | 55 | В | stwbrx | Store Word Byte-Reverse Indexed | | 1 | X | 31 | 917 | | Н | 492 | S | stwcix | Store Word Caching Inhibited Indexed | | | X | 31 | 150 | | Р | 442 | В | stwcx. | Store Word Conditional Indexed | | | X<br>D | 31<br>37 | 159 | | Р | 630<br>53 | E.PD<br>B | stwepx<br>stwu | Store Word by External Process ID Indexed Store Word with Update | | | X | 31 | 183 | | | 53 | В | stwux | Store Word with Opdate Store Word with Update Indexed | | | X | 31 | 151 | | | 53 | В | stwx | Store Word Indexed | | | χo | 31 | 40 | SR | | 63 | В | subf[o][.] | Subtract From | | | XO | 31 | 8 | SR | | 64 | В | subfc[o][.] | Subtract From Carrying | | | ΧO | 31 | 136 | SR | | 65 | В | subfe[o][.] | Subtract From Extended | | | D | 8 | | SR | | 64 | В | subfic | Subtract From Immediate Carrying | | | XO | 31 | 232 | SR | | 65 | В | subfme[o][.] | Subtract From Minus One Extended | | | XO | 31 | 200 | SR | | 66 | В | subfze[o][.] | Subtract From Zero Extended | | | X | 31 | 598 | | | 446 | В | sync | Synchronize | | | X | 31 | 68 | | | 74 | 64 | td | Trap Doubleword | | | D | 2 | 270 | | ы | 74<br>542 | 64 | tdi | Trap Doubleword Immediate TLB Invalidate All | | • <u>;</u> | X<br>X | 31<br>31 | 370<br>306 | 64 | H | 542<br>539 | S<br>S | tlbia<br>tlbie | TLB Invalidate All TLB Invalidate Entry | | | x X | 31 | 274 | 64 | Н | | S | tlbiel | TLB Invalidate Entry TLB Invalidate Entry Local | | ; | x X | 31 | 786 | 04 | Р | 658, | S<br>E | tlbivax | TLB Invalidate Virtual Address Indexed | | ' | `` | 01 | , 50 | | ' | 747 | _ | | 1.25 III andato viitaai / taarooo iiiaa/ou | | | X | 31 | 946 | | Р | 658, | Е | tlbre | TLB Read Entry | | | | | • | | | 748 | - | | , | | | X | 31 | 914 | | Р | 659, | Е | tlbsx | TLB Search Indexed | | | | | | | | 748 | - | . , | | | | X | 31 | 566 | | Н | | В | tlbsync | TLB Synchronize | | | | | | | • | 659, | - | | | | | | | | | | 749 | | | | | | X | 31 | 978 | | Р | 660, | Е | tlbwe | TLB Write Entry | | | • | ٠. | | | • | 749 | _ | | | | | X | 31 | 4 | | | 73 | В | tw | Trap Word | | | D | 3 | - | | | 73 | В | twi | Trap Word Immediate | | L. | • | | | ı | L | | | 1 | 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 | | E | Op | code | . 7 e | ٦, | | - 1 | | | |---------|-----|------|---------------------------|------|------|------------------|-------------|-------------------------------------------------------------| | AX Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Pri∖ | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | VX | 4 | 384 | | | 220 | V | vaddcuw | Vector Add and Write Carry-Out Unsigned Word | | VX | 4 | 10 | | | 249 | V | vaddfp | Vector Add Single-Precision | | VX | 4 | 768 | | | 220 | V | vaddsbs | Vector Add Signed Byte Saturate | | VX | 4 | 832 | | | 220 | V | vaddshs | Vector Add Signed Halfword Saturate | | VX | 4 | 896 | | | 220 | V | vaddsws | Vector Add Signed Word Saturate | | VX | 4 | 0 | | | 221 | V | vaddubm | Vector Add Unsigned Byte Modulo | | VX | 4 | 512 | | | 222 | V | vaddubs | Vector Add Unsigned Byte Saturate | | VX | 4 | 64 | | | 221 | V | vadduhm | Vector Add Unsigned Halfword Modulo | | VX | 4 | 576 | | | 222 | V | vadduhs | Vector Add Unsigned Halfword Saturate | | VX | 4 | 128 | | | 221 | V | vadduwm | Vector Add Unsigned Word Modulo | | VX | 4 | 640 | | | 222 | V | vadduws | Vector Add Unsigned Word Saturate | | VX | 4 | 1028 | | | 244 | V | vand | Vector Logical AND | | VX | 4 | 1092 | | | 244 | V | vandc | Vector Logical AND with Complement | | VX | 4 | 1282 | | | 235 | V | vavgsb | Vector Average Signed Byte | | VX | 4 | 1346 | | | 235 | V | vavgsh | Vector Average Signed Halfword | | VX | 4 | 1410 | | | 235 | V | vavgsw | Vector Average Signed Word | | VX | 4 | 1026 | | | 236 | V | vavgub | Vector Average Unsigned Byte | | VX | 4 | 1090 | | | 236 | V | vavguh | Vector Average Unsigned Halfword | | VX | 4 | 1154 | | | 236 | V | vavguw | Vector Average Unsigned Word | | VX | 4 | 842 | | | 253 | V | vcfsx | Vector Convert From Signed Fixed-Point Word | | VX | 4 | 778 | | | 253 | V | vcfux | Vector Convert From Unsigned Fixed-Point Word | | VC | 4 | 966 | | | 255 | V | vcmpbfp[.] | Vector Compare Bounds Single-Precision | | VC | 4 | 198 | | | 255 | V | vcmpeqfp[.] | Vector Compare Equal To Single-Precision | | VC | 4 | 6 | | | 241 | V | vcmpequb[.] | Vector Compare Equal To Unsigned Byte | | VC | 4 | 70 | | | 241 | V | vcmpequh[.] | Vector Compare Equal To Unsigned Halfword | | VC | 4 | 134 | | | 242 | V | vcmpequw[.] | Vector Compare Equal To Unsigned Word | | VC | 4 | 454 | | | 256 | V | vcmpgefp[.] | Vector Compare Greater Than or Equal To Single-Precision | | VC | 4 | 710 | | | 256 | V | vcmpgtfp[.] | Vector Compare Greater Than Single-Precision | | VC | 4 | 774 | | | 242 | V | vcmpgtsb[.] | Vector Compare Greater Than Signed Byte | | VC | 4 | 838 | | | 242 | V | vcmpgtsh[.] | Vector Compare Greater Than Signed Halfword | | VC | 4 | 902 | | | 242 | V | vcmpgtsw[.] | Vector Compare Greater Than Signed Word | | VC | 4 | 518 | | | 243 | V | vcmpgtub[.] | Vector Compare Greater Than Unsigned Byte | | VC | 4 | 582 | | | 243 | V | vcmpgtuh[.] | Vector Compare Greater Than Unsigned Halfword | | VC | 4 | 646 | | | 243 | V | vcmpgtuw[.] | Vector Compare Greater Than Unsigned Word | | VX | 4 | 970 | | | 252 | V | vctsxs | Vector Convert To Signed Fixed-Point Word Saturate | | VX | 4 | 906 | | | 252 | V | vctuxs | Vector Convert To Unsigned Fixed-Point Word Saturate | | VX | 4 | 394 | | | 257 | V | vexptefp | Vector 2 Raised to the Exponent Estimate Floating-<br>Point | | VX | 4 | 458 | | | 257 | V | vlogefp | Vector Log Base 2 Estimate Floating-Point | | VA | 4 | 46 | | | 250 | V | vmaddfp | Vector Multiply-Add Single-Precision | | VX | 4 | 1034 | | | 251 | V | vmaxfp | Vector Maximum Single-Precision | | VX | 4 | 258 | | | 237 | V | vmaxsb | Vector Maximum Signed Byte | | VX | 4 | 322 | | | 237 | V | vmaxsh | Vector Maximum Signed Halfword | | VX | 4 | 386 | | | 237 | V | vmaxsw | Vector Maximum Signed Word | | VX | 4 | 2 | | | 238 | V | vmaxub | Vector Maximum Unsigned Byte | | VX | 4 | 66 | | | 238 | V | vmaxuh | Vector Maximum Unsigned Halfword | | VX | 4 | 130 | | | 238 | V | vmaxuw | Vector Maximum Unsigned Word | | VA | 4 | 32 | | | 228 | V | vmhaddshs | Vector Multiply-High-Add Signed Halfword Saturate | | VA | 4 | 33 | | | 228 | V | vmhraddshs | Vector Multiply-High-Round-Add Signed Halfword Saturate | | VX | 4 | 1098 | | | 251 | V | vminfp | Vector Minimum Single-Precision | | VX | 4 | 770 | | | 239 | V | vminsb | Vector Minimum Signed Byte | | VX | 4 | 834 | | | 239 | V | vminsh | Vector Minimum Signed Halfword | | VX | 4 | 898 | | | 239 | V | vminsw | Vector Minimum Signed Word | | VX | 4 | 514 | | | 240 | V | vminub | Vector Minimum Unsigned Byte | | VX | 4 | 578 | | | 240 | V | vminuh | Vector Minimum Unsigned Halfword | | VX | 4 | 642 | | | 240 | V | vminuw | Vector Minimum Unsigned Word | | VA | 4 | 34 | | | 229 | V | vmladduhm | Vector Multiply-Low-Add Unsigned Halfword Modulo | | VX | 4 | 12 | | | 214 | V | vmrghb | Vector Merge High Byte | | VX | 4 | 76 | | | 214 | V | vmrghh | Vector Merge High Halfword | | _ | Ор | code | <u>, 7 e</u> | _ | | 4 | | | |-----------------------------------------|-----|------|---------------------------|------|------|------------------|-----------|---------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Priv | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | VX | 4 | 140 | | | 214 | V | vmrghw | Vector Merge High Word | | VX | 4 | 268 | | | 215 | V | vmrglb | Vector Merge Low Byte | | VX | 4 | 332 | | | 215 | V | vmrglh | Vector Merge Low Halfword | | VX | 4 | 396 | | | 215 | V | vmrglw | Vector Merge Low Word | | VA | 4 | 37 | | | 230 | V | vmsummbm | Vector Multiply-Sum Mixed Byte Modulo | | VA | 4 | 40 | | | 230 | V | vmsumshm | Vector Multiply-Sum Signed Halfword Modulo | | VA | 4 | 41 | | | 231 | V | vmsumshs | Vector Multiply-Sum Signed Halfword Saturate | | VA | 4 | 36 | | | 229 | V | vmsumubm | Vector Multiply-Sum Unsigned Byte Modulo | | VA | 4 | 38 | | | 231 | V | vmsumuhm | Vector Multiply-Sum Unsigned Halfword Modulo | | VA | 4 | 39 | | | 232 | V | vmsumuhs | Vector Multiply-Sum Unsigned Halfword Saturate | | VX | 4 | 776 | | | 226 | V | vmulesb | Vector Multiply Even Signed Byte | | VX | 4 | 840 | | | 226 | V | vmulesh | Vector Multiply Even Signed Halfword | | VX | 4 | 520 | | | 226 | V | vmuleub | Vector Multiply Even Unsigned Byte | | VX | 4 | 584 | | | 226 | V | vmuleuh | Vector Multiply Even Unsigned Halfword | | VX | 4 | 264 | | | 227 | V | vmulosb | Vector Multiply Odd Signed Byte | | VX | 4 | 328 | | | 227 | V | vmulosh | Vector Multiply Odd Signed Halfword | | VX | 4 | 8 | | | 227 | V | vmuloub | Vector Multiply Odd Unsigned Byte | | VX | 4 | 72 | | | 227 | V | vmulouh | Vector Multiply Odd Unsigned Halfword | | VA | 4 | 47 | | | 250 | V | vnmsubfp | Vector Negative Multiply-Subtract Single-Precision | | VX | 4 | 1284 | | | 244 | V | vnor | Vector Logical NOR | | VX | 4 | 1156 | | | 244 | V | vor | Vector Logical OR | | VA | 4 | 43 | | | 217 | V | vperm | Vector Permute | | VX | 4 | 782 | | | 209 | V | vpkpx | Vector Pack Pixel | | VX | 4 | 398 | | | 210 | V | vpkshss | Vector Pack Signed Halfword Signed Saturate | | VX | 4 | 270 | | | 210 | V | vpkshus | Vector Pack Signed Halfword Unsigned Saturate | | VX | 4 | 462 | | | 210 | V | vpkswss | Vector Pack Signed Word Signed Saturate | | VX | 4 | 334 | | | 210 | V | vpkswus | Vector Pack Signed Word Unsigned Saturate | | VX | 4 | 14 | | | 211 | V | vpkuhum | Vector Pack Unsigned Halfword Unsigned Modulo | | VX | 4 | 142 | | | 211 | V | vpkuhus | Vector Pack Unsigned Halfword Unsigned Saturate | | VX | 4 | 78 | | | 211 | V | vpkuwum | Vector Pack Unsigned Word Unsigned Modulo | | VX | 4 | 206 | | | 211 | V | vpkuwus | Vector Pack Unsigned Word Unsigned Saturate | | VX | 4 | 266 | | | 258 | V | vrefp | Vector Reciprocal Estimate Single-Precision | | VX | 4 | 714 | | | 254 | V | vrfim | Vector Round to Single-Precision Integer toward -Infin- | | | | | | | | | | ity | | VX | 4 | 522 | | | 254 | V | vrfin | Vector Round to Single-Precision Integer Nearest | | VX | 4 | 650 | | | 254 | V | vrfip | Vector Round to Single-Precision Integer toward +Infin- | | | | | | | | | | ity | | VX | 4 | 586 | | | 254 | V | vrfiz | Vector Round to Single-Precision Integer toward Zero | | VX | 4 | 4 | | | 245 | V | vrlb | Vector Rotate Left Byte | | VX | 4 | 68 | | | 245 | V | vrlh | Vector Rotate Left Halfword | | VX | 4 | 132 | | | 245 | V | vrlw | Vector Rotate Left Word | | VX | 4 | 330 | | | 258 | V | vrsqrtefp | Vector Reciprocal Square Root Estimate Single-Preci- | | | | | | | | | ' ' | sion | | VA | 4 | 42 | | l | 217 | V | vsel | Vector Select | | VX | 4 | 452 | | | 218 | V | vsl | Vector Shift Left | | VX | 4 | 260 | | | 246 | V | vslb | Vector Shift Left Byte | | VA | 4 | 44 | | | 218 | V | vsldoi | Vector Shift Left Double by Octet Immediate | | VX | 4 | 324 | | | 246 | V | vslh | Vector Shift Left Halfword | | VX | 4 | 1036 | | | 218 | V | vslo | Vector Shift Left by Octet | | VX | 4 | 388 | | | 246 | V | vslw | Vector Shift Left Word | | VX | 4 | 524 | | | 216 | V | vspltb | Vector Splat Byte | | VX | 4 | 588 | | | 216 | V | vsplth | Vector Splat Halfword | | VX | 4 | 780 | | | 216 | V | vspltisb | Vector Splat Immediate Signed Byte | | VX | 4 | 844 | | | 216 | V | vspltish | Vector Splat Immediate Signed Halfword | | VX | 4 | 908 | | l | 216 | V | vspltisw | Vector Splat Immediate Signed Word | | VX | 4 | 652 | | | 216 | V | vspltw | Vector Splat Word | | VX | 4 | 708 | | | 219 | V | vsr | Vector Shift Right | | VX | 4 | 772 | | | 248 | V | vsrab | Vector Shift Right Algebraic Byte | | VX | 4 | 836 | | l | 248 | V | vsrah | Vector Shift Right Algebraic Byte | | VX | 4 | 900 | | | 248 | V | vsraw | Vector Shift Right Algebraic Word | | VX | 4 | 516 | | l | 247 | V | vsrb | Vector Shift Right Byte | | • • • • • • • • • • • • • • • • • • • • | | 0.0 | l | | '' | , v | | 1. Color Crime raight Dylo | | _ | Op | code | o۲. | _ | | | | | |------|-----|------|-------------|-------------------|------|------------------|----------|----------------------------------------------------| | Form | Pri | Ext | Mod<br>Dep. | Priv <sup>1</sup> | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | VX | 4 | 580 | | | 247 | V | vsrh | Vector Shift Right Halfword | | VX | 4 | 1100 | | | 219 | V | vsro | Vector Shift Right by Octet | | VX | 4 | 644 | | | 247 | V | vsrw | Vector Shift Right Word | | VX | 4 | 1408 | | | 223 | V | vsubcuw | Vector Subtract and Write Carry-Out Unsigned Word | | VX | 4 | 74 | | | 249 | V | vsubfp | Vector Subtract Single-Precision | | VX | 4 | 1792 | | | 223 | V | vsubsbs | Vector Subtract Signed Byte Saturate | | VX | 4 | 1856 | | | 223 | V | vsubshs | Vector Subtract Signed Halfword Saturate | | VX | 4 | 1920 | | | 223 | V | vsubsws | Vector Subtract Signed Word Saturate | | VX | 4 | 1024 | | | 224 | V | vsububm | Vector Subtract Unsigned Byte Modulo | | VX | 4 | 1536 | | | 225 | V | vsububs | Vector Subtract Unsigned Byte Saturate | | VX | 4 | 1088 | | | 224 | V | vsubuhm | Vector Subtract Unsigned Halfword Modulo | | VX | 4 | 1600 | | | 224 | V | vsubuhs | Vector Subtract Unsigned Halfword Saturate | | VX | 4 | 1152 | | | 224 | V | vsubuwm | Vector Subtract Unsigned Word Modulo | | VX | 4 | 1664 | | | 225 | V | vsubuws | Vector Subtract Unsigned Word Saturate | | VX | 4 | 1672 | | | 233 | V | vsum2sws | Vector Sum across Half Signed Word Saturate | | VX | 4 | 1800 | | | 234 | V | vsum4sbs | Vector Sum across Quarter Signed Byte Saturate | | VX | 4 | 1608 | | | 234 | V | vsum4shs | Vector Sum across Quarter Signed Halfword Saturate | | VX | 4 | 1544 | | | 234 | V | vsum4ubs | Vector Sum across Quarter Unsigned Byte Saturate | | VX | 4 | 1928 | | | 233 | V | vsumsws | Vector Sum across Signed Word Saturate | | VX | 4 | 846 | | | 212 | V | vupkhpx | Vector Unpack High Pixel | | VX | 4 | 526 | | | 212 | V | vupkhsb | Vector Unpack High Signed Byte | | VX | 4 | 590 | | | 212 | V | vupkhsh | Vector Unpack High Signed Halfword | | VX | 4 | 974 | | | 213 | V | vupklpx | Vector Unpack Low Pixel | | VX | 4 | 654 | | | 213 | V | vupklsb | Vector Unpack Low Signed Byte | | VX | 4 | 718 | | | 213 | V | vupklsh | Vector Unpack Low Signed Halfword | | VX | 4 | 1220 | | | 244 | V | vxor | Vector Logical XOR | | X | 31 | 62 | | | 449 | WT | wait | Wait | | Χ | 31 | 131 | | Р | 626 | Е | wrtee | Write MSR External Enable | | Χ | 31 | 163 | | Р | 626 | Ε | wrteei | Write MSR External Enable Immediate | | Χ | 31 | 316 | SR | | 77 | В | xor[.] | XOR | | D | 26 | | | | 76 | В | xori | XOR Immediate | | D | 27 | | | | 76 | В | xoris | XOR Immediate Shifted | See the key to the mode dependency and privilege columns on page 905 and the key to the category column in Section 1.3.5 of Book I. ### **Mode Dependency and Privilege Abbreviations** Except as described below and in Section 1.10.3, "Effective Address Calculation", in Book I, all instructions are independent of whether the processor is in 32-bit or 64-bit mode. #### **Key to Mode Dependency Column** | Mode Dep.<br>CT | <b>Description</b> If the instruction tests the Count Register, it tests the low-order 32 bits in 32-bit mode and all 64 bits in 64-bit mode. | |-----------------|-----------------------------------------------------------------------------------------------------------------------------------------------| | SR | The setting of status registers (such as XER and CR0) is mode-dependent. | | 32 | The instruction can be executed only in 32-bit mode. | | 64 | The instruction can be executed only in 64-bit mode. | #### **Key to Privilege Column** #### Priv. Description P Denotes a privileged instruction. #### Priv. Description - O Denotes an instruction that is treated as privileged or nonprivileged (or hypervisor, for *mtspr*), depending on the SPR or PMR numher - H Denotes an instruction that can be executed only in hypervisor state - M Denotes an instruction that is treated as privileged or nonprivileged, depending on the value of the UCLE bit in the MSR. # **Appendix H. Power ISA Instruction Set Sorted by Category** This appendix lists all the instructions in the Power ISA, grouped by category, and in order by mnemonic within category. . | _ | Op | code | e<br>.1 | ٦, | | | | | |-----------|-----|------|---------------------------|------|------|------------------|-------------|---------------------------------------------------| | X<br>Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Priv | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | X | 31 | 58 | SR | | 81 | 64 | cntlzd[.] | Count Leading Zeros Doubleword | | XO | 31 | 489 | SR | | 70 | 64 | divd[o][.] | Divide Doubleword | | XO | 31 | 457 | SR | | 70 | 64 | divdu[o][.] | Divide Doubleword Unsigned | | X | 31 | 986 | SR | | 81 | 64 | extsw[.] | Extend Sign Word | | DS | 58 | 0 | | | 50 | 64 | ld | Load Doubleword | | X | 31 | 84 | | | 444 | 64 | Idarx | Load Doubleword And Reserve Indexed | | DS | 58 | 1 | | | 50 | 64 | ldu | Load Doubleword with Update | | X | 31 | 53 | | | 50 | 64 | ldux | Load Doubleword with Update Indexed | | X | 31 | 21 | | | 50 | 64 | ldx | Load Doubleword Indexed | | DS | 58 | 2 | | | 49 | 64 | lwa | Load Word Algebraic | | X | 31 | 373 | | | 49 | 64 | lwaux | Load Word Algebraic with Update Indexed | | Χ | 31 | 341 | | | 49 | 64 | lwax | Load Word Algebraic Indexed | | XO | 31 | 73 | SR | | 69 | 64 | mulhd[.] | Multiply High Doubleword | | XO | 31 | 9 | SR | | 69 | 64 | mulhdu[.] | Multiply High Doubleword Unsigned | | XO | 31 | 233 | SR | | 69 | 64 | mulld[o][.] | Multiply Low Doubleword | | X | 31 | 186 | | | 80 | 64 | prtyd | Parity Doubleword | | MDS | 30 | 8 | SR | | 86 | 64 | rldcl[.] | Rotate Left Doubleword then Clear Left | | MDS | 30 | 9 | SR | | 87 | 64 | rldcr[.] | Rotate Left Doubleword then Clear Right | | MD | 30 | 2 | SR | | 86 | 64 | rldic[.] | Rotate Left Doubleword Immediate then Clear | | MD | 30 | 0 | SR | | 85 | 64 | rldicl[.] | Rotate Left Doubleword Immediate then Clear Left | | MD | 30 | 1 | SR | | 85 | 64 | rldicr[.] | Rotate Left Doubleword Immediate then Clear Right | | MD | 30 | 3 | SR | | 87 | 64 | rldimi[.] | Rotate Left Doubleword Immediate then Mask Insert | | X | 31 | 27 | SR | | 90 | 64 | sld[.] | Shift Left Doubleword | | Χ | 31 | 794 | SR | | 91 | 64 | srad[.] | Shift Right Algebraic Doubleword | | XS | 31 | 413 | SR | | 91 | 64 | sradi[.] | Shift Right Algebraic Doubleword Immediate | | X | 31 | 539 | SR | | 90 | 64 | srd[.] | Shift Right Doubleword | | DS | 62 | 0 | | | 54 | 64 | std | Store Doubleword | | X | 31 | 214 | | | 444 | 64 | stdcx. | Store Doubleword Conditional Indexed | | DS | 62 | 1 | | | 54 | 64 | stdu | Store Doubleword with Update | | X | 31 | 181 | | | 54 | 64 | stdux | Store Doubleword with Update Indexed | | X | 31 | 149 | | | 54 | 64 | stdx | Store Doubleword Indexed | | X | 31 | 68 | | | 74 | 64 | td | Trap Doubleword | | D | 2 | | | | 74 | 64 | tdi | Trap Doubleword Immediate | | XO | 31 | 266 | SR | | 63 | В | add[o][.] | Add | | XO | 31 | 10 | SR | | 64 | В | addc[o][.] | Add Carrying | | XO | 31 | 138 | SR | | 65 | В | adde[o][.] | Add Extended | | D | 14 | | | | 62 | В | addi | Add Immediate | | D | 12 | | SR | | 63 | В | addic | Add Immediate Carrying | | D | 13 | | SR | | 63 | В | addic. | Add Immediate Carrying and Record | | D | 15 | | | | 62 | В | addis | Add Immediate Shifted | | XO | 31 | 234 | SR | | 65 | В | addme[o][.] | Add to Minus One Extended | | XO | 31 | 202 | SR | | 66 | В | addze[o][.] | Add to Zero Extended | | Χ | 31 | 28 | SR | | 77 | В | and[.] | AND | | _ | Ор | code | <u>o</u> ⊤. | ۲, | | | | | |-------------|----------|------------|---------------------------|------|----------------------|------------------|---------------|-------------------------------------------------------------------------------| | X<br>Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Priv | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | | 31 | 60 | SR | | 78 | В | andc[.] | AND with Complement | | D | 28 | | SR | | 75 | В | andi. | AND Immediate | | D | 29 | | SR | | 75 | В | andis. | AND Immediate Shifted | | | 18 | | ОТ | | 35 | В | b[l][a] | Branch | | В | 16 | <b>500</b> | CT | | 35 | В | bc[l][a] | Branch Conditional | | XL | 19 | 528 | CT | | 36 | В | bcctr[l] | Branch Conditional to Count Register | | XL | 19<br>31 | 16<br>0 | СТ | | 36 | B<br>B | bclr[l] | Branch Conditional to Link Register | | X | 31 | 508 | | | 71<br>79 | В | cmp | Compare Compare Bytes | | ĥ | 11 | 506 | | | 79 | В | cmpb<br>cmpi | Compare Immediate | | X | 31 | 32 | | | 72 | В | cmpl | Compare Logical | | Ď | 10 | 32 | | | 72 | В | cmpli | Compare Logical Immediate | | X | 31 | 26 | SR | | 79 | В | cntlzw[.] | Count Leading Zeros Word | | XL | 19 | 257 | 0.1 | | 37 | В | crand | Condition Register AND | | XL | 19 | 129 | | | 38 | В | crandc | Condition Register AND with Complement | | XL | 19 | 289 | | | 38 | В | creqv | Condition Register Equivalent | | XL | 19 | 225 | | | 37 | В | crnand | Condition Register NAND | | XL | 19 | 33 | | | 38 | В | crnor | Condition Register NOR | | XL | 19 | 449 | | | 37 | B | cror | Condition Register OR | | XL | 19 | 417 | | | 38 | В | crorc | Condition Register OR with Complement | | XL | 19 | 193 | | | 37 | В | crxor | Condition Register XOR | | X | 31 | 86 | | | 437 | В | dcbf | Data Cache Block Flush | | Х | 31 | 54 | | | 436 | В | dcbst | Data Cache Block Store | | Х | 31 | 278 | | | 434 | В | dcbt | Data Cache Block Touch | | Χ | 31 | 246 | | | 435 | В | dcbtst | Data Cache Block Touch for Store | | Χ | 31 | 1014 | | | 436 | В | dcbz | Data Cache Block set to Zero | | XO | 31 | 491 | SR | | 68 | В | divw[o][.] | Divide Word | | XO | 31 | 459 | SR | | 68 | В | divwu[o][.] | Divide Word Unsigned | | Χ | 31 | 284 | SR | | 78 | В | eqv[.] | Equivalent | | Χ | 31 | 954 | SR | | 79 | В | extsb[.] | Extend Sign Byte | | Χ | 31 | 922 | SR | | 79 | В | extsh[.] | Extend Sign Halfword | | Χ | 31 | 982 | | | 428 | В | icbi | Instruction Cache Block Invalidate | | XL | 19 | 150 | | | 440 | В | isync | Instruction Synchronize | | D | 34 | | | | 45 | В | lbz | Load Byte and Zero | | D | 35 | | | | 45 | В | lbzu | Load Byte and Zero with Update | | X | 31 | 119 | | | 45 | В | lbzux | Load Byte and Zero with Update Indexed | | X | 31 | 87 | | | 46 | В | lbzx | Load Byte and Zero Indexed | | D | 42 | | | | 47 | В | lha<br> | Load Halfword Algebraic | | D | 43 | | | | 47 | В | lhau | Load Halfword Algebraic with Update | | X | 31 | 375 | | | 47 | В | lhaux | Load Halfword Algebraic with Update Indexed | | X | 31 | 343 | | | 47 | В | lhax | Load Halfword Algebraic Indexed | | X | 31 | 790 | | | 55<br>46 | В | Ihbrx | Load Halfword and Zoro | | D | 40<br>41 | | | | 46<br>46 | В | lhz | Load Halfword and Zero | | D<br>X | 31 | 311 | | l | 46<br>46 | B<br>B | lhzu | Load Halfword and Zero with Update Load Halfword and Zero with Update Indexed | | X | 31 | 279 | | l | 46<br>46 | В | lhzux<br>lhzx | Load Halfword and Zero With Opdate Indexed | | Ď | 46 | 213 | | l | <del>4</del> 6<br>56 | В | Imw | Load Multiple Word | | X | 31 | 20 | | | 442 | В | lwarx | Load World And Reserve Indexed | | X | 31 | 534 | | | 55 | В | lwbrx | Load Word Byte-Reverse Indexed | | Ď | 32 | JJ- | | l | 48 | В | lwz | Load Word and Zero | | D | 33 | | | l | 48 | В | lwzu | Load Word and Zero with Update | | X | 31 | 55 | | l | 48 | В | lwzux | Load Word and Zero with Opdate Load Word and Zero with Update Indexed | | X | 31 | 23 | | | 48 | В | lwzx | Load Word and Zero Indexed | | XL | 19 | 0 | | | 38 | В | mcrf | Move Condition Register Field | | XFX | 31 | 19 | | l | 95 | В | mfcr | Move From Condition Register | | X | 31 | 83 | | Р | 503, | В | mfmsr | Move From Machine State Register | | <b> </b> ^` | | 55 | | | 625 | | | | | XFX | 31 | 339 | | 0 | 94,4 | В | mfspr | Move From Special Purpose Register | | `` ` ` ` | | 555 | | | 51 | | | | | XFX | 31 | 144 | | | 95 | В | mtcrf | Move To Condition Register Fields | | XFX | 31 | 467 | | 0 | 93 | В | mtspr | Move To Special Purpose Register | | _ | Ор | code | <u>o</u> ⁻. | ٦, | | 4 | | | |-----------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------|-------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Priv | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | XO<br>XO<br>D<br>XO<br>X<br>X<br>X<br>X<br>D<br>D<br>X<br>M<br>M<br>M<br>SC | 31<br>31<br>31<br>31<br>31<br>31<br>31<br>24<br>25<br>31<br>20<br>21<br>23 | 75<br>11<br>235<br>476<br>104<br>124<br>444<br>412 | SR<br>SR<br>SR<br>SR<br>SR<br>SR<br>SR<br>SR<br>SR | | 67<br>67<br>67<br>67<br>77<br>66<br>78<br>77<br>78<br>75<br>76<br>80<br>84<br>82<br>83<br>39,<br>479, | 888888888888888 | mulhw[.] mulhwu[.] mulli mullw[o][.] nand[.] neg[o][.] nor[.] or[.] ori oris prtyw rlwimi[.] rlwinm[.] sc | Multiply High Word Multiply High Word Unsigned Multiply Low Immediate Multiply Low Word NAND Negate NOR OR OR With Complement OR Immediate OR Immediate Shifted Parity Word Rotate Left Word Immediate then Mask Insert Rotate Left Word then AND with Mask System Call | | X | 31<br>31<br>31<br>31<br>33<br>39<br>31<br>31<br>44<br>31<br>31<br>47<br>36<br>31<br>31<br>31<br>31<br>31<br>31<br>31<br>31<br>31<br>31<br>31<br>31<br>31 | 24<br>792<br>824<br>536<br>247<br>215<br>918<br>439<br>407<br>662<br>150<br>183<br>151<br>40<br>8<br>136<br>232<br>200<br>598<br>566 | SR SR SR SR SR SR SR SR SR | I | 613<br>88<br>89<br>89<br>88<br>51<br>51<br>51<br>52<br>55<br>52<br>52<br>57<br>53<br>53<br>54<br>54<br>54<br>54<br>65<br>66<br>66<br>65<br>65<br>65<br>65<br>65<br>65<br>65<br>65<br>65<br>65 | 888888888888888888888888888888888888888 | slw[.] sraw[.] sraw[.] srw[.] stb stbu stbux stbx sth sthbrx sthu sthux sthx stmw stw stw stwbrx stwcx. stwu stwux subf[o][.] subfe[o][.] subfe[o][.] subfeze[o][.] subfze[o][.] sync tlbsync | Shift Left Word Shift Right Algebraic Word Shift Right Algebraic Word Immediate Shift Right Word Store Byte Store Byte Store Byte with Update Store Byte with Update Indexed Store Byte Indexed Store Halfword Store Halfword Byte-Reverse Indexed Store Halfword with Update Store Halfword with Update Store Halfword Indexed Store Halfword Indexed Store Word Indexed Store Word Store Word Store Word Conditional Indexed Store Word with Update Store Word with Update Store Word with Update Store Word Indexed Store Word Indexed Subtract From Subtract From Carrying Subtract From Extended Subtract From Immediate Carrying Subtract From Jero Extended Subtract From Zero Extended Synchronize TLB Synchronize | | X<br>D<br>X<br>D<br>D<br>A<br>XFX<br>XFX<br>X<br>XO<br>X<br>X | 31<br>3<br>31<br>26<br>27<br>31<br>31<br>31<br>31<br>31<br>31<br>31<br>59 | 4<br>316<br>15<br>19<br>144<br>122<br>74<br>314<br>282<br>2 | SR<br>SR | H H H | 749<br>73<br>73<br>77<br>76<br>76<br>74<br>96<br>96<br>81<br>495<br>494<br>494<br>163 | B<br>B<br>B<br>B.in<br>B.in<br>B.in<br>BCDA<br>BCDA<br>BCDA<br>DFP | tw twi xor[.] xori xoris isel mfocrf mtocrf popcntb addg6s cbcdtd cdtbcd dadd | Trap Word Trap Word Immediate XOR XOR Immediate XOR Immediate Shifted Integer Select Move From One Condition Register Field Move To One Condition Register Field Population Count Bytes Add and Generate Sixes Convert Binary Coded Decimal to Declets Convert Declets To Binary Coded Decimal DFP Add | | E Pri | 1 [ | <b>~</b> ~ ~ | | 4 | | | |------------------|------------|------------------------------------------------|------------|------------------|-------------------------|--------------------------------------------------------------------------------| | V | Ext | Mode<br>Dep. <sup>1</sup><br>Priv <sup>1</sup> | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | X 63 | 2 | | 163 | DFP | daddq | DFP Add Quad | | X 63 | 802 | | 185 | DFP | dcffixq | DFP Convert From Fixed Quad | | X 59<br>X 63 | 130 | | 169 | DFP | dcmpo | DFP Compare Ordered | | X 63<br>X 59 | 130<br>642 | | 169 | DFP<br>DFP | dcmpoq | DFP Compare Ordered Quad DFP Compare Unordered | | X 63 | 642 | | 168<br>169 | DFP | dcmpu<br>dcmpuq | DFP Compare Unordered Quad | | X 59 | 258 | | 183 | DFP | dctdp | DFP Convert To DFP Long | | X 59 | 290 | | 185 | DFP | dctfix | DFP Convert To Fixed | | X 63 | 290 | | 185 | DFP | dctfixq | DFP Convert To Fixed Quad | | X 63 | 258 | | 183 | DFP | dctqpq | DFP Convert To DFP Extended | | X 59 | 322 | | 187 | DFP | ddedpd | DFP Decode DPD To BCD | | X 63 | 322 | | 187 | DFP | ddedpdq | DFP Decode DPD To BCD Quad | | X 59 | 546 | | 166 | DFP | ddiv | DFP Divide | | X 63 | 546 | | 166 | DFP | ddivq | DFP Divide Quad | | X 59 | 834 | | 187 | DFP | denbcd | DFP Encode BCD To DPD | | X 63 | 834 | | 187 | DFP | denbcdq | DFP Encode BCD To DPD Quad | | X 59<br>X 63 | 866<br>866 | | 188<br>188 | DFP<br>DFP | diex | DFP Insert Biased Exponent Quad | | X 59 | 34 | | 165 | DFP | diexq<br>dmul | DFP Insert Biased Exponent Quad DFP Multiply | | X 63 | 34 | | 165 | DFP | dmulq | DFP Multiply Quad | | Z 59 | 3 | | 174 | DFP | dqua | DFP Quantize | | Z23 59 | 67 | | 173 | DFP | dquai[.] | DFP Quantize Immediate | | Z23 63 | 67 | | 173 | DFP | dquaiq[.] | DFP Quantize Immediate Quad | | Z23 63 | 3 | | 174 | DFP | dquaq[.] | DFP Quantize Quad | | X 63 | 770 | | 184 | DFP | drdpq | DFP Round To DFP Long | | Z23 59 | 227 | | 181 | DFP | drintn[.] | DFP Round To FP Integer Without Inexact | | Z23 63 | 227 | | 181 | DFP | drintnq[.] | DFP Round To FP Integer Without Inexact Quad | | Z23 59<br>Z23 63 | 99<br>99 | | 179<br>179 | DFP<br>DFP | drintx[.]<br>drintxq[.] | DFP Round To FP Integer With Inexact DFP Round To FP Integer With Inexact Quad | | Z 59 | 35 | | 176 | DFP | drrnd | DFP Reround | | Z23 63 | 35 | | 176 | DFP | drrndq[.] | DFP Reround Quad | | X 59 | 770 | | 184 | DFP | drsp | DFP Round To DFP Short | | Z23 59 | 66 | | 190 | DFP | dscli[.] | DFP Shift Significand Left Immediate | | Z23 63 | 66 | | 190 | DFP | dscliq[.] | DFP Shift Significand Left Immediate Quad | | Z 59 | 98 | | 190 | DFP | dscri | DFP Shift Significand Right Immediate | | Z 63 | 98 | | 190 | DFP | dscriq | DFP Shift Significand Right Immediate Quad | | X 59<br>X 63 | 514 | | 163 | DFP<br>DFP | dsubg | DFP Subtract DFP Subtract Quad | | Z23 59 | 514<br>194 | | 163<br>170 | DFP | dsubq<br>dtstdc | DFP Test Data Class | | Z23 63 | 194 | | 170 | DFP | dtstdcq | DFP Test Data Class Quad | | Z23 59 | 226 | | 170 | DFP | dtstdg | DFP Test Data Group | | Z23 63 | 226 | | 170 | DFP | dtstdgq | DFP Test Data Group Quad | | X 59 | 162 | | 171 | DFP | dtstex | DFP Test Exponent | | X 63 | 162 | | 171 | DFP | dtstexq | DFP Test Exponent Quad | | X 59 | 674 | | 172 | DFP | dtstsf | DFP Test Significance | | X 63 | 674 | | 172 | DFP | dtstsfq | DFP Test Significance Quad | | X 59 | 354 | | 188 | DFP | dxex | DFP Extract Biased Exponent | | X 63<br>X 31 | 354<br>758 | | 188<br>433 | DFP<br>E | dxexq<br>dcba | DFP Extract Biased Exponent Quad Data Cache Block Allocate | | X 31 | 470 | Р | 652 | Ē | dcba | Data Cache Block Invalidate | | X 31 | 22 | | 428 | Ē | icbt | Instruction Cache Block Touch | | X 31 | 854 | | 448 | Ē | mbar | Memory Barrier | | X 31 | 512 | | 97 | Е | mcrxr | Move to Condition Register from XER | | X 31 | 275 | | 97 | Ε | mfapidi | Move From APID Indirect | | XFX 31 | 323 | P | 625 | E | mfdcr | Move From Device Control Register | | X 31 | 291 | Р | 97 | E | mfdcrux | Move From Device Control Register User-mode | | X 31 | 259 | Р | 625 | Е | mfdcrx | Indexed Move From Device Control Register Indexed | | XFX 31 | 451 | P | 624 | Ē | mtdcr | Move To Device Control Register | | X 31 | 419 | ' | 97 | Ē | mtdcrux | Move To Device Control Register User-mode Indexed | | X 31 | 387 | Р | 624 | Ε | mtdcrx | Move To Device Control Register Indexed | | X 31 | 146 | Р | 625 | Е | mtmsr | Move To Machine State Register | | _ | Opcode | | o ⁻. | ٦, | Page | ge Cat <sup>1</sup> | Mnomonic | Instruction | |------|--------|------|---------------------------|------|-------------|---------------------|-----------|------------------------------------------------------------------------| | Form | Pri | | Mode<br>Dep. <sup>1</sup> | Priv | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | XL | 19 | 51 | | Ρ | 614 | E | rfci | Return From Critical Interrupt | | XL | 19 | 50 | | Р | 613 | Ε | rfi | Return From Interrupt | | XL | 19 | 38 | | Р | 614 | E | rfmci | Return From Machine Check Interrupt | | X | 31 | 786 | | Р | 658,<br>747 | Е | tlbivax | TLB Invalidate Virtual Address Indexed | | Х | 31 | 946 | | Р | 658,<br>748 | Е | tlbre | TLB Read Entry | | Х | 31 | 914 | | Р | 659,<br>748 | Е | tlbsx | TLB Search Indexed | | Х | 31 | 978 | | Р | 660,<br>749 | Е | tlbwe | TLB Write Entry | | X | 31 | 131 | | Р | 626 | Е | wrtee | Write MSR External Enable | | X | 31 | 163 | | P | 626 | Ē | wrteei | Write MSR External Enable Immediate | | X | 31 | 326 | | Р | 730 | E.CD | dcread | Data Cache Read [Alternative Encoding] | | X | 31 | 486 | | Р | 730 | E.CD | dcread | Data Cache Read | | Χ | 31 | 998 | | Р | 731 | E.CD | icread | Instruction Cache Read | | Χ | 31 | 454 | | Р | 727 | E.CI | dci | Data Cache Invalidate | | X | 31 | 966 | | Р | 727 | E.CI | ici | Instruction Cache Invalidate | | XFX | 19 | 198 | | | 718 | E.ED | dnh | Debugger Notify Halt | | X | 19 | 39 | | Р | 614 | E.ED | rfdi | Return From Debug Interrupt | | Χ | 31 | 238 | | Р | 721 | E.PC | msgclr | Message Clear | | Χ | 31 | 206 | | Р | 721 | E.PC | msgsnd | Message Send | | X | 31 | 127 | | Р | 632 | E.PD | dcbfep | Data Cache Block Flush by External PID | | X | 31 | 63 | | Р | 631 | E.PD | dcbstep | Data Cache Block Store by External PID | | X | 31 | 319 | | Р | 631 | E.PD | dcbtep | Data Cache Block Touch by External PID | | Χ | 31 | 255 | | Р | 633 | E.PD | dcbtstep | Data Cache Block Touch for Store by External PID | | X | 31 | 1023 | | Р | 634 | E.PD | dcbzep | Data Cache Block set to Zero by External PID | | EVX | 31 | 285 | | Р | 636 | E.PD | evlddepx | Vector Load Doubleword into Doubleword by External | | | | | | _ | | | | Process ID Indexed | | EVX | 31 | 413 | | Р | 636 | E.PD | evstddepx | Vector Store Doubleword into Doubleword by External Process ID Indexed | | X | 31 | 991 | | Р | 634 | E.PD | icbiep | Instruction Cache Block Invalidate by External PID | | X | 31 | 95 | | P | 627 | E.PD | lbepx | Load Byte by External Process ID Indexed | | X | 31 | 29 | | P | 628 | E.PD | ldepx | Load Doubleword by External Process ID Indexed | | X | 31 | 607 | | Р | 635 | E.PD | lfdepx | Load Floating-Point Double by External Process ID Indexed | | X | 31 | 287 | | Р | 627 | E.PD | lhepx | Load Halfword by External Process ID Indexed | | X | 31 | 295 | | P | 637 | E.PD | lvepx | Load Vector by External Process ID Indexed | | X | 31 | 263 | | P | 637 | E.PD | lvepxl | Load Vector by External Process ID Indexed LRU | | X | 31 | 31 | | Р | 628 | E.PD | lwepx | Load Word by External Process ID Indexed | | X | 31 | 223 | | Р | 629 | E.PD | stbepx | Store Byte by External Process ID Indexed | | X | 31 | 157 | | Р | 630 | E.PD | stdepx | Store Doubleword by External Process ID Indexed | | Х | 31 | 735 | | Р | 635 | E.PD | stfdepx | Store Floating-Point Double by External Process ID Indexed | | X | 31 | 415 | | Р | 629 | E.PD | sthepx | Store Halfword by External Process ID Indexed | | Χ | 31 | 807 | | Р | 638 | E.PD | stvepx | Store Vector by External Process ID Indexed | | X | 31 | 775 | | Р | 638 | E.PD | stvepxl | Store Vector by External Process ID Indexed LRU | | X | 31 | 159 | | Р | 630 | E.PD | stwepx | Store Word by External Process ID Indexed | | XFX | 31 | 334 | | 0 | 756 | E.PM | mfpmr | Move From Performance Monitor Register | | XFX | 31 | 462 | | 0 | 756 | E.PM | mtpmr | Move To Performance Monitor Register | | X | 31 | 310 | | | 456 | EC | eciwx | External Control In Word Indexed | | X | 31 | 438 | | | 456 | EC | ecowx | External Control Out Word Indexed | | X | 31 | 390 | | М | 656 | ECL | dcblc | Data Cache Block Lock Clear | | Χ | 31 | 166 | | М | 655 | ECL | dcbtls | Data Cache Block Touch and Lock Set | | X | 31 | 134 | | М | 655 | ECL | dcbtstls | Data Cache Block Touch for Store and Lock Set | | X | 31 | 230 | | М | 657 | ECL | icblc | Instruction Cache Block Lock Clear | | X | 31 | 486 | | М | 656 | ECL | icbtls | Instruction Cache Block Touch and Lock Set | | X | 63 | 32 | | | 138 | FP | fcmpo | Floating Compare Ordered | | X | 63 | 0 | | | 138 | FP | fcmpu | Floating Compare Unordered | | D | 50 | - | | | 119 | FP | lfd | Load Floating-Point Double | | D | 51 | | | | 119 | FP | lfdu | Load Floating-Point Double with Update | | | | | 1 | | | | Ī | | | | E | Op | code | . 7 ge | ۲, | | - 1 | | | |---|---------|----------|------------|---------------------------|-----|------------|------------------|------------------------|-------------------------------------------------------------------------------------------| | | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Pri | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | | X | 31 | 631 | | | 119 | FP | Ifdux | Load Floating-Point Double with Update Indexed | | | X<br>X | 31<br>31 | 599<br>855 | | | 119<br>120 | FP<br>FP | Ifdx<br>Ifiwax | Load Floating-Point Double Indexed Load Floating-Point as Integer Word Algebraic Indexed | | | Ď | 48 | 000 | | | 122 | FP | Ifs | Load Floating-Point as integer word Algebraic indexed | | | D | 49 | | | | 122 | FP | lfsu | Load Floating-Point Single with Update | | | X | 31 | 567 | | | 122 | FP | Ifsux | Load Floating-Point Single with Update Indexed | | | X | 31 | 535 | | | 122 | FP | Ifsx | Load Floating-Point Single Indexed | | | Χ | 63 | 64 | | | 140 | FP | mcrfs | Move to Condition Register from FPSCR | | | D | 54 | | | | 123 | FP | stfd | Store Floating-Point Double | | | D<br>X | 55<br>31 | 759 | | | 123<br>123 | FP<br>FP | stfdu<br>stfdux | Store Floating-Point Double with Update Store Floating-Point Double with Update Indexed | | | X | 31 | 727 | | | 123 | FP | stfdx | Store Floating-Point Double Indexed | | | X | 31 | 983 | | | 124 | FP | stfiwx | Store Floating-Point as Integer Word Indexed | | | D | 52 | | | | 122 | FP | stfs | Store Floating-Point Single | | | D | 53 | | | | 122 | FP | stfsu | Store Floating-Point Single with Update | | | Χ | 31 | 695 | | | 122 | FP | stfsux | Store Floating-Point Single with Update Indexed | | | X | 31 | 663 | | | 122 | FP | stfsx | Store Floating-Point Single Indexed | | | DS<br>X | 57<br>31 | 0<br>791 | | | 125<br>125 | FP.out | lfdp<br>lfdpy | Load Floating-Point Double Pair Load Floating-Point Double Pair Indexed | | | DS | 61 | 791 | | | 125 | FP.out<br>FP.out | Ifdpx<br>stfdp | Store Floating-Point Double Pair | | | X | 31 | 919 | | | 125 | FP.out | stfdpx | Store Floating-Point Double Pair Indexed | | • | X | 63 | 264 | | | 126 | FP[R] | fabs[.] | Floating Absolute Value | | | Α | 63 | 21 | | | 127 | FP[R] | fadd[.] | Floating Add | | | Α | 59 | 21 | | | 127 | FP[R] | fadds[.] | Floating Add Single | | | X | 63 | 846 | | | 136 | FP[R] | fcfid[.] | Floating Convert From Integer Doubleword | | I | X<br>X | 63<br>63 | 8<br>814 | | | 126<br>134 | FP[R]<br>FP[R] | fcpsgn[.]<br>fctid[.] | Floating Copy Sign Floating Convert To Integer Doubleword | | | X | 63 | 815 | | | 135 | FP[R] | fctidz[.] | Floating Convert To Integer Doubleword Floating Convert To Integer Doubleword with round | | | ^ | 00 | 010 | | | 100 | [] | Tottuz[.] | toward Zero | | | Χ | 63 | 14 | | | 135 | FP[R] | fctiw[.] | Floating Convert To Integer Word | | | Χ | 63 | 15 | | | 136 | FP[R] | fctiwz[.] | Floating Convert To Integer Word with round toward | | | | | 4.0 | | | 400 | ED/D1 | | Zero | | | A | 63<br>59 | 18<br>18 | | | 128<br>128 | FP[R] | fdiv[.] | Floating Divide | | | A<br>A | 63 | 29 | | | 132 | FP[R]<br>FP[R] | fdivs[.]<br>fmadd[.] | Floating Divide Single<br>Floating Multiply-Add | | | A | 59 | 29 | | | 132 | FP[R] | fmadds[.] | Floating Multiply-Add Single | | | X | 63 | 72 | | | 126 | FP[R] | fmr[.] | Floating Move Register | | | Α | 63 | 28 | | | 132 | FP[R] | fmsub[.] | Floating Multiply-Subtract | | | A | 59 | 28 | | | 132 | FP[R] | fmsubs[.] | Floating Multiply-Subtract Single | | | A | 63 | 25 | | | 128 | FP[R] | fmul[.] | Floating Multiply | | | A<br>X | 59<br>63 | 25<br>136 | | | 128<br>126 | FP[R]<br>FP[R] | fmuls[.]<br>fnabs[.] | Floating Multiply Single Floating Negative Absolute Value | | | X | 63 | 40 | | | 126 | FP[R] | fneg[.] | Floating Negative Absolute Value | | | A | 63 | 31 | | | 133 | FP[R] | fnmadd[.] | Floating Negative Multiply-Add | | | Α | 59 | 31 | | | 133 | FP[R] | fnmadds[.] | Floating Negative Multiply-Add Single | | | Α | 63 | 30 | | | 133 | FP[R] | fnmsub[.] | Floating Negative Multiply-Subtract | | | A | 59 | 30 | | | 133 | FP[R] | fnmsubs[.] | Floating Negative Multiply-Subtract Single | | | A | 63 | 24 | | | 129 | FP[R] | fre[.] | Floating Reciprocal Estimate | | | A<br>X | 59<br>63 | 24<br>12 | | | 129<br>134 | FP[R]<br>FP[R] | fres[.]<br>frsp[.] | Floating Reciprocal Estimate Single Floating Round to Single-Precision | | | A | 63 | 23 | | | 139 | FP[R] | fsel[.] | Floating Select | | | A | 63 | 22 | | | 129 | FP[R] | fsqrt[.] | Floating Square Root | | | Α | 59 | 22 | | | 129 | FP[R] | fsqrts[.] | Floating Square Root Single | | | A | 63 | 20 | | | 127 | FP[R] | fsub[.] | Floating Subtract | | | A | 59 | 20 | | | 127 | FP[R] | fsubs[.] | Floating Subtract Single | | | X<br>X | 63<br>63 | 583 | | | 140 | FP[R] | mffs[.] | Move From FPSCR | | | X | 63 | 70<br>38 | | | 142<br>142 | FP[R]<br>FP[R] | mtfsb0[.]<br>mtfsb1[.] | Move To FPSCR Bit 0 Move To FPSCR Bit 1 | | | XFL | 63 | 711 | | | 141 | FP[R] | mtfsf[.] | Move To FPSCR Fields | | | X | 63 | 134 | | | 141 | FP[R] | mtfsfi[.] | Move To FPSCR Field Immediate | | | X | 63 | 488 | | | 137 | FP[R].in | | Floating Round to Integer Minus | | _ | Ор | code | o۲. | 1 | | | | | |------|-----|------|---------------------------|------|------|------------------|----------------|---------------------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Priv | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | X | 63 | 392 | | | 137 | FP[R].in | | Floating Round to Integer Nearest | | X | 63 | 456 | | | 137 | FP[R].in | | Floating Round to Integer Plus | | X | 63 | 424 | | | 137 | FP[R].in | | Floating Round to Integer Toward Zero | | Α | 63 | 26 | | | 130 | | frsqrte[.] | Floating Reciprocal Square Root Estimate | | A | 59 | 26 | | | 130 | | frsqrtes[.] | Floating Reciprocal Square Root Estimate Single | | XO | 4 | 172 | | | 351 | LMA | macchw[o][.] | Multiply Accumulate Cross Halfword to Word Modulo Signed | | хо | 4 | 236 | | | 351 | LMA | macchws[o][.] | Multiply Accumulate Cross Halfword to Word Saturate Signed | | хо | 4 | 204 | | | 352 | LMA | macchwsu[o][.] | Multiply Accumulate Cross Halfword to Word Saturate Unsigned | | ХО | 4 | 140 | | | 352 | LMA | macchwu[o][.] | Multiply Accumulate Cross Halfword to Word Modulo Unsigned | | хо | 4 | 44 | | | 353 | LMA | machhw[o][.] | Multiply Accumulate High Halfword to Word Modulo Signed | | хо | 4 | 108 | | | 353 | LMA | machhws[o][.] | Multiply Accumulate High Halfword to Word Saturate Signed | | хо | 4 | 76 | | | 354 | LMA | machhwsu[o][.] | Multiply Accumulate High Halfword to Word Saturate Unsigned | | хо | 4 | 12 | | | 354 | LMA | machhwu[o][.] | Multiply Accumulate High Halfword to Word Modulo Unsigned | | хо | 4 | 428 | | | 355 | LMA | maclhw[o][.] | Multiply Accumulate Low Halfword to Word Modulo | | хо | 4 | 492 | | | 355 | LMA | maclhws[o][.] | Signed Multiply Accumulate Low Halfword to Word Saturate | | хо | 4 | 460 | | | 356 | LMA | maclhwsu[o][.] | Signed Multiply Accumulate Low Halfword to Word Saturate | | хо | 4 | 396 | | | 356 | LMA | maclhwu[o][.] | Unsigned Multiply Accumulate Low Halfword to Word Modulo Unsigned | | X | 4 | 168 | | | 356 | LMA | mulchw[.] | Multiply Cross Halfword to Word Signed | | Χ | 4 | 136 | | | 356 | LMA | mulchwu[.] | Multiply Cross Halfword to Word Unsigned | | Χ | 4 | 40 | | | 357 | LMA | mulhhw[.] | Multiply High Halfword to Word Signed | | Χ | 4 | 8 | | | 357 | LMA | mulhhwu[.] | Multiply High Halfword to Word Unsigned | | X | 4 | 424 | | | 357 | LMA | mullhw[.] | Multiply Low Halfword to Word Signed | | X | 4 | 392 | | | 357 | LMA | mullhwu[.] | Multiply Low Halfword to Word Unsigned | | хо | 4 | 174 | | | 358 | LMA | nmacchw[o][.] | Negative Multiply Accumulate Cross Halfword to Word Modulo Signed | | хо | 4 | 238 | | | 358 | LMA | nmacchws[o][.] | Negative Multiply Accumulate Cross Halfword to Word Saturate Signed | | хо | 4 | 46 | | | 359 | LMA | nmachhw[o][.] | Negative Multiply Accumulate High Halfword to Word Modulo Signed | | хо | 4 | 110 | | | 359 | LMA | nmachhws[o][.] | Negative Multiply Accumulate High Halfword to Word Saturate Signed | | хо | 4 | 430 | | | 360 | LMA | nmaclhw[o][.] | Negative Multiply Accumulate Low Halfword to Word Modulo Signed | | хо | 4 | 494 | | | 360 | LMA | nmaclhws[o][.] | Negative Multiply Accumulate Low Halfword to Word Saturate Signed | | X | 31 | 78 | | | 349 | LMV | dlmzb[.] | Determine Leftmost Zero Byte | | DQ | 56 | , 0 | | Р | 493 | LSQ | lq | Load Quadword | | DS | 62 | 2 | | P | 493 | LSQ | stq | Store Quadword | | X | 31 | 597 | | | 59 | MA | Iswi | Load String Word Immediate | | X | 31 | 533 | | | 59 | MA | Iswx | Load String Word Indexed | | X | 31 | 725 | | | 60 | MA | stswi | Store String Word Immediate | | Х | 31 | 661 | | | 60 | MA | stswx | Store String Word Indexed | | XL | 19 | 402 | | Н | 482 | S | doze | Doze | | X | 31 | 854 | | | 448 | S | eieio | Enforce In-order Execution of I/O | | XL | 19 | 274 | | Н | 480 | S | hrfid | Hypervisor Return From Interrupt Doubleword | | X | 31 | 853 | | Н | 491 | S | Ibzcix | Load Byte and Zero Caching Inhibited Indexed | | X | 31 | 885 | | Н | 491 | S | Idcix | Load Doubleword Caching Inhibited Indexed | | X | 31 | 821 | 1 | Н | 491 | S | Ihzcix | Load Halfword and Zero Caching Inhibited Indexed | | 2 | - | Op | code | e Ţ. | ٦, | | | | | |-----|-----|-----|------|---------------------------|------|------|------------------|--------------|------------------------------------------------------------| | E | 5 | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Priv | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | 7 | X | 31 | 789 | | Н | 491 | S | lwzcix | Load Word and Zero Caching Inhibited Indexed | | | Χ | 31 | 595 | 32 | Р | 538 | S | mfsr | Move From Segment Register | | | Χ | 31 | 659 | 32 | Р | 538 | S | mfsrin | Move From Segment Register Indirect | | | Χ | 31 | 146 | | Р | 501 | S | mtmsr | Move To Machine State Register | | | Χ | 31 | 178 | | Р | 502 | S | mtmsrd | Move To Machine State Register Doubleword | | | Χ | 31 | 210 | 32 | Р | 537 | S | mtsr | Move To Segment Register | | | X | 31 | 242 | 32 | Р | 537 | S | mtsrin | Move To Segment Register Indirect | | 7 | XL | 19 | 434 | | Н | 482 | S | nap | Nap | | | XL | 19 | 18 | | Р | 480 | S | rfid | Return From Interrupt Doubleword | | | XL | 19 | 498 | | Н | 483 | S | rvwinkle | Rip Van Winkle | | - 2 | Χ | 31 | 979 | SR | | 535 | S | slbfee. | SLB Find Entry ESID | | ) | Χ | 31 | 498 | | Р | 532 | S | slbia | SLB Invalidate All | | | Χ | 31 | 434 | | Р | 531 | S | slbie | SLB Invalidate Entry | | | Χ | 31 | 915 | | Р | 534 | S | slbmfee | SLB Move From Entry ESID | | | Χ | 31 | 851 | | Р | 534 | S | slbmfev | SLB Move From Entry VSID | | | Χ | 31 | 402 | | Р | 533 | S | slbmte | SLB Move To Entry | | 2 | XL | 19 | 466 | | Н | 483 | S | sleep | Sleep | | 2 | X | 31 | 981 | | Н | 492 | S | stbcix | Store Byte Caching Inhibited Indexed | | ) | X | 31 | 1013 | | Н | 492 | S | stdcix | Store Doubleword Caching Inhibited Indexed | | 1 | X | 31 | 949 | | Н | 492 | S | sthcix | Store Halfword Caching Inhibited Indexed | | 2 | X | 31 | 917 | | Н | 492 | S | stwcix | Store Word Caching Inhibited Indexed | | 1 | Χ | 31 | 370 | | Н | 542 | S | tlbia | TLB Invalidate All | | 2 | Χ | 31 | 306 | 64 | Н | 539 | S | tlbie | TLB Invalidate Entry | | 2 | Χ | 31 | 274 | 64 | Н | 541 | S | tlbiel | TLB Invalidate Entry Local | | 2 | XFX | 31 | 371 | | | 451 | S.out | mftb | Move From Time Base | | I | EVX | 4 | 527 | | | 268 | SP | brinc | Bit Reversed Increment | | ı | EVX | 4 | 520 | | | 268 | SP | evabs | Vector Absolute Value | | I | EVX | 4 | 514 | | | 268 | SP | evaddiw | Vector Add Immediate Word | | ı | EVX | 4 | 1225 | | | 268 | SP | evaddsmiaaw | Vector Add Signed, Modulo, Integer to Accumulator Word | | ļ | EVX | 4 | 1217 | | | 269 | SP | evaddssiaaw | Vector Add Signed, Saturate, Integer to Accumulator Word | | I | EVX | 4 | 1224 | | | 269 | SP | evaddumiaaw | Vector Add Unsigned, Modulo, Integer to Accumulator Word | | I | EVX | 4 | 1216 | | | 269 | SP | evaddusiaaw | Vector Add Unsigned, Saturate, Integer to Accumulator Word | | H | EVX | 4 | 512 | | | 269 | SP | evaddw | Vector Add Word | | | EVX | 4 | 529 | | | 270 | SP | evand | Vector AND | | I | EVX | 4 | 530 | | | 270 | SP | evandc | Vector AND with Complement | | | EVX | 4 | 564 | | | 270 | SP | evcmpeq | Vector Compare Equal | | | EVX | 4 | 561 | | | 270 | SP | evcmpgts | Vector Compare Greater Than Signed | | | EVX | 4 | 560 | | | 271 | SP | evcmpgtu | Vector Compare Greater Than Unsigned | | I | EVX | 4 | 563 | | | 271 | SP | evcmplts | Vector Compare Less Than Signed | | | EVX | 4 | 562 | | | 271 | SP | evcmpltu | Vector Compare Less Than Unsigned | | I | EVX | 4 | 526 | | | 272 | SP | eventisw | Vector Count Leading Signed Bits Word | | | EVX | 4 | 525 | | | 272 | SP | evcntlzw | Vector Count Leading Zeros Word | | | EVX | 4 | 1222 | | | 272 | SP | evdivws | Vector Divide Word Signed | | | EVX | 4 | 1223 | | | 273 | SP | evdivwu | Vector Divide Word Unsigned | | | EVX | 4 | 537 | | | 273 | SP | eveqv | Vector Equivalent | | | EVX | 4 | 522 | | | 273 | SP | evextsb | Vector Extend Sign Byte | | | EVX | 4 | 523 | | | 273 | SP | evextsh | Vector Extend Sign Halfword | | | EVX | 4 | 769 | | | 274 | SP | evldd | Vector Load Double Word into Double Word | | | EVX | 4 | 768 | | | 274 | SP | evlddx | Vector Load Double Word into Double Word Indexed | | | EVX | 4 | 773 | | | 274 | SP | evldh | Vector Load Double into Four Halfwords | | | EVX | 4 | 772 | | | 274 | SP | evldhx | Vector Load Double into Four Halfwords Indexed | | | EVX | 4 | 771 | | | 275 | SP | evldw | Vector Load Double into Two Words | | | EVX | 4 | 770 | | | 275 | SP | evldwx | Vector Load Double into Two Words Indexed | | | EVX | 4 | 777 | | | 275 | SP | evlhhesplat | Vector Load Halfword into Halfwords Even and Splat | | | EVX | 4 | 776 | | | 275 | SP | evlhhesplatx | Vector Load Halfword into Halfwords Even and Splat | | | | | | | | | | | Indexed | | E | Ор | Opcode o - | | | | - 1 | | | |------|-----|------------|--------------|------------|-------|------------------|---------------|------------------------------------------------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. | ≧ Pa<br>L | ige ( | Cat <sup>1</sup> | Mnemonic | Instruction | | EVX | 4 | 783 | | 2 | 76 | SP | evlhhossplat | Vector Load Halfword into Halfword Odd Signed and Splat | | EVX | 4 | 782 | | 2 | 76 | SP | evlhhossplatx | Vector Load Halfword into Halfword Odd Signed and Splat Indexed | | EVX | 4 | 781 | | 2 | 76 | SP | evlhhousplat | Vector Load Halfword into Halfword Odd Unsigned and Splat | | EVX | 4 | 780 | | 2 | 76 | SP | evlhhousplatx | Vector Load Halfword into Halfword Odd Unsigned and Splat Indexed | | EVX | 4 | 785 | | | 77 | SP | evlwhe | Vector Load Word into Two Halfwords Even | | EVX | 4 | 784 | | | 77 | SP | evlwhex | Vector Load Word into Two Halfwords Even Indexed | | EVX | 4 | 791 | | 2 | 77 | SP | evlwhos | Vector Load Word into Two Halfwords Odd Signed (with sign extension) | | EVX | 4 | 790 | | 2 | 77 | SP | evlwhosx | Vector Load Word into Two Halfwords Odd Signed Indexed (with sign extension) | | EVX | 4 | 789 | | 2 | 78 | SP | evlwhou | Vector Load Word into Two Halfwords Odd Unsigned (zero-extended) | | EVX | 4 | 788 | | 2 | 78 | SP | evlwhoux | Vector Load Word into Two Halfwords Odd Unsigned Indexed (zero-extended) | | EVX | 4 | 797 | | 2 | 78 | SP | evlwhsplat | Vector Load Word into Two Halfwords and Splat | | EVX | 4 | 796 | | | 78 | SP | evlwhsplatx | Vector Load Word into Two Halfwords and Splat Indexed | | EVX | 4 | 793 | | 2 | 79 | SP | evlwwsplat | Vector Load Word into Word and Splat | | EVX | 4 | 792 | | 2 | 79 | SP | evlwwsplatx | Vector Load Word into Word and Splat Indexed | | EVX | 4 | 556 | | 2 | 79 | SP | evmergehi | Vector Merge High | | EVX | 4 | 558 | | | 80 | SP | evmergehilo | Vector Merge High/Low | | EVX | 4 | 557 | | | 79 | SP | evmergelo | Vector Merge Low | | EVX | 4 | 559 | | | 80 | SP | evmergelohi | Vector Merge Low/High | | EVX | 4 | 1323 | | 2 | 80 | SP | evmhegsmfaa | Vector Multiply Halfwords, Even, Guarded, Signed, | | EVX | 4 | 1451 | | 28 | 80 | SP | evmhegsmfan | Modulo, Fractional and Accumulate Vector Multiply Halfwords, Even, Guarded, Signed, | | EVX | 4 | 1321 | | 28 | 81 | SP | evmhegsmiaa | Modulo, Fractional and Accumulate Negative Vector Multiply Halfwords, Even, Guarded, Signed, | | | | | | | | | | Modulo, Integer and Accumulate | | EVX | 4 | 1449 | | | 81 | SP | evmhegsmian | Vector Multiply Halfwords, Even, Guarded, Signed,<br>Modulo, Integer and Accumulate Negative | | EVX | 4 | 1320 | | 28 | 81 | SP | evmhegumiaa | Vector Multiply Halfwords, Even, Guarded, Unsigned,<br>Modulo, Integer and Accumulate | | EVX | 4 | 1448 | | 28 | 81 | SP | evmhegumian | Vector Multiply Halfwords, Even, Guarded, Unsigned, Modulo, Integer and Accumulate Negative | | EVX | 4 | 1035 | | 28 | 82 | SP | evmhesmf | Vector Multiply Halfwords, Even, Signed, Modulo, Fractional | | EVX | 4 | 1067 | | 28 | 82 | SP | evmhesmfa | Vector Multiply Halfwords, Even, Signed, Modulo, Fractional to Accumulator | | EVX | 4 | 1291 | | 28 | 82 | SP | evmhesmfaaw | Vector Multiply Halfwords, Even, Signed, Modulo, Fractional and Accumulate into Words | | EVX | 4 | 1419 | | 28 | 82 | SP | evmhesmfanw | Vector Multiply Halfwords, Even, Signed, Modulo, Fractional and Accumulate Negative into Words | | EVX | 4 | 1033 | | 28 | 83 | SP | evmhesmi | Vector Multiply Halfwords, Even, Signed, Modulo, Inte- | | EVX | 4 | 1065 | | 28 | 83 | SP | evmhesmia | ger Vector Multiply Halfwords, Even, Signed, Modulo, Integer to Accumulator | | EVX | 4 | 1289 | | 28 | 83 | SP | evmhesmiaaw | Vector Multiply Halfwords, Even, Signed, Modulo, Integer and Accumulate into Words | | EVX | 4 | 1417 | | 28 | 83 | SP | evmhesmianw | Vector Multiply Halfwords, Even, Signed, Modulo, Inte- | | EVX | 4 | 1027 | | 28 | 84 | SP | evmhessf | ger and Accumulate Negative into Words Vector Multiply Halfwords, Even, Signed, Saturate, | | EVX | 4 | 1059 | | 28 | 84 | SP | evmhessfa | Fractional Vector Multiply Halfwords, Even, Signed, Saturate, Fractional to Accumulator | | ٤ | Op | code | a⊤. | ۲, | | - 1 | | | |------|----------|----------|--------------|-----|------|------------------|-------------|---------------------------------------------------------------------------------------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. | Pri | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | EVX | 4 | 1283 | | | 285 | SP | evmhessfaaw | Vector Multiply Halfwords, Even, Signed, Saturate, | | EVX | 4 | 1411 | | | 285 | SP | evmhessfanw | Fractional and Accumulate into Words Vector Multiply Halfwords, Even, Signed, Saturate, Fractional and Accumulate Negative into Words | | EVX | 4 | 1281 | | | 286 | SP | evmhessiaaw | Vector Multiply Halfwords, Even, Signed, Saturate, Inte- | | EVX | 4 | 1409 | | | 286 | SP | evmhessianw | ger and Accumulate into Words Vector Multiply Halfwords, Even, Signed, Saturate, Integer and Accumulate Negative into Words | | EVX | 4 | 1032 | | | 287 | SP | evmheumi | Vector Multiply Halfwords, Even, Unsigned, Modulo, Integer | | EVX | 4 | 1064 | | | 287 | SP | evmheumia | Vector Multiply Halfwords, Even, Unsigned, Modulo, Integer to Accumulator | | EVX | 4 | 1288 | | | 287 | SP | evmheumiaaw | Vector Multiply Halfwords, Even, Unsigned, Modulo, | | EVX | 4 | 1416 | | | 287 | SP | evmheumianw | Integer and Accumulate into Words Vector Multiply Halfwords, Even, Unsigned, Modulo, Integer and Accumulate Negative into Words | | EVX | 4 | 1280 | | | 288 | SP | evmheusiaaw | Vector Multiply Halfwords, Even, Unsigned, Saturate, Integer and Accumulate into Words | | EVX | 4 | 1408 | | | 288 | SP | evmheusianw | Vector Multiply Halfwords, Even, Unsigned, Saturate, Integer and Accumulate Negative into Words | | EVX | 4 | 1327 | | | 289 | SP | evmhogsmfaa | Vector Multiply Halfwords, Odd, Guarded, Signed, Modulo, Fractional and Accumulate | | EVX | 4 | 1455 | | | 289 | SP | evmhogsmfan | Vector Multiply Halfwords, Odd, Guarded, Signed, Modulo, Fractional and Accumulate Negative | | EVX | 4 | 1325 | | | 289 | SP | evmhogsmiaa | Vector Multiply Halfwords, Odd, Guarded, Signed, Modulo, Integer and Accumulate | | EVX | 4 | 1453 | | | 289 | SP | evmhogsmian | Vector Multiply Halfwords, Odd, Guarded, Signed, Modulo, Integer and Accumulate Negative | | EVX | 4 | 1324 | | | 290 | SP | evmhogumiaa | Vector Multiply Halfwords, Odd, Guarded, Unsigned, Modulo, Integer and Accumulate | | EVX | 4 | 1452 | | | 290 | SP | evmhogumian | Vector Multiply Halfwords, Odd, Guarded, Unsigned, Modulo, Integer and Accumulate Negative | | EVX | 4 | 1039 | | | 290 | SP | evmhosmf | Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional | | EVX | 4 | 1071 | | | 290 | SP | evmhosmfa | Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional to Accumulator | | EVX | 4 | 1295 | | | 291 | SP | evmhosmfaaw | Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional and Accumulate into Words | | EVX | 4 | 1423 | | | 291 | SP | evmhosmfanw | Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional and Accumulate Negative into Words | | EVX | 4 | 1037 | | | 291 | SP | evmhosmi | Vector Multiply Halfwords, Odd, Signed, Modulo, Integer | | EVX | 4 | 1069 | | | 291 | SP | evmhosmia | Vector Multiply Halfwords, Odd, Signed, Modulo, Integer to Accumulator | | EVX | 4 | 1293 | | | 292 | SP | evmhosmiaaw | Vector Multiply Halfwords, Odd, Signed, Modulo, Integer and Accumulate into Words | | EVX | 4 | 1421 | | | 291 | SP | evmhosmianw | Vector Multiply Halfwords, Odd, Signed, Modulo, Integer and Accumulate Negative into Words | | EVX | 4 | 1031 | | | 293 | SP | evmhossf | Vector Multiply Halfwords, Odd, Signed, Saturate, Fractional | | EVX | 4 | 1063 | | | 293 | SP | evmhossfa | Vector Multiply Halfwords, Odd, Signed, Saturate, Fractional to Accumulator | | EVX | 4 | 1287 | | | 294 | SP | evmhossfaaw | Vector Multiply Halfwords, Odd, Signed, Saturate, Fractional and Accumulate into Words | | EVX | 4 | 1415 | | | 294 | SP | evmhossfanw | Vector Multiply Halfwords, Odd, Signed, Saturate, Fractional and Accumulate Negative into Words | | EVX | 4 | 1285 | | | 295 | SP | evmhossiaaw | Vector Multiply Halfwords, Odd, Signed, Saturate, Integer and Accumulate into Words | | | <b>l</b> | <u> </u> | <u> </u> | | | | l . | ger and Accumulate into Words | | _ | Ор | code | o ⁻. | Ļ | | | | | |------|-----|--------------|---------------------------|------|------------|------------------|-----------------------|--------------------------------------------------------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Priv | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | EVX | 4 | 1413 | | | 295 | SP | evmhossianw | Vector Multiply Halfwords, Odd, Signed, Saturate, Inte- | | EVX | 4 | 1036 | | | 295 | SP | evmhoumi | ger and Accumulate Negative into Words<br>Vector Multiply Halfwords, Odd, Unsigned, Modulo,<br>Integer | | EVX | 4 | 1068 | | | 295 | SP | evmhoumia | Vector Multiply Halfwords, Odd, Unsigned, Modulo, Integer to Accumulator | | EVX | 4 | 1292 | | | 296 | SP | evmhoumiaaw | Vector Multiply Halfwords, Odd, Unsigned, Modulo, Integer and Accumulate into Words | | EVX | 4 | 1420 | | | 292 | SP | evmhoumianw | Vector Multiply Halfwords, Odd, Unsigned, Modulo, Integer and Accumulate Negative into Words | | EVX | 4 | 1284 | | | 296 | SP | evmhousiaaw | Vector Multiply Halfwords, Odd, Unsigned, Saturate, Integer and Accumulate into Words | | EVX | 4 | 1412 | | | 296 | SP | evmhousianw | Vector Multiply Halfwords, Odd, Unsigned, Saturate, Integer and Accumulate Negative into Words | | EVX | 4 | 1220 | | | 297 | SP | evmra | Initialize Accumulator | | EVX | 4 | 1103 | | | 297 | SP | evmwhsmf | Vector Multiply Word High Signed, Modulo, Fractional | | EVX | 4 | 1135 | | | 297 | SP | evmwhsmfa | Vector Multiply Word High Signed, Modulo, Fractional to Accumulator | | EVX | 4 | 1101<br>1133 | | | 297<br>297 | SP<br>SP | evmwhsmi<br>evmwhsmia | Vector Multiply Word High Signed, Modulo, Integer Vector Multiply Word High Signed, Modulo, Integer to | | EVX | 4 | 1095 | | | 298 | SP | a | Accumulator Vector Multiply Word High Signed, Saturate, Fractional | | EVX | 4 | 1127 | | | 298 | SP | evmwhssf<br>evmwhssfa | Vector Multiply Word High Signed, Saturate, Fractional | | LVX | - | 1127 | | | 290 | OI | eviiiwiissia | to Accumulator | | EVX | 4 | 1100 | | | 298 | SP | evmwhumi | Vector Multiply Word High Unsigned, Modulo, Integer | | EVX | 4 | 1132 | | | 298 | SP | evmwhumia | Vector Multiply Word High Unsigned, Modulo, Integer to | | | 7 | 1102 | | | 200 | Oi | CVIIIWIIdiilla | Accumulator | | EVX | 4 | 1353 | | | 299 | SP | evmwlsmiaaw | Vector Multiply Word Low Signed, Modulo, Integer and Accumulate into Words | | EVX | 4 | 1481 | | | 299 | SP | evmwlsmianw | Vector Multiply Word Low Signed, Modulo, Integer and Accumulate Negative in Words | | EVX | 4 | 1345 | | | 299 | SP | evmwlssiaaw | Vector Multiply Word Low Signed, Saturate, Integer and Accumulate into Words | | EVX | 4 | 1473 | | | 299 | SP | evmwlssianw | Vector Multiply Word Low Signed, Saturate, Integer and Accumulate Negative in Words | | EVX | 4 | 1096 | | | 300 | SP | evmwlumi | Vector Multiply Word Low Unsigned, Modulo, Integer | | EVX | 4 | 1128 | | | 300 | SP | evmwlumia | Vector Multiply Word Low Unsigned, Modulo, Integer to Accumulator | | EVX | 4 | 1352 | | | 300 | SP | evmwlumiaaw | Vector Multiply Word Low Unsigned, Modulo, Integer and Accumulate into Words | | EVX | 4 | 1480 | | | 300 | SP | evmwlumianw | Vector Multiply Word Low Unsigned, Modulo, Integer and Accumulate Negative in Words | | EVX | 4 | 1344 | | | 301 | SP | evmwlusiaaw | Vector Multiply Word Low Unsigned, Saturate, Integer and Accumulate into Words | | EVX | 4 | 1472 | | | 301 | SP | evmwlusianw | Vector Multiply Word Low Unsigned, Saturate, Integer and Accumulate Negative in Words | | EVX | 4 | 1115 | | | 301 | SP | evmwsmf | Vector Multiply Word Signed, Modulo, Fractional | | EVX | 4 | 1147 | | | 301 | SP | evmwsmfa | Vector Multiply Word Signed, Modulo, Fractional to | | | | | | | | | | Accumulator | | EVX | 4 | 1371 | | | 302 | SP | evmwsmfaa | Vector Multiply Word Signed, Modulo, Fractional and Accumulate | | EVX | 4 | 1499 | | | 302 | SP | evmwsmfan | Vector Multiply Word Signed, Modulo, Fractional and Accumulate Negative | | EVX | 4 | 1113 | | l | 302 | SP | evmwsmi | Vector Multiply Word Signed, Modulo, Integer | | EVX | 4 | 1145 | | | 302 | SP | evmwsmia | Vector Multiply Word Signed, Modulo, Integer to Accumulator | | EVX | 4 | 1369 | | | 302 | SP | evmwsmiaa | Vector Multiply Word Signed, Modulo, Integer and Accumulate | | _ | Ор | code | <u>o</u> ~. | ۲, | | 4 | | | |------|-----|-------------|--------------|------|------------|------------------|--------------------------|--------------------------------------------------------------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. | Priv | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | EVX | 4 | 1497 | | | 302 | SP | evmwsmian | Vector Multiply Word Signed, Modulo, Integer and Accumulate Negative | | EVX | 4 | 1107 | | | 303 | SP | evmwssf | Vector Multiply Word Signed, Saturate, Fractional | | EVX | 4 | 1139 | | | 303 | SP | evmwssfa | Vector Multiply Word Signed, Saturate, Fractional to Accumulator | | EVX | 4 | 1363 | | | 303 | SP | evmwssfaa | Vector Multiply Word Signed, Saturate, Fractional and Accumulate | | EVX | 4 | 1491 | | | 304 | SP | evmwssfan | Vector Multiply Word Signed, Saturate, Fractional and Accumulate Negative | | EVX | 4 | 1112 | | | 304 | SP | evmwumi | Vector Multiply Word Unsigned, Modulo, Integer | | EVX | 4 | 1144 | | | 304 | SP | evmwumia | Vector Multiply Word Unsigned, Modulo, Integer to Accumulator | | EVX | 4 | 1368 | | | 305 | SP | evmwumiaa | Vector Multiply Word Unsigned, Modulo, Integer and Accumulate | | EVX | 4 | 1496 | | | 305 | SP | evmwumian | Vector Multiply Word Unsigned, Modulo, Integer and Accumulate Negative | | EVX | 4 | 542 | | | 305 | SP | evnand | Vector NAND | | EVX | 4 | 521 | | | 305 | SP | evneg | Vector Negate | | EVX | 4 | 536 | | | 305 | SP | evnor | Vector NOR | | EVX | 4 | 535 | | | 306 | SP | evor | Vector OR | | EVX | 4 | 539 | | | 306 | SP | evorc | Vector OR with Complement | | EVX | 4 | 552 | | | 306 | SP | evrlw | Vector Rotate Left Word | | EVX | 4 | 554<br>524 | | | 307<br>307 | SP<br>SP | evrlwi<br>evrndw | Vector Rotate Left Word Immediate Vector Round Word | | EVS | 4 | 79 | | | 307 | SP | evsel | Vector Select | | EVX | 4 | 548 | | | 308 | SP | evslw | Vector Shift Left Word | | EVX | 4 | 550 | | | 308 | SP | evslwi | Vector Shift Left Word Immediate | | EVX | 4 | 555 | | | 308 | SP | evsplatfi | Vector Splat Fractional Immediate | | EVX | 4 | 553 | | | 308 | SP | evsplati | Vector Splat Immediate | | EVX | 4 | 547 | | | 308 | SP | evsrwis | Vector Shift Right Word Immediate Signed | | EVX | 4 | 546 | | | 308 | SP | evsrwiu | Vector Shift Right Word Immediate Unsigned | | EVX | 4 | 545 | | | 309 | SP | evsrws | Vector Shift Right Word Signed | | EVX | 4 | 544 | | | 309 | SP | evsrwu | Vector Shift Right Word Unsigned | | EVX | 4 | 801 | | | 309 | SP | evstdd | Vector Store Double of Double | | EVX | 4 | 800<br>805 | | | 309<br>310 | SP<br>SP | evstddx<br>evstdh | Vector Store Double of Double Indexed<br>Vector Store Double of Four Halfwords | | EVX | 4 | 804 | | | 310 | SP | evstdhx | Vector Store Double of Four Halfwords Indexed | | EVX | 4 | 803 | | | 310 | SP | evstdw | Vector Store Double of Two Words | | EVX | 4 | 802 | | | 310 | SP | evstdwx | Vector Store Double of Two Words Indexed | | EVX | 4 | 817 | | | 311 | SP | evstwhe | Vector Store Word of Two Halfwords from Even | | EVX | 4 | 816 | | | 311 | SP | evstwhex | Vector Store Word of Two Halfwords from Even Indexed | | EVX | 4 | 821 | | | 311 | SP | evstwho | Vector Store Word of Two Halfwords from Odd | | EVX | 4 | 820 | | | 311 | SP | evstwhox | Vector Store Word of Two Halfwords from Odd Indexed | | EVX | 4 | 825 | | | 311 | SP | evstwwe | Vector Store Word of Word from Even | | EVX | 4 | 824 | | | 311 | SP | evstwwex | Vector Store Word of Word from Even Indexed | | EVX | 4 | 829 | | | 312 | SP | evstwwo | Vector Store Word of Word from Odd | | EVX | 4 | 828<br>1227 | | | 312<br>312 | SP<br>SP | evstwwox<br>evsubfsmiaaw | Vector Store Word of Word from Odd Indexed<br>Vector Subtract Signed, Modulo, Integer to Accumulator<br>Word | | EVX | 4 | 1219 | | | 312 | SP | evsubfssiaaw | Vector Subtract Signed, Saturate, Integer to Accumulator Word | | EVX | 4 | 1226 | | | 313 | SP | evsubfumiaaw | Vector Subtract Unsigned, Modulo, Integer to Accumulator Word | | EVX | 4 | 1218 | | | 313 | SP | evsubfusiaaw | Vector Subtract Unsigned, Saturate, Integer to Accumulator Word | | EVX | 4 | 516 | | | 313 | SP | evsubfw | Vector Subtract from Word | | EVX | 4 | 518 | | | 313 | SP | evsubifw | Vector Subtract Immediate from Word | | EVX | 4 | 534 | | | 313 | SP | evxor | Vector XOR | | EVX | 4 | 740 | | | 335 | SP.FD | efdabs | Floating-Point Double-Precision Absolute Value | | EVX | 4 | 736 | | | 336 | SP.FD | efdadd | Floating-Point Double-Precision Add | | ٤ | Ор | code | . ر | ۲, | | - 1 | | | |-------------------|------------------|--------------------------|---------------------------|-----|--------------------------|----------------------------------|---------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Pri | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | EVX | 4 | 751 | | | 342 | SP.FD | efdcfs | Floating-Point Double-Precision Convert from Single-<br>Precision | | EVX | 4 | 755 | | | 340 | SP.FD | efdcfsf | Convert Floating-Point Double-Precision from Signed Fraction | | EVX | 4 | 753 | | | 339 | SP.FD | efdcfsi | Convert Floating-Point Double-Precision from Signed | | EVX | 4 | 739 | | | 340 | SP.FD | efdcfsid | Integer Convert Floating-Point Double-Precision from Signed Integer Doubleword | | EVX | 4 | 754 | | | 340 | SP.FD | efdcfuf | Convert Floating-Point Double-Precision from Unsigned Fraction | | EVX | 4 | 752 | | | 339 | SP.FD | efdcfui | Convert Floating-Point Double-Precision from Unsigned Integer | | EVX | 4 | 738 | | | 340 | SP.FD | efdcfuid | Convert Floating-Point Double-Precision from Unsigned Integer Doubleword | | EVX<br>EVX<br>EVX | 4<br>4<br>4<br>4 | 750<br>748<br>749<br>759 | | | 337<br>337<br>337<br>342 | SP.FD<br>SP.FD<br>SP.FD<br>SP.FD | efdcmpeq<br>efdcmpgt<br>efdcmplt<br>efdctsf | Floating-Point Double-Precision Compare Equal Floating-Point Double-Precision Compare Greater Than Floating-Point Double-Precision Compare Less Than Convert Floating-Point Double-Precision to Signed Fraction | | EVX | 4 | 757 | | | 340 | SP.FD | efdctsi | Convert Floating-Point Double-Precision to Signed Integer | | EVX | 4 | 747 | | | 341 | SP.FD | efdctsidz | Convert Floating-Point Double-Precision to Signed Integer Doubleword with Round toward Zero | | EVX | 4 | 762 | | | 342 | SP.FD | efdctsiz | Convert Floating-Point Double-Precision to Signed Integer with Round toward Zero | | EVX | 4 | 758 | | | 342 | SP.FD | efdctuf | Convert Floating-Point Double-Precision to Unsigned Fraction | | EVX | 4 | 756 | | | 340 | SP.FD | efdctui | Convert Floating-Point Double-Precision to Unsigned Integer | | EVX | 4 | 746 | | | 341 | SP.FD | efdctuidz | Convert Floating-Point Double-Precision to Unsigned Integer Doubleword with Round toward Zero | | EVX | 4 | 760 | | | 342 | SP.FD | efdctuiz | Convert Floating-Point Double-Precision to Unsigned Integer with Round toward Zero | | EVX | 4 | 745 | | | 336 | SP.FD | efddiv | Floating-Point Double-Precision Divide | | EVX | 4 | 744<br>741 | | | 336<br>335 | SP.FD<br>SP.FD | efdmul<br>efdnabs | Floating-Point Double-Precision Multiply Floating-Point Double-Precision Negative Absolute Value | | EVX | 4 | 742 | | | 335 | SP.FD | efdneg | Floating-Point Double-Precision Negate | | EVX | 4 | 737 | | | 336 | SP.FD | efdsub | Floating-Point Double-Precision Subtract | | EVX | 4 | 766<br>764 | | | 338<br>337 | SP.FD<br>SP.FD | efdtsteq<br>efdtstgt | Floating-Point Double-Precision Test Equal Floating-Point Double-Precision Test Greater Than | | EVX | 4 | 765 | | | 338 | SP.FD | efdtstlt | Floating-Point Double-Precision Test Greater Than Floating-Point Double-Precision Test Less Than | | EVX | 4 | 719 | | | 343 | SP.FD | efscfd | Floating-Point Single-Precision Convert from Double-Precision | | EVX | 4 | 708 | | | 328 | SP.FS | efsabs | Floating-Point Single-Precision Absolute Value | | EVX | 4 | 704<br>723 | | | 329<br>333 | SP.FS<br>SP.FS | efsadd<br>efscfsf | Floating-Point Single-Precision Add Convert Floating-Point Single-Precision from Signed | | EVX | 4 | 721 | | | 333 | SP.FS | efscfsi | Fraction Convert Floating-Point Single-Precision from Signed Integer | | EVX | 4 | 722 | | | 333 | SP.FS | efscfuf | Convert Floating-Point Single-Precision from Unsigned | | EVX | 4 | 720 | | | 333 | SP.FS | efscfui | Fraction Convert Floating-Point Single-Precision from Unsigned Integer | | EVX | 4 | 718 | | | 331 | SP.FS | efscmpeq | Floating-Point Single-Precision Compare Equal | | EVX | 4 | 716 | | | 330 | SP.FS | efscmpgt | Floating-Point Single-Precision Compare Greater Than | | EVX | 4 | 717<br>727 | | | 330 | SP.FS<br>SP.FS | efscmplt<br>efsctsf | Floating-Point Single-Precision Compare Less Than Convert Floating-Point Single-Precision to Signed Frac- | | EVA | 4 | 121 | | | 334 | 0F.F0 | C190191 | tion | | _ | Ope | code | <u>o</u> −. | | | | | | |------------|-----|------------|---------------------------|------|------------|------------------|------------------------|------------------------------------------------------------------------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Pri∨ | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | EVX | 4 | 725 | | | 333 | SP.FS | efsctsi | Convert Floating-Point Single-Precision to Signed Integer | | EVX | 4 | 730 | | | 334 | SP.FS | efsctsiz | Convert Floating-Point Single-Precision to Signed Integer with Round toward Zero | | EVX | 4 | 726 | | | 334 | SP.FS | efsctuf | Convert Floating-Point Single-Precision to Unsigned Fraction | | EVX | 4 | 724 | | | 333 | SP.FS | efsctui | Convert Floating-Point Single-Precision to Unsigned Integer | | EVX | 4 | 728 | | | 334 | SP.FS | efsctuiz | Convert Floating-Point Single-Precision to Unsigned Integer with Round toward Zero | | EVX | 4 | 713 | | | 329 | SP.FS | efsdiv | Floating-Point Single-Precision Divide | | EVX | 4 | 712 | | | 329 | SP.FS | efsmul | Floating-Point Single-Precision Multiply | | EVX | 4 | 709 | | | 328 | SP.FS | efsnabs | Floating-Point Single-Precision Negative Absolute Value | | EVX | 4 | 710 | | | 328 | SP.FS | efsneg | Floating-Point Single-Precision Negate | | EVX | 4 | 705 | | | 329 | SP.FS | efssub | Floating-Point Single-Precision Subtract | | EVX | 4 | 734 | | | 332 | SP.FS | efststeq | Floating-Point Single-Precision Test Equal | | EVX | 4 | 732 | | | 331 | SP.FS | efststgt | Floating-Point Single-Precision Test Greater Than | | EVX | 4 | 733 | | | 332 | SP.FS | efststlt | Floating-Point Single-Precision Test Less Than | | EVX | 4 | 644<br>640 | | | 319 | SP.FV<br>SP.FV | evfsabs | Vector Floating-Point Single-Precision Absolute Value | | EVX | 4 | 659 | | | 320<br>324 | SP.FV<br>SP.FV | evfsadd<br>evfscfsf | Vector Floating-Point Single-Precision Add Vector Convert Floating-Point Single-Precision from Signed Fraction | | EVX | 4 | 657 | | | 324 | SP.FV | evfscfsi | Vector Convert Floating-Point Single-Precision from Signed Integer | | EVX | 4 | 658 | | | 324 | SP.FV | evfscfuf | Vector Convert Floating-Point Single-Precision from Unsigned Fraction | | EVX | 4 | 656 | | | 324 | SP.FV | evfscfui | Vector Convert Floating-Point Single-Precision from Unsigned Integer | | EVX<br>EVX | 4 | 654<br>652 | | | 322<br>321 | SP.FV<br>SP.FV | evfscmpeq<br>evfscmpgt | Vector Floating-Point Single-Precision Compare Equal<br>Vector Floating-Point Single-Precision Compare<br>Greater Than | | EVX | 4 | 653 | | | 321 | SP.FV | evfscmplt | Vector Floating-Point Single-Precision Compare Less Than | | EVX | 4 | 663 | | | 326 | SP.FV | evfsctsf | Vector Convert Floating-Point Single-Precision to Signed Fraction | | EVX | 4 | 661 | | | 325 | SP.FV | evfsctsi | Vector Convert Floating-Point Single-Precision to Signed Integer | | EVX | 4 | 666 | | | 325 | SP.FV | evfsctsiz | Vector Convert Floating-Point Single-Precision to Signed Integer with Round toward Zero | | EVX | 4 | 662 | | | 326 | SP.FV | evfsctuf | Vector Convert Floating-Point Single-Precision to Unsigned Fraction | | EVX | 4 | 660 | | | 325 | SP.FV | evfsctui | Vector Convert Floating-Point Single-Precision to Unsigned Integer | | EVX | 4 | 664 | | | 325 | SP.FV | evfsctuiz | Vector Convert Floating-Point Single-Precision to Unsigned Integer with Round toward Zero | | EVX | 4 | 649 | | | 320 | SP.FV | evfsdiv | Vector Floating-Point Single-Precision Divide | | EVX | 4 | 648 | | | 320 | SP.FV | evfsmul | Vector Floating-Point Single-Precision Multiply | | EVX | 4 | 645 | | | 319 | SP.FV | evfsnabs | Vector Floating-Point Single-Precision Negative Absolute Value | | EVX | 4 | 646 | | | 319 | SP.FV | evfsneg | Vector Floating-Point Single-Precision Negate | | EVX | 4 | 641 | | | 320 | SP.FV | evfssub | Vector Floating-Point Single-Precision Subtract | | EVX | 4 | 670 | | | 323 | SP.FV | evfststeq | Vector Floating-Point Single-Precision Test Equal | | EVX | 4 | 668 | | | 322 | SP.FV | evfststgt | Vector Floating-Point Single-Precision Test Greater Than | | EVX | 4 | 669 | | | 323 | SP.FV | evfststlt | Vector Floating-Point Single-Precision Test Less Than | | X | 31 | 7 | | | 206 | V | lvebx | Load Vector Element Byte Indexed | | X | 31 | 39 | | | 203 | V | lvehx | Load Vector Element Halfword Indexed | | X | 31 | 71 | | | 203 | V | lvewx | Load Vector Element Word Indexed | | Χ | 31 | 6 | | | 208 | V | Ivsl | Load Vector for Shift Left Indexed | | _ | Ор | code | o – . | _ | | | | | |-----------|---------|--------------|---------------------------|-----|------------|------------------|-------------------------|---------------------------------------------------------------------------------------------------------| | X<br>Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | riv | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | X | 31 | 38 | 2 ⊔ | F | 208 | V | lvsr | Load Vector for Shift Right Indexed | | X | 31 | 103 | | | 204 | V | lvx | Load Vector Indexed | | X | 31 | 359 | | | 204 | V | lvxl | Load Vector Indexed LRU | | VX | 4 | 1540 | | | 259 | V | mfvscr | Move From Vector Status and Control Register | | VX | 4 | 1604 | | | 259 | V | mtvscr | Move To Vector Status and Control Register | | Χ | 31 | 135 | | | 206 | V | stvebx | Store Vector Element Byte Indexed | | X | 31 | 167 | | | 206 | V | stvehx | Store Vector Element Halfword Indexed | | X | 31 | 199 | | | 207 | V | stvewx | Store Vector Element Word Indexed | | X | 31 | 231 | | | 204 | V | stvx | Store Vector Indexed | | X<br>VX | 31<br>4 | 487<br>384 | | | 207<br>220 | V<br>V | stvxl<br>vaddcuw | Store Vector Indexed LRU | | VX | 4 | 10 | | | 249 | V | vaddcuw<br>vaddfp | Vector Add and Write Carry-Out Unsigned Word Vector Add Single-Precision | | VX | 4 | 768 | | | 220 | V | vaddsbs | Vector Add Signed Byte Saturate | | VX | 4 | 832 | | | 220 | V | vaddshs | Vector Add Signed Halfword Saturate | | VX | 4 | 896 | | | 220 | V | vaddsws | Vector Add Signed Word Saturate | | VX | 4 | 0 | | | 221 | V | vaddubm | Vector Add Unsigned Byte Modulo | | VX | 4 | 512 | | | 222 | V | vaddubs | Vector Add Unsigned Byte Saturate | | VX | 4 | 64 | | | 221 | V | vadduhm | Vector Add Unsigned Halfword Modulo | | VX | 4 | 576 | | | 222 | V | vadduhs | Vector Add Unsigned Halfword Saturate | | VX | 4 | 128 | | | 221 | V | vadduwm | Vector Add Unsigned Word Modulo | | VX | 4 | 640 | | | 222 | V | vadduws | Vector Add Unsigned Word Saturate | | VX | 4 | 1028 | | | 244 | V | vand | Vector Logical AND | | VX<br>VX | 4 | 1092<br>1282 | | | 244<br>235 | V<br>V | vandc<br>vavgsb | Vector Logical AND with Complement | | VX | 4 | 1346 | | | 235 | V | vavgsb<br>vavgsh | Vector Average Signed Byte Vector Average Signed Halfword | | VX | 4 | 1410 | | | 235 | V | vavgsii | Vector Average Signed Word | | VX | 4 | 1026 | | | 236 | V | vavgub | Vector Average Unsigned Byte | | VX | 4 | 1090 | | | 236 | V | vavguh | Vector Average Unsigned Halfword | | VX | 4 | 1154 | | | 236 | V | vavguw | Vector Average Unsigned Word | | VX | 4 | 842 | | | 253 | V | vcfsx | Vector Convert From Signed Fixed-Point Word | | VX | 4 | 778 | | | 253 | V | vcfux | Vector Convert From Unsigned Fixed-Point Word | | VC | 4 | 966 | | | 255 | V | vcmpbfp[.] | Vector Compare Bounds Single-Precision | | VC | 4 | 198 | | | 255 | V | vcmpeqfp[.] | Vector Compare Equal To Single-Precision | | VC<br>VC | 4 | 6<br>70 | | | 241<br>241 | V | vcmpequb[.] | Vector Compare Equal To Unsigned Byte | | VC | 4 | 134 | | | 241 | V | vcmpequh[.] vcmpequw[.] | Vector Compare Equal To Unsigned Halfword Vector Compare Equal To Unsigned Word | | VC | 4 | 454 | | | 256 | V | vcmpgefp[.] | Vector Compare Equal 10 Onsigned Word Vector Compare Greater Than or Equal To Single-Pre- | | ' ' | • | .0. | | | 200 | • | vompgorp[i] | cision | | VC | 4 | 710 | | | 256 | V | vcmpgtfp[.] | Vector Compare Greater Than Single-Precision | | VC | 4 | 774 | | | 242 | V | vcmpgtsb[.] | Vector Compare Greater Than Signed Byte | | VC | 4 | 838 | | | 242 | V | vcmpgtsh[.] | Vector Compare Greater Than Signed Halfword | | VC | 4 | 902 | | | 242 | V | vcmpgtsw[.] | Vector Compare Greater Than Signed Word | | VC | 4 | 518 | | | 243 | V | vcmpgtub[.] | Vector Compare Greater Than Unsigned Byte | | VC | 4 | 582 | | | 243 | V | vcmpgtuh[.] | Vector Compare Greater Than Unsigned Halfword | | VC | 4 | 646 | | | 243 | V | vcmpgtuw[.] | Vector Compare Greater Than Unsigned Word | | VX<br>VX | 4 | 970<br>906 | | | 252<br>252 | V<br>V | vctsxs | Vector Convert To Signed Fixed-Point Word Saturate Vector Convert To Unsigned Fixed-Point Word Saturate | | VX | 4 | 394 | | | 25Z<br>257 | V | vexptefp | Vector Convert 16 Unsigned Fixed-Point Word Saturate Vector 2 Raised to the Exponent Estimate Floating- | | ** | • | 001 | | | 201 | v | νολρισιρ | Point Point | | VX | 4 | 458 | | | 257 | V | vlogefp | Vector Log Base 2 Estimate Floating-Point | | VA | 4 | 46 | | | 250 | V | vmaddfp | Vector Multiply-Add Single-Precision | | VX | 4 | 1034 | | | 251 | V | vmaxfp | Vector Maximum Single-Precision | | VX | 4 | 258 | | | 237 | V | vmaxsb | Vector Maximum Signed Byte | | VX | 4 | 322 | | | 237 | V | vmaxsh | Vector Maximum Signed Halfword | | VX | 4 | 386 | | | 237 | V | vmaxsw | Vector Maximum Signed Word | | VX | 4 | 2 | | | 238 | V | vmaxub | Vector Maximum Unsigned Byte | | VX<br>VX | 4 | 66<br>130 | | | 238<br>238 | V<br>V | vmaxuh | Vector Maximum Unsigned Halfword Vector Maximum Unsigned Word | | VA | 4 | 32 | | | 238 | V | vmaxuw<br>vmhaddshs | Vector Multiply-High-Add Signed Halfword Saturate | | VA | 4 | 33 | | | 228 | V | vmhraddshs | Vector Multiply-High-Round-Add Signed Halfword Satu- | | | | 33 | | | | • | | rate | | | | | 1 | | | | l | 1 1410 | | _ | Ор | code | o⊤. | _ | | | | | |----------|-----|------------|---------------------------|------|------------|------------------|--------------------|-----------------------------------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Priv | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | VX | 4 | 1098 | | | 251 | V | vminfp | Vector Minimum Single-Precision | | VX | 4 | 770 | | | 239 | V | vminsb | Vector Minimum Signed Byte | | VX | 4 | 834 | | | 239 | V | vminsh | Vector Minimum Signed Halfword | | VX | 4 | 898 | | | 239 | V | vminsw | Vector Minimum Signed Word | | VX | 4 | 514 | | | 240 | V | vminub | Vector Minimum Unsigned Byte | | VX | 4 | 578 | | | 240 | V | vminuh | Vector Minimum Unsigned Halfword | | VX | 4 | 642 | | | 240 | V | vminuw | Vector Minimum Unsigned Word | | VA | 4 | 34 | | | 229 | V | vmladduhm | Vector Multiply-Low-Add Unsigned Halfword Modulo | | VX<br>VX | 4 | 12<br>76 | | | 214<br>214 | V<br>V | vmrghb<br>vmrghh | Vector Merge High Byte | | VX | 4 | 140 | | | 214 | V | vmrghw | Vector Merge High Halfword<br>Vector Merge High Word | | VX | 4 | 268 | | | 215 | V | vmrglb | Vector Merge Low Byte | | VX | 4 | 332 | | | 215 | V | vmrglh | Vector Merge Low Byte Vector Merge Low Halfword | | VX | 4 | 396 | | | 215 | V | vmrglw | Vector Merge Low Word | | VA | 4 | 37 | | | 230 | V | vmsummbm | Vector Multiply-Sum Mixed Byte Modulo | | VA | 4 | 40 | | | 230 | V | vmsumshm | Vector Multiply-Sum Signed Halfword Modulo | | VA | 4 | 41 | | | 231 | V | vmsumshs | Vector Multiply-Sum Signed Halfword Saturate | | VA | 4 | 36 | | | 229 | V | vmsumubm | Vector Multiply-Sum Unsigned Byte Modulo | | VA | 4 | 38 | | | 231 | V | vmsumuhm | Vector Multiply-Sum Unsigned Halfword Modulo | | VA | 4 | 39 | | | 232 | V | vmsumuhs | Vector Multiply-Sum Unsigned Halfword Saturate | | VX | 4 | 776 | | | 226 | V | vmulesb | Vector Multiply Even Signed Byte | | VX | 4 | 840 | | | 226 | V | vmulesh | Vector Multiply Even Signed Halfword | | VX | 4 | 520 | | | 226 | V | vmuleub | Vector Multiply Even Unsigned Byte | | VX | 4 | 584 | | | 226 | V | vmuleuh | Vector Multiply Even Unsigned Halfword | | VX | 4 | 264 | | | 227 | V | vmulosb | Vector Multiply Odd Signed Byte | | VX | 4 | 328 | | | 227 | V | vmulosh | Vector Multiply Odd Signed Halfword | | VX | 4 | 8 | | | 227 | V | vmuloub | Vector Multiply Odd Unsigned Byte | | VX | 4 | 72 | | | 227 | V | vmulouh | Vector Multiply Odd Unsigned Halfword | | VA | 4 | 47 | | | 250 | V | vnmsubfp | Vector Negative Multiply-Subtract Single-Precision | | VX | 4 | 1284 | | | 244 | V | vnor | Vector Logical NOR | | VX | 4 | 1156 | | | 244 | V | vor | Vector Logical OR | | VA | 4 | 43 | | | 217 | V | vperm | Vector Permute | | VX | 4 | 782 | | | 209 | V | vpkpx | Vector Pack Pixel | | VX | 4 | 398 | | | 210 | V | vpkshss | Vector Pack Signed Halfword Signed Saturate | | VX<br>VX | 4 | 270<br>462 | | | 210<br>210 | V<br>V | vpkshus | Vector Pack Signed Halfword Unsigned Saturate | | VX | 4 | 334 | | | 210 | V | vpkswss<br>vpkswus | Vector Pack Signed Word Signed Saturate Vector Pack Signed Word Unsigned Saturate | | VX | 4 | 14 | | | 211 | V | vpkswus | Vector Pack Unsigned Halfword Unsigned Modulo | | VX | 4 | 142 | | | 211 | V | vpkuhus | Vector Pack Unsigned Halfword Unsigned Saturate | | VX | 4 | 78 | | | 211 | V | vpkuwum | Vector Pack Unsigned Word Unsigned Modulo | | VX | 4 | 206 | | | 211 | V | vpkuwus | Vector Pack Unsigned Word Unsigned Saturate | | VX | 4 | 266 | | | 258 | V | vrefp | Vector Reciprocal Estimate Single-Precision | | VX | 4 | 714 | | | 254 | V | vrfim | Vector Round to Single-Precision Integer toward -Infinity | | VX | 4 | 522 | | | 254 | V | vrfin | Vector Round to Single-Precision Integer Nearest | | VX | 4 | 650 | | | 254 | V | vrfip | Vector Round to Single-Precision Integer toward +Infinity | | VX | 4 | 586 | | | 254 | V | vrfiz | Vector Round to Single-Precision Integer toward Zero | | VX | 4 | 4 | | | 245 | V | vrlb | Vector Rotate Left Byte | | VX | 4 | 68 | | | 245 | V | vrlh | Vector Rotate Left Halfword | | VX | 4 | 132 | | ĺ | 245 | V | vrlw | Vector Rotate Left Word | | VX | 4 | 330 | | | 258 | V | vrsqrtefp | Vector Reciprocal Square Root Estimate Single-Precision | | VA | 4 | 42 | | | 217 | V | vsel | Vector Select | | VX | 4 | 452 | | | 218 | V | vsl | Vector Shift Left | | VX | 4 | 260 | | | 246 | V | vslb | Vector Shift Left Byte | | VA | 4 | 44 | | | 218 | V | vsldoi | Vector Shift Left Double by Octet Immediate | | VX | 4 | 324 | | | 246 | V | vslh | Vector Shift Left Halfword | | VX | 4 | 1036 | | | 218 | V | vslo | Vector Shift Left by Octet | | VX | 4 | 388 | | | 246 | V | vslw | Vector Shift Left Word | | VX | 4 | 524 | | | 216 | V | vspltb | Vector Splat Byte | | | Ор | code | ω | L | | | | | |------|-----|------|---------------------------|-------------------|------|------------------|----------|----------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Priv <sup>1</sup> | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | VX | 4 | 588 | | | 216 | V | vsplth | Vector Splat Halfword | | VX | 4 | 780 | | | 216 | V | vspltisb | Vector Splat Immediate Signed Byte | | VX | 4 | 844 | | | 216 | V | vspltish | Vector Splat Immediate Signed Halfword | | VX | 4 | 908 | | | 216 | V | vspltisw | Vector Splat Immediate Signed Word | | VX | 4 | 652 | | | 216 | V | vspltw | Vector Splat Word | | VX | 4 | 708 | | | 219 | V | vsr | Vector Shift Right | | VX | 4 | 772 | | | 248 | V | vsrab | Vector Shift Right Algebraic Byte | | VX | 4 | 836 | | | 248 | V | vsrah | Vector Shift Right Algebraic Halfword | | VX | 4 | 900 | | | 248 | V | vsraw | Vector Shift Right Algebraic Word | | VX | 4 | 516 | | | 247 | V | vsrb | Vector Shift Right Byte | | VX | 4 | 580 | | | 247 | V | vsrh | Vector Shift Right Halfword | | VX | 4 | 1100 | | | 219 | V | vsro | Vector Shift Right by Octet | | VX | 4 | 644 | | | 247 | V | vsrw | Vector Shift Right Word | | VX | 4 | 1408 | | | 223 | V | vsubcuw | Vector Subtract and Write Carry-Out Unsigned Word | | VX | 4 | 74 | | | 249 | V | vsubfp | Vector Subtract Single-Precision | | VX | 4 | 1792 | | | 223 | V | vsubsbs | Vector Subtract Signed Byte Saturate | | VX | 4 | 1856 | | | 223 | V | vsubshs | Vector Subtract Signed Halfword Saturate | | VX | 4 | 1920 | | | 223 | V | vsubsws | Vector Subtract Signed Word Saturate | | VX | 4 | 1024 | | | 224 | V | vsububm | Vector Subtract Unsigned Byte Modulo | | VX | 4 | 1536 | | | 225 | V | vsububs | Vector Subtract Unsigned Byte Saturate | | VX | 4 | 1088 | | | 224 | V | vsubuhm | Vector Subtract Unsigned Halfword Modulo | | VX | 4 | 1600 | | | 224 | V | vsubuhs | Vector Subtract Unsigned Halfword Saturate | | VX | 4 | 1152 | | | 224 | V | vsubuwm | Vector Subtract Unsigned Word Modulo | | VX | 4 | 1664 | | | 225 | V | vsubuws | Vector Subtract Unsigned Word Saturate | | VX | 4 | 1672 | | | 233 | V | vsum2sws | Vector Sum across Half Signed Word Saturate | | VX | 4 | 1800 | | | 234 | V | vsum4sbs | Vector Sum across Quarter Signed Byte Saturate | | VX | 4 | 1608 | | | 234 | V | vsum4shs | Vector Sum across Quarter Signed Halfword Saturate | | VX | 4 | 1544 | | | 234 | V | vsum4ubs | Vector Sum across Quarter Unsigned Byte Saturate | | VX | 4 | 1928 | | | 233 | V | vsumsws | Vector Sum across Signed Word Saturate | | VX | 4 | 846 | | | 212 | V | vupkhpx | Vector Unpack High Pixel | | VX | 4 | 526 | | | 212 | V | vupkhsb | Vector Unpack High Signed Byte | | VX | 4 | 590 | | | 212 | V | vupkhsh | Vector Unpack High Signed Halfword | | VX | 4 | 974 | | | 213 | V | vupklpx | Vector Unpack Low Pixel | | VX | 4 | 654 | | | 213 | V | vupklsb | Vector Unpack Low Signed Byte | | VX | 4 | 718 | | | 213 | V | vupklsh | Vector Unpack Low Signed Halfword | | VX | 4 | 1220 | | | 244 | V | vxor | Vector Logical XOR | | Χ | 31 | 62 | | | 449 | WT | wait | Wait | <sup>1</sup> See the key to the mode dependency and privilege columns on page 905 and the key to the category column in Section 1.3.5 of Book I. ## **Appendix I. Power ISA Instruction Set Sorted by Opcode** This appendix lists all the instructions in the Power ISA, in order by opcode | _ | Op | code | оr. | 1 | | | | | |------|-----|-------------|------------|------|------|------------------|----------------|------------------------------------------------------------------| | Form | Pri | code<br>Ext | Mod<br>Dep | Priv | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | D | 2 | | | | 74 | 64 | tdi | Trap Doubleword Immediate | | D | 3 | | | | 73 | В | twi | Trap Word Immediate | | VX | 4 | 0 | | | 221 | V | vaddubm | Vector Add Unsigned Byte Modulo | | VX | 4 | 2 | | | 238 | V | vmaxub | Vector Maximum Unsigned Byte | | VX | 4 | 4 | | | 245 | V | vrlb | Vector Rotate Left Byte | | VC | 4 | 6 | | | 241 | V | vcmpequb[.] | Vector Compare Equal To Unsigned Byte | | Χ | 4 | 8 | | | 357 | LMA | mulhhwu[.] | Multiply High Halfword to Word Unsigned | | VX | 4 | 8 | | | 227 | V | vmuloub | Vector Multiply Odd Unsigned Byte | | VX | 4 | 10 | | | 249 | V | vaddfp | Vector Add Single-Precision | | XO | 4 | 12 | | | 354 | LMA | machhwu[o][.] | Multiply Accumulate High Halfword to Word Modulo Unsigned | | VX | 4 | 12 | | | 214 | V | vmrghb | Vector Merge High Byte | | VX | 4 | 14 | | | 211 | V | vpkuhum | Vector Pack Unsigned Halfword Unsigned Modulo | | VA | 4 | 32 | | | 228 | V | vmhaddshs | Vector Multiply-High-Add Signed Halfword Saturate | | VA | 4 | 33 | | | 228 | V | vmhraddshs | Vector Multiply-High-Round-Add Signed Halfword Saturate | | VA | 4 | 34 | | | 229 | V | vmladduhm | Vector Multiply-Low-Add Unsigned Halfword Modulo | | VA | 4 | 36 | | | 229 | V | vmsumubm | Vector Multiply-Sum Unsigned Byte Modulo | | VA | 4 | 37 | | | 230 | V | vmsummbm | Vector Multiply-Sum Mixed Byte Modulo | | VA | 4 | 38 | | | 231 | V | vmsumuhm | Vector Multiply-Sum Unsigned Halfword Modulo | | VA | 4 | 39 | | | 232 | V | vmsumuhs | Vector Multiply-Sum Unsigned Halfword Saturate | | Χ | 4 | 40 | | | 357 | LMA | mulhhw[.] | Multiply High Halfword to Word Signed | | VA | 4 | 40 | | | 230 | V | vmsumshm | Vector Multiply-Sum Signed Halfword Modulo | | VA | 4 | 41 | | | 231 | V | vmsumshs | Vector Multiply-Sum Signed Halfword Saturate | | VA | 4 | 42 | | | 217 | V | vsel | Vector Select | | VA | 4 | 43 | | | 217 | V | vperm | Vector Permute | | ХО | 4 | 44 | | | 353 | LMA | machhw[o][.] | Multiply Accumulate High Halfword to Word Modulo<br>Signed | | VA | 4 | 44 | | | 218 | V | vsldoi | Vector Shift Left Double by Octet Immediate | | ХО | 4 | 46 | | | 359 | LMA | nmachhw[o][.] | Negative Multiply Accumulate High Halfword to Word Modulo Signed | | VA | 4 | 46 | | | 250 | V | vmaddfp | Vector Multiply-Add Single-Precision | | VA | 4 | 47 | | | 250 | V | vnmsubfp | Vector Negative Multiply-Subtract Single-Precision | | VX | 4 | 64 | | | 221 | V | vadduhm | Vector Add Unsigned Halfword Modulo | | VX | 4 | 66 | | | 238 | V | vmaxuh | Vector Maximum Unsigned Halfword | | VX | 4 | 68 | | | 245 | V | vrlh | Vector Rotate Left Halfword | | VC | 4 | 70 | | | 241 | V | vcmpequh[.] | Vector Compare Equal To Unsigned Halfword | | VX | 4 | 72 | | | 227 | V | vmulouh | Vector Multiply Odd Unsigned Halfword | | VX | 4 | 74 | | | 249 | V | vsubfp | Vector Subtract Single-Precision | | ХО | 4 | 76 | | | 354 | LMA | machhwsu[o][.] | Multiply Accumulate High Halfword to Word Saturate Unsigned | | VX | 4 | 76 | | | 214 | V | vmrghh | Vector Merge High Halfword | | VX | 4 | 78 | | | 211 | V | vpkuwum | Vector Pack Unsigned Word Unsigned Modulo | | EVS | 4 | 79 | | | 307 | SP | evsel | Vector Select | | хо | 4 | 108 | | | 353 | LMA | machhws[o][.] | Multiply Accumulate High Halfword to Word Saturate Signed | | E | Opcode | | e | ۲, | | - 1 | | | |-------|--------|-----|---------------------------|------|------|------------------|----------------|---------------------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Priv | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | XO | 4 | 110 | | | 359 | LMA | nmachhws[o][.] | Negative Multiply Accumulate High Halfword to Word Saturate Signed | | VX | 4 | 128 | | | 221 | V | vadduwm | Vector Add Unsigned Word Modulo | | VX | 4 | 130 | | | 238 | V | vmaxuw | Vector Maximum Unsigned Word | | VX | 4 | 132 | | | 245 | V | vrlw | Vector Rotate Left Word | | VC | 4 | 134 | | | 242 | V | vcmpequw[.] | Vector Compare Equal To Unsigned Word | | X | 4 | 136 | | | 356 | LMA | mulchwu[.] | Multiply Cross Halfword to Word Unsigned | | XO | 4 | 140 | | | 352 | LMA | macchwu[o][.] | Multiply Accumulate Cross Halfword to Word Modulo | | | | | | | | | | Unsigned | | VX | 4 | 140 | | | 214 | V | vmrghw | Vector Merge High Word | | VX | 4 | 142 | | | 211 | V | vpkuhus | Vector Pack Unsigned Halfword Unsigned Saturate | | X | 4 | 168 | | | 356 | LMA | mulchw[.] | Multiply Cross Halfword to Word Signed | | хо | 4 | 172 | | | 351 | LMA | macchw[o][.] | Multiply Accumulate Cross Halfword to Word Modulo Signed | | ХО | 4 | 174 | | | 358 | LMA | nmacchw[o][.] | Negative Multiply Accumulate Cross Halfword to Word Modulo Signed | | VC | 4 | 198 | | | 255 | V | vcmpeqfp[.] | Vector Compare Equal To Single-Precision | | хо | 4 | 204 | | | 352 | LMA | macchwsu[o][.] | Multiply Accumulate Cross Halfword to Word Saturate Unsigned | | VX | 4 | 206 | | | 211 | V | vpkuwus | Vector Pack Unsigned Word Unsigned Saturate | | XO | 4 | 236 | | | 351 | LMA | macchws[o][.] | Multiply Accumulate Cross Halfword to Word Saturate | | ' ' ' | • | | | | | | | Signed | | хо | 4 | 238 | | | 358 | LMA | nmacchws[o][.] | Negative Multiply Accumulate Cross Halfword to Word Saturate Signed | | VX | 4 | 258 | | | 237 | V | vmaxsb | Vector Maximum Signed Byte | | VX | 4 | 260 | | | 246 | V | vslb | Vector Shift Left Byte | | VX | 4 | 264 | | | 227 | V | vmulosb | Vector Multiply Odd Signed Byte | | VX | 4 | 266 | | | 258 | V | vrefp | Vector Reciprocal Estimate Single-Precision | | VX | 4 | 268 | | | 215 | V | vmrglb | Vector Merge Low Byte | | VX | 4 | 270 | | | 210 | V | vpkshus | Vector Pack Signed Halfword Unsigned Saturate | | VX | 4 | 322 | | | 237 | V | vmaxsh | Vector Maximum Signed Halfword | | VX | 4 | 324 | | | 246 | V | vslh | Vector Shift Left Halfword | | VX | 4 | 328 | | | 227 | V | vmulosh | Vector Multiply Odd Signed Halfword | | VX | 4 | 330 | | | 258 | V | | | | | | | | | 200 | | vrsqrtefp | Vector Reciprocal Square Root Estimate Single-Precision | | VX | 4 | 332 | | | 215 | V | vmrglh | Vector Merge Low Halfword | | VX | 4 | 334 | | | 210 | V | vpkswus | Vector Pack Signed Word Unsigned Saturate | | VX | 4 | 384 | | | 220 | V | vaddcuw | Vector Add and Write Carry-Out Unsigned Word | | VX | 4 | 386 | | | 237 | V | vmaxsw | Vector Maximum Signed Word | | VX | 4 | 388 | | | 246 | V | vslw | Vector Shift Left Word | | X | 4 | 392 | | | 357 | LMA | mullhwu[.] | Multiply Low Halfword to Word Unsigned | | VX | 4 | 394 | | | 257 | V | vexptefp | Vector 2 Raised to the Exponent Estimate Floating-<br>Point | | хо | 4 | 396 | | | 356 | LMA | maclhwu[o][.] | Multiply Accumulate Low Halfword to Word Modulo Unsigned | | VX | 4 | 396 | | | 215 | V | vmrglw | Vector Merge Low Word | | VX | 4 | 398 | | | 210 | V | vpkshss | Vector Pack Signed Halfword Signed Saturate | | X | 4 | 424 | | l | 357 | LMA | mullhw[.] | Multiply Low Halfword to Word Signed | | ХО | 4 | 428 | | | 355 | LMA | maclhw[o][.] | Multiply Accumulate Low Halfword to Word Modulo Signed | | хо | 4 | 430 | | | 360 | LMA | nmaclhw[o][.] | Negative Multiply Accumulate Low Halfword to Word Modulo Signed | | VX | 4 | 452 | | | 218 | V | vsl | Vector Shift Left | | VC | 4 | 454 | | | 256 | V | vcmpgefp[.] | Vector Compare Greater Than or Equal To Single-Pre- | | | | | | | | • | . 3692,6[,] | cision | | VX | 4 | 458 | | | 257 | V | vlogefp | Vector Log Base 2 Estimate Floating-Point | | XO | 4 | 460 | | | 356 | LMA | maclhwsu[o][.] | Multiply Accumulate Low Halfword to Word Saturate | | | | .50 | | l | | 717 ( | | Unsigned | | VX | 4 | 462 | | | 210 | V | vpkswss | Vector Pack Signed Word Signed Saturate | | L.,, | | | 1 | | | • | 1 - 12 | | | _ | Ор | code | <u>o</u> −. | _ | | | | | |-----------|-----|------------|---------------------------|------|------------|------------------|----------------------|----------------------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Priv | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | XO | 4 | 492 | | | 355 | LMA | maclhws[o][.] | Multiply Accumulate Low Halfword to Word Saturate | | хо | 4 | 494 | | | 360 | LMA | nmaclhws[o][.] | Signed Negative Multiply Accumulate Low Halfword to Word | | EVX | 4 | 512 | | | 269 | SP | evaddw | Saturate Signed<br>Vector Add Word | | VX | 4 | 512 | | | 222 | V | vaddubs | Vector Add Unsigned Byte Saturate | | EVX | 4 | 514 | | | 268 | SP | evaddiw | Vector Add Immediate Word | | VX | 4 | 514 | | | 240 | V | vminub | Vector Minimum Unsigned Byte | | EVX | 4 | 516 | | | 313 | SP | evsubfw | Vector Subtract from Word | | VX | 4 | 516 | | | 247 | V | vsrb | Vector Shift Right Byte | | EVX | 4 | 518 | | | 313 | SP | evsubifw | Vector Subtract Immediate from Word | | VC<br>EVX | 4 | 518<br>520 | | | 243<br>268 | V<br>SP | vcmpgtub[.] | Vector Compare Greater Than Unsigned Byte Vector Absolute Value | | VX | 4 | 520 | | | 226 | ۶۲<br>۷ | evabs<br>vmuleub | Vector Multiply Even Unsigned Byte | | EVX | 4 | 521 | | | 305 | SP | evneg | Vector Negate | | EVX | 4 | 522 | | | 273 | SP | evextsb | Vector Extend Sign Byte | | VX | 4 | 522 | | | 254 | V | vrfin | Vector Round to Single-Precision Integer Nearest | | EVX | 4 | 523 | | | 273 | SP | evextsh | Vector Extend Sign Halfword | | EVX | 4 | 524 | | | 307 | SP | evrndw | Vector Round Word | | VX | 4 | 524 | | | 216 | V | vspltb | Vector Splat Byte | | EVX | 4 | 525 | | | 272 | SP | evcntlzw | Vector Count Leading Zeros Word | | EVX | 4 | 526 | | | 272 | SP | evcntlsw | Vector Count Leading Signed Bits Word | | VX | 4 | 526 | | | 212 | V | vupkhsb | Vector Unpack High Signed Byte | | EVX | 4 | 527 | | | 268 | SP | brinc | Bit Reversed Increment | | EVX | 4 | 529<br>530 | | | 270<br>270 | SP<br>SP | evand<br>evandc | Vector AND Vector AND with Complement | | EVX | 4 | 534 | | | 313 | SP | evando | Vector XOR | | EVX | 4 | 535 | | | 306 | SP | evor | Vector OR | | EVX | 4 | 536 | | | 305 | SP | evnor | Vector NOR | | EVX | 4 | 537 | | | 273 | SP | eveqv | Vector Equivalent | | EVX | 4 | 539 | | | 306 | SP | evorc | Vector OR with Complement | | EVX | 4 | 542 | | | 305 | SP | evnand | Vector NAND | | EVX | 4 | 544 | | | 309 | SP | evsrwu | Vector Shift Right Word Unsigned | | EVX | 4 | 545 | | | 309 | SP | evsrws | Vector Shift Right Word Signed | | EVX | 4 | 546 | | | 308 | SP | evsrwiu | Vector Shift Right Word Immediate Unsigned | | EVX | 4 | 547 | | | 308 | SP | evsrwis | Vector Shift Right Word Immediate Signed | | EVX | 4 | 548<br>550 | | | 308<br>308 | SP<br>SP | evslw<br>evslwi | Vector Shift Left Word Vector Shift Left Word Immediate | | EVX | 4 | 552 | | | 306 | SP | evrlw | Vector Rotate Left Word | | EVX | 4 | 553 | | | 308 | SP | evsplati | Vector Splat Immediate | | EVX | 4 | 554 | | | 307 | SP | evrlwi | Vector Rotate Left Word Immediate | | EVX | 4 | 555 | | | 308 | SP | evsplatfi | Vector Splat Fractional Immediate | | EVX | 4 | 556 | | | 279 | SP | evmergehi | Vector Merge High | | EVX | 4 | 557 | | | 279 | SP | evmergelo | Vector Merge Low | | EVX | 4 | 558 | | | 280 | SP | evmergehilo | Vector Merge High/Low | | EVX | 4 | 559 | | | 280 | SP | evmergelohi | Vector Merge Low/High | | EVX | 4 | 560 | | | 271 | SP | evcmpgtu | Vector Compare Greater Than Unsigned | | EVX | 4 | 561<br>562 | | | 270<br>271 | SP<br>SP | evempgts | Vector Compare Less Than Linsigned | | EVX | 4 | 562<br>563 | | | 271 | SP<br>SP | evcmpltu<br>evcmplts | Vector Compare Less Than Unsigned<br>Vector Compare Less Than Signed | | EVX | 4 | 564 | | | 271 | SP | evemped | Vector Compare Less Than Signed Vector Compare Equal | | VX | 4 | 576 | | | 222 | V | vadduhs | Vector Add Unsigned Halfword Saturate | | VX | 4 | 578 | | | 240 | V | vminuh | Vector Minimum Unsigned Halfword | | VX | 4 | 580 | | | 247 | V | vsrh | Vector Shift Right Halfword | | VC | 4 | 582 | | | 243 | V | vcmpgtuh[.] | Vector Compare Greater Than Unsigned Halfword | | VX | 4 | 584 | | | 226 | V | vmuleuh | Vector Multiply Even Unsigned Halfword | | VX | 4 | 586 | | | 254 | V | vrfiz | Vector Round to Single-Precision Integer toward Zero | | VX | 4 | 588 | | | 216 | V | vsplth | Vector Splat Halfword | | VX | 4 | 590 | | | 212 | V<br>0DEV | vupkhsh | Vector Unpack High Signed Halfword | | EVX | 4 | 640 | | | 320 | SP.FV | evfsadd | Vector Floating-Point Single-Precision Add | | VX | 4 | 640 | | | 222 | V | vadduws | Vector Add Unsigned Word Saturate | | _ | Ор | code | <u>o</u> ⊤. | | | | | | |-----------|-----|------------|---------------------------|------|-----------------|------------------|--------------------|---------------------------------------------------------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Pri≺ | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | EVX | 4 | 641 | | | 320 | SP.FV | evfssub | Vector Floating-Point Single-Precision Subtract | | VX | 4 | 642 | | | 240 | V | vminuw | Vector Minimum Unsigned Word | | EVX | 4 | 644<br>644 | | | 319<br>247 | SP.FV<br>V | evfsabs<br>vsrw | Vector Floating-Point Single-Precision Absolute Value Vector Shift Right Word | | EVX | 4 | 645 | | | 319 | SP.FV | evfsnabs | Vector Floating-Point Single-Precision Negative Abso- | | | - | | | | | | | lute Value | | EVX | 4 | 646 | | | 319 | SP.FV | evfsneg | Vector Floating-Point Single-Precision Negate | | VC | 4 | 646 | | | 243 | V | vcmpgtuw[.] | Vector Compare Greater Than Unsigned Word | | EVX | 4 | 648<br>649 | | | 320<br>320 | SP.FV<br>SP.FV | evfsmul<br>evfsdiv | Vector Floating-Point Single-Precision Multiply | | VX | 4 | 650 | | | 254 | 3F.FV<br>V | vrfip | Vector Floating-Point Single-Precision Divide Vector Round to Single-Precision Integer toward +Infin- | | ** | | 000 | | | 201 | • | VIIIP | ity | | EVX | 4 | 652 | | | 321 | SP.FV | evfscmpgt | Vector Floating-Point Single-Precision Compare | | | | | | | | | | Greater Than | | VX | 4 | 652 | | | 216 | V | vspltw | Vector Splat Word | | EVX | 4 | 653 | | | 321 | SP.FV | evfscmplt | Vector Floating-Point Single-Precision Compare Less Than | | EVX | 4 | 654 | | | 322 | SP.FV | evfscmpeq | Vector Floating-Point Single-Precision Compare Equal | | VX | 4 | 654 | | | 213 | V | vupklsb | Vector Unpack Low Signed Byte | | EVX | 4 | 656 | | | 324 | SP.FV | evfscfui | Vector Convert Floating-Point Single-Precision from | | E) ()( | | 057 | | | 004 | 0D EV | <b></b> | Unsigned Integer | | EVX | 4 | 657 | | | 324 | SP.FV | evfscfsi | Vector Convert Floating-Point Single-Precision from<br>Signed Integer | | EVX | 4 | 658 | | | 324 | SP.FV | evfscfuf | Vector Convert Floating-Point Single-Precision from | | | | | | | 0 | O | 5 1.5 5. W. | Unsigned Fraction | | EVX | 4 | 659 | | | 324 | SP.FV | evfscfsf | Vector Convert Floating-Point Single-Precision from | | E) 04 | | | | | 005 | 0D EV | | Signed Fraction | | EVX | 4 | 660 | | | 325 | SP.FV | evfsctui | Vector Convert Floating-Point Single-Precision to Unsigned Integer | | EVX | 4 | 661 | | | 325 | SP.FV | evfsctsi | Vector Convert Floating-Point Single-Precision to | | LVX | 7 | 001 | | | 020 | O1 .1 V | CVISCISI | Signed Integer | | EVX | 4 | 662 | | | 326 | SP.FV | evfsctuf | Vector Convert Floating-Point Single-Precision to | | | | | | | | | | Unsigned Fraction | | EVX | 4 | 663 | | | 326 | SP.FV | evfsctsf | Vector Convert Floating-Point Single-Precision to | | EVX | 4 | 664 | | | 325 | SP.FV | evfsctuiz | Signed Fraction Vector Convert Floating-Point Single-Precision to | | | 7 | 004 | | | 020 | O1 .1 V | CVISCICIE | Unsigned Integer with Round toward Zero | | EVX | 4 | 666 | | | 325 | SP.FV | evfsctsiz | Vector Convert Floating-Point Single-Precision to | | | | | | | | | | Signed Integer with Round toward Zero | | EVX | 4 | 668 | | | 322 | SP.FV | evfststgt | Vector Floating-Point Single-Precision Test Greater | | EVX | 4 | 669 | | | 323 | SP.FV | evfststlt | Than Vector Floating-Point Single-Precision Test Less Than | | EVX | 4 | 670 | | | 323 | SP.FV | eviststeq | Vector Floating-Point Single-Precision Test Less Than Vector Floating-Point Single-Precision Test Equal | | EVX | 4 | 704 | | | 329 | SP.FS | efsadd | Floating-Point Single-Precision Add | | EVX | 4 | 705 | | | 329 | SP.FS | efssub | Floating-Point Single-Precision Subtract | | EVX | 4 | 708 | | | 328 | SP.FS | efsabs | Floating-Point Single-Precision Absolute Value | | VX | 4 | 708 | | | 219 | V | vsr | Vector Shift Right | | EVX | 4 | 709 | | | 328 | SP.FS | efsnabs | Floating-Point Single-Precision Negative Absolute Value | | EVX | 4 | 710 | | | 328 | SP.FS | efsneg | Floating-Point Single-Precision Negate | | VC | 4 | 710 | | | 256 | V | vcmpgtfp[.] | Vector Compare Greater Than Single-Precision | | EVX | 4 | 712 | | | 329 | SP.FS | efsmul | Floating-Point Single-Precision Multiply | | EVX<br>VX | 4 | 713<br>714 | | | 329<br>254 | SP.FS<br>V | efsdiv<br>vrfim | Floating-Point Single-Precision Divide Vector Round to Single-Precision Integer toward -Infin- | | \ \^ | 4 | 114 | | | 2J <del>4</del> | V | VIIIIII | ity | | EVX | 4 | 716 | | | 330 | SP.FS | efscmpgt | Floating-Point Single-Precision Compare Greater Than | | EVX | 4 | 717 | | | 330 | SP.FS | efscmplt | Floating-Point Single-Precision Compare Less Than | | EVX | 4 | 718 | | | 331 | SP.FS | efscmpeq | Floating-Point Single-Precision Compare Equal | | VX | 4 | 718 | | | 213 | V | vupklsh | Vector Unpack Low Signed Halfword | | E | Ор | code | e. | ۱, | | - 1 | | | |---------------------------------|-----------------------|----------------------------------------|---------------------------|-----|----------------------------------------|-------------------------------------------|------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Pri | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | EVX | 4 | 719 | | | 343 | SP.FD | efscfd | Floating-Point Single-Precision Convert from Double-<br>Precision | | EVX | 4 | 720 | | | 333 | SP.FS | efscfui | Convert Floating-Point Single-Precision from Unsigned Integer | | EVX | 4 | 721 | | | 333 | SP.FS | efscfsi | Convert Floating-Point Single-Precision from Signed | | EVX | 4 | 722 | | | 333 | SP.FS | efscfuf | Integer Convert Floating-Point Single-Precision from Unsigned Fraction | | EVX | 4 | 723 | | | 333 | SP.FS | efscfsf | Convert Floating-Point Single-Precision from Signed Fraction | | EVX | 4 | 724 | | | 333 | SP.FS | efsctui | Convert Floating-Point Single-Precision to Unsigned Integer | | EVX | 4 | 725 | | | 333 | SP.FS | efsctsi | Convert Floating-Point Single-Precision to Signed Inte- | | EVX | 4 | 726 | | | 334 | SP.FS | efsctuf | ger Convert Floating-Point Single-Precision to Unsigned Fraction | | EVX | 4 | 727 | | | 334 | SP.FS | efsctsf | Convert Floating-Point Single-Precision to Signed Fraction | | EVX | 4 | 728 | | | 334 | SP.FS | efsctuiz | Convert Floating-Point Single-Precision to Unsigned Integer with Round toward Zero | | EVX | 4 | 730 | | | 334 | SP.FS | efsctsiz | Convert Floating-Point Single-Precision to Signed Integer with Round toward Zero | | EVX<br>EVX<br>EVX<br>EVX<br>EVX | 4<br>4<br>4<br>4<br>4 | 732<br>733<br>734<br>736<br>737<br>738 | | | 331<br>332<br>332<br>336<br>336<br>340 | SP.FS<br>SP.FS<br>SP.FD<br>SP.FD<br>SP.FD | efststgt<br>efststlt<br>efststeq<br>efdadd<br>efdsub<br>efdcfuid | Floating-Point Single-Precision Test Greater Than Floating-Point Single-Precision Test Less Than Floating-Point Single-Precision Test Equal Floating-Point Double-Precision Add Floating-Point Double-Precision Subtract Convert Floating-Point Double-Precision from Unsigned | | EVX | 4 | 739 | | | 340 | SP.FD | efdcfsid | Integer Doubleword Convert Floating-Point Double-Precision from Signed Integer Doubleword | | EVX<br>EVX | 4<br>4 | 740<br>741 | | | 335<br>335 | SP.FD<br>SP.FD | efdabs<br>efdnabs | Floating-Point Double-Precision Absolute Value Floating-Point Double-Precision Negative Absolute Value | | EVX<br>EVX<br>EVX | 4<br>4<br>4<br>4 | 742<br>744<br>745<br>746 | | | 335<br>336<br>336<br>341 | SP.FD<br>SP.FD<br>SP.FD<br>SP.FD | efdneg<br>efdmul<br>efddiv<br>efdctuidz | Floating-Point Double-Precision Negate Floating-Point Double-Precision Multiply Floating-Point Double-Precision Divide Convert Floating-Point Double-Precision to Unsigned | | EVX | 4 | 747 | | | 341 | SP.FD | efdctsidz | Integer Doubleword with Round toward Zero Convert Floating-Point Double-Precision to Signed Inte- | | EVX<br>EVX<br>EVX | 4<br>4<br>4<br>4 | 748<br>749<br>750<br>751 | | | 337<br>337<br>337<br>342 | SP.FD<br>SP.FD<br>SP.FD<br>SP.FD | efdcmpgt<br>efdcmplt<br>efdcmpeq<br>efdcfs | ger Doubleword with Round toward Zero Floating-Point Double-Precision Compare Greater Than Floating-Point Double-Precision Compare Less Than Floating-Point Double-Precision Compare Equal Floating-Point Double-Precision Convert from Single- Precision | | EVX | 4 | 752 | | | 339 | SP.FD | efdcfui | Convert Floating-Point Double-Precision from Unsigned Integer | | EVX | 4 | 753 | | | 339 | SP.FD | efdcfsi | Convert Floating-Point Double-Precision from Signed Integer | | EVX | 4 | 754 | | | 340 | SP.FD | efdcfuf | Convert Floating-Point Double-Precision from Unsigned Fraction | | EVX | 4 | 755 | | | 340 | SP.FD | efdcfsf | Convert Floating-Point Double-Precision from Signed Fraction | | EVX | 4 | 756 | | | 340 | SP.FD | efdctui | Convert Floating-Point Double-Precision to Unsigned | | EVX | 4 | 757 | | | 340 | SP.FD | efdctsi | Integer Convert Floating-Point Double-Precision to Signed Integer ger | | EVX | 4 | 758 | | | 342 | SP.FD | efdctuf | Convert Floating-Point Double-Precision to Unsigned Fraction | | _ | Op | code | o | | | 4 | | | |------|--------|------------|--------------|------|------------|------------------|--------------------|---------------------------------------------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. | Pri≻ | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | EVX | 4 | 759 | | | 342 | SP.FD | efdctsf | Convert Floating-Point Double-Precision to Signed Fraction | | EVX | 4 | 760 | | | 342 | SP.FD | efdctuiz | Convert Floating-Point Double-Precision to Unsigned Integer with Round toward Zero | | EVX | 4 | 762 | | | 342 | SP.FD | efdctsiz | Convert Floating-Point Double-Precision to Signed Integer with Round toward Zero | | EVX | 4 | 764 | | | 337 | SP.FD | efdtstgt | Floating-Point Double-Precision Test Greater Than | | EVX | 4<br>4 | 765 | | | 338 | SP.FD<br>SP.FD | efdtstlt | Floating-Point Double-Precision Test Less Than | | EVX | 4 | 766<br>768 | | | 338<br>274 | SP.FD<br>SP | efdtsteq<br>evlddx | Floating-Point Double-Precision Test Equal Vector Load Double Word into Double Word Indexed | | VX | 4 | 768 | | | 220 | V | vaddsbs | Vector Add Signed Byte Saturate | | EVX | 4 | 769 | | | 274 | SP | evldd | Vector Load Double Word into Double Word | | EVX | 4 | 770 | | | 275 | SP | evldwx | Vector Load Double into Two Words Indexed | | VX | 4 | 770 | | | 239 | V | vminsb | Vector Minimum Signed Byte | | EVX | 4 | 771 | | | 275 | SP | evldw | Vector Load Double into Two Words | | EVX | 4 | 772 | | | 274 | SP | evldhx | Vector Load Double into Four Halfwords Indexed | | VX | 4 | 772 | | | 248 | V | vsrab | Vector Shift Right Algebraic Byte | | EVX | 4 | 773 | | | 274 | SP | evldh | Vector Load Double into Four Halfwords | | VC | 4 | 774 | | | 242 | V | vcmpgtsb[.] | Vector Compare Greater Than Signed Byte | | EVX | 4 | 776 | | | 275 | SP | evlhhesplatx | Vector Load Halfword into Halfwords Even and Splat Indexed | | VX | 4 | 776 | | | 226 | V | vmulesb | Vector Multiply Even Signed Byte | | EVX | 4 | 777 | | | 275 | SP | evlhhesplat | Vector Load Halfword into Halfwords Even and Splat | | VX | 4 | 778 | | | 253 | V | vcfux | Vector Convert From Unsigned Fixed-Point Word | | EVX | 4 | 780 | | | 276 | SP | evlhhousplatx | Vector Load Halfword into Halfword Odd Unsigned and Splat Indexed | | VX | 4 | 780 | | | 216 | V | vspltisb | Vector Splat Immediate Signed Byte | | EVX | 4 | 781 | | | 276 | SP | evlhhousplat | Vector Load Halfword into Halfword Odd Unsigned and Splat | | EVX | 4 | 782 | | | 276 | SP | evlhhossplatx | Vector Load Halfword into Halfword Odd Signed and Splat Indexed | | VX | 4 | 782 | | | 209 | V | vpkpx | Vector Pack Pixel | | EVX | 4 | 783 | | | 276 | SP | evlhhossplat | Vector Load Halfword into Halfword Odd Signed and Splat | | EVX | 4 | 784 | | | 277 | SP | evlwhex | Vector Load Word into Two Halfwords Even Indexed | | EVX | 4 | 785 | | | 277 | SP | evlwhe | Vector Load Word into Two Halfwords Even | | EVX | 4 | 788 | | | 278 | SP | evlwhoux | Vector Load Word into Two Halfwords Odd Unsigned Indexed (zero-extended) | | EVX | 4 | 789 | | | 278 | SP | evlwhou | Vector Load Word into Two Halfwords Odd Unsigned (zero-extended) | | EVX | 4 | 790 | | | 277 | SP | evlwhosx | Vector Load Word into Two Halfwords Odd Signed Indexed (with sign extension) | | EVX | 4 | 791 | | | 277 | SP | evlwhos | Vector Load Word into Two Halfwords Odd Signed (with sign extension) | | EVX | 4 | 792 | | | 279 | SP | evlwwsplatx | Vector Load Word into Word and Splat Indexed | | EVX | 4 | 793 | | | 279 | SP | evlwwsplat | Vector Load Word into Word and Splat | | EVX | 4 | 796 | | | 278 | SP | evlwhsplatx | Vector Load Word into Two Halfwords and Splat Indexed | | EVX | 4 | 797 | | | 278 | SP | evlwhsplat | Vector Load Word into Two Halfwords and Splat | | EVX | 4 | 800 | | | 309 | SP | evstddx | Vector Store Double of Double Indexed | | EVX | 4 | 801 | | | 309 | SP | evstdd | Vector Store Double of Double | | EVX | 4 | 802 | | | 310 | SP | evstdwx<br>evstdw | Vector Store Double of Two Words Indexed | | EVX | 4<br>4 | 803<br>804 | | | 310<br>310 | SP<br>SP | evstaw<br>evstdhx | Vector Store Double of Two Words Vector Store Double of Four Halfwords Indexed | | EVX | 4 | 804<br>805 | | | 310 | SP<br>SP | evstanx | Vector Store Double of Four Hallwords Indexed | | EVX | 4 | 816 | | | 311 | SP | evstun | Vector Store Word of Two Halfwords from Even Indexed | | EVX | 4 | 817 | | | 311 | SP | evstwhe | Vector Store Word of Two Halfwords from Even | | EVX | 4 | 820 | | | 311 | SP | evstwhox | Vector Store Word of Two Halfwords from Odd Indexed | | EVX | 4 | 821 | | | 311 | SP | evstwho | Vector Store Word of Two Halfwords from Odd | | EVX | 4 | 824 | | | 311 | SP | evstwwex | Vector Store Word of Word from Even Indexed | | | | | | | | | | - | | _ | Ор | code | <u>o</u> ⁻. | ۲, | | 4 | | | |-----------|-----|--------------|---------------------------|-----|------------|------------------|---------------------|----------------------------------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Pri | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | EVX | 4 | 825 | | | 311 | SP | evstwwe | Vector Store Word of Word from Even | | EVX | 4 | 828 | | | 312 | SP | evstwwox | Vector Store Word of Word from Odd Indexed | | EVX | 4 | 829 | | | 312 | SP | evstwwo | Vector Store Word of Word from Odd | | VX | 4 | 832 | | | 220 | V | vaddshs | Vector Add Signed Halfword Saturate | | VX | 4 | 834 | | | 239 | V | vminsh | Vector Minimum Signed Halfword | | VX | 4 | 836 | | | 248 | V | vsrah | Vector Shift Right Algebraic Halfword | | VC | 4 | 838 | | | 242 | V | vcmpgtsh[.] | Vector Compare Greater Than Signed Halfword | | VX | 4 | 840 | | | 226 | V | vmulesh | Vector Multiply Even Signed Halfword | | VX | 4 | 842 | | | 253 | V | vcfsx | Vector Convert From Signed Fixed-Point Word | | VX<br>VX | 4 | 844<br>846 | | | 216<br>212 | V<br>V | vspltish<br>vupkhpx | Vector Splat Immediate Signed Halfword<br>Vector Unpack High Pixel | | VX | 4 | 896 | | | 220 | V | vaddsws | Vector Add Signed Word Saturate | | VX | 4 | 898 | | | 239 | V | vminsw | Vector Minimum Signed Word | | VX | 4 | 900 | | | 248 | V | vsraw | Vector Shift Right Algebraic Word | | VC | 4 | 902 | | | 242 | V | vcmpgtsw[.] | Vector Compare Greater Than Signed Word | | VX | 4 | 906 | | | 252 | V | vctuxs | Vector Convert To Unsigned Fixed-Point Word Saturate | | VX | 4 | 908 | | | 216 | V | vspltisw | Vector Splat Immediate Signed Word | | VC | 4 | 966 | | | 255 | V | vcmpbfp[.] | Vector Compare Bounds Single-Precision | | VX | 4 | 970 | | | 252 | V | vctsxs | Vector Convert To Signed Fixed-Point Word Saturate | | VX | 4 | 974 | | | 213 | V | vupklpx | Vector Unpack Low Pixel | | VX | 4 | 1024 | | | 224 | V | vsububm | Vector Subtract Unsigned Byte Modulo | | VX | 4 | 1026 | | | 236 | V | vavgub | Vector Average Unsigned Byte | | EVX | 4 | 1027 | | | 284 | SP | evmhessf | Vector Multiply Halfwords, Even, Signed, Saturate, | | | | | | | | | | Fractional | | VX | 4 | 1028 | | | 244 | V | vand | Vector Logical AND | | EVX | 4 | 1031 | | | 293 | SP | evmhossf | Vector Multiply Halfwords, Odd, Signed, Saturate, Frac- | | | | | | | | | | tional | | EVX | 4 | 1032 | | | 287 | SP | evmheumi | Vector Multiply Halfwords, Even, Unsigned, Modulo, | | | | | | | | | | Integer | | EVX | 4 | 1033 | | | 283 | SP | evmhesmi | Vector Multiply Halfwords, Even, Signed, Modulo, Inte- | | | | | | | | | _ | ger | | VX | 4 | 1034 | | | 251 | V | vmaxfp | Vector Maximum Single-Precision | | EVX | 4 | 1035 | | | 282 | SP | evmhesmf | Vector Multiply Halfwords, Even, Signed, Modulo, Frac- | | E) ()( | | 4000 | | | 005 | C D | | tional | | EVX | 4 | 1036 | | | 295 | SP | evmhoumi | Vector Multiply Halfwords, Odd, Unsigned, Modulo, | | \/\ | 4 | 4000 | | | 040 | 17 | ala | Integer | | VX<br>EVX | 4 | 1036<br>1037 | | | 218<br>291 | V<br>SP | vslo<br>evmhosmi | Vector Shift Left by Octet Vector Multiply Halfwords, Odd, Signed, Modulo, Inte- | | EVA | 4 | 1037 | | | 291 | <b>3</b> P | eviiiiosiiii | 1 | | EVX | 4 | 1039 | | | 290 | SP | evmhosmf | ger Vector Multiply Halfwords, Odd, Signed, Modulo, Frac- | | LVA | 4 | 1039 | | | 290 | 31 | eviiiiosiiii | _ · · · · · · · · · · · · · · · · · · · | | EVX | 4 | 1059 | | | 284 | SP | evmhessfa | tional Vector Multiply Halfwords, Even, Signed, Saturate, | | LVA | 4 | 1039 | | | 204 | 51 | eviillessia | Fractional to Accumulator | | EVX | 4 | 1063 | | | 293 | SP | evmhossfa | Vector Multiply Halfwords, Odd, Signed, Saturate, Frac- | | LVA | 4 | 1003 | | | 293 | 51 | eviiiiossia | tional to Accumulator | | EVX | 4 | 1064 | | | 287 | SP | evmheumia | Vector Multiply Halfwords, Even, Unsigned, Modulo, | | LVA | 4 | 1004 | | | 201 | 31 | eviilleuiilla | Integer to Accumulator | | EVX | 4 | 1065 | | | 283 | SP | evmhesmia | Vector Multiply Halfwords, Even, Signed, Modulo, Inte- | | LVA | 4 | 1003 | | | 203 | 51 | eviillesiilla | ger to Accumulator | | EVX | 4 | 1067 | | | 282 | SP | evmhesmfa | Vector Multiply Halfwords, Even, Signed, Modulo, Frac- | | | _ | 1007 | | | 202 | 0 | eviillesiilla | tional to Accumulator | | EVX | 4 | 1068 | | | 295 | SP | evmhoumia | Vector Multiply Halfwords, Odd, Unsigned, Modulo, | | | т . | 1000 | | | 200 | 01 | Sviiiiodiiia | Integer to Accumulator | | EVX | 4 | 1069 | | | 291 | SP | evmhosmia | Vector Multiply Halfwords, Odd, Signed, Modulo, Inte- | | | + | 1003 | | | ا ت کے | OI. | CVIIIIOSIIIIA | ger to Accumulator | | EVX | 4 | 1071 | | | 290 | SP | evmhosmfa | Vector Multiply Halfwords, Odd, Signed, Modulo, Frac- | | LVA | 7 | 10/1 | | | 290 | OI. | CVIIIIOSIIIIA | tional to Accumulator | | VX | 4 | 1088 | | | 224 | V | vsubuhm | Vector Subtract Unsigned Halfword Modulo | | VX | 4 | 1090 | | | 236 | V | vavguh | Vector Average Unsigned Halfword | | VX | 4 | 1090 | | | 244 | V | vandc | Vector Logical AND with Complement | | ٠,, | | .002 | 1 | | - ' ' | ٧ | 1.31.00 | 1 . cete. Logical / a to man complement | | _ | Ор | code | o⊤. | Ļ. | | | | | |-------|-----|--------------|---------------------------|------|------------|------------------|---------------------|--------------------------------------------------------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Priv | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | EVX | 4 | 1095 | | | 298 | SP | evmwhssf | Vector Multiply Word High Signed, Saturate, Fractional | | EVX | 4 | 1096 | | | 300 | SP | evmwlumi | Vector Multiply Word Low Unsigned, Modulo, Integer | | VX | 4 | 1098 | | | 251 | V | vminfp | Vector Minimum Single-Precision | | EVX | 4 | 1100 | | | 298 | SP | evmwhumi | Vector Multiply Word High Unsigned, Modulo, Integer | | VX | 4 | 1100 | | | 219 | V | vsro | Vector Shift Right by Octet | | EVX | 4 | 1101 | | | 297 | SP | evmwhsmi | Vector Multiply Word High Signed, Modulo, Integer | | EVX | 4 | 1103<br>1107 | | | 297<br>303 | SP<br>SP | evmwhsmf<br>evmwssf | Vector Multiply Word High Signed, Modulo, Fractional Vector Multiply Word Signed, Saturate, Fractional | | EVX | 4 | 1112 | | | 303 | SP | evmwumi | Vector Multiply Word Signed, Saturate, Fractional Vector Multiply Word Unsigned, Modulo, Integer | | EVX | 4 | 1113 | | | 302 | SP | evmwsmi | Vector Multiply Word Orisigned, Modulo, Integer | | EVX | 4 | 1115 | | | 301 | SP | evmwsmf | Vector Multiply Word Signed, Modulo, Fractional | | EVX | 4 | 1127 | | | 298 | SP | evmwhssfa | Vector Multiply Word High Signed, Saturate, Fractional | | | • | | | | 200 | O. | oviiiwiiooia | to Accumulator | | EVX | 4 | 1128 | | | 300 | SP | evmwlumia | Vector Multiply Word Low Unsigned, Modulo, Integer to | | | | | | | | - | | Accumulator | | EVX | 4 | 1132 | | | 298 | SP | evmwhumia | Vector Multiply Word High Unsigned, Modulo, Integer to | | | | | | | | | | Accumulator | | EVX | 4 | 1133 | | | 297 | SP | evmwhsmia | Vector Multiply Word High Signed, Modulo, Integer to | | | | | | | | | | Accumulator | | EVX | 4 | 1135 | | | 297 | SP | evmwhsmfa | Vector Multiply Word High Signed, Modulo, Fractional to | | | | | | | | | | Accumulator | | EVX | 4 | 1139 | | | 303 | SP | evmwssfa | Vector Multiply Word Signed, Saturate, Fractional to | | | | | | | | | | Accumulator | | EVX | 4 | 1144 | | | 304 | SP | evmwumia | Vector Multiply Word Unsigned, Modulo, Integer to | | | | | | | | | | Accumulator | | EVX | 4 | 1145 | | | 302 | SP | evmwsmia | Vector Multiply Word Signed, Modulo, Integer to Accu- | | | | | | | | | | mulator | | EVX | 4 | 1147 | | | 301 | SP | evmwsmfa | Vector Multiply Word Signed, Modulo, Fractional to | | | | | | | | | | Accumulator | | VX | 4 | 1152 | | | 224 | V | vsubuwm | Vector Subtract Unsigned Word Modulo | | VX | 4 | 1154 | | | 236 | V | vavguw | Vector Average Unsigned Word | | VX | 4 | 1156 | | | 244 | V | vor | Vector Logical OR | | EVX | 4 | 1216 | | | 269 | SP | evaddusiaaw | Vector Add Unsigned, Saturate, Integer to Accumulator | | | | | | | | | | Word | | EVX | 4 | 1217 | | | 269 | SP | evaddssiaaw | Vector Add Signed, Saturate, Integer to Accumulator | | | | | | | | | | Word | | EVX | 4 | 1218 | | | 313 | SP | evsubfusiaaw | Vector Subtract Unsigned, Saturate, Integer to Accumu- | | | | | | | | | | lator Word | | EVX | 4 | 1219 | | | 312 | SP | evsubfssiaaw | Vector Subtract Signed, Saturate, Integer to Accumula- | | | | | | | | | | tor Word | | EVX | 4 | 1220 | | | 297 | SP | evmra | Initialize Accumulator | | VX | 4 | 1220 | | | 244 | V | vxor | Vector Logical XOR | | EVX | 4 | 1222 | | | 272 | SP | evdivws | Vector Divide Word Signed | | EVX | 4 | 1223 | | | 273 | SP | evdivwu | Vector Divide Word Unsigned | | EVX | 4 | 1224 | | | 269 | SP | evaddumiaaw | Vector Add Unsigned, Modulo, Integer to Accumulator | | EV/V | 4 | 4005 | | | 200 | CD | | Word | | EVX | 4 | 1225 | | | 268 | SP | evaddsmiaaw | Vector Add Signed, Modulo, Integer to Accumulator | | EVX | 4 | 1226 | | | 313 | SP | evsubfumiaaw | Word Vector Subtract Unsigned, Modulo, Integer to Accumu- | | E v ^ | 4 | 1220 | | | 313 | SF | CASUDIUIIIIQAM | | | EVX | 4 | 1227 | | | 312 | SP | evsubfsmiaaw | lator Word<br>Vector Subtract Signed, Modulo, Integer to Accumulator | | - ^ ^ | + | 1221 | | | 312 | OF. | EvaubiaiiiaaW | Word | | EVX | 4 | 1280 | | | 288 | SP | evmheusiaaw | Vector Multiply Halfwords, Even, Unsigned, Saturate, | | | 4 | 1200 | | | 200 | 36 | eviiiieusiaaW | Integer and Accumulate into Words | | EVX | 4 | 1281 | | | 286 | SP | evmhessiaaw | Vector Multiply Halfwords, Even, Signed, Saturate, Inte- | | - *^ | _ | 1201 | | | 200 | OI | Sviiiiossiaaw | ger and Accumulate into Words | | VX | 4 | 1282 | | | 235 | V | vavgsb | Vector Average Signed Byte | | EVX | 4 | 1283 | | | 285 | SP | evmhessfaaw | Vector Multiply Halfwords, Even, Signed, Saturate, | | " | т | 1200 | | | 200 | <b>O</b> 1 | VIIIIOOSIAAW | Fractional and Accumulate into Words | | L | | | l | | | | I | i radional and Addunidate Into Words | | _ | Ор | code | o ⁻. | _ | | | | | |-----------|-----|--------------|---------------------------|------|------------|------------------|------------------------|------------------------------------------------------------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Priv | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | EVX | 4 | 1284 | | | 296 | SP | evmhousiaaw | Vector Multiply Halfwords, Odd, Unsigned, Saturate, | | VX | 4 | 1284 | | | 244 | V | vnor | Integer and Accumulate into Words Vector Logical NOR | | EVX | 4 | 1285 | | | 295 | SP | evmhossiaaw | Vector Multiply Halfwords, Odd, Signed, Saturate, Inte- | | EV/V | 4 | 4007 | | | 294 | SP | | ger and Accumulate into Words | | EVX | 4 | 1287 | | | 294 | 5P | evmhossfaaw | Vector Multiply Halfwords, Odd, Signed, Saturate, Fractional and Accumulate into Words | | EVX | 4 | 1288 | | | 287 | SP | evmheumiaaw | Vector Multiply Halfwords, Even, Unsigned, Modulo, | | EVX | 4 | 1289 | | | 283 | SP | evmhesmiaaw | Integer and Accumulate into Words Vector Multiply Halfwords, Even, Signed, Modulo, Inte- | | | | | | | | | | ger and Accumulate into Words | | EVX | 4 | 1291 | | | 282 | SP | evmhesmfaaw | Vector Multiply Halfwords, Even, Signed, Modulo, Fractional and Accumulate into Words | | EVX | 4 | 1292 | | | 296 | SP | evmhoumiaaw | Vector Multiply Halfwords, Odd, Unsigned, Modulo, | | EVX | 4 | 1293 | | | 292 | SP | evmhosmiaaw | Integer and Accumulate into Words Vector Multiply Halfwords, Odd, Signed, Modulo, Inte- | | LVA | 4 | 1293 | | | 292 | SF | eviiiiosiiiaaw | ger and Accumulate into Words | | EVX | 4 | 1295 | | | 291 | SP | evmhosmfaaw | Vector Multiply Halfwords, Odd, Signed, Modulo, Frac- | | EVX | 4 | 1320 | | | 281 | SP | evmhegumiaa | tional and Accumulate into Words Vector Multiply Halfwords, Even, Guarded, Unsigned, | | | | | | | | | | Modulo, Integer and Accumulate | | EVX | 4 | 1321 | | | 281 | SP | evmhegsmiaa | Vector Multiply Halfwords, Even, Guarded, Signed,<br>Modulo, Integer and Accumulate | | EVX | 4 | 1323 | | | 280 | SP | evmhegsmfaa | Vector Multiply Halfwords, Even, Guarded, Signed, | | EVX | 4 | 1324 | | | 290 | SP | evmhogumiaa | Modulo, Fractional and Accumulate Vector Multiply Halfwords, Odd, Guarded, Unsigned, | | LVX | 4 | 1324 | | | 290 | 31 | eviilloguillaa | Modulo, Integer and Accumulate | | EVX | 4 | 1325 | | | 289 | SP | evmhogsmiaa | Vector Multiply Halfwords, Odd, Guarded, Signed, Mod- | | EVX | 4 | 1327 | | | 289 | SP | evmhogsmfaa | ulo, Integer and Accumulate Vector Multiply Halfwords, Odd, Guarded, Signed, Mod- | | E) () ( | | 4044 | | | 004 | 0.5 | | ulo, Fractional and Accumulate | | EVX | 4 | 1344 | | | 301 | SP | evmwlusiaaw | Vector Multiply Word Low Unsigned, Saturate, Integer and Accumulate into Words | | EVX | 4 | 1345 | | | 299 | SP | evmwlssiaaw | Vector Multiply Word Low Signed, Saturate, Integer and | | VX | 4 | 1346 | | | 235 | V | vavgsh | Accumulate into Words Vector Average Signed Halfword | | EVX | 4 | 1352 | | | 300 | SP | evmwlumiaaw | Vector Multiply Word Low Unsigned, Modulo, Integer | | EVX | 4 | 1353 | | | 299 | SP | evmwlsmiaaw | and Accumulate into Words Vector Multiply Word Low Signed, Modulo, Integer and | | | 4 | | | | | | eviiiwisiiiiaaw | Accumulate into Words | | EVX | 4 | 1363 | | | 303 | SP | evmwssfaa | Vector Multiply Word Signed, Saturate, Fractional and Accumulate | | EVX | 4 | 1368 | | | 305 | SP | evmwumiaa | Vector Multiply Word Unsigned, Modulo, Integer and | | | , | 4000 | | | 000 | 0.0 | | Accumulate | | EVX | 4 | 1369 | | | 302 | SP | evmwsmiaa | Vector Multiply Word Signed, Modulo, Integer and Accumulate | | EVX | 4 | 1371 | | | 302 | SP | evmwsmfaa | Vector Multiply Word Signed, Modulo, Fractional and | | EVX | 4 | 1408 | | | 288 | SP | evmheusianw | Accumulate Vector Multiply Halfwords, Even, Unsigned, Saturate, | | | | | | | | | | Integer and Accumulate Negative into Words | | VX<br>EVX | 4 | 1408<br>1409 | | | 223<br>286 | V<br>SP | vsubcuw<br>evmhessianw | Vector Subtract and Write Carry-Out Unsigned Word Vector Multiply Halfwords, Even, Signed, Saturate, Inte- | | | - | 1703 | | | 200 | Oi. | Ovininessianiw | ger and Accumulate Negative into Words | | VX | 4 | 1410 | | | 235 | V | vavgsw | Vector Average Signed Word | | EVX | 4 | 1411 | | | 285 | SP | evmhessfanw | Vector Multiply Halfwords, Even, Signed, Saturate,<br>Fractional and Accumulate Negative into Words | | EVX | 4 | 1412 | | | 296 | SP | evmhousianw | Vector Multiply Halfwords, Odd, Unsigned, Saturate, | | | | | | | | | | Integer and Accumulate Negative into Words | | ٤ | Ор | code | e | ۲, | | _ 1 | | | |--------|-----|------|---------------------------|-----|----------|------------------|-----------------|----------------------------------------------------------------------------------------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Pri | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | EVX | 4 | 1413 | | | 295 | SP | evmhossianw | Vector Multiply Halfwords, Odd, Signed, Saturate, Inte- | | EVX | 4 | 1415 | | | 294 | SP | evmhossfanw | ger and Accumulate Negative into Words Vector Multiply Halfwords, Odd, Signed, Saturate, Fractional and Accumulate Negative into Words | | EVX | 4 | 1416 | | | 287 | SP | evmheumianw | Vector Multiply Halfwords, Even, Unsigned, Modulo, | | EVX | 4 | 1417 | | | 283 | SP | evmhesmianw | Integer and Accumulate Negative into Words Vector Multiply Halfwords, Even, Signed, Modulo, Integer and Accumulate Negative into Words | | EVX | 4 | 1419 | | | 282 | SP | evmhesmfanw | Vector Multiply Halfwords, Even, Signed, Modulo, Fractional and Accumulate Negative into Words | | EVX | 4 | 1420 | | | 292 | SP | evmhoumianw | Vector Multiply Halfwords, Odd, Unsigned, Modulo, Integer and Accumulate Negative into Words | | EVX | 4 | 1421 | | | 291 | SP | evmhosmianw | Vector Multiply Halfwords, Odd, Signed, Modulo, Integer and Accumulate Negative into Words | | EVX | 4 | 1423 | | | 291 | SP | evmhosmfanw | Vector Multiply Halfwords, Odd, Signed, Modulo, Fractional and Accumulate Negative into Words | | EVX | 4 | 1448 | | | 281 | SP | evmhegumian | Vector Multiply Halfwords, Even, Guarded, Unsigned, Modulo, Integer and Accumulate Negative | | EVX | 4 | 1449 | | | 281 | SP | evmhegsmian | Vector Multiply Halfwords, Even, Guarded, Signed, Modulo, Integer and Accumulate Negative | | EVX | 4 | 1451 | | | 280 | SP | evmhegsmfan | Vector Multiply Halfwords, Even, Guarded, Signed, Modulo, Fractional and Accumulate Negative | | EVX | 4 | 1452 | | | 290 | SP | evmhogumian | Vector Multiply Halfwords, Odd, Guarded, Unsigned, Modulo, Integer and Accumulate Negative | | EVX | 4 | 1453 | | | 289 | SP | evmhogsmian | Vector Multiply Halfwords, Odd, Guarded, Signed, Mod-<br>ulo, Integer and Accumulate Negative | | EVX | 4 | 1455 | | | 289 | SP | evmhogsmfan | Vector Multiply Halfwords, Odd, Guarded, Signed, Mod-<br>ulo, Fractional and Accumulate Negative | | EVX | 4 | 1472 | | | 301 | SP | evmwlusianw | Vector Multiply Word Low Unsigned, Saturate, Integer and Accumulate Negative in Words | | EVX | 4 | 1473 | | | 299 | SP | evmwlssianw | Vector Multiply Word Low Signed, Saturate, Integer and Accumulate Negative in Words | | EVX | 4 | 1480 | | | 300 | SP | evmwlumianw | Vector Multiply Word Low Unsigned, Modulo, Integer and Accumulate Negative in Words | | EVX | 4 | 1481 | | | 299 | SP | evmwlsmianw | Vector Multiply Word Low Signed, Modulo, Integer and Accumulate Negative in Words | | EVX | 4 | 1491 | | | 304 | SP | evmwssfan | Vector Multiply Word Signed, Saturate, Fractional and Accumulate Negative | | EVX | 4 | 1496 | | | 305 | SP | evmwumian | Vector Multiply Word Unsigned, Modulo, Integer and Accumulate Negative | | EVX | 4 | 1497 | | | 302 | SP | evmwsmian | Vector Multiply Word Signed, Modulo, Integer and Accumulate Negative | | EVX | 4 | 1499 | | | 302 | SP | evmwsmfan | Vector Multiply Word Signed, Modulo, Fractional and Accumulate Negative | | VX | 4 | 1536 | | | 225 | V | vsububs | Vector Subtract Unsigned Byte Saturate | | VX | 4 | 1540 | | | 259 | V | mfvscr | Move From Vector Status and Control Register | | VX | 4 | 1544 | | | 234 | V | vsum4ubs | Vector Sum across Quarter Unsigned Byte Saturate | | VX | 4 | 1600 | | | 224 | V | vsubuhs | Vector Subtract Unsigned Halfword Saturate | | VX | 4 | 1604 | | | 259 | V | mtvscr | Move To Vector Status and Control Register | | VX | 4 | 1608 | | | 234 | V | vsum4shs | Vector Sum across Quarter Signed Halfword Saturate | | VX | 4 | 1664 | | | 225 | V | vsubuws | Vector Subtract Unsigned Word Saturate | | VX | 4 | 1672 | | | 233 | V | vsum2sws | Vector Sum across Half Signed Word Saturate | | VX | 4 | 1792 | | | 223 | V | vsubsbs | Vector Subtract Signed Byte Saturate | | VX | 4 | 1800 | | | 234 | V | vsum4sbs | Vector Sum across Quarter Signed Byte Saturate | | VX | 4 | 1856 | | | 223 | V | vsubshs | Vector Subtract Signed Halfword Saturate | | VX | 4 | 1920 | | | 223 | V | vsubsws | Vector Subtract Signed Word Saturate | | VX | 4 | 1928 | | | 233 | V | vsumsws | Vector Sum across Signed Word Saturate | | D | 7 | | C.D. | | 67 | В | mulli | Multiply Low Immediate | | D<br>D | 8 | | SR | | 64<br>72 | B<br>B | subfic<br>cmpli | Subtract From Immediate Carrying Compare Logical Immediate | | וטו | 10 | l | | | 12 | ь | Curbu | Compare Logical inimediate | | E | Opcode | | .7 e | ٦, | | - 1 | | | |----------------------------------------|-------------------------------------------|-----------------------------------------------------------------------------------------------------------|----------------------------------------|----------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Pri∖ | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | D D D D B SC | 11<br>12<br>13<br>14<br>15<br>16<br>17 | | SR<br>SR<br>CT | | 71<br>63<br>63<br>62<br>62<br>35<br>39,<br>479, | B<br>B<br>B<br>B<br>B | cmpi<br>addic<br>addic.<br>addi<br>addis<br>bc[l][a]<br>sc | Compare Immediate Add Immediate Carrying Add Immediate Carrying and Record Add Immediate Add Immediate Add Immediate Shifted Branch Conditional System Call | | _ XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX | 18 19 19 19 19 19 19 19 19 19 19 19 19 19 | 0 16 18 33 38 39 51 129 150 193 198 2257 274 249 449 466 498 528 0 1 2 3 8 9 0 4 6 7 8 9 10 11 5 19 20 21 | C CSSSS SSSSSSSSSSSSSSSSSSSSSSSSSSSSSS | P PPPP I I III | 479,<br>613<br>35<br>38<br>36<br>480<br>38<br>614<br>613<br>614<br>37<br>718<br>37<br>480<br>38<br>482<br>37<br>483<br>482<br>37<br>483<br>482<br>37<br>75<br>75<br>85<br>86<br>87<br>77<br>85<br>86<br>87<br>71<br>87<br>87<br>87<br>87<br>87<br>87<br>87<br>87<br>87<br>87<br>87<br>87<br>87 | B B B S B E E E E B B S B S B S B S B S | b[l][a] mcrf bclr[l] rfid crnor rfmci rfdi rfi rfci crandc isync crxor dnh crnand crand hrfid creqv doze crorc nap cror sleep rvwinkle bcctr[l] rlwimi[.] rlwimi[.] rlwim[.] rlwini isi mdi andis. rldic[.] rldic[.] rldic[.] rldic[.] rldic[.] rldic[.] rldic[.] rldcf[.] rldcf[ | Branch Move Condition Register Field Branch Conditional to Link Register Return From Interrupt Doubleword Condition Register NOR Return From Machine Check Interrupt Return From Debug Interrupt Return From Debug Interrupt Return From Critical Interrupt Condition Register AND with Complement Instruction Synchronize Condition Register XOR Debugger Notify Halt Condition Register NAND Condition Register AND Hypervisor Return From Interrupt Doubleword Condition Register Equivalent Doze Condition Register OR with Complement Nap Condition Register OR Sleep Rip Van Winkle Branch Conditional to Count Register Rotate Left Word Immediate then Mask Insert Rotate Left Word Immediate then AND with Mask OR Immediate OR Immediate OR Immediate AND Immediate AND Immediate Shifted AND Immediate AND Immediate Shifted Rotate Left Doubleword Immediate then Clear Right Rotate Left Doubleword Immediate then Clear Right Rotate Left Doubleword Immediate then Clear Ten L | | | ٤ | Ор | code | <u>o</u> ⊤. | ر1 | _ | - 1 | | | |---|---------------------|----------|------------|---------------------------|------|-----------|---------------------|---------------------------|-----------------------------------------------------------------------------------| | | X<br>X<br>X<br>Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Priv | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | | X | 31 | 22 | | | 428 | Е | icbt | Instruction Cache Block Touch | | | X | 31 | 23 | | | 48 | В | lwzx | Load Word and Zero Indexed | | | X | 31 | 24 | SR | | 88 | В | slw[.] | Shift Left Word | | | X | 31 | 26 | SR | | 79 | В | cntlzw[.] | Count Leading Zeros Word | | | X<br>X | 31<br>31 | 27<br>28 | SR<br>SR | | 90<br>77 | 64<br>B | sld[.] | Shift Left Doubleword<br>AND | | | X | 31 | 29 | SK | Р | 628 | E.PD | and[.]<br>Idepx | Load Doubleword by External Process ID Indexed | | | X | 31 | 31 | | P | 628 | E.PD | lwepx | Load Word by External Process ID Indexed | | | X | 31 | 32 | | | 72 | В | cmpl | Compare Logical | | | X | 31 | 38 | | | 208 | V | lvsr | Load Vector for Shift Right Indexed | | | X | 31 | 39 | | | 203 | V | lvehx | Load Vector Element Halfword Indexed | | | XO | 31 | 40 | SR | | 63 | В | subf[o][.] | Subtract From | | | X | 31 | 53 | | | 50 | 64 | ldux | Load Doubleword with Update Indexed | | | X<br>X | 31 | 54 | | | 436 | В | dcbst | Data Cache Block Store | | | X | 31<br>31 | 55<br>58 | SR | | 48<br>81 | В<br>64 | lwzux<br>cntlzd[.] | Load Word and Zero with Update Indexed Count Leading Zeros Doubleword | | | X | 31 | 60 | SR | | 78 | B | andc[.] | AND with Complement | | | X | 31 | 62 | 0.1 | | 449 | WT | wait | Wait | | | Χ | 31 | 63 | | Р | 631 | E.PD | dcbstep | Data Cache Block Store by External PID | | - | X | 31 | 68 | | | 74 | 64 | td | Trap Doubleword | | | X | 31 | 71 | 0.0 | | 203 | V | lvewx | Load Vector Element Word Indexed | | | XO | 31 | 73 | SR<br>SR | | 69<br>405 | 64 | mulhd[.] | Multiply High Doubleword | | | XO<br>XO | 31<br>31 | 74<br>75 | SR | Н | 495<br>67 | BCDA<br>B | addg6s<br>mulhw[.] | Add and Generate Sixes Multiply High Word | | | X | 31 | 78 | SIX | | 349 | LMV | dlmzb[.] | Determine Leftmost Zero Byte | | | X | 31 | 83 | | Р | 503, | В | mfmsr | Move From Machine State Register | | | | | | | | 625 | | | 3 | | | X | 31 | 84 | | | 444 | 64 | Idarx | Load Doubleword And Reserve Indexed | | | X | 31 | 86 | | | 437 | В | dcbf | Data Cache Block Flush | | | X | 31 | 87 | | | 46 | В | lbzx | Load Byte and Zero Indexed | | | X | 31 | 95 | | Р | 627 | E.PD | lbepx | Load Byte by External Process ID Indexed | | | X<br>XO | 31<br>31 | 103<br>104 | SR | | 204<br>66 | V<br>B | lvx<br>neg[o][.] | Load Vector Indexed<br>Negate | | | X | 31 | 119 | SIX | | 45 | В | lbzux | Load Byte and Zero with Update Indexed | | | X | 31 | 122 | | | 81 | B.in | popentb | Population Count Bytes | | | X | 31 | 124 | SR | | 78 | В | nor[.] | NOR | | I | X | 31 | 127 | | Р | 632 | E.PD | dcbfep | Data Cache Block Flush by External PID | | | X | 31 | 131 | | Р | 626 | _E | wrtee | Write MSR External Enable | | | X | 31 | 134 | | М | 655 | ECL | dcbtstls | Data Cache Block Touch for Store and Lock Set | | | X<br>XO | 31<br>31 | 135<br>136 | SR | | 206<br>65 | V<br>B | stvebx | Store Vector Element Byte Indexed Subtract From Extended | | | XO | 31 | 138 | SR | | 65 | В | subfe[o][.]<br>adde[o][.] | Add Extended | | | XFX | 31 | 144 | 0.1 | | 95 | В | mtcrf | Move To Condition Register Fields | | | XFX | 31 | 144 | | | 96 | B.in | mtocrf | Move To One Condition Register Field | | | Χ | 31 | 146 | | Р | 625 | E | mtmsr | Move To Machine State Register | | | X | 31 | 146 | | Р | 501 | S | mtmsr | Move To Machine State Register | | | X | 31 | 149 | | | 54 | 64 | stdx | Store Doubleword Indexed Store Word Conditional Indexed | | | X<br>X | 31<br>31 | 150<br>151 | | | 442<br>53 | B<br>B | stwcx.<br>stwx | Store Word Conditional Indexed Store Word Indexed | | ı | X | 31 | 154 | | | 80 | В | prtyw | Parity Word | | | X | 31 | 157 | | Р | 630 | E.PD | stdepx | Store Doubleword by External Process ID Indexed | | | X | 31 | 159 | | Р | 630 | E.PD | stwepx | Store Word by External Process ID Indexed | | | Χ | 31 | 163 | | Р | 626 | E | wrteei | Write MSR External Enable Immediate | | | X | 31 | 166 | | М | 655 | ECL | dcbtls | Data Cache Block Touch and Lock Set | | | X | 31 | 167 | | Р | 206 | V | stvehx | Store Vector Element Halfword Indexed | | | X<br>X | 31<br>31 | 178<br>181 | | ۲ | 502<br>54 | S<br>64 | mtmsrd<br>stdux | Move To Machine State Register Doubleword<br>Store Doubleword with Update Indexed | | | X | 31 | 183 | | | 53 | 0 <del>4</del><br>В | stwux | Store Word with Update Indexed | | Ī | X | 31 | 186 | | | 80 | 64 | prtyd | Parity Doubleword | | | X | 31 | 199 | | | 207 | V | stvewx | Store Vector Element Word Indexed | | | XO | 31 | 200 | SR | | 66 | В | subfze[o][.] | Subtract From Zero Extended | | | _ | Op | code | e ⁻. | ۲, | | 4 | | | |---|------------|----------|------------|---------------------------|--------|------------------|--------------------|----------------------------|---------------------------------------------------------------------------------------------| | | Form | Pri | | Mode<br>Dep. <sup>1</sup> | Priv | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | _ | XO | 31 | 202 | SR | ) | 66 | В | addze[o][.] | Add to Zero Extended | | | X | 31<br>31 | 206<br>210 | 32 | P<br>P | 721<br>537 | E.PC<br>S | msgsnd | Message Send | | | X | 31 | 214 | 32 | г | 444 | 64 | mtsr<br>stdcx. | Move To Segment Register Store Doubleword Conditional Indexed | | | X | 31 | 215 | | | 51 | В | stbx | Store Byte Indexed | | I | X | 31 | 223 | | Ρ | 629 | E.PD | stbepx | Store Byte by External Process ID Indexed | | | X | 31 | 230 | | М | 657 | ECL | icblc | Instruction Cache Block Lock Clear | | | X | 31 | 231 | | | 204 | V | stvx | Store Vector Indexed | | | XO | 31 | 232 | SR | | 65 | B | subfme[o][.] | Subtract From Minus One Extended | | | XO<br>XO | 31<br>31 | 233<br>234 | SR<br>SR | | 69<br>65 | 64<br>B | mulld[o][.]<br>addme[o][.] | Multiply Low Doubleword Add to Minus One Extended | | | XO | 31 | 235 | SR | | 67 | В | mullw[o][.] | Multiply Low Word | | I | X | 31 | 238 | 0.1 | Р | 721 | E.PC | msgclr | Message Clear | | • | X | 31 | 242 | 32 | Ρ | 537 | S | mtsrin | Move To Segment Register Indirect | | | X | 31 | 246 | | | 435 | В | dcbtst | Data Cache Block Touch for Store | | _ | X | 31 | 247 | | _ | 51 | В | stbux | Store Byte with Update Indexed | | | X<br>X | 31 | 255 | | P<br>P | 633 | E.PD | dcbtstep | Data Cache Block Touch for Store by External PID | | | X | 31<br>31 | 259<br>263 | | Р | 625<br>637 | E<br>E.PD | mfdcrx<br>Ivepxl | Move From Device Control Register Indexed<br>Load Vector by External Process ID Indexed LRU | | L | XO | 31 | 266 | SR | • | 63 | В | add[o][.] | Add | | | X | 31 | 274 | 64 | Н | 541 | S | tlbiel | TLB Invalidate Entry Local | | | X | 31 | 275 | | | 97 | Ε | mfapidi | Move From APID Indirect | | | X | 31 | 278 | | | 434 | В | dcbt | Data Cache Block Touch | | | X<br>X | 31<br>31 | 279<br>282 | | Н | 46<br>494 | B<br>BCDA | lhzx<br>cdtbcd | Load Halfword and Zero Indexed | | ı | X | 31 | 284 | SR | П | 78 | В | eqv[.] | Convert Declets To Binary Coded Decimal<br>Equivalent | | ı | EVX | 31 | 285 | Oix | Р | 636 | E.PD | evlddepx | Vector Load Doubleword into Doubleword by External | | • | | | | | | | | | Process ID Indexed | | | X | 31 | 287 | | Р | 627 | E.PD | lhepx | Load Halfword by External Process ID Indexed | | | Х | 31 | 291 | | Р | 97 | E | mfdcrux | Move From Device Control Register User-mode Indexed | | ı | Χ | 31 | 295 | | Р | 637 | E.PD | lvepx | Load Vector by External Process ID Indexed | | • | X | 31 | 306 | 64 | Н | 539 | S | tlbie | TLB Invalidate Entry | | | X | 31 | 310 | | | 456 | EC | eciwx | External Control In Word Indexed | | | X<br>X | 31<br>31 | 311<br>314 | | Н | 46<br>494 | B<br>BCDA | lhzux<br>cbcdtd | Load Halfword and Zero with Update Indexed<br>Convert Binary Coded Decimal to Declets | | I | X | 31 | 316 | SR | ٠. | 77 | В | xor[.] | XOR | | ı | X | 31 | 319 | 0.1 | Ρ | 631 | E.PD | dcbtep | Data Cache Block Touch by External PID | | | XFX | 31 | 323 | | Р | 625 | Ε | mfdcr | Move From Device Control Register | | | X | 31 | 326 | | Р | 730 | E.CD | dcread | Data Cache Read [Alternative Encoding] | | ı | XFX<br>XFX | 31<br>31 | 334<br>339 | | 00 | 756<br>94,4 | E.PM<br>B | mfpmr<br>mfspr | Move From Performance Monitor Register Move From Special Purpose Register | | | AFA | 31 | 339 | | | 54, <del>4</del> | ь | шърг | INOVE FIORI Special Fulpose Register | | | Χ | 31 | 341 | | | 49 | 64 | lwax | Load Word Algebraic Indexed | | | X | 31 | 343 | | | 47 | В | lhax | Load Halfword Algebraic Indexed | | _ | X | 31 | 359 | | | 204 | V | lvxl | Load Vector Indexed LRU | | I | X | 31 | 370 | | Н | 542 | S | tlbia | TLB Invalidate All | | | XFX<br>X | 31<br>31 | 371<br>373 | | | 451<br>49 | S.out<br>64 | mftb<br>lwaux | Move From Time Base<br>Load Word Algebraic with Update Indexed | | | X | 31 | 375 | | | 49<br>47 | <del>04</del><br>В | Ihaux | Load Word Algebraic with Opdate Indexed | | | X | 31 | 387 | | Р | 624 | Ē | mtdcrx | Move To Device Control Register Indexed | | I | X | 31 | 390 | | М | 656 | ECL | dcblc | Data Cache Block Lock Clear | | _ | X | 31 | 402 | | Р | 533 | S | slbmte | SLB Move To Entry | | | X | 31 | 407 | SR | | 52<br>79 | B<br>B | sthx | Store Halfword Indexed | | | X<br>EVX | 31<br>31 | 412<br>413 | SK | Р | 78<br>636 | E.PD | orc[.]<br>evstddepx | OR with Complement Vector Store Doubleword into Doubleword by External | | ı | " | 01 | 710 | | ' | 000 | | σνοιαάσρα | Process ID Indexed | | | XS | 31 | 413 | SR | | 91 | 64 | sradi[.] | Shift Right Algebraic Doubleword Immediate | | | X | 31 | 415 | | Р | 629 | E.PD | sthepx | Store Halfword by External Process ID Indexed | | | X | 31 | 419 | | _ | 97 | E | mtdcrux | Move To Device Control Register User-mode Indexed | | | Χ | 31 | 434 | | Р | 531 | S | slbie | SLB Invalidate Entry | | | _ | Ор | code | ω | | | | | | |---|----------|----------|------------|---------------------------|--------|-------------|------------------|----------------------|----------------------------------------------------------------------------------------| | | × Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | riv | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | | Υ<br>Υ | 31 | 438 | 2 0 | ъ. | 456 | EC | ecowx | External Control Out Word Indexed | | | X | 31 | 439 | | | 52 | В | sthux | Store Halfword with Update Indexed | | | Χ | 31 | 444 | SR | | 77 | В | or[.] | OR | | | XFX | 31 | 451 | | Р | 624 | Е | mtdcr | Move To Device Control Register | | ı | X | 31 | 454 | 0.0 | Р | 727 | E.CI | dci | Data Cache Invalidate | | | XO<br>XO | 31<br>31 | 457<br>459 | SR<br>SR | | 70<br>68 | 64<br>B | divdu[o][.] | Divide Doubleword Unsigned Divide Word Unsigned | | | XFX | 31 | 462 | SK | 0 | 756 | E.PM | divwu[o][.]<br>mtpmr | Move To Performance Monitor Register | | ı | XFX | 31 | 467 | | ŏ | 93 | B | mtspr | Move To Special Purpose Register | | | X | 31 | 470 | | P | 652 | E | dcbi | Data Cache Block Invalidate | | | X | 31 | 476 | SR | | 77 | В | nand[.] | NAND | | | X | 31 | 486 | | Р | 730 | E.CD | dcread | Data Cache Read | | ı | X<br>X | 31<br>31 | 486<br>487 | | M | 656<br>207 | ECL<br>V | icbtls<br>stvxl | Instruction Cache Block Touch and Lock Set<br>Store Vector Indexed LRU | | | хо | 31 | 489 | SR | | 70 | v<br>64 | divd[o][.] | Divide Doubleword | | | XO | 31 | 491 | SR | | 68 | В | divw[o][.] | Divide Word | | | X | 31 | 498 | | Р | 532 | S | slbia | SLB Invalidate All | | ı | Χ | 31 | 508 | | | 79 | В | cmpb | Compare Bytes | | | X | 31 | 512 | | | 97 | E | mcrxr | Move to Condition Register from XER | | | X<br>X | 31<br>31 | 533<br>534 | | | 59<br>55 | MA<br>B | Iswx | Load String Word Indexed<br>Load Word Byte-Reverse Indexed | | | X | 31 | 535 | | | 122 | FP | lwbrx<br>lfsx | Load Floating-Point Single Indexed | | | X | 31 | 536 | SR | | 88 | В | srw[.] | Shift Right Word | | | Χ | 31 | 539 | SR | | 90 | 64 | srd[.] | Shift Right Doubleword | | | Χ | 31 | 566 | | Н | 542, | В | tlbsync | TLB Synchronize | | | | | | | | 659, | | | | | | V | 24 | FC7 | | | 749 | ED | If a | Load Floating Daint Cingle with Undete Indexed | | | X<br>X | 31<br>31 | 567<br>595 | 32 | Р | 122<br>538 | FP<br>S | lfsux<br>mfsr | Load Floating-Point Single with Update Indexed Move From Segment Register | | | X | 31 | 597 | 32 | ' | 59 | MA | Iswi | Load String Word Immediate | | | Χ | 31 | 598 | | | 446 | В | sync | Synchronize | | | Χ | 31 | 599 | | | 119 | FP | lfdx | Load Floating-Point Double Indexed | | | Х | 31 | 607 | | Р | 635 | E.PD | Ifdepx | Load Floating-Point Double by External Process ID | | | Х | 31 | 631 | | | 119 | FP | lfdux | Indexed | | | x | 31 | 659 | 32 | Р | 538 | S | mfsrin | Load Floating-Point Double with Update Indexed Move From Segment Register Indirect | | | X | 31 | 661 | 02 | • | 60 | MA | stswx | Store String Word Indexed | | | X | 31 | 662 | | | 55 | В | stwbrx | Store Word Byte-Reverse Indexed | | | X | 31 | 663 | | | 122 | FP | stfsx | Store Floating-Point Single Indexed | | | X<br>X | 31<br>31 | 695<br>725 | | | 122<br>60 | FP<br>MA | stfsux | Store Floating-Point Single with Update Indexed | | | X | 31 | 725<br>727 | | | 123 | MA<br>FP | stswi | Store String Word Immediate Store Floating-Point Double Indexed | | 1 | X | 31 | 735 | | Р | 635 | E.PD | stfdepx | Store Floating-Point Double by External Process ID | | • | | | | | | | | ' | Indexed | | | X | 31 | 758 | | | 433 | E | dcba | Data Cache Block Allocate | | | X | 31 | 759<br>775 | | _ | 123 | FP | stfdux | Store Floating-Point Double with Update Indexed | | ı | X<br>X | 31<br>31 | 775<br>786 | | P<br>P | 638<br>658, | E.PD<br>E | stvepxl<br>tlbivax | Store Vector by External Process ID Indexed LRU TLB Invalidate Virtual Address Indexed | | | ^ | 31 | 700 | | ' | 747 | L | libivax | TED ITVAIIdate VIItual Address Itidexed | | ı | Х | 31 | 789 | | Н | 491 | S | lwzcix | Load Word and Zero Caching Inhibited Indexed | | - | Χ | 31 | 790 | | | 55 | В | Ihbrx | Load Halfword Byte-Reverse Indexed | | | X | 31 | 791 | | | 125 | FP.out | Ifdpx | Load Floating-Point Double Pair Indexed | | | X<br>X | 31 | 792<br>704 | SR | | 89<br>01 | B<br>64 | sraw[.] | Shift Right Algebraic Word | | ı | X | 31<br>31 | 794<br>807 | SR | Р | 91<br>638 | 64<br>E.PD | srad[.]<br>stvepx | Shift Right Algebraic Doubleword Store Vector by External Process ID Indexed | | | X | 31 | 821 | | H | 491 | S | Ihzcix | Load Halfword and Zero Caching Inhibited Indexed | | | Χ | 31 | 824 | SR | | 89 | В | srawi[.] | Shift Right Algebraic Word Immediate | | | X | 31 | 851 | | Р | 534 | S | slbmfev | SLB Move From Entry VSID | | | X | 31 | 853 | | Н | 491 | S | Ibzcix | Load Byte and Zero Caching Inhibited Indexed | | | X | 31 | 854 | | | 448 | S<br>E | eieio | Enforce In-order Execution of I/O | | | Χ | 31 | 854 | | | 448 | | mbar | Memory Barrier | | | _ | Opcode | | <u>o</u> ⊤. | ۲, | | | | | |---|----------|----------|------------|---------------------------|--------|-------------|------------------|--------------------|------------------------------------------------------------------------------| | | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Pri∿ | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | I | Χ | 31 | 855 | | | 120 | FP | Ifiwax | Load Floating-Point as Integer Word Algebraic Indexed | | I | X<br>X | 31<br>31 | 885<br>914 | | H<br>P | 491<br>659, | S<br>E | ldcix<br>tlbsx | Load Doubleword Caching Inhibited Indexed TLB Search Indexed | | | | 0. | 0 | | • | 748 | _ | libox | 125 Coal of Findoxoa | | | Χ | 31 | 915 | | Р | 534 | S | slbmfee | SLB Move From Entry ESID | | I | X | 31 | 917 | | Η | 492 | S | stwcix | Store Word Caching Inhibited Indexed | | | X<br>X | 31<br>31 | 918<br>919 | | | 55<br>125 | B<br>FP.out | sthbrx<br>stfdpx | Store Halfword Byte-Reverse Indexed Store Floating-Point Double Pair Indexed | | I | X | 31 | 922 | SR | | 79 | B | extsh[.] | Extend Sign Halfword | | | Χ | 31 | 946 | | Р | 658, | Е | tlbre | TLB Read Entry | | _ | \ | 0.4 | 0.40 | | | 748 | 0 | a · | | | ı | X | 31<br>31 | 949<br>954 | SR | Н | 492<br>79 | S<br>B | sthcix<br>extsb[.] | Store Halfword Caching Inhibited Indexed Extend Sign Byte | | ı | X | 31 | 966 | Six | Р | 727 | E.CI | ici | Instruction Cache Invalidate | | • | Χ | 31 | 978 | | Р | 660, | E | tlbwe | TLB Write Entry | | _ | | | | | | 749 | _ | | | | | X | 31 | 979 | SR | P<br>H | 535 | S<br>S | slbfee. | SLB Find Entry ESID | | ı | X | 31<br>31 | 981<br>982 | | П | 492<br>428 | S<br>B | stbcix<br>icbi | Store Byte Caching Inhibited Indexed<br>Instruction Cache Block Invalidate | | | X | 31 | 983 | | | 124 | FP | stfiwx | Store Floating-Point as Integer Word Indexed | | _ | X | 31 | 986 | SR | | 81 | 64 | extsw[.] | Extend Sign Word | | | X<br>X | 31<br>31 | 991<br>998 | | P<br>P | 634<br>731 | E.PD<br>E.CD | icbiep<br>icread | Instruction Cache Block Invalidate by External PID Instruction Cache Read | | | X | 31 | 1013 | | Н | 492 | S S | stdcix | Store Doubleword Caching Inhibited Indexed | | • | Χ | 31 | 1014 | | • • | 436 | B | dcbz | Data Cache Block set to Zero | | | X | 31 | 1023 | | Р | 634 | E.PD | dcbzep | Data Cache Block set to Zero by External PID | | | D<br>D | 32<br>33 | | | | 48<br>48 | B<br>B | lwz<br>lwzu | Load Word and Zero<br>Load Word and Zero with Update | | | D | 34 | | | | 45 | В | lbz | Load Byte and Zero | | | D | 35 | | | | 45 | В | lbzu | Load Byte and Zero with Update | | | D | 36 | | | | 53 | В | stw | Store Word | | | D<br>D | 37<br>38 | | | | 53<br>51 | B<br>B | stwu<br>stb | Store Word with Update<br>Store Byte | | | D | 39 | | | | 51 | В | stbu | Store Byte with Update | | | D | 40 | | | | 46 | В | lhz | Load Halfword and Zero | | | D<br>D | 41<br>42 | | | | 46<br>47 | B<br>B | lhzu<br>lha | Load Halfword Algebraia | | | D | 43 | | | | 47 | В | Ihau | Load Halfword Algebraic Load Halfword Algebraic with Update | | | D | 44 | | | | 52 | В | sth | Store Halfword | | | D | 45 | | | | 52 | В | sthu | Store Halfword with Update | | | D | 46<br>47 | | | | 56<br>57 | B<br>B | lmw<br>stmw | Load Multiple Word<br>Store Multiple Word | | | D<br>D | 47 | | | | 122 | Б<br>FP | Ifs | Load Floating-Point Single | | | D | 49 | | | | 122 | FP | Ifsu | Load Floating-Point Single with Update | | | D | 50 | | | | 119 | FP | lfd | Load Floating-Point Double | | | D<br>D | 51<br>52 | | | | 119<br>122 | FP<br>FP | Ifdu<br>stfs | Load Floating-Point Double with Update<br>Store Floating-Point Single | | | D | 53 | | | | 122 | FP | stfsu | Store Floating-Point Single with Update | | | D | 54 | | | | 123 | FP | stfd | Store Floating-Point Double | | | D | 55 | | | ) | 123 | FP | stfdu | Store Floating-Point Double with Update | | | DQ<br>DS | 56<br>57 | 0 | | Р | 493<br>125 | LSQ<br>FP.out | lq<br>lfdp | Load Quadword<br>Load Floating-Point Double Pair | | I | DS | 58 | 0 | | | 50 | 64 | Id | Load Ploating-Form Bouble Fair | | | DS | 58 | 1 | | | 50 | 64 | ldu | Load Doubleword with Update | | | DS | 58 | 2 | | | 49 | 64<br>DED | lwa | Load Word Algebraic | | | X<br>Z | 59<br>59 | 2<br>3 | | | 163<br>174 | DFP<br>DFP | dadd<br>dqua | DFP Add<br>DFP Quantize | | ı | A | 59 | 18 | | | 128 | FP[R] | fdivs[.] | Floating Divide Single | | | Α | 59 | 20 | | | 127 | FP[R] | fsubs[.] | Floating Subtract Single | | | A | 59 | 21 | | | 127 | FP[R] | fadds[.] | Floating Add Single | | | Α | 59 | 22 | | | 129 | FP[R] | fsqrts[.] | Floating Square Root Single | | ٤ | Op | code | .₁e | ر1 | _ | - 1 | | | |---------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------|---------------------------|----------|-------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Pri | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | A<br>A<br>A<br>A<br>A<br>A<br>X<br>Z<br>Z23<br>Z23<br>Z<br>Z23<br>X<br>X<br>Z23<br>Z23<br>Z23 | 59 59 59 59 59 59 59 59 59 59 59 59 59 5 | 24<br>25<br>26<br>28<br>29<br>30<br>31<br>34<br>35<br>66<br>67<br>98<br>99<br>130<br>162<br>194<br>226 | | <b>4</b> | 129<br>128<br>130<br>132<br>132<br>133<br>133<br>165<br>176<br>190<br>173<br>190<br>179<br>169<br>171<br>170 | FP[R] FP[R] FP[R] DFP | fres[.] fmuls[.] frsqrtes[.] fmsubs[.] fmadds[.] fnmadds[.] dmul drrnd dscli[.] dquai[.] dscri drintx[.] dcmpo dtstex dtstdc dtstdg | Floating Reciprocal Estimate Single Floating Multiply Single Floating Reciprocal Square Root Estimate Single Floating Multiply-Subtract Single Floating Multiply-Add Single Floating Negative Multiply-Subtract Single Floating Negative Multiply-Add Single DFP Multiply DFP Reround DFP Shift Significand Left Immediate DFP Quantize Immediate DFP Shift Significand Right Immediate DFP Round To FP Integer With Inexact DFP Compare Ordered DFP Test Exponent DFP Test Data Class DFP Test Data Group | | Z23<br>X<br>X<br>X<br>X<br>X<br>X<br>X<br>X<br>X<br>X<br>DS<br>DS<br>DS<br>DS<br>X<br>X<br>X<br>X | 59<br>59<br>59<br>59<br>59<br>59<br>59<br>59<br>59<br>59<br>61<br>62<br>62<br>63<br>63<br>63<br>63<br>63<br>63 | 227<br>258<br>290<br>322<br>354<br>514<br>546<br>642<br>674<br>770<br>834<br>866<br>-<br>0<br>1<br>2<br>0<br>2<br>3<br>8<br>12<br>14 | | Р | 181<br>183<br>185<br>187<br>188<br>163<br>166<br>168<br>172<br>184<br>187<br>188<br>125<br>54<br>493<br>138<br>163<br>174<br>126<br>134<br>135<br>136 | DFP<br>DFP<br>DFP<br>DFP<br>DFP<br>DFP<br>DFP<br>DFP<br>DFP<br>FP.out<br>64<br>LSQ<br>FPFP<br>FP[R]<br>FP[R]<br>FP[R] | drintn[.] dctdp dctfix ddedpd dxex dsub ddiv dcmpu dtstsf drsp denbcd diex stfdp std stdu stq fcmpu daddq dquaq[.] fcpsgn[.] frsp[.] fctiwz[.] | DFP Round To FP Integer Without Inexact DFP Convert To DFP Long DFP Convert To Fixed DFP Decode DPD To BCD DFP Extract Biased Exponent DFP Subtract DFP Divide DFP Compare Unordered DFP Test Significance DFP Round To DFP Short DFP Encode BCD To DPD DFP Insert Biased Exponent Store Floating-Point Double Pair Store Doubleword Store Doubleword with Update Store Quadword Floating Compare Unordered DFP Add Quad DFP Quantize Quad Floating Copy Sign Floating Convert To Integer Word Floating Convert To Integer Word with round toward | | A<br>A<br>A<br>A<br>A<br>A<br>A<br>A<br>A<br>X<br>X<br>Z<br>Z<br>Z<br>Z<br>Z<br>Z<br>Z<br>Z<br>Z | 63 63 63 63 63 63 63 63 63 63 63 63 63 6 | 18<br>20<br>21<br>22<br>23<br>24<br>25<br>26<br>28<br>29<br>30<br>31<br>32<br>34<br>35<br>38<br>40<br>64<br>66<br>67 | | | 128<br>127<br>127<br>129<br>139<br>129<br>128<br>130<br>132<br>133<br>133<br>138<br>165<br>176<br>142<br>126<br>140<br>190<br>173 | FP[R] | fdiv[.] fsub[.] fsub[.] fsqrt[.] fsel[.] fre[.] fmul[.] frsqrte[.] fmsub[.] fmadd[.] fnmsub[.] fnmadd[.] fcmpo dmulq drrndq[.] mtfsb1[.] fneg[.] mcrfs dscliq[.] dquaiq[.] | Zero Floating Divide Floating Subtract Floating Add Floating Square Root Floating Select Floating Reciprocal Estimate Floating Multiply Floating Reciprocal Square Root Estimate Floating Multiply-Subtract Floating Multiply-Subtract Floating Multiply-Add Floating Negative Multiply-Subtract Floating Negative Multiply-Add Floating Compare Ordered DFP Multiply Quad DFP Reround Quad Move To FPSCR Bit 1 Floating Negate Move to Condition Register from FPSCR DFP Shift Significand Left Immediate Quad DFP Quantize Immediate Quad | | _ | Ор | code | <u>o</u> −. | - | | | | | |------|-----|------|---------------------------|------|------|------------------|------------|---------------------------------------------------------------| | Form | Pri | Ext | Mode<br>Dep. <sup>1</sup> | Priv | Page | Cat <sup>1</sup> | Mnemonic | Instruction | | X | 63 | 70 | | | 142 | FP[R] | mtfsb0[.] | Move To FPSCR Bit 0 | | X | 63 | 72 | | | 126 | FP[R] | fmr[.] | Floating Move Register | | Z | 63 | 98 | | | 190 | DFP | dscriq | DFP Shift Significand Right Immediate Quad | | Z23 | 63 | 99 | | | 179 | DFP | drintxq[.] | DFP Round To FP Integer With Inexact Quad | | X | 63 | 130 | | | 169 | DFP | dcmpoq | DFP Compare Ordered Quad | | X | 63 | 134 | | | 141 | FP[R] | mtfsfi[.] | Move To FPSCR Field Immediate | | X | 63 | 136 | | | 126 | FP[R] | fnabs[.] | Floating Negative Absolute Value | | X | 63 | 162 | | | 171 | DFP | dtstexq | DFP Test Exponent Quad | | Z23 | 63 | 194 | | | 170 | DFP | dtstdcq | DFP Test Data Class Quad | | Z23 | 63 | 226 | | | 170 | DFP | dtstdgq | DFP Test Data Group Quad | | Z23 | 63 | 227 | | | 181 | DFP | drintnq[.] | DFP Round To FP Integer Without Inexact Quad | | X | 63 | 258 | | | 183 | DFP | dctqpq | DFP Convert To DFP Extended | | X | 63 | 264 | | | 126 | FP[R] | fabs[.] | Floating Absolute Value | | X | 63 | 290 | | | 185 | DFP | dctfixq | DFP Convert To Fixed Quad | | X | 63 | 322 | | | 187 | DFP | ddedpdq | DFP Decode DPD To BCD Quad | | X | 63 | 354 | | | 188 | DFP | dxexq | DFP Extract Biased Exponent Quad | | X | 63 | 392 | | | 137 | FP[R].in | frin[.] | Floating Round to Integer Nearest | | X | 63 | 424 | | | 137 | FP[R].in | friz[.] | Floating Round to Integer Toward Zero | | Χ | 63 | 456 | | | 137 | FP[R].in | frip[.] | Floating Round to Integer Plus | | X | 63 | 488 | | | 137 | FP[R].in | frim[.] | Floating Round to Integer Minus | | X | 63 | 514 | | | 163 | DFP | dsubq | DFP Subtract Quad | | X | 63 | 546 | | | 166 | DFP | ddivq | DFP Divide Quad | | X | 63 | 583 | | | 140 | FP[R] | mffs[.] | Move From FPSCR | | X | 63 | 642 | | | 169 | DFP | dcmpuq | DFP Compare Unordered Quad | | X | 63 | 674 | | | 172 | DFP | dtstsfq | DFP Test Significance Quad | | XFL | 63 | 711 | | | 141 | FP[R] | mtfsf[.] | Move To FPSCR Fields | | X | 63 | 770 | | | 184 | DFP | drdpq | DFP Round To DFP Long | | X | 63 | 802 | | | 185 | DFP | dcffixq | DFP Convert From Fixed Quad | | X | 63 | 814 | | | 134 | FP[R] | fctid[.] | Floating Convert To Integer Doubleword | | Х | 63 | 815 | | | 135 | FP[R] | fctidz[.] | Floating Convert To Integer Doubleword with round toward Zero | | X | 63 | 834 | | | 187 | DFP | denbcdg | DFP Encode BCD To DPD Quad | | X | 63 | 846 | | | 136 | FP[R] | fcfid[.] | Floating Convert From Integer Doubleword | | X | 63 | 866 | | | 188 | DFP | diexq | DFP Insert Biased Exponent Quad | | | | | <del>., 1</del> | _ | | | | mps on page 905 and the key to the category column in | See the key to the mode dependency and privilege columns on page 905 and the key to the category column in Section 1.3.5 of Book I. ## Index | Α | BA instruction field 763, 764<br>BB field 18 | |-----------------------------------------------|-------------------------------------------------------------------| | a bit 32 | BC field 18 | | A-form 17 | BD field 18 | | AA field 18 | BD instruction field 764 | | address 23 | BE | | | | | effective 26 | See Machine State Register BF field 18 | | effective address 505, 639 | | | real 506, 640 | BF instruction field 764 | | address compare 506, 559, 566 | BFA field 18 | | address translation 522, 644 | BFA instruction field 764 | | EA to VA 508 | BH field 18 | | esid to vsid 508 | Bl field 18 | | overview 514 | block 408 | | PTE 540 500 | BO field 18, 32 | | page table entry 518, 522 | boundedly undefined 4 | | Reference bit 522 | Branch Trace 565 | | RPN | Bridge 536 | | real page number 517 | Segment Registers 536 | | VA to RA 517 | SR 536 | | VPN | brinc 268 | | virtual page number 517 | BT field 18 | | 32-bit mode 508 | bytes 4 | | address wrap 506, 640 | | | addresses | С | | accessed by processor 512 | • | | implicit accesses 512 | C 102 | | interrupt vectors 512 | CA 42 | | with defined uses 512 | cache management instructions 427 | | addressing mode | cache model 409 | | D-mode 767 | cache parameters 425 | | aliasing 413 | Caching Inhibited 411 | | alignment | Change bit 522 | | effect on performance 421, 581, 703 | CIA 7 | | Alignment interrupt 562, 601, 677 | Come-From Address Register 497, 859 | | assembler language | consistency 413 | | extended mnemonics 383, 589, 733 | context | | mnemonics 383, 589, 733 | definition 467, 607 | | symbols 383, 589, 733 | synchronization 469, 609 | | atomic operation 415 | Control Register 488 | | atomicity 409 | Count Register 497, 621, 774, 859 | | single-copy 409 | | | Auxiliary Processor 4 | CR 30<br>Critical Input interrupt 674 | | Auxiliary Processor Unavailable interrupt 679 | Critical Input interrupt 674 Critical Save/Restore Register 1 663 | | · · · · · · · · · · · · · · · · · · · | | | B | CSRR1 663 | | В | CTR 31, 774 | | D forms 44 | CTRL See Control Register | | B-form 14 | See Control Register | | BA field 18 | Current Instruction Address 479, 613 | | D | eciwx instruction 455, 456, 559, 562, 563, 566, 583 | |----------------------------------------------------------------------|-----------------------------------------------------| | | ecowx instruction 455, 456, 559, 562, 563, 566, 583 | | D field 18 | EE . | | D instruction field 764 | See Machine State Register | | D-form 15 | effective address 26, 505, 514, 639 | | D-mode addressing mode 767 | size 508 | | DABR interrupt 581 | translation 514 | | DABR(X) | eieio instruction 413, 448, 543 | | See Data Breakpoint Register (Extension) | emulation assist 468, 608 | | DAR | Endianness 412 | | See Data Address Register | EQ 30, 31 | | data access 506, 640 | ESR 665 | | Data Address Breakpoint Register (Extension) 475, 498, 581, 586, 861 | evabs 268<br>evaddiw 268 | | data address compare 559, 566 | evaddsmiaaw 268 | | Data Address Register 497, 548, 559, 560, 562, 566, | evaddssiaaw 269 | | 568, 859 | eviwhex 277 | | data cache instructions 429 | exception 662 | | Data Exception Address Register 664 | alignment exception 677 | | data exception address register 664 | critical input exception 674 | | Data Segment interrupt 560, 567 | data storage exception 675 | | data storage 407 | external input exception 676 | | Data Storage interrupt 559, 566, 675 | illegal instruction exception 678 | | Data Storage Interrupt Status Register 497, 548, 549, | instruction storage exception 676 | | 559, 562, 563, 566, 601, 859 | instruction TLB miss exception 681 | | Alignment interrupt 601 | machine check exception 674 | | Data TLB Error interrupt 681 | privileged instruction exception 678 | | dcba instruction 433, 652 | program exception 678 | | dcbf instruction 437 | system call exception 679 | | dcbst instruction 417, 436, 559, 566 | trap exception 678 | | dcbt instruction 434, 631, 655 | exception priorities 689 | | dcbtls 656 | system call instruction 691 | | dcbtst instruction 435, 633, 655 | trap instructions 690 | | dcbz instruction 436, 529, 559, 562, 566, 601, 634, | Exception Syndrome Register 665 | | 652 | exception syndrome register 665 | | DEAR 664 | exception vector prefix register 664 | | Debug Interrupt 682 | Exceptions 661 | | DEC | exceptions | | See Decrementer | address compare 506, 559, 566 | | Decrementer 497, 576, 621, 697, 859 | definition 467, 607 | | Decrementer Interrupt 680 | page fault 506, 521, 559, 566, 639 | | Decrementer interrupt 501, 502, 565 | protection 506, 639 | | defined instructions 21 | segment fault 506 | | denormalization 106 | storage 506, 639 execution synchronization 469, 609 | | denormalized number 104<br>double-precision 106 | extended mnemonics 457 | | doublewords 4 | External Access Register 497, 559, 566, 583, 586, | | DQ field 18 | 621, 859 | | DQ-form 15 | External Control 455 | | DR DR | External Control instructions | | See Machine State Register | eciwx 456 | | DS field 19 | ecowx 456 | | DS-form 15 | External Input interrupt 676 | | DSISR | External interrupt 501, 502, 561 | | See Data Storage Interrupt Status Register | , , , | | J , | F | | E | Γ | | L | FE 31, 102 | | E (Enable bit) 583 | FEX 101 | | EA 26 | FE0 | | | | | See Machine State Register | VXSNAN 102 | |----------------------------------------------------|---------------------------------------------------| | FE1 | VXSOFT 102 | | See Machine State Register | VXSQRT 102 | | FG 31, 102 | VXVC 102 | | FI 102 | VXZDZ 102 | | Fixed-Interval Timer interrupt 680 | XE 103 | | Fixed-Point Exception Register 497, 621, 859 | XX 101 | | FL 30, 102 | ZE 103 | | FLM field 19 | ZX 101 | | floating-point | FR 102 | | denormalization 106 | FRA field 19 | | double-precision 106 | FRB field 19 | | exceptions 100, 108 | FRC field 19 | | inexact 113 | FRS field 19 | | invalid operation 110 | FRT field 19 | | overflow 111 | FU 31, 102 | | underflow 112 | FX 101 | | zero divide 111 | FXM field 19 | | execution models 113 | FXM instruction field 764 | | normalization 106 | | | number | G | | denormalized 104 | | | infinity 105 | GPR 42 | | normalized 104 | GT 30, 31 | | not a number 105 | Guarded 411 | | zero 104 | | | rounding 107 | Н | | sign 105 | •• | | single-precision 106 | halfwords 4 | | Floating-Point Unavailable interrupt 564, 569, 679 | hardware | | forward progress 417 | definition 468, 608 | | FP | hardware description language 7 | | See Machine State Register | hashed page table | | FPCC 102<br>FPR 100 | size 519 | | FPRF 102 | HDEC | | FPSCR 101 | See Hypervisor Decrementer | | C 102 | HDICE | | FE 102 | See Logical Partitioning Control Register | | FEX 101 | HEIR | | FG 102 | See Hypervisor Emulated Instruction Register | | FI 102 | hrfid instruction 477, 572 | | FL 102 | HRMOR | | FPCC 102 | See Hypervisor Real Mode Offset Register | | FPRF 102 | HSPRGn | | FR 102 | See software-use SPRs | | FU 102 | HTABORG 520 | | FX 101 | HTABSIZE 520 | | NI 103 | HV | | OE 103 | See Machine State Register | | OX 101 | hypervisor 471 | | RN 103 | Hypervisor Decrementer 497, 577, 586, 860 | | UE 103 | Hypervisor Decrementer interrupt 565 | | UX 101 | Hypervisor Emulated Instruction Register 498, 549 | | VE 103 | 860 | | VX 101 | Hypervisor Machine Status Save Restore Register | | VXCVI 103 | See HSRR0, HSRR1 | | VXIDI 102 | Hypervisor Machine Status Save Restore Register | | VXIMZ 102 | 0 548 | | VXISI 102 | Hypervisor Real Mode Offset Register 43, 474, 488 | | 586 | PMRN 20<br>RA 20 | |-------------------------------------------|----------------------------------------------------| | 1 | RB 20 | | I | Rc 20 | | I-form 14 | RS 20 | | icbi instruction 417, 428, 559, 566 | RT 20 | | icbt instruction 428 | SH 20 | | ILE | SI 20 | | See Logical Partitioning Control Register | SPR 20 | | illegal instructions 21 | SR 20 | | implicit branch 506, 640 | TBR 20 | | imprecise interrupt 550, 669 | TH 20 | | in-order operations 506, 640 | TO 20 | | inexact 113 | U 20 | | infinity 105 | UI 20 | | instruction 559, 566 | XO 21 | | field | formats 14–?? | | BA 763, 764 | A-form 17<br>B-form 14 | | BD 764 | D-form 15 | | BF 764 | DQ-form 15 | | BFA 764 | DS-form 15 | | D 764 | I-form 14 | | FXM 764 | M-form 17 | | L 764 | MD-form 17 | | LK 764 | MDS-form 17 | | Rc 764<br>SH 764 | SC-form 15 | | SH 764<br>SI 764 | VA-form 17 | | UI 765 | VX-form 17 | | WS 765 | X-form 16 | | fields 18–21 | XFL-form 16 | | AA 18 | XFX-form 16 | | BA 18 | XL-form 16 | | BB 18 | XO-form 17 | | BC 18 | XS-form 17 | | BD 18 | interrupt control 778 | | BF 18 | mtmsr 625 | | BFA 18 | partially executed 686 | | BH 18 | rfci 779 | | BI 18 | sc 778 | | BO 18 | instruction cache instructions 428 | | BT 18 | instruction fetch 506, 640 | | D 18 | effective address 506, 640 | | DQ 18 | implicit branch 506, 640<br>Instruction Fields 763 | | DS 19 | instruction restart 423 | | FLM 19 | Instruction Segment interrupt 561, 568 | | FRA 19 | instruction storage 407 | | FRB 19 | Instruction Storage interrupt 560, 676 | | FRC 19 | Instruction TLB Error Interrupt 681 | | FRS 19<br>FRT 19 | instruction-caused interrupt 550 | | FXI 19<br>FXM 19 | Instructions | | - | brinc 268 | | L 19<br>LEV 19 | dcbtls 656 | | LEV 19<br>LI 19 | evabs 268 | | LK 19 | evaddiw 268 | | MB 19 | evaddsmiaaw 268 | | ME 19 | evaddssiaaw 269 | | NB 20 | evlwhex 277 | | OE 20 | instructions | | classes 21 | sync 417, 446, 469, 522, 551 | |-----------------------------------------------------------|----------------------------------------------------------------| | dcba 433, 652 | tlbia 521, 542 | | dcbf 437 | tlbie 521, 539, 542, 544, 659 | | dcbst 417, 436, 559, 566 | tlbiel 541 | | dcbt 434, 631, 655 | tlbsync 542, 543, 659 | | dcbtst 435, 633, 655 | wrtee 626 | | dcbz 436, 529, 562, 601, 634, 652 | wrteei 626 | | defined 21 | interrupt 662 | | forms 21 | Alignment 562, 601 | | eciwx 455, 456, 559, 562, 563, 566, 583 | alignment interrupt 677 | | ecowx 455, 456, 559, 562, 563, 566, 583 | DABR 581 | | eieio 413, 448, 543 | Data Segment 560, 567 | | hrfid 477, 572 | Data Storage 559, 566 | | icbi 417, 428, 559, 566 | data storage interrupt 675 | | icbt 428 | Decrementer 501, 502, 565 | | illegal 21 | definition 467, 608 | | invalid forms 21 | External 501, 502, 561 | | isync 417, 440, 551 | external input interrupt 676 | | Idarx 415, 444, 551, 559, 562, 563, 566 | Floating-Point Unavailable 564, 569 | | Imw 562 | Hypervisor Decrementer 565 | | lookaside buffer 529 | imprecise 550, 669 | | lq 493, 562 | instruction | | lwa 563 | partially executed 686 | | lwarx 415, 442, 551, 559, 562, 563, 566, 601<br>lwaux 563 | Instruction Segment 561, 568 | | lwsync 446 | Instruction Storage 560, 676 instruction storage interrupt 676 | | lwz 601 | instruction TLB miss interrupt 681 | | mbar 448 | instruction-caused 550 | | mfmsr 477, 503, 625 | Machine Check 557 | | mfspr 500, 624 | machine check interrupt 674 | | mfsr 538 | masking 687 | | mfsrin 538 | guidelines for system software 689 | | mftb 451 | new MSR 555 | | mtmsr 477, 501, 572 | ordering 687, 689 | | mtmsrd 477, 502, 572 | guidelines for system software 689 | | address wrap 506, 640 | overview 547 | | mtspr 499, 622 | Performance Monitor 569 | | mtsr 537 | precise 550, 669 | | mtsrin 537 | priorities 571 | | optional | processing 551 | | See optional instructions | Program 563 | | preferred forms 21 | program interrupt 678 | | ptesync 446, 469, 543 | illegal instruction exception 678 | | reserved 21 | privileged instruction exception 678 | | rfci 614 | trap exception 678 | | rfid 417, 477, 480, 554, 572 | recoverable 554 | | rfmci 615 | synchronization 550 | | sc 479, 482, 565, 613 | System Call 565 | | slbia 532, 535 | system call interrupt 679 | | slbie 531 | System Reset 556 | | slbmfee 534 | system-caused 550 | | slbmfev 534 | Trace 565 | | slbmte 533 | type | | stdcx. 415, 444, 551, 559, 562, 563, 566 | Alignment 677 | | stmw 562 | Auxiliary Processor Unavailable 679 | | storage control 425, 529, 652 | Critical Input 674 Data Storage 675 | | stq 493, 562<br>stw 601 | Data Storage 675 Data TLB Error 681 | | stwcx. 415, 442, 551, 559, 562, 563, 566 | Debug 682 | | stwx 601 | Decrementer 680 | | External Input 676 | Link Register 497, 621, 774, 859 | |--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------| | Fixed-Interval Timer 680 | LK field 19 | | Floating-Point Unavailable 679 | LK instruction field 764 | | Instruction TLB Error 681 | Imw instruction 562 | | Machine Check 674 | Logical Partition Identification Register 474 | | Program interrupt 678 | Logical Partitioning 471 | | System Call 679 | Logical Partitioning Control Register 426, 471, 498, | | Watchdog Timer 680 | 530, 586, 860 | | vector 551, 556 | HDICE Hypervisor Decrementer Interrupt Condition- | | interrupt and exception handling registers | ally Enable 473, 476, 501, 502, 565, 587 | | DEAR 664 | ILE Interrupt Little-Endian 472, 555 | | ESR 665 | ISL Ignore Large Page Specification 472 | | ivpr 664 | ISL Ignore SLB Large Page Specification 472 | | interrupt classes | LPES Logical Partitioning Environment | | asynchronous 668 | Selector 473, 476, 479, 509, 510, 524, 526, 555, | | critical,non-critical 669 | 588 | | machine check 669 | RMI Real Mode Caching Inhibited Bit 473, 588 | | synchronous 668 | RMLS Real Mode Offset Selector 472, 588 | | interrupt control instructions 778 | VC 588 | | mtmsr 625 | VC Virtualization Control 472 | | rfci 779 | VPM Virtualization Control 472 VPM Virtualized Partition Memory 472 | | | • | | sc 778 | VRMASD 588 | | interrupt processing 670 | VRMASD Virtual Real Mode Area Segment | | interrupt vector 670 | Descriptor 472 | | interrupt vector 670 | lookaside buffer 529 | | Interrupt Vector Offset Register 36 622, 860 | LPAR (see Logical Partitioning) 471 | | Interrupt Vector Offset Register 37 622, 860 | LPCR | | Interrupt Vector Offset Registers 666 | See Logical Partitioning Control Register | | Interrupt Vector Prefix Register 664 | LPES | | Interrupts 661 | See Logical Partitioning Control Register | | invalid instruction forms 21 | LPIDR | | invalid operation 110 | See Logical Partition Identification Register | | IR Control of the Con | lq instruction 493, 562 | | See Machine State Register | LR 31, 774 | | ISL | LT 30 | | See Logical Partitioning Control Register | lwa instruction 563 | | isync instruction 417, 440, 551 | lwarx instruction 415, 442, 551, 559, 562, 563, 566, | | IVORs 666 | 601 | | IVPR 664 | lwaux instruction 563 | | ivpr 664 | lwsync instruction 446 | | | lwz instruction 601 | | K | | | · · | M | | K bits 524 | | | key, storage 524 | M-form 17 | | , ,g. | Machine 611 | | | Machine Check 669 | | L | Machine Check interrupt 557, 674 | | 116 550 500 | Machine State Register 477, 479, 501, 502, 503, 551, | | dcbf 559, 566 | 554, 555, 611, 625 | | instructions | BE Branch Trace Enable 478 | | dcbf 559, 566 | DR Data Relocate 478 | | L field 19 | EE External Interrupt Enable 477, 501, 502 | | L instruction field 764 | FE0 FP Exception Mode 478 | | language used for instruction operation description 7 | FE1 FP Exception Mode 478 | | ldarx instruction 415, 444, 551, 559, 562, 563, 566 | FP FP Available 478 | | LE | | | See Machine State Register | HV Hypervisor State 477 IR Instruction Relocate 478 | | LEV field 19 | | | LI field 19 | LE Little-Endian Mode 478 | | | ME Machine Check Enable 478 | | PMM Performance Monitor Mark 4/8, 592 | sible 531 | |------------------------------------------------------|-------------------------------------------------------| | PR Problem State 477 | tlbia 542 | | RI Recoverable Interrupt 478, 501, 502 | tlbie 539 | | SE Single-Step Trace Enable 478 | tlbiel 541 | | SF Sixty Four Bit mode 477, 506, 640 | tlbsync 542 | | VEC Vector Avaialable 477 | out-of-order operations 506, 640 | | Machine Status Save Restore Register | OV 42 | | See SRR0, SRR1 | overflow 111 | | Machine Status Save Restore Register 0 548, 551, 554 | OX 101 | | Machine Status Save Restore Register 1 551, 554, 564 | P | | main storage 407 | page 409 | | MB field 19 | page 408<br>size 508 | | mbar instruction 448 | page fault 506, 521, 559, 566, 639 | | MD-form 17 | • | | MDS-form 17 | page table<br>search 520 | | ME | update 543 | | See Machine State Register | · | | ME field 19 | page table entry 518, 522 | | memory barrier 413 | Change bit 522 | | Memory Coherence Required 411 | PP bits 524 | | mfmsr instruction 477, 503, 625 | Reference bit 522 | | mfspr instruction 500, 624 | update 543, 544 | | mfsr instruction 538 | partially executed instructions 686 | | mfsrin instruction 538 | partition 471 | | mftb instruction 451 | Performance Monitor interrupt 569 | | Mnemonics 762 | performed 408 | | mnemonics | PID 641 | | extended 383, 589, 733 | PMM | | mode change 506, 640 | See Machine State Register | | move to machine state register 625 | PMRN field 20 | | MSR | PP bits 524 | | See Machine State Register | PR C. M. Li. C. A. B. Li. | | mtmsr 625 | See Machine State Register | | mtmsr instruction 477, 501, 572 | precise interrupt 550, 669 | | mtmsrd instruction 477, 502, 572 | preferred instruction forms 21 | | mtspr instruction 499, 622 | priority of interrupts 571 | | mtsr instruction 537 | Process ID Register 641 | | mtsrin instruction 537 | Processor Utilization of Resources Register 497, 578, | | This in this indiction 557 | 860 | | | Processor Version Register 487, 617 | | N | Program interrupt 563, 678 | | | program order 407 | | NB field 20 | Program Priority Register 43, 488, 498, 861 | | Next Instruction Address 479, 480, 613, 614, 615 | protection boundary 524, 562 | | NI 103 | protection domain 524 | | NIA 7 | PTE 520 | | no-op 75 | See also page table entry | | normalization 106 | PTEG 520 | | normalized number 104 | ptesync instruction 446, 469, 543 | | not a number 105 | PURR | | | See Processor Utilization of Resources Register | | 0 | PVR | | | See Processor Version Register | | OE 103 | | | OE field 20 | Q | | opcode 0 601 | <b>~</b> | | optional instructions 529 | quadwords 4 | | elhia 532 535 | 4000.000 1 | | R | HDEC | |------------------------------------------------|-------------------------------------------------| | •• | Hypervisor Decrementer 497, 577, 586, 860 | | RA field 20 | HEIR | | RB field 20 | Hypervisor Emulated Instruction Register 498, | | RC bits 522 | 549, 860 | | Rc field 20 | HRMOR | | Rc instruction field 764 | Hypervisor Real Mode Offset Register 43, 474, | | real address 514 | 488, 586 | | Real Mode Offset Register 473, 586 | HSPRGn | | real page | software-use SPRs 489 | | definition 467, 607 | HSRR0 | | real page number 518 | Hypervisor Machine Status Save Restore Regis- | | recoverable interrupt 554 | ter 0 548 | | reference and change recording 522 | IVOR36 | | Reference bit 522 | Interrupt Vector Offset Register 36 622, 860 | | register | IVOR37 | | CSRR1 663 | Interrupt Vector Offset Register 37 622, 860 | | CTR 774 | Link Register 31 | | DEAR 664 | LPCR | | ESR 665 | Logical Partitioning Control Register 426, 471, | | IVORs 666 | 498, 530, 586, 860 | | IVPR 664 | LPIDR | | ivpr 664 | Logical Partition Identification Register 474 | | LR 774 | LR | | PID 641 | Link Register 497, 621, 859 | | SRR0 662 | MSR | | SRR1 662 | Machine State Register 477, 479, 501, 502, | | | 503, 551, 554, 555, 611, 625 | | register transfer level language 7 Registers | PPR | | implementation-specific | Program Prioirty Register 43, 488, 498, 861 | | MMCR1 754 | PURR | | | Processor Utilization of Resources | | supervisor-level | | | MMCR1 754 | Register 497, 578, 860<br>PVR | | registers<br>CFAR | | | - | Processor Version Register 487, 617 RMOR | | Come-From Address Register 497, 859 | | | Count Register 30 | Real Mode Offset Register 473, 586<br>SDR1 | | Count Register 31<br>CTR | | | | Storage Description Register 1 497, 520, 859 | | Count Register 497, 621, 859<br>CTRL | Storage DescriptionRegister 1 586 SPRGn | | | | | Control Register 488 | software-use SPRs 497, 621, 859<br>SPRs | | DABR(X) | | | Data Address Breakpoint Register | Special Purpose Registers 496<br>SRR0 | | (Extension) 475, 498, 581, 586, 861 | | | DAR | Machine Status Save Restore Register 0 548, | | Data Address Register 497, 548, 559, 560, 562, | 551, 554 | | 566, 568, 859<br>DEC | SRR1 | | DEC | Machine Status Save Restore Register 1 551, | | Decrementer 497, 576, 621, 697, 859 | 554, 564 | | DSISR | TB = 575, 005 | | Data Storage Interrupt Status Register 497, | Time Base 575, 695 | | 548, 549, 559, 562, 563, 566, 601, 859 | TBL | | EAR | Time Base Lower 497, 575, 621, 695, 859 | | External Access Register 497, 559, 566, 583, | TBU | | 586, 621, 859 | Time Base Upper 497, 575, 621, 695, 859 | | Fixed-Point Exception Register 42 | Time Base 451 | | Floating-Point Registers 100 | XER | | Floating-Point Status and Control Register 101 | Fixed-Point Exception Register 478, 497, 567, | | General Purpose Registers 42 | 621, 859 | | relocation | slbmfev instruction 534 | |---------------------------------------------------|------------------------------------------------------| | data 506, 640 | slbmte instruction 533 | | reserved field 5, 468 | SO 30, 31, 42 | | reserved instructions 21 | software-use SPRs 497, 621, 859 | | return from critical interrupt 779 | Special Purpose Registers 496 | | rfci 779 | speculative operations 506, 640 | | rfci instruction 614 | split field notation 14 | | rfid instruction 417, 477, 480, 554, 572 | SPR field 20 | | rfmci instruction 615 | SR 536 | | RI | SR field 20 | | See Machine State Register | SRR0 662 | | RID (Resource ID) 583 | SRR1 662 | | RMI | stdcx. instruction 415, 444, 551, 559, 562, 563, 566 | | See Logical Partitioning Control Register | stmw instruction 562 | | RMLS | storage | | See Logical Partitioning Control Register | access order 413 | | RMOR | accessed by processor 512 | | See Real Mode Offset Register | atomic operation 415 | | RN 103 | attributes | | rounding 107 | Endianness 412 | | RS field 20 | implicit accesses 512 | | RT field 20 | instruction restart 423 | | RTL 7 | interrupt vectors 512 | | | N 520 | | S | No-execute 520 | | | order 413 | | Save/Restore Register 0 662 | ordering 413, 446, 448 | | Save/Restore Register 1 662 | protection | | sc 778 | translation disabled 526 | | sc instruction 479, 482, 565, 613 | reservation 415 | | SC-form 15 | shared 413 | | SDR1 | with defined uses 512 | | See Storage Description Register 1 | storage access 407 | | SE | definitions | | See Machine State Register | program order 407 | | segment | floating-point 117 | | size 508 | storage access ordering 459 | | type 508 | storage address 23<br>storage control | | Segment Lookaside Buffer | instructions 529, 652 | | See SLB | storage control attributes 410 | | Segment Registers 536 | storage control instructions 425 | | Segment Table | Storage Description Register 1 497, 520, 586, 859 | | bridge 536 | storage key 524 | | sequential execution model 29 | storage location 407 | | definition 467, 608 | storage operations | | SF<br>Con Machine State Posietes | in-order 506, 640 | | See Machine State Register | out-of-order 506, 640 | | SH field 20 | speculative 506, 640 | | SH instruction field 764 SI field 20 | storage protection 524 | | | string instruction 648 | | SI instruction field 764 | TLB management 648 | | sign 105 | stg instruction 493, 562 | | single-copy atomicity 409<br>single-precision 106 | string instruction 648 | | Single-Step Trace 565 | stw instruction 601 | | SLB 514, 529 | stwcx. instruction 415, 442, 551, 559, 562, 563, 566 | | entry 515 | stwx instruction 601 | | slbia instruction 532, 535 | symbols 383, 589, 733 | | slbie instruction 531 | sync instruction 417, 446, 469, 522, 551 | | slbmfee instruction 534 | synchronization 469, 543, 609 | | context 469, 609 | virtual page number 518 | |---------------------------------------------------|-------------------------------------------| | execution 469, 609 | virtual storage 408 | | interrupts 550 | VPM | | Synchronize 413 | See Logical Partitioning Control Register | | Synchronous 668 | VRMASD | | system call 778 | See Logical Partitioning Control Register | | system call instruction 691 | VX 101 | | System Call interrupt 565, 679 | VX-form 17 | | System Reset interrupt 556 | VXCVI 103 | | system-caused interrupt 550 | VXIDI 102 | | | VXIMZ 102 | | Т | VXISI 102 | | | VXSNAN 102 | | t bit 32 | VXSOFT 102 | | table update 543 | VXSQRT 102 | | TB 451 | VXVC 102 | | TBL 451 | VXZDZ 102 | | TBR field 20 | | | TH field 20 | W | | Time Base 451, 575, 695 | | | Time Base Lower 497, 575, 621, 695, 859 | Watchdog Timer interrupt 680 | | Time Base Upper 497, 575, 621, 695, 859 | words 4 | | TLB 521, 529, 641 | Write Through Required 410 | | TLB management 648 | wrtee instruction 626 | | tlbia instruction 521, 542 | wrteei instruction 626 | | tlbie instruction 521, 539, 542, 544, 659 | WS instruction field 765 | | tlbiel instruction 541 | | | tlbsync instruction 542, 543, 659 | X | | TO field 20 | A | | Trace interrupt 565 | X-form 16 | | Translation Lookaside Buffer 641 | XE 103 | | translation lookaside buffer 521 | XER 42, 478, 567 | | trap instructions 690 | XFL-form 16 | | trap interrupt | XFX-form 16 | | definition 467, 608 | XL-form 16 | | | XO field 21 | | U | XO-form 17 | | | XS-form 17 | | U field 20 | XX 101 | | UE 103 | | | UI field 20 | 7 | | UI instruction field 765 | Z | | UMMCR1 (user monitor mode control register 1) 754 | z bit 32 | | undefined 7 | ZE 103 | | boundedly 4 | zero 104 | | underflow 112 | zero divide 111 | | UX 101 | ZX 101 | | | ZX 101 | | V | | | <b>v</b> | Numerics | | VA-form 17 | 2 472 | | VC | 2 472 | | See Logical Partitioning Control Register | 32-bit mode 508 | | VE 103 | | | VEC | | | See Machine State Register | | | virtual address 514, 517 | | | generation 514 | | | size 508 | | **Last Page - End of Document**