The Transmeta Code Morphing Software Using Speculation Recovery











- Slides: 11
The Transmeta Code Morphing Software Using Speculation, Recovery, and Adaptive Retranslation to Address Real-Life Challenges.
v. What is Code Morphing? Code morphing is basically binary translation so that the implementation of the ISA can be different from the software level (external ISA presented to users). This approach allows a simple, compact, low power microprocessor implementation, with the freedom to modify the internal ISA between generations, while supporting the broad range of legacy x 86 software available. v. What is Transmeta Crusoe? VLIW processor and Code Morphing System (CMS) present an approach unique among commercial architectures: a microprocessor system with an internal VLIW instruction set architecture (ISA) with little resemblance to external ISA x 86. The instruction is called a molecule and consists of 2 or 4 atoms(RISC like instructions).
CMS key features: §Implement the complete x 86 architecture: all instructions (including memory mapped I/O), architectural registers, and complete exception behavior. §It runs independent of any operating system (system level implementation). §Must provide robust performance for a wide variety of systems and applications. §It implements garbage collection for translated cache.
v. The bigger the translated code is, the better the optimization is. ØSo, it enables speculation to further optimize the code. ØIn case of a failure, it reverts back using a mix HW & SW methods to its state before entering the translated code (commit and rollback method) and hands the control back to the interpreter which is substantially slower. ØIf failures are frequent, interpreter introduces large overhead. ØSolution is to decrease size of the translated code ØIf still persistent, generate more conservative translation. v Hardware Support for Speculation and Recovery ØShadow all registers; have a working copy and a shadow copy. ØCommit working copy only after translation code finishes execution correctly. ØIn case of failure rollback to the last correct state saved in shadow registers. ØMemory operations: use gated store buffer that commits store after execution finishes correctly, else drop all updates in the buffer.
Challenges CMS meets by applying the procedure of speculation, recovery, and adaptive retranslation: 1. Precise exception behavior without constraining the scheduling: Ø Make use of the already implemented commit and rollback hardware Ø Place the potentially faulty instructions before unconditional branches and use rollback feature if needed. Ø We need to separate exceptions rooting from faulty speculation and from genuine x 86 exception. Ø If the interpreter executed it correctly after the rollback, then it’s speculative and can be ignored if not frequent. If frequent, first try to cut down translation code size, or use more conservative translation paradigms. Ø If it was a genuine x 86 exception, narrow down translation code size(to have less stuff lost in the rollback). We can ultimately run translations of all but the faulting instruction, which becomes a zero-instruction translation that simply calls the interpreter to execute the faulting instruction.
Challenges CMS meets by applying the procedure of speculation, recovery, and adaptive retranslation: 2. Responds to interrupts accurately: Ø Ø Uses commit and rollback mechanism as well as it goes back to the consistent target state. No need for the translation code to do anything. 3. Data Speculation: Ø Ø Ø Translator can’t decide on whethere is an overlap between the load/store instructions. Since the reordering allows for very efficient optimizations, and most of the time they don’t overlap, reordering is carried out. Hardware mechanisms are used to detect overlap, issue an exception and rollback then interprets.
Challenges CMS meets by applying the procedure of speculation, recovery, and adaptive retranslation: 4. Handles memory mapped I/O: Ø Ø Must happen in the original x 86 order as it is irrevocable interaction with external devices. Suppressing the reorder of memory instruction lead to degradation of overall performance. So, mark the reordered load and store, if any of them accessed a memory page mapped to IO, hardware raises an exception and rolls back and interpret. If the exceptions are frequent for a certain fragment of the code, retranslate without reordering of memory instructions.
Challenges CMS meets by applying the procedure of speculation, recovery, and adaptive retranslation: 5. Self Modifying Code: Ø First attempt write protect x 86 page once translated, if any writes take place, issue a fault and retranslate. (costly: invalidation and retranslation) Ø What if the page is shared between code and data? 1. Fine grain protection: Instead of having write protect for the whole page, have it with a finer granularity. The granularity supported cannot always identify a single translation affected, but typically narrows the impact to a few, reducing both the number of faults and the number of invalidated translations for each. 2. Self checking translation: Have the translation itself fetch the original code and check that it didn’t change. Normally it’s scheduled after stores. This is less expensive than interpretation or retranslation, yet introduces large overhead for big translation codes.
Challenges CMS meets by applying the procedure of speculation, recovery, and adaptive retranslation: 5. Self Modifying Code: Ø What if the page is shared between code and data? 3. Self-revalidating translations: If CMS is trying to identify that the fault is due to data write not code. It inserts a “prologue” which is a code segment that executes just before the code itself starts. Once a candidate translation for self-revalidation is identified, it is flagged. The next time it is encountered, it is re-translated in order to capture the translated x 86 code (which is not preserved initially). Later, if the handler for a fine-grain protection fault determines that the translation(s) might be affected, it enables the prologue and turns off protection to avoid the cost of faulting again. When the translation is next invoked, the prologue verifies that the x 86 code corresponding to the translation has not changed, reenables protection, re-verifies the x 86 code, disables the prologue, and then executes the translation.
Challenges CMS meets by applying the procedure of speculation, recovery, and adaptive retranslation: 5. Self Modifying Code: Ø What if the protection faults are actually due to change in code not data? 1. Stylized SMC: Many PC applications that rely on self-modifying code do so in very stylized ways. A common approach, for example, is to modify the immediate or offset fields in instructions inside an inner loop, just before entering that loop. So, code is translated in a way that translation loads that immediate field from the code stream at runtime. 2. Translation groups: SMC often writes and executes certain versions of the code. So different translations of the same code region are grouped in a “Translation Group”. If one of them fails the self check after a protection faults, the rest of the group is checked.
Any questions?