Crinkler compressing Windows 4 k intros to EXE

  • Slides: 22
Download presentation
Crinkler - compressing Windows 4 k intros to EXE files Aske Simon Christensen Rune

Crinkler - compressing Windows 4 k intros to EXE files Aske Simon Christensen Rune L. H. Stubbe Assembly 2005, Helsinki, July 2005 1

Overview • • • Background Compression method Function import Header layout Demo Future plans

Overview • • • Background Compression method Function import Header layout Demo Future plans Assembly 2005, Helsinki, July 2005 2

Why another one? • Most common method: CAB dropping EXE file EXE optimizer CAB

Why another one? • Most common method: CAB dropping EXE file EXE optimizer CAB compressor BAT inserter BAT file • Dropping is a mess • We want EXE files! Assembly 2005, Helsinki, July 2005 3

How is Crinkler different? • The normal build process: C/C++ files ASM files Compiler

How is Crinkler different? • The normal build process: C/C++ files ASM files Compiler object / library files Assember Linker Cruncher EXE file Assembly 2005, Helsinki, July 2005 4

How is Crinkler different? • The Crinkler way: C/C++ files ASM files Compiler object

How is Crinkler different? • The Crinkler way: C/C++ files ASM files Compiler object / library files Assember Crinkler EXE file Assembly 2005, Helsinki, July 2005 5

Why another one? • Control over code and data placement – Choose base address

Why another one? • Control over code and data placement – Choose base address – Optimize order for best compression – Separate code and data – Put in extra code • Import code • Code transformations Assembly 2005, Helsinki, July 2005 6

Compression method • Context modelling + Much better compression ratio than LZX + Well

Compression method • Context modelling + Much better compression ratio than LZX + Well suited for small amounts of data + Small decompression code (< 250 bytes) + Pays off even with the extra header - Extremely slow - Very memory-hungry Assembly 2005, Helsinki, July 2005 7

Data compression basics • Take advantage of self-similarity • Find patterns and eliminate them

Data compression basics • Take advantage of self-similarity • Find patterns and eliminate them • Dictionary compression • Statistical compression Assembly 2005, Helsinki, July 2005 8

Dictionary compression • LZ 77: Refer repetitions back to original M I S S

Dictionary compression • LZ 77: Refer repetitions back to original M I S S I P P I • Reasonable compression ratio • Fast compression • Very fast decompression Assembly 2005, Helsinki, July 2005 9

Statistical compression • Estimate probability distribution of each symbol based on earlier data •

Statistical compression • Estimate probability distribution of each symbol based on earlier data • PPM: M I S S I P P I • Problem: local Assembly 2005, Helsinki, July 2005 10

Context modelling • Generalization of PPM • Look at combinations of recent symbols •

Context modelling • Generalization of PPM • Look at combinations of recent symbols • A bit mask describes a model 0 0 0 1 0 0 M I S S I P P I • Problem: Many masks to choose from Assembly 2005, Helsinki, July 2005 11

Implementation • • Estimation for each single bit Context is current byte + selection

Implementation • • Estimation for each single bit Context is current byte + selection of last 8 Estimate the best collection of masks Estimate the best weights of the masks Keep track of contexts in a hash table Ignore hash collisions Find hash table size with few collisions Assembly 2005, Helsinki, July 2005 12

Function import • Import by name: Name of each function – The import table

Function import • Import by name: Name of each function – The import table is a big part of an EXE file • Import by ordinal: Number instead of name – Much smaller but quite incompatible • Import by hash: Hash code of each function – Small and compatible – Not supported directly • Import by hashed ordinal range Assembly 2005, Helsinki, July 2005 13

Header optimization DOS header PE offset DOS stub PE header Data directories Section header

Header optimization DOS header PE offset DOS stub PE header Data directories Section header 544 bytes! Assembly 2005, Helsinki, July 2005 14

Header optimization DOS header PE offset DOS stub PE header Data directories Section header

Header optimization DOS header PE offset DOS stub PE header Data directories Section header Assembly 2005, Helsinki, July 2005 15

Header optimization DOS header PE offset DOS stub PE header Data directories Section header

Header optimization DOS header PE offset DOS stub PE header Data directories Section header Assembly 2005, Helsinki, July 2005 16

Header optimization DOS header PE offset DOS stub PE header Data directories Section header

Header optimization DOS header PE offset DOS stub PE header Data directories Section header Ignored Assembly 2005, Helsinki, July 2005 17

Header optimization DOS header PE offset DOS stub PE header Data directories Section header

Header optimization DOS header PE offset DOS stub PE header Data directories Section header 196 bytes! Ignored Assembly 2005, Helsinki, July 2005 18

Header optimization DOS header PE offset DOS stub PE header Data directories 124 bytes

Header optimization DOS header PE offset DOS stub PE header Data directories 124 bytes + 18 hash codes! Section header Hash code Assembly 2005, Helsinki, July 2005 19

Demo Assembly 2005, Helsinki, July 2005 20

Demo Assembly 2005, Helsinki, July 2005 20

Future plans • • • Windows 2000 compatibility Even better compression Section reordering Transformations

Future plans • • • Windows 2000 compatibility Even better compression Section reordering Transformations More feedback 64 k specialized version Assembly 2005, Helsinki, July 2005 21

Thank you Questions? Comments? Suggestions? Assembly 2005, Helsinki, July 2005 22

Thank you Questions? Comments? Suggestions? Assembly 2005, Helsinki, July 2005 22