Hip Hop Compiler for PHP Transforming PHP into

  • Slides: 35
Download presentation
Hip. Hop Compiler for PHP Transforming PHP into C++ Hip. Hop Compiler Team Facebook,

Hip. Hop Compiler for PHP Transforming PHP into C++ Hip. Hop Compiler Team Facebook, Inc. May 2010 Facebook 2010 (confidential)

PHP is easy to read <? php function tally($count) { $sum = 0; for

PHP is easy to read <? php function tally($count) { $sum = 0; for ($i = 0; $i < $count; ++$i) { $sum += $i; } return $sum; } print tally(10). “n”; Facebook 2010 (confidential)

PHP syntax is similar to C++/Java <? php class Tool extends Object { public

PHP syntax is similar to C++/Java <? php class Tool extends Object { public $name; public use($target) {} } $tool = new Tool(); $tool->name = ‘hammer’; $tool->use($nail); Facebook 2010 (confidential)

PHP Statements and Expressions Function. Statement, Class. Statement, Interface. Statement, Class. Variable, Class. Constant,

PHP Statements and Expressions Function. Statement, Class. Statement, Interface. Statement, Class. Variable, Class. Constant, Method. Statement, Statement. List, Block. Statement, If. Branch. Statement, If. Statement, While. Statement, Do. Statement, For. Statement, Switch. Statement, Case. Statement, Break. Statement, Continue. Statement, Return. Statement, Global. Statement, Static. Statement, Echo. Statement, Unset. Statement, Exp. Statement, For. Each. Statement, Catch. Statement, Try. Statement, Throw. Statement, Expression. List, Assignment. Expression, Simple. Variable, Dynamic. Variable, Static. Member. Expression, Array. Element. Expression, Dynamic. Function. Call, Simple. Function. Call, Scalar. Expression, Object. Property. Expression, Object. Method. Expression, List. Assignment, New. Object. Expression, Unary. Op. Expression, Include. Expression, Binary. Op. Expression, QOp. Expression, Array. Pair. Expression, Class. Constant. Expression, Parameter. Expression, Modifier. Expression, Constant. Expression, Encaps. List. Expression, Facebook 2010 (confidential)

PHP is weakly typed <? php $a = 12345; $a = “hello”; $a =

PHP is weakly typed <? php $a = 12345; $a = “hello”; $a = array(12345, “hello”, array()); $a = new Object(); $c = $a + $b; // integer or array $c = $a. $b; // implicit casting to strings Facebook 2010 (confidential)

Core PHP library is small - Most are in functional style - ~200 to

Core PHP library is small - Most are in functional style - ~200 to 500 basic functions <? php $len = strlen(“hello”); $ret = curl_exec($curl); // C library // open source Facebook 2010 (confidential)

PHP is easy to debug <? php function tally($count) { $sum = 0; for

PHP is easy to debug <? php function tally($count) { $sum = 0; for ($i = 0; $i < $count; ++$i) { $sum += $i; var_dump($sum); } return $sum; } Facebook 2010 (confidential)

PHP is easy to learn ü easy to read ü easy to write ü

PHP is easy to learn ü easy to read ü easy to write ü easy to debug Hello, World! Facebook 2010 (confidential)

PHP is slow http: //shootout. alioth. debian. org/u 64 q/benchmark. ph p? test=all&lang=all 50

PHP is slow http: //shootout. alioth. debian. org/u 64 q/benchmark. ph p? test=all&lang=all 50 40 30 CPU 20 10 0 C++ Java C# Erlang Python Perl PHP Facebook 2010 (confidential)

Why is Zend Engine slow? Byte-code interpreter Dynamic symbol lookups functions, variables, constants class

Why is Zend Engine slow? Byte-code interpreter Dynamic symbol lookups functions, variables, constants class methods, properties, constants Weakly typing zval array() Facebook 2010 (confidential)

Transforming PHP into C++ g++ is a native code compiler static binding functions, variables,

Transforming PHP into C++ g++ is a native code compiler static binding functions, variables, constants class methods, properties, constants type inference integers, strings, arrays, objects, variants struct, vector, map, array Facebook 2010 (confidential)

Static Binding – Function Calls <? php $ret = foo($a); // C++ Variant v_ret;

Static Binding – Function Calls <? php $ret = foo($a); // C++ Variant v_ret; Variant v_a; v_ret = f_foo(v_a); Facebook 2010 (confidential)

Dynamic Function Calls <? php $func = ‘foo’; $ret = $func($a); // C++ Variant

Dynamic Function Calls <? php $func = ‘foo’; $ret = $func($a); // C++ Variant v_ret; Variant v_a; String v_func; V_func = “foo”; v_ret = invoke(v_func, CREATE_VECTOR 1(v_a)); Facebook 2010 (confidential)

Function Invoke Table Variant invoke(CStr. Ref func, CArr. Ref params) { int 64 hash

Function Invoke Table Variant invoke(CStr. Ref func, CArr. Ref params) { int 64 hash = hash_string(func); switch (hash) { case 1234: if (func == “foo”) return foo(params[0]) } throw Fatal. Error(“function not found”); } Facebook 2010 (confidential)

Re-declared Functions <? php if ($condition) { function foo($a) { return $a + 1;

Re-declared Functions <? php if ($condition) { function foo($a) { return $a + 1; } } else { function foo($a) { return $a + 2; } } $ret = foo($a); // C++ if (v_condition) { g->i_foo = i_foo$$0; } else { g->i_foo = i_foo$$1; } g->i_foo(v_a); Facebook 2010 (confidential)

Volatile Functions <? php if (!function_exists(‘foo’)) { bar($a); } else { foo($a); } function

Volatile Functions <? php if (!function_exists(‘foo’)) { bar($a); } else { foo($a); } function foo($a) {} // C++ if (f_function_exists(“foo”)) { f_bar(v_a); } else { f_foo(v_a); } g->declare. Function(“foo”); Facebook 2010 (confidential)

Static Binding – Variables <? php $foo = ‘hello’; function global $bar = return

Static Binding – Variables <? php $foo = ‘hello’; function global $bar = return foo($a) { $foo; $foo. $a; $bar; } // C++ String f_foo(CStr. Ref v_a) { Variant &gv_foo = g->GV(foo); String v_bar; v_bar = concat(to. String(gv_foo), v_a); return v_bar; } Facebook 2010 (confidential)

Global. Variables Class class Global. Variables : public System. Globals { public: // Direct

Global. Variables Class class Global. Variables : public System. Globals { public: // Direct Global Variables Variant gv_foo; // Indirect Global Variables for large compilation enum _gv_enums { gv_foo, } Variant gv[1]; }; Facebook 2010 (confidential)

Dynamic Variables <? php function foo() { $b = 10; $a = 'b'; echo($$a);

Dynamic Variables <? php function foo() { $b = 10; $a = 'b'; echo($$a); } void f_foo() { class Variable. Table : public RVariable. Table { public: int 64 &v_b; String &v_a; Variable. Table(int 64 &r_b, String &r_a) : v_b(r_b), v_a(r_a) {} virtual Variant get. Impl(const char *s) { // hash – switch – strcmp } } variable. Table(v_b, v_a); echo(variable. Table. get("b”)); } Facebook 2010 (confidential)

Static Binding – Constants <? php define(‘FOO’, ‘hello’); echo FOO; // C++ echo(“hello” /*

Static Binding – Constants <? php define(‘FOO’, ‘hello’); echo FOO; // C++ echo(“hello” /* FOO */); Facebook 2010 (confidential)

Dynamic Constants <? php if ($condition) { define(‘FOO’, ‘hello’); } else { define(‘FOO’, ‘world’);

Dynamic Constants <? php if ($condition) { define(‘FOO’, ‘hello’); } else { define(‘FOO’, ‘world’); } echo FOO; // C++ if (v_condition) { g->declare. Constant("FOO", g->k_FOO, "hello”); } else { g->declare. Constant("FOO", g->k_FOO, "world”); } echo(to. String(g->k_FOO)); Facebook 2010 (confidential)

Static Binding with Classes Class methods Class properties Class constants Re-declared classes Deriving from

Static Binding with Classes Class methods Class properties Class constants Re-declared classes Deriving from re-declared classes Volatile classes Facebook 2010 (confidential)

Summary - Dynamic Symbol Lookup Problem is nicely solved Rule of 90 -10 Dynamic

Summary - Dynamic Symbol Lookup Problem is nicely solved Rule of 90 -10 Dynamic binding is a general form of static binding Generated code is a super-set of static binding and dynamic binding Facebook 2010 (confidential)

Problem 2. Weakly Typing Type Inference Runtime Type Info (RTTI)-Guided Optimization Type Hints Strongly

Problem 2. Weakly Typing Type Inference Runtime Type Info (RTTI)-Guided Optimization Type Hints Strongly Typed Collection Classes Facebook 2010 (confidential)

Type Coercions Variant Double String Array Object Integer Boolean Facebook 2010 (confidential)

Type Coercions Variant Double String Array Object Integer Boolean Facebook 2010 (confidential)

Type Inference Example <? php $a = 10; $a = ‘string’; Variant v_a; Facebook

Type Inference Example <? php $a = 10; $a = ‘string’; Variant v_a; Facebook 2010 (confidential)

Why is strong type faster? $a = $b + $c; if (is_integer($b) && is_integer($c))

Why is strong type faster? $a = $b + $c; if (is_integer($b) && is_integer($c)) { $a = (int)$b + (int)$c; } else if (is_array($b) && is_array($c)) { $a = array_merge((array)$b + (array)$c); } else { … } int 64 v_a = v_b + v_c; Facebook 2010 (confidential)

Type Inference Blockers <? php function foo() { if ($success) return 10; // integer

Type Inference Blockers <? php function foo() { if ($success) return 10; // integer return false; // doh’ } $arr[$a] = 10; // doh’ ++$a; // $a can be a string actually! $a = $a + 1; // $a can become a double, ouch! Facebook 2010 (confidential)

RTTI-Guided Optimization <? php function foo($x) {. . . } foo(10); foo(‘test’); void foo(Variant

RTTI-Guided Optimization <? php function foo($x) {. . . } foo(10); foo(‘test’); void foo(Variant x) {. . . } Facebook 2010 (confidential)

Type Specialization Method 1 template<typename T> void foo(T x) { // generate code with

Type Specialization Method 1 template<typename T> void foo(T x) { // generate code with generic T (tough!) } -Pros: smaller generated code -Cons: no type propagation Facebook 2010 (confidential)

Type Specialization Method 2 void foo(int 64 x) { // generate code assuming x

Type Specialization Method 2 void foo(int 64 x) { // generate code assuming x is integer } void foo(Variant x) { // generate code assuming x is variant } -Pros: type propagation -Cons: variant case is not optimized Facebook 2010 (confidential)

Type Specialization Method 3 void foo(int 64 x) { // generate code assuming x

Type Specialization Method 3 void foo(int 64 x) { // generate code assuming x is integer } void foo(Variant x) { if (is_integer(x)) { foo(x. to. Int 64()); return; } // generate code assuming x is variant } -Pros: optimized for integer case -Cons: large code size Facebook 2010 (confidential)

Type Hints <? php function foo(int $a) { string $b; } class bar {

Type Hints <? php function foo(int $a) { string $b; } class bar { public array $c; } bar $d; Facebook 2010 (confidential)

Strongly Typed Collection Classes That omnipotent “array” in PHP Swapping out underlying implementation: Array

Strongly Typed Collection Classes That omnipotent “array” in PHP Swapping out underlying implementation: Array escalation PHP classes: Vector Set Map: un-ordered Then Array: ordered map Facebook 2010 (confidential)

Compiler Friendly Scripting Language If all problems described here are considered when designing a

Compiler Friendly Scripting Language If all problems described here are considered when designing a new scripting language, will it run faster than Java? Facebook 2010 (confidential)