PHP Web Applications Data Management Martin Kruli by
PHP Web Applications Data Management Martin Kruliš by Martin Kruliš (v 2. 0) 14. 3. 2019 1
Select Your Charset � One Charset to Rule Them All ◦ HTML, PHP, database (connection), text files, … ◦ Determined by the language(s) used �Unicode covers almost every language ◦ Early incoming, late outgoing conversions � Charset in Meta-data ◦ Must be in HTTP headers header('Content-Type: text/html; charset=utf-8'); ◦ Do not use HTML meta element with http-equiv �Except special cases (like saving HTML file locally) by Martin Kruliš (v 2. 0) 14. 3. 2019 2
Multi-byte Strings � Multibyte Character Encoding ◦ Some charsets (e. g. , UTF-8, UTF-16, …) ◦ Standard string functions are ANSI based �They treat each byte as a char � Multibyte String Functions Library ◦ Standard library, often present in PHP ◦ Duplicates most of the standard string functions, but with prefix mb_ (mb_strlen, mb_strpos, …) ◦ Encoding conversions mb_convert_encoding() ◦ mb_internal_encoding() – specifies the internal encoding used in PHP by Martin Kruliš (v 2. 0) 14. 3. 2019 3
Comparisons and Conversions � Lexicographical Comparison of Strings ◦ Best to be done elsewhere (in DBMS for instance) ◦ The strcmp() function is binary safe ◦ The locale must be set correctly (setlocale()) � Iconv Library ◦ An alternative to Multibyte String Functions ◦ Fewer functions ◦ Easier for encoding conversions �Can deal with missing mappings and replacements by Martin Kruliš (v 2. 0) 14. 3. 2019 4
Regular Expressions (Revision) � String Search Patterns ◦ Special syntax that encodes a program (language) for regular automaton ◦ Simple to use �Encoding is (mostly) human readable ◦ POSIX and Perl Standards �POSIX is deprecated � Usage ◦ Searching strings, listing matches ◦ Find and replace ◦ Splitting a string into an array of strings by Martin Kruliš (v 2. 0) 14. 3. 2019 5
Data Serialization � Built-in De/Serialization Functions ◦ serialize(), unserialize() ◦ Converts (almost) any type to string (and back) �Except for resources and (some) objects $data = [ 'foo' => 42, 'bar' => 'Text: "quoted"', 'flag' => false, ]; echo serialize($data); a: 3: {s: 3: "foo"; i: 42; s: 3: "bar"; s: 14: "Text: "quoted""; s: 4: "flag"; b: 0; } by Martin Kruliš (v 2. 0) 14. 3. 2019 6
JSON - Revision � Java. Script Object Notation (JSON) ◦ Lightweight interchange format for structured data ◦ Based on subset of Java. Script language ◦ Otherwise language independent �Many parsers exist with frontends for many languages ◦ Intended for replacing XML in simple scenarios � Syntax ◦ Two basic structures: collections and lists ◦ Supports strings, numbers, bools, and null type ◦ Unicode safe by Martin Kruliš (v 2. 0) 14. 3. 2019 7
JSON – Revision � JSON Example Ordered list (PHP array) [ { Number (int) "Student. Id": 42, "Name": "John Smith" }, { } Unicode string Named collection (PHP object or associative array) "Student. Id": 54, "Name": "Jane Johnson", "Graduated": true Boolean literal ] by Martin Kruliš (v 2. 0) 14. 3. 2019 8
JSON in PHP � JSON Functions ◦ json_decode() – JSON string -> object/array struct. ◦ json_encode() – serialize any PHP type (except resource) into JSON string ◦ json_last_error(), json_last_error_msg() � Customizing Serialization Process ◦ Json. Serializable interface ◦ Abstract method json. Serialize() that converts object into anything that can be processed by json_encode() by Martin Kruliš (v 2. 0) 14. 3. 2019 9
Neon � Neon Structured Format ◦ Basically derived from JSON by loosening the rules �Omitting quotes �Removing braces and commas �Using bullets to create lists �Adding comments ◦ Much more better in scenarios, where users/coders have to write the data manually �E. g. , configuration files ◦ Library for PHP exists �And for Java. Script and Python by Martin Kruliš (v 2. 0) 14. 3. 2019 10
YAML � YAML Ain’t Markup Language ◦ Similar to Neon ◦ PECL package exists for PHP Array items are prefixed by dash - Student. Id: 42 Support for explicit typing Name: John Smith Avg. Mark: !!float 1 # nerd - Student. Id: 54 Name: Jane Johnson We can fall back to JSON syntax Graduated: true - { Student. Id: 19, Name: Jono Grant } by Martin Kruliš (v 2. 0) 14. 3. 2019 11
Extensible Markup Language <? xml version="1. 0" encoding="windows-1250"? > <contacts> <person id="person 1" typ="friend"> <surname>Smith</surname> <name>John</name> <email>smith@gmail. com</email> <phone>607987654</phone> </person > <person id="person 2"> <surname>Jones</surname> <name>Charlie</name> <email>jones@yahoo. com</email> </person> </contacts> by Martin Kruliš (v 2. 0) 14. 3. 2019 12
Parsing XML � Parsing Process ◦ Loading XML from text file into memory structures � Simple API for XML (SAX) ◦ The document is processed sequentially ◦ Important “events” are reported to the user �Opening/closing an element, attribute found, … � Document Object Model (DOM) ◦ Tree-based structure with object oriented API ◦ Nodes of the tree represent elements, attributes, text content, …. ◦ More suitable for mutable documents by Martin Kruliš (v 2. 0) 14. 3. 2019 13
PHP DOM Implementation � Document Object Model in PHP ◦ Set of PHP-native classes ◦ Implementing DOM 3 as defined by W 3 C �DOMDocument – entire XML document �DOMElement – tree node of one XML element �DOMAttr – attribute of an element �DOMNode – base class for all tree classes �DOMCharacter. Data – plain text contents �… ◦ Can be used to process HTML as well ◦ Supports schema validation by Martin Kruliš (v 2. 0) 14. 3. 2019 14
XPath and XSLT � XPath Query Language (DOMXPath) ◦ Inspired by filesys. paths (elements ~ directories) ◦ Various types of queries that could yield �Boolean, number, string, or set of DOM nodes ◦ Quite expressive (num. operations, conditions, …) � XSLT Transformations (XSLTProcessor) ◦ XSLT describes transformation of one XML document to another (or to HTML, or to plain-text) �XSLT rules are stored in XML document ◦ Uses XPath for search queries ◦ Turing complete language by Martin Kruliš (v 2. 0) 14. 3. 2019 15
Simple XML in PHP � Simplified DOM API ◦ Also loads the document into tree structure ◦ All nodes are Simple. XMLElement objects �Tree traversal is implemented by overloading class iterators and magic methods __get(), __set(), … ◦ The structure can be modified and saved ◦ Some advanced functions are also available �Integrated XPath query processor �Conversions to/from DOM structure by Martin Kruliš (v 2. 0) 14. 3. 2019 16
Working with Files � Low Level C-like API ◦ fopen(), fread(), fwrite(), fclose(), … ◦ Data are treated as strings � File-wide Operations ◦ fpassthru(), readfile() – whole file is spilled to the output ◦ file_get_contents(), file_put_contents() �Read/write the entire file to/from a string ◦ file() – reads entire file as an array of lines � Newer OOP API ◦ Spl. File. Object, Spl. Temp. File. Object by Martin Kruliš (v 2. 0) 14. 3. 2019 17
File Properties and Rights � File Properties (Size, Type, Creation Time, …) ◦ Dedicated functions for some properties �filesize(), filetype(), filectime() ◦ Reading all properties at once �stat(), lstat(), fstat(), Spl. File. Info � POSIX Access Rights (rwx) ◦ Testing functions �is_file(), is_readable(), is_writeable(), … ◦ Modification of rights and owner �chmod(), chown(), chgrp() by Martin Kruliš (v 2. 0) 14. 3. 2019 18
Directories and Paths � Traversing Directories ◦ Iterative – opendir(), readdir(), closedir() ◦ scandir() – returns array with directory contents ◦ dir() – returns object of Directory class � Directory ◦ ◦ Paths glob() – expanding wildcards dirname(), basename() – get part of pathinfo() – parses path and return components realpath() – converts symbolic to real path �Absolute path without links, . /, and. . / by Martin Kruliš (v 2. 0) 14. 3. 2019 19
Formatted Files � CSV Files ◦ Comma (semicolon) separated values ◦ API uses file handlers from fopen() �fgetcsv() – reads one CSV line �fputcsv() – writes one CSV line � INI Files ◦ Configuration files with “name = value” items ◦ parse_ini_file() – reads the entire file and yields an associative array �Optionally divides the data to sections by Martin Kruliš (v 2. 0) 14. 3. 2019 20
Compression � ZIP Files (*. zip) ◦ Rather complicated (ZIP is multi-file container) ◦ Represented by Zip. Archive class � Zlib – gzip library (*. gz) ◦ Similar functions to standard C-like API �gzopen(), gzread(), gzwrite(), gzclose(), … � BZip Library (*. bz 2) ◦ Similar to Zlib (bzopen(), …) � Other libraries are available (RAR, LZF, …) ◦ TAR only as PECL extension by Martin Kruliš (v 2. 0) 14. 3. 2019 21
Binary Data in PHP � Binary Data ◦ Represented as strings �ord() – converts character to int (its ordinal value) �Handling data byte by byte may be slow �pack(format, data, …), unpack() �Convert data to binary string according to format descriptor �Typically deals with numerical formats (ints, floats) �Also handles endianity �base 64_encode(), base 64_decode() �Base 64 presents secure way how to embed binary data in textual formats �Note that base 64 is not URL-safe by itself by Martin Kruliš (v 2. 0) 14. 3. 2019 22
Processing Images � GD 2 ◦ Original library in PHP, become obsolete � Cairo ◦ Fast 2 D drawing library written in C ◦ Supports transformations and Bezier curves � Gmagic ◦ A Swiss army knife for image manipulation ◦ Large variety of supported formats � Image. Magick ◦ Well established library for image manipulation ◦ Designed even for more complex operations by Martin Kruliš (v 2. 0) 14. 3. 2019 23
My. SQL � My. SQL Revision ◦ Original mysql API is deprecated (as of PHP 5. 5) ◦ My. SQL Improved (mysqli) API �Dual object/procedural interface �Procedural interface is similar to original (deprecated) API �Advanced connectivity features �Persistent connections, compression, encryption �Directly supports transactions ◦ My. SQL Native Driver (mysqlnd) extension �More direct access to My. SQL server �Additional features (e. g. , asynchronous queries) by Martin Kruliš (v 2. 0) 14. 3. 2019 24
My. SQL � My. SQLi Query Example $mysqli = new mysqli('server', 'login', . . . ); $stmt = $mysqli->prepare('SELECT * FROM subjects WHERE students = ? AND credits >= ? '); $stmt->bind_param('si', 'I 2', 3); $stmt->execute(); $res = $stmt->get_result(); while (($subj = $res->fetch_object())) { echo "$subj->fullnamen"; Really an assignment } $res->close(); by Martin Kruliš (v 2. 0) 14. 3. 2019 25
Other Database Systems � Postrgre. SQL ◦ Quite powerful alternative for My. SQL ◦ Similar API to My. SQL �pg_connect(), pg_query(), … � SQLite ◦ SQL engine running directly on filesystem ◦ Small projects, where regular DBMS is not available � Commercial Giants ◦ MSSQL, Oracle, d. Base, IBM DB 2, … by Martin Kruliš (v 2. 0) 14. 3. 2019 26
Database Abstraction � Database Abstraction Layers ◦ Solving the problem of portability �If the application refrains from using specific SQL statements or DBMS APIs � PHP Data Objects (PDO) ◦ Generic interface for various RDBMS ◦ Extension php_pdo, individual DB systems are in separate extensions (php_pdo_mysql, …) ◦ Simple object oriented API �PDO and PDOStatement classes �Minimalistic, no special functions by Martin Kruliš (v 2. 0) 14. 3. 2019 27
Database Abstraction � Database Abstraction Library (Di. Bi) ◦ Simpler work with statements, result extraction, … dibi: : connect([ 'driver' => 'mysqli', … ]); $r = dibi: : query('SELECT name FROM [users] Automated WHERE [login] = %s', $login); sanitization $username = $r->fetch. Single(); // $r->fetch. All(), $r->fetch. Pairs(), foreach($r. . . ) dibi: : query('INSERT INTO [table]', [ 'number' => 1, 'str' => "What’s up? " ]); Data fetching INSERT INTO `table` (`number`, `str`) VALUES (1, 'What's up? ') by Martin Kruliš (v 2. 0) 14. 3. 2019 28
Data Model � Object Oriented Approach ◦ Allows you express data entities naturally �Customer, Product, Order, Invoice, … ◦ Offers benefits like inheritance �Customer, Seller, Admin ~ derived from User ◦ Promotes encapsulation �Well defined interface �Hides the details of actual data representation/storage ◦ Much more readable �Self documented code �Easily decorated with doc comments by Martin Kruliš (v 2. 0) 14. 3. 2019 29
Data Model � Object Oriented Approach ◦ Object persistence �Explicit way to flush objects to persistent storage ◦ Object relations �Similar to database relations (1: 1, 1: N, M: N, …) ◦ Collections �Implicit �All customers �Defined by object relations (orders of a customer) �Explicit �Orders that are due today by Martin Kruliš (v 2. 0) 14. 3. 2019 30
ORM � Object-relational Mapping (ORM) ◦ Technique which creates object-oriented API over (relational) database ◦ Benefits �Much simpler for the programmer (no need for SQL) �Much less error prone ◦ Object-relational impedance mismatch �Set of difficulties which are encountered when matching relational DB and OOP API �E. g. , how to save a list of objects of a class User or derived classes (Admin, Editor, …) into relational database? �Mapping IS-A hierarchy, private members, references, … by Martin Kruliš (v 2. 0) 14. 3. 2019 31
ORM � Doctrine ◦ Very popular database framework for PHP �Contains both DB abstraction layer and O/R mapping ◦ Provides transparent persistence of PHP objects �The mapping must be provided by the means of XML, YAML, or PHP annotations �A DB schema is generated from the PHP classes ◦ The Entity. Manager frontend �Provides methods to load/store objects of specific type �Allows utilizing SQL for more complex cases and converts results into mapped objects by Martin Kruliš (v 2. 0) 14. 3. 2019 32
ORM � Doctrine Example /** @Entity @Table(name="subjects") **/ class Lecture { /** @Id @Column(type="integer") @Generated. Value **/ protected $id; /** @Column(type="string") **/ protected $fullname; /** @Many. To. One(target. Entity="User", inversed. By="teach_lectures") **/ protected $teacher; . . . public function get. Description. String() {. . . } public function get. Students() {. . . } } by Martin Kruliš (v 2. 0) 14. 3. 2019 33
ORM � Doctrine Example $entity. Manager = Entity. Manager: : create($conn, $config); $subj = $entity. Manager->find('Lecture', (int)$id); $subj->set. Name('Web Applications'); $entity. Manager->flush(); $subjs = $entity. Manager->get. Repository('Lecture') ->find. By([ 'programme' => 'I 2' ]); foreach ($subjs as $subj) { echo $subj->get. Description. String(); foreach ($subj->get. Students() as $student) {. . . } } by Martin Kruliš (v 2. 0) 14. 3. 2019 34
ORM � Doctrine Query Builder ◦ API for procedural query construction $qb->select('u') Employs chaining for simpler ->from('User', 'u') modifications composition ->where('u. id = ? 1') ->order. By('u. name', 'ASC'); � Doctrine Query Language ◦ Restricted SQL dialect ◦ Object (entity) oriented, rather than table oriented $query = $em->create. Query('SELECT u. id FROM User u WHERE : group. Id MEMBER OF u. groups'); $query->set. Parameter('group. Id', $group); by Martin Kruliš (v 2. 0) 14. 3. 2019 35
ORM � Doctrine SQL Preparation Process Entity Manager Query Builder AST DQL Statement Intermediate tree representation by Martin Kruliš (v 2. 0) 14. 3. 2019 36
Not. ORM � Not-ORM Abstraction ◦ Stands somewhere between simple abstraction like Di. Bi and full ORM system ◦ SQL queries are not written in strings, but semiautomatically procedurally composed �Programmer has to be familiar with SQL ◦ Unlike ORM, the API does not have to know complete DB schema �But knowing/inferring FK relations is very useful in order to generate table joins properly ◦ Original API was proposed and implemented by Jakub Vrána by Martin Kruliš (v 2. 0) 14. 3. 2019 37
Not. ORM � Not. ORM Example $db = new Not. ORM(. . . ); $subj = $db->lectures[42]; $subj->fullname = 'Web Applications'; $subj->update(); foreach ($db->lectures() ->where('programme', 'I 2') ->order('teacher. name') as $subj) { echo $subj->teacher['name'], ' ', $subj['fullname']; } by Martin Kruliš (v 2. 0) 14. 3. 2019 38
Integrity � Maintaining Data Integrity ◦ File locking ◦ Synchronization primitives (e. g. , mutexes) ◦ Database transactions �Perhaps the best solution �Begin, Commit, Rollback �Optimistic database transaction system may rollback your transactions as deadlock prevention �Restart rollback-ed transactions �Order your updates (if your DBMS uses 2 -phase locking) by Martin Kruliš (v 2. 0) 14. 3. 2019 39
Optimizations � Database ◦ Indices Optimizations �For columns often used in filters (WHERE clause) ◦ (Materialized) Views ◦ Stored Procedures/Functions �Common data handling routines (in SQL, PL/SQL) ◦ Triggers �Procedures triggered on insert/update/delete �Making writes more expensive, so that reads would be faster �There are typically much more reads than writes by Martin Kruliš (v 2. 0) 14. 3. 2019 40
Migrations � Database Schema Management ◦ One of the most essential and the most disregarded issues in IS database utilization ◦ Database schema evolves with the application �Application is deployed in releases �Well defined snapshots in time �Code can be updated by simple replace, but not the data ◦ Migration �Defines transition between two database schemas �Also handles data conversion �Some systems (like Doctrine) can generate and manage migrations (semi)automatically by Martin Kruliš (v 2. 0) 14. 3. 2019 41
Migrations � Doctrine Migrations ◦ A migration is a class �With well defined (sequential) ID �pre. Up(), up(), post. Up() methods �up() holds the SQL that modifies schema �pre. Down(), down(), post. Down() �Allow reverting the migration to previous state ◦ Management �Automated creation (model diff) �Automated deployment �Database holds a table where deployed migrations are recorded by Martin Kruliš (v 2. 0) 14. 3. 2019 42
Discussion by Martin Kruliš (v 2. 0) 14. 3. 2019 43
- Slides: 43