What is a Database A database is essentially


What is a Database? A database is essentially a highly organized collection of information, plus the system software needed to provide for its structural definition, creation, processing and retrieval. Before the mid-1960’s information was typically stored in separate files. The file structure was typically dictated by the needs for the application that processed the data, with the result that a change to the structure (e. g. need for adding a data item) required changing all application programs that made use of the file. One of the motivations for using a database is data independence, removing the need to replicate the entire structure of the data in each application that references it. It permits “need to know” access and control. 2

A VERY BRIEF HISTORY Two early models for database structure were “Hierarchical” and “CODASYL”. The latter was developed by a committee within the association responsible for the standardization and development of the COBOL language. Both were essentially ‘navigational’ models. Simply put, an application would start somewhere within the database and then follow links to other areas, eventually homing-in on the desired data item. Major problems existed in that it was very space in-efficient when it came to handling ‘sparse’ data, and it was processing in-efficient in that the maintenance of links was very time consuming. 3

A Different Approach Edwin Codd of IBM San Diego published papers outlining the use of multiple tables where each row in the table had a unique identifier known as the row’s primary key, and each column value in that row was tightly bound (related) to that unique value. If there was an optional and/or multiple value for that row, it was placed in a separate table with its own primary key, as well as the primary key of the tow in the table to which it was ‘joined’. This reference to the ‘parent’ primary key is known as a ‘foreign key’. This provided for a “parent/child” relationship as well as a mechanism for handling sparse or multi-valued information. This became the foundation for the Relational Database Management System – RDMS – and is the foundation for most current production database management system. 4

Components of a Database Management System Data Definition Language (DDL) Statements such as CREATE [DATABASE | TABLE | INDEX | RELATIONSHIP ] … ALTER [DATABASE | TABLE | INDEX | RELATIONSHIP ] … DELETE [DATABASE | TABLE | INDEX | RELATIONSHIP ] … Data Manipulation Language (DML) Statements such as SELECT … UPDATE … INSERT … DELETE … With the development of Graphic User Interface (GUI) environments, these languages are often hidden behind an “app” such as a designer supporting such things as ‘drag and drop’ objects representing tables and relationships. A few examples follow. 5
![DATA DEFINITION LANGUAGE (DDL) CREATE TABLE [dbo]. [tbl_REQUEST]( [pk_REQUEST] [int] IDENTITY(1, 1) NOT NULL, DATA DEFINITION LANGUAGE (DDL) CREATE TABLE [dbo]. [tbl_REQUEST]( [pk_REQUEST] [int] IDENTITY(1, 1) NOT NULL,](http://slidetodoc.com/presentation_image_h2/ba15d8c61c465da63012d3c2899e05fc/image-6.jpg)
DATA DEFINITION LANGUAGE (DDL) CREATE TABLE [dbo]. [tbl_REQUEST]( [pk_REQUEST] [int] IDENTITY(1, 1) NOT NULL, [Request_Date] [datetime] NOT NULL, [fk_PATIENT] [int] NOT NULL, [fk_TREATMENT_CTR] [int] NOT NULL, [HT_or_TH] [char](2) NOT NULL, [fk_ZIP_Origin] [varchar](10) NOT NULL, [fk_ZIP_DEST] [varchar](10) NOT NULL, [Earliest_Departure] [datetime](7) NOT NULL, [Latest_Arrival] [datetime](7) NOT NULL, [fk_STATUS] [int] NOT NULL, [Notes] [varchar](500) NULL, [fk_Entered_By] [int] NOT NULL, [Latest_Revision_Date] [datetime] NULL ) ON [PRIMARY]; My naming conventions • pk_ Primary Key • fk_ Foreign Key – points to other table which has this value as its primary key. 6

DATA TYPES Primitives Whole Numbers bit - occupies a single bit of memory Value of 0 is interpreted as 0, No, or False Value of 1 is interpreted as -1 or 1, Yes, or True. It is usually safest to test for "NOT 0" byte - typically consumes 8 bits, although some machines used 9. Most often used to store a single character, or a number of range 0 to 255, or -127 to 127 integer (or "int") - for 'whole' numbers, no fractional part. number of bits varies, thus maximum capacity. If supports sign then capacity is halved. long integer, or "long" - typically double the number of bits used by integer, thus increasing the maximum size. Again, may support sign. 7

Numbers with Fractional Parts single precision floating point - typically consumes 32 bits of memory Internally stored as +/- "0. " followed by 23 binary bits which can represent 6 digits of precision, with a multiplier of 10 raised to the +/- 0 to 127 double precision floating point - typically occupies 64 bits of memory Internally stored as +/- "0. " followed by 53 binary bits which can represent 15 digits of precision, with a multiplier of 10 raised to the +/- 0 to 1023 decimal - the number is stored using the base-10 system, i. e. the characters 0, 1, 2, . . . , 8, 9 and +, -, and '. '. 8

Constructed Types date - in some instances the number of days since "day 1" - for Unix file system it was Jan. 1, 1970 It is related to the adoption of the Julian calendar which introduced ''leap' days for February where a year is evenly divisible by 4, except for years evenly divided by 400 etc. Not all countries adopted at the same time as some continued to use the Gregorian calendar. datetime - a double word, the first is the date, the second elapsed micro-seconds since mid-night. money (or "currency") char a fixed-length character string - ASCII characters varchar a variable length character string - ASCII characters nvarchar - a variable length character string - UNICODE characters (two bytes per character) 9

NULL is a state, not a value There may be cases where it is necessary to differentiate between a measured value and the absence of data. For example, NULL and the number 0 are not the same. NULL indicates that there is no known value, or that it is not possible to have a value for some reason. 10

11

Late night TV offer – 44 volume set of Presidential twitters. For that I’m going to need an empty shelf so as to keep the 44 volumes together. 12

4 3 2 1 19 (NULL) 22 20 17 (NULL) 14 18 11 18 21 0 7 13 17 21 19 A B C D SELECT COUNT([BOOK_ID]) AS [HOW_MANY] FROM [BOOKS] GROUP BY [BOOKCASE], [SHELF] ORDER BY [BOOKCASE], [SHELF]; The above SQL SELECT statement will count the number of books on each shelf in each bookcase. A second query make use of the above query in the FROM clause of a ‘cross tab’ query and display it. 13

In the Relational Database Model there are two principle relationship types: The items in a row of the table must be single-valued and associated with the primary key of the row. Thus in the GAME table we have individual items related to the Phillies game #162 – teams, field, date and time, score, etc. In the second table shown, we have a row for each player in the batting order. It has his batting order sequence, his name, perhaps his starting field position. Again the row relationship rule is enforced. Not necessarily visible is the linkage mechanism which relates each batting order record to the particular game. In this case it would be “#162|PHL”, the foreign key value for the specific game. This illustrates the ‘Parent/Child’ relationship between tables. REFERENTIAL INTEGRITY: The system will not let you delete a PARENT row if any CHILD row is attached to it. 14

15

ALIAS Sometimes it is necessary to simultaneously reference two rows in the same table. To avoid ambiguity, this may be done via the ALIAS mechanism. In this example, the TEAM table only exists once but by two different names. SELECT t. G. DATETIME, t. V. [TEAM NAME] AS Visitor, t. H. [TEAM NAME] AS Home, t. G. SCORE FROM TEAM AS t. H INNER JOIN (TEAM AS t. V INNER JOIN GAME AS t. G ON t. V. pk_TEAM = t. G. [fk_VISITOR]) ON t. H. pk_TEAM = t. G. fk_HOME WHERE ( t. G. [GAME#] = 162 ); 16

17

A Light Look at a Serious Topic Hacking 18

A system is typically ‘hacked’ by somehow introducing program code (instructions) written by the hacker. A system that uses scripts is vulnerable to compromise if the contents of the script can be modified by a data stream provided by the hacker. It must be carefully constructed so that it doesn’t ‘break’ the script, or at least doesn’t break it until the damage has been done. There is a category of hacker known as an ‘ethical hacker’ – these are programmers who volunteer or are paid to hack an application upon invitation or challenge by the application owner. The following example is a simple but successful hack of a system developed to recognize motor vehicles on a police agency’s BOLO (Be On Look Out) list. 19

BOLO - Be On Look Out BOLO Van • • • 6 cameras / 6 computers 1 database server computer On-board Local Area Network Display for operator riding ‘shotgun’ Looks like delivery truck • Florist, Messenger etc. 20
![SELECT * FROM [BOLO] WHERE ( [TAG] = ‘%TAG%’ ); Optical Character Recognition software SELECT * FROM [BOLO] WHERE ( [TAG] = ‘%TAG%’ ); Optical Character Recognition software](http://slidetodoc.com/presentation_image_h2/ba15d8c61c465da63012d3c2899e05fc/image-21.jpg)
SELECT * FROM [BOLO] WHERE ( [TAG] = ‘%TAG%’ ); Optical Character Recognition software (OCR) scans the image for text and inserts it into the above script. The script then queries the database to see if the tag (license plate) exists in the database. SELECT * FROM [BOLO] WHERE ( [TAG] = ‘MI NI 5052’ ); 21

An ethical hacker decided that the automated reading of the tag’s value was a weak link and thus a good target for exploitation. The reasoning was that the OCR would read any and all text that it saw and try to process it. The scanner wasn’t likely to differentiate between the tag and various ornamentation such as vehicle brand, model. It might even read analyze a bumper sticker. The exploit is to have the scanner read a terse command in the (presumed) language used by the (presumed) database server and then execute it. The hacker printed up some bumper stickers and went for a ride. 22
![SELECT * FROM [BOLO] WHERE ( [TAG] = ‘%TAG%’ ); SELECT * FROM [BOLO] SELECT * FROM [BOLO] WHERE ( [TAG] = ‘%TAG%’ ); SELECT * FROM [BOLO]](http://slidetodoc.com/presentation_image_h2/ba15d8c61c465da63012d3c2899e05fc/image-23.jpg)
SELECT * FROM [BOLO] WHERE ( [TAG] = ‘%TAG%’ ); SELECT * FROM [BOLO] WHERE ( [TAG] = ‘X’ ); GO; DELETE * FROM [BOLO]; GO; ’ ); • • It may first check ‘MI NI 5052’. Then it won’t find tag “X” – not a problem. It will then delete all rows in the table BOLO. It will then try to execute a command consisting of the apostrophe and the right parenthesis. (green highlight) This is not a valid statement, a “syntax error” and will either be ignored, or crash the application. But the damage has already been done. 23

Here are several ways that the BOLO VAN application could have been hardened against SQL Injection exploits. 1. The computers running the application should have been restricted to READ ONLY access of the database, blocking DELETE. 2. The data stream from the OCR module should have been ‘scrubbed’ – it should not permit any SQL keywords or punctuation such as: • DELETE, UPDATE, INSERT, CREATE, MODIFY, DROP, ALTER • Strip commas, semi-colons, parenthesis etc. 3. The length of the data stream should be limited to ‘worst case’ for a valid tag. 4. Table name(s) and Column name(s) should be obfuscated or mangled to prevent educated guesses. 24

I understand from my source that the BOLO Van concept was developed in the U. K. in the late 1990’s. It was a ‘quick and dirty’ project, done as a ‘proof of concept’, and at that level it succeeded. It proved that it would work not only while slow-speed cruising streets and parking lots, but also while travelling on multi-line highways as the relative speed of adjacent vehicles was low enough to allow for reading the tags. In addition to the BOLO Vans, there are now static cameras linked to a server via encrypted, wired network connections. These are installed at high traffic volume points. 25

After 9/11 the New York City Police Department acquired about a half dozen BOLO Vans. These get the updated list just by passing a precinct house via a Wi-Fi connection. The static installations can put out an alert within seconds – “Wanted vehicle full description, photo etc. just crossed 7 th avenue at 42 nd St heading west. ” The description includes pertinent information – “armed and presumed dangerous” or “AMBER ALERT” etc. 26

When selecting a ‘Vanity Plate’ choose wisely. There may be unintended side effects. 27

Let’s wrap up with a highly-integrated database – the NYPD Real-Time Crime Center https: //www. youtube. com/watch? v=Qrp. Gvmm. Yh 1 U (Click to Full-Screen Video when it starts, ESC at end. ) 28
- Slides: 28