Introduction to Computer Science Databases and SQL Lecture

Introduction to Computer Science Databases and SQL Lecture a This material (Comp 4 Unit 5) was developed by Oregon Health & Science University, funded by the Department of Health and Human Services, Office of the National Coordinator for Health Information Technology under Award Number 90 WT 0001. This work is licensed under the Creative Commons Attribution-Non. Commercial-Share. Alike 4. 0 International License. To view a copy of this license, visit http: //creativecommons. org/licenses/by-nc-sa/4. 0/.

Databases and SQL Learning Objectives - 1 • Define and describe the purpose of databases (Lecture a) • Describe a relational database (Lecture a) • Describe data modeling and normalization (Lecture b) • Describe the structured query language (SQL) (Lecture c) 2

Databases and SQL Learning Objectives - 2 • Define the basic data operations for relational databases and how to implement them in SQL (Lecture c) • Design a simple relational database and create corresponding SQL commands (Lecture c) • Examine the structure of a health care database component (Lecture d) 3

Data Storage • Large component of computer systems is data management • Storing and retrieving data are important functions – Efficiency – Accessibility – Speed 4

Data Storage Options • Text/data files • Spreadsheets • Databases 5

Files • A collection of data stored electronically in a single location • Can store text or data • Files have different formats 6

Advantages/Disadvantages of Files • Advantages • Disadvantages – Easy to create and – Limited security store – Easy to share – Used by many applications – Multiple user access is not supported – Redundant and inconsistent data 7

Contact Information Example • File with contact information: – Sriveni Sharma, 1312 Main, Portland, OR, Community – – Hospital, Inc. Walter Chen, 14 12 th St. , Oakland, CA, Oakland Providers LLC Rachel Cohen, 1414 Main St. , Oakland, CA, Oakland Providers LLC Karthik Subramanian, 1312 Main Street, Portland, OR, Community Hospital Incorporated Kelly David, 14 12 th Street, Oakland, CA, Oakland Providers LLC 8

Quick! • Do Sriveni and Karthik work for the same company? • Is there an issue with Kelly’s and Walter’s information? • Can a computer application tell? • Give me a contact list sorted by last name • Imagine with 10, 000 contacts! 9

Quick! Answers • Sriveni and Karthik work for the same company, but it is represented differently • Kelly and Walter have the same address, but it is represented differently • It is hard for a computer application to tell • You CAN sort by hand – but it is a challenge 10

Another Problem • What do you do if “Community Hospital” becomes “Community General”? – Find every instance of “Community Hospital” or variation thereof – Change EVERY entry 11

Another Solution: Spreadsheets • Spreadsheet applications store, manipulate, and present data • Provide more functionality than plain text files – Calculations – Sorting – Filtering – Data analysis 12

Spreadsheet Example Figure 1: Open. Office Calc spreadsheet example. (Barratt, 2016, PD-US) 13

Advantages/Disadvantages of Spreadsheets • Advantages • Disadvantages – Widely available – Limited security – Powerful – Multiple user calculations – Basic sorting and filtering access is not supported – May contain redundant and inconsistent data 14

Databases • Definition – Structured data collection accessed electronically • Files can be seen as simple databases • Relational databases maintain relationships between data 15

Relational Databases • Introduced by Dr. Edgar Codd of IBM Research Laboratory in 1970 – “Future users of large data banks must be protected from having to know how the data is organized in the machine (the internal representation)” • Definition – An organized collection of data accessible by electronic means where the information type and information relationships are maintained 16

Relational Database Contents • Tables contain multiple rows and columns of data • A relational database contains tables • Rows contain data of specified type(s) in a column order • Data and type are independent • Row order does not matter, but column order does 17

Advantages/Disadvantages of Relational Databases • Advantages • Disadvantages – Secure – Expertise required – Multiple user access – Limited data – Relationships calculations prevent redundancy and inconsistency – Optimized operations – Complex queries 18

Databases and SQL Summary – Lecture a • Data can be stored in files, spreadsheets, or databases – Files and spreadsheets o Widely available o Good for computations – Databases o Secure o Optimized for speed o Multiple user access o Store relationships 19

Databases and SQL References – Lecture a References American National Standards Institute. (2007). Information Systems - Coded Character Sets - 7 -Bit American National Standard Code for Information Interchange (7 -Bit ASCII). (No. ANSI INCITS 4 -1986 (R 2007)). Codd, E. F. (1970). A relational model of data for large shared data banks. Communications of the ACM, 13(6), 377 -387. Figures Figure 1: Open. Office Calc spreadsheet example. Baratt, A. (2016). Public Domain. 20

Introduction to Computer Science Databases and SQL Lecture a This material was developed by Oregon Health & Science University, funded by the Department of Health and Human Services, Office of the National Coordinator for Health Information Technology under Award Number 90 WT 0001. 21
- Slides: 21