brief introduction to relational database and big data
brief introduction to relational database and big data analysis Kunihiko Kaneko
Relational Database • Problems in data sharing – Data is encoded in data files – Other users can understand the data files ? • Relational Database is a standard of the followings – data format (i. e. the way to encode data) – data operations (query and update) – the way to describe data format – the way to describe constraints
describe data format relational database a relational database is a set of tables product(id, product_name, type, cost, created_at) data format description score(name, score, student_name, created_at, up data format description table_name(attribute name 1, attribute name 2, . . . )
describe constraints score(name, score, student_name, created_at, updated_at) data format description constraints description (SQL language) keywords: INTEGER, REAL, TEXT, DATETIME NOT NULL, UNIQUE, PRIMARY KEY, etc
Data format of relational database each table is a set of rows a relational database is a set of tables
list of the table names in a database command editor a table Database Browser (SQLiteman)
data sources description of data formats and constraints various data formats relational database for data storage interactive command (written in SQL Language) programs (embedded SQL statements in a programming language)
date Currency exchange data source Plot program cat >/tmp/a. $$. sql <<-SQL create table quote ( seq INTEGER PRIMARY KEY NOT NULL, at datetime, USD real, GBP real, EUR real, CAD real, CHF real, SEK real, DKK real, NOK real, AUD real, NZD real, ZAR real, BHD real, IDR 100 real, CNY real, HKD real, INR real, MYR real, PHP real, SGD real, KRW 100 real, THB real, KWD real, SAR real, AED real, MXN real, PGK real, HUF real, CZK real, PLN real, RUB real, TRY real, a 01 real, IDR 100 b real, CNYb real, MYRb real, KRW 100 b real, TWD real ); SQL cat /tmp/a. $$. sql | sqlite 3 /tmp/quotedb M <- table_to_melt(T, T$at, "%Y/%m/%d") # ggplot(M, aes(x=Date, y=Value, colour=factor(Attr. Num))) + geom_point(size=1); cat >/tmp/a. $$. sql <<-SQL. mode csv. import /tmp/a. $$. csv quote SQL # tail -n +2 /tmp/Book 1. csv > /tmp/a. $$. csv cat /tmp/a. $$. sql | sqlite 3 /tmp/quotedb a program to read the data source and store into database description of data formats and constraints
Fukuoka-City map data A Digital elevation map data Plot Examples using Relational Database
A Point Cloud data A Polygon data Three-dimensional Plot Examples using Relational Database
Data Analysis Example – Future Prediction
Data Analysis Example – Trend and Outlier
Summary • Relational Database is easy – Describing data format and constraints is easy – Database browser (such as SQLiteman) • Relational Database can handle various type of data – Spatial – Temporal • There already many types of data analysis methods
- Slides: 13