5 Extraction Transformation and Loading ETL Loading Copyright

  • Slides: 16
Download presentation
5 Extraction, Transformation, and Loading (ETL) Loading Copyright © 2005, Oracle. All rights reserved.

5 Extraction, Transformation, and Loading (ETL) Loading Copyright © 2005, Oracle. All rights reserved.

Objectives After completing this lesson, you should be able to implement the following methods

Objectives After completing this lesson, you should be able to implement the following methods that are available for loading data: • SQL*Loader • External tables • OCI and direct-path APIs • Data Pump • Export/import 5 -2 Copyright © 2005, Oracle. All rights reserved.

Data-Loading Mechanisms You can use the following mechanisms for loading a data warehouse: •

Data-Loading Mechanisms You can use the following mechanisms for loading a data warehouse: • SQL*Loader • External tables • OCI and direct-path APIs • Export/import • Data Pump 5 -3 Copyright © 2005, Oracle. All rights reserved.

Loading Mechanisms • • 5 -4 SQL*Loader loads a formatted flat file into an

Loading Mechanisms • • 5 -4 SQL*Loader loads a formatted flat file into an existing table. It can perform basic transformations while loading. Direct-path loading may be used to decrease the load time. When you use this method, data in the flat file is not accessible until the data is loaded. Copyright © 2005, Oracle. All rights reserved.

SQL*Loader: Example • Control file used for loading the SALES table: LOAD DATA INFILE

SQL*Loader: Example • Control file used for loading the SALES table: LOAD DATA INFILE sh_sales. dat APPEND INTO TABLE sales FIELDS TERMINATED BY "|" (PROD_ID, CUST_ID, TIME_ID, CHANNEL_ID, PROMO_ID, QUANTITY_SOLD, AMOUNT_SOLD) • The fact table can be loaded with the following command: $ sqlldr sh/sh control=sh_sales. ctl direct=true 5 -5 Copyright © 2005, Oracle. All rights reserved.

Loading Mechanisms • • • 5 -6 External tables are read-only tables where the

Loading Mechanisms • • • 5 -6 External tables are read-only tables where the data is stored outside the database in flat files. The data can be queried like a virtual table, using any supported language inside the database. No DML is allowed and no indexes can be created. The metadata for an external table is created using a CREATE TABLE statement. An external table describes how the external data should be presented to the database. Copyright © 2005, Oracle. All rights reserved.

Applications of External Tables External tables: • Allow external data to be queried and

Applications of External Tables External tables: • Allow external data to be queried and joined directly and in parallel without requiring it to be loaded into the database • Eliminate the need for staging the data within the database for ETL in data warehousing applications • Are useful in environments where an external source has to be joined with database objects and then transformed • Are useful when the external data is large and not queried frequently • Complement SQL*Loader functionalities: – Transparent parallelism – Full SQL capabilities for direct-path insertion 5 -7 Copyright © 2005, Oracle. All rights reserved.

Example of Defining External Tables CREATE TABLE sales_delta_xt ( prod_id NUMBER(6), cust_id NUMBER, time_id

Example of Defining External Tables CREATE TABLE sales_delta_xt ( prod_id NUMBER(6), cust_id NUMBER, time_id DATE, unit_cost, unit_price. . . ) ORGANIZATION external ( -- External Table TYPE oracle_loader –- Access Driver DEFAULT DIRECTORY data_dir –- Files Directory ACCESS PARAMETERS –- Similar to SQL*Loader ( RECORDS DELIMITED BY NEWLINE CHARACTERSET US 7 ASCII BADFILE log_dir: 'sh_sales_%p. bad' LOGFILE log_dir: 'sh_sales_%p. log_xt' FIELDS TERMINATED BY "|" LDRTRIM ) location ( 'sales_delta. dat', data_dir 2: 'sales_delta 2. dat' )) PARALLEL 5 –- Independent from the number of files REJECT LIMIT UNLIMITED; 5 -9 Copyright © 2005, Oracle. All rights reserved.

Populating External Tables with Data Pump CREATE TABLE emp_ext (first_name, last_name, department_name) ORGANIZATION EXTERNAL

Populating External Tables with Data Pump CREATE TABLE emp_ext (first_name, last_name, department_name) ORGANIZATION EXTERNAL ( TYPE ORACLE_DATAPUMP DEFAULT DIRECTORY ext_dir LOCATION ('emp 1. exp', ) ) AS SELECT e. first_name, e. last_name, d. department_name FROM employees e, departments d WHERE e. department_id = d. department_id AND d. department_name in ('Marketing', 'Purchasing'); 5 -11 Copyright © 2005, Oracle. All rights reserved.

Privileges for External Tables Access to the external tables for other users requires: •

Privileges for External Tables Access to the external tables for other users requires: • SELECT on the table definition • READ access to the directory containing the file • WRITE access to the directory for the bad file and log file GRANT SELECT ON sh. sales_delta_xt TO oe; GRANT READ ON DIRECTORY data_dir TO oe; GRANT WRITE ON DIRECTORY log_dir TO oe; 5 -12 Copyright © 2005, Oracle. All rights reserved.

Defining External Tables Using SQL*Loader After creating a control file, SQL*Loader can generate a

Defining External Tables Using SQL*Loader After creating a control file, SQL*Loader can generate a log file with the SQL commands to: • Create the metadata for the external table • Insert the data into the target table • Drop the metadata for the external table sqlldr sh/sh control=sales_dec 00. ctl EXTERNAL_TABLE=GENERATE_ONLY LOG=sales_dec 00. sql 5 -13 Copyright © 2005, Oracle. All rights reserved.

Data Dictionary Information for External Tables DBA_EXTERNAL_LOCATIONS • OWNER • TABLE_NAME • LOCATION •

Data Dictionary Information for External Tables DBA_EXTERNAL_LOCATIONS • OWNER • TABLE_NAME • LOCATION • DIRECTORY_OWNER • DIRECTORY_NAME DBA_DIRECTORIES • OWNER • DIRECTORY_NAME • DIRECTORY_PATH 5 -14 DBA_EXTERNAL_TABLES • OWNER • NAME • TYPE_OWNER • TYPE_NAME • DEFAULT_DIRECTORY_ OWNER • DEFAULT_DIRECTORY_ NAME • REJECT_LIMIT Copyright © 2005, Oracle. All rights reserved.

Changing External Data Properties • Using the ALTER TABLE command, you can change: –

Changing External Data Properties • Using the ALTER TABLE command, you can change: – – – • DEFAULT DIRECTORY ACCESS PARAMETERS LOCATION REJECT_LIMIT Degree of parallelism Useful in situations where external files are changing: ALTER TABLE sales_delta_xt LOCATION ('newfile 1. dat') 5 -16 Copyright © 2005, Oracle. All rights reserved.

Other Loading Methods • OCI and direct-path APIs: – Allow transformation and loading at

Other Loading Methods • OCI and direct-path APIs: – Allow transformation and loading at the same time – Access an online source – Do not require an intermediary step such as a flat file • Export/import: – Is good for small loads – Allows for easy transfers between Oracle databases on different operating systems 5 -18 Copyright © 2005, Oracle. All rights reserved.

Summary In this lesson, you should have learned how to implement the following methods

Summary In this lesson, you should have learned how to implement the following methods that are available for loading data: • SQL*Loader • External tables • OCI and direct-path APIs • Data Pump • Export/import 5 -19 Copyright © 2005, Oracle. All rights reserved.

Practice 5: Overview This practice covers the following topics: • Loading data from a

Practice 5: Overview This practice covers the following topics: • Loading data from a flat file using SQL*Loader • Loading data from a flat file using external tables 5 -20 Copyright © 2005, Oracle. All rights reserved.