5 Extraction Transformation and Loading ETL Loading Copyright
- Slides: 16
5 Extraction, Transformation, and Loading (ETL) Loading Copyright © 2005, Oracle. All rights reserved.
Objectives After completing this lesson, you should be able to implement the following methods that are available for loading data: • SQL*Loader • External tables • OCI and direct-path APIs • Data Pump • Export/import 5 -2 Copyright © 2005, Oracle. All rights reserved.
Data-Loading Mechanisms You can use the following mechanisms for loading a data warehouse: • SQL*Loader • External tables • OCI and direct-path APIs • Export/import • Data Pump 5 -3 Copyright © 2005, Oracle. All rights reserved.
Loading Mechanisms • • 5 -4 SQL*Loader loads a formatted flat file into an existing table. It can perform basic transformations while loading. Direct-path loading may be used to decrease the load time. When you use this method, data in the flat file is not accessible until the data is loaded. Copyright © 2005, Oracle. All rights reserved.
SQL*Loader: Example • Control file used for loading the SALES table: LOAD DATA INFILE sh_sales. dat APPEND INTO TABLE sales FIELDS TERMINATED BY "|" (PROD_ID, CUST_ID, TIME_ID, CHANNEL_ID, PROMO_ID, QUANTITY_SOLD, AMOUNT_SOLD) • The fact table can be loaded with the following command: $ sqlldr sh/sh control=sh_sales. ctl direct=true 5 -5 Copyright © 2005, Oracle. All rights reserved.
Loading Mechanisms • • • 5 -6 External tables are read-only tables where the data is stored outside the database in flat files. The data can be queried like a virtual table, using any supported language inside the database. No DML is allowed and no indexes can be created. The metadata for an external table is created using a CREATE TABLE statement. An external table describes how the external data should be presented to the database. Copyright © 2005, Oracle. All rights reserved.
Applications of External Tables External tables: • Allow external data to be queried and joined directly and in parallel without requiring it to be loaded into the database • Eliminate the need for staging the data within the database for ETL in data warehousing applications • Are useful in environments where an external source has to be joined with database objects and then transformed • Are useful when the external data is large and not queried frequently • Complement SQL*Loader functionalities: – Transparent parallelism – Full SQL capabilities for direct-path insertion 5 -7 Copyright © 2005, Oracle. All rights reserved.
Example of Defining External Tables CREATE TABLE sales_delta_xt ( prod_id NUMBER(6), cust_id NUMBER, time_id DATE, unit_cost, unit_price. . . ) ORGANIZATION external ( -- External Table TYPE oracle_loader –- Access Driver DEFAULT DIRECTORY data_dir –- Files Directory ACCESS PARAMETERS –- Similar to SQL*Loader ( RECORDS DELIMITED BY NEWLINE CHARACTERSET US 7 ASCII BADFILE log_dir: 'sh_sales_%p. bad' LOGFILE log_dir: 'sh_sales_%p. log_xt' FIELDS TERMINATED BY "|" LDRTRIM ) location ( 'sales_delta. dat', data_dir 2: 'sales_delta 2. dat' )) PARALLEL 5 –- Independent from the number of files REJECT LIMIT UNLIMITED; 5 -9 Copyright © 2005, Oracle. All rights reserved.
Populating External Tables with Data Pump CREATE TABLE emp_ext (first_name, last_name, department_name) ORGANIZATION EXTERNAL ( TYPE ORACLE_DATAPUMP DEFAULT DIRECTORY ext_dir LOCATION ('emp 1. exp', ) ) AS SELECT e. first_name, e. last_name, d. department_name FROM employees e, departments d WHERE e. department_id = d. department_id AND d. department_name in ('Marketing', 'Purchasing'); 5 -11 Copyright © 2005, Oracle. All rights reserved.
Privileges for External Tables Access to the external tables for other users requires: • SELECT on the table definition • READ access to the directory containing the file • WRITE access to the directory for the bad file and log file GRANT SELECT ON sh. sales_delta_xt TO oe; GRANT READ ON DIRECTORY data_dir TO oe; GRANT WRITE ON DIRECTORY log_dir TO oe; 5 -12 Copyright © 2005, Oracle. All rights reserved.
Defining External Tables Using SQL*Loader After creating a control file, SQL*Loader can generate a log file with the SQL commands to: • Create the metadata for the external table • Insert the data into the target table • Drop the metadata for the external table sqlldr sh/sh control=sales_dec 00. ctl EXTERNAL_TABLE=GENERATE_ONLY LOG=sales_dec 00. sql 5 -13 Copyright © 2005, Oracle. All rights reserved.
Data Dictionary Information for External Tables DBA_EXTERNAL_LOCATIONS • OWNER • TABLE_NAME • LOCATION • DIRECTORY_OWNER • DIRECTORY_NAME DBA_DIRECTORIES • OWNER • DIRECTORY_NAME • DIRECTORY_PATH 5 -14 DBA_EXTERNAL_TABLES • OWNER • NAME • TYPE_OWNER • TYPE_NAME • DEFAULT_DIRECTORY_ OWNER • DEFAULT_DIRECTORY_ NAME • REJECT_LIMIT Copyright © 2005, Oracle. All rights reserved.
Changing External Data Properties • Using the ALTER TABLE command, you can change: – – – • DEFAULT DIRECTORY ACCESS PARAMETERS LOCATION REJECT_LIMIT Degree of parallelism Useful in situations where external files are changing: ALTER TABLE sales_delta_xt LOCATION ('newfile 1. dat') 5 -16 Copyright © 2005, Oracle. All rights reserved.
Other Loading Methods • OCI and direct-path APIs: – Allow transformation and loading at the same time – Access an online source – Do not require an intermediary step such as a flat file • Export/import: – Is good for small loads – Allows for easy transfers between Oracle databases on different operating systems 5 -18 Copyright © 2005, Oracle. All rights reserved.
Summary In this lesson, you should have learned how to implement the following methods that are available for loading data: • SQL*Loader • External tables • OCI and direct-path APIs • Data Pump • Export/import 5 -19 Copyright © 2005, Oracle. All rights reserved.
Practice 5: Overview This practice covers the following topics: • Loading data from a flat file using SQL*Loader • Loading data from a flat file using external tables 5 -20 Copyright © 2005, Oracle. All rights reserved.
- Static vs dynamic class loading
- Perbedaan granit single loading dan double loading
- Data extraction cleanup and transformation tools
- Etl design and development
- Data integration in data preprocessing
- It etl
- Procesos etl
- Etl prosessi
- David lexis
- Data modeling best practices for data warehousing
- Etl metadata
- Data services etl
- Omop etl
- Etl service manager
- Hpe nonstop enscribe to xml
- Etl process flow
- Etl acronimo