The LEAD Effort at Unidata The Unidata Seminar

  • Slides: 62
Download presentation
The LEAD Effort at Unidata The Unidata Seminar will start at 1: 30 PM

The LEAD Effort at Unidata The Unidata Seminar will start at 1: 30 PM MST

The LEAD Effort at Unidata Tom Baltzer, Brian Kelly, Doug Lindholm, Anne Wilson December

The LEAD Effort at Unidata Tom Baltzer, Brian Kelly, Doug Lindholm, Anne Wilson December 14, 2005

LEAD is funded by the National Science Foundation under the following Cooperative Agreements: ATM-0331594

LEAD is funded by the National Science Foundation under the following Cooperative Agreements: ATM-0331594 ATM-0331591 ATM-0331574 ATM-0331480 ATM-0331579 ATM-0331586 ATM-0331587 ATM-0331578

Outline 1. Setting the Stage: Introduction to LEAD and Unidata’s LEAD Efforts: Anne 2.

Outline 1. Setting the Stage: Introduction to LEAD and Unidata’s LEAD Efforts: Anne 2. Application of current technology on the LEAD testbeds: Tom 3. The LEAD Hardware at Unidata: Brian 4. The THREDDS Data Repository: Doug

Setting the Stage: Introduction to LEAD and Unidata’s LEAD Efforts Anne Wilson

Setting the Stage: Introduction to LEAD and Unidata’s LEAD Efforts Anne Wilson

Current IT Barriers to Mesoscale Weather Research and Education • Data and tools useable

Current IT Barriers to Mesoscale Weather Research and Education • Data and tools useable mainly by experts • Researchers and educators constrained by hardware limitations • Rigid, brittle technology can’t accommodate mesoscale weather research requirements: – real time, on demand, dynamic data processing and sensor steering

A Solution: Linked Environments for Atmospheric Discovery (LEAD) • Funded by NSF Large Information

A Solution: Linked Environments for Atmospheric Discovery (LEAD) • Funded by NSF Large Information Technology Research (ITR) award • Produce a web service based, scalable framework for handling meteorological data and model output: – Identifying, accessing, preparing, assimilating, predicting, managing, analyzing, mining, visualizing – Independent of data format and physical location • Dynamically adaptive workflows and steering of sensors

The LEAD Vision • Data access via querying, and browsing • Analysis and forecast

The LEAD Vision • Data access via querying, and browsing • Analysis and forecast tools that can be composed into workflows • Workflows and sensors that respond to the weather • Support users ranging from grade 6 to experienced researchers

LEAD Objectives • Lower the barrier for entry and increase the sophistication of problems

LEAD Objectives • Lower the barrier for entry and increase the sophistication of problems that can be addressed by complex end-to-end weather analysis and forecasting/simulation tools • Improve our understanding of and ability to detect, analyze and predict mesoscale atmospheric phenomena by interacting with weather in a dynamically adaptive manner • Result: Paradigm change in how experiments are conceived and performed

LEAD Challenges Challenge Disparate, high volume data sets Requirements Efficient transmission, remote subsetting and

LEAD Challenges Challenge Disparate, high volume data sets Requirements Efficient transmission, remote subsetting and aggregration, reliable, robust storage, format independence Huge computational demands, e. g. Distributed, load balanced ensemble forecasting computations Use of existing complex numerical models and data assimilation systems Make existing tools work in web service environment Lack of controlled vocabulary Ontology, dictionary Support for 6 – 12, college, graduate, and advanced research Robust security, user aids, education modules, meaningful responses

Multidisciplinary Effort • Meteorology • Computer Science and Information Technology • Education and Outreach

Multidisciplinary Effort • Meteorology • Computer Science and Information Technology • Education and Outreach

LEAD Institutions > 100 scientists, students, technical staff

LEAD Institutions > 100 scientists, students, technical staff

LEAD Thrust Groups • • • Data* Orchestration Portal Meteorology Grid and Web Services

LEAD Thrust Groups • • • Data* Orchestration Portal Meteorology Grid and Web Services Test Bed* Education and Outreach Test Bed *Major Unidata areas

LEAD Data Subsystem LEAD Portal Ontology Service Query Service my. LEAD Catalog Dictionary Resource

LEAD Data Subsystem LEAD Portal Ontology Service Query Service my. LEAD Catalog Dictionary Resource Catalog LEAD Data Repository (LDR) Public Data (e. g. IDD data)

Unidata Technology Used in LEAD • LDM/IDD Data Delivery: near real time data delivery

Unidata Technology Used in LEAD • LDM/IDD Data Delivery: near real time data delivery • THREDDS: catalogs of data and their associated metadata • Common Data Model (CDM): single interface to multiple data formats • THREDDS Data Server (TDS): integrated OPe. NDAP and http data access • Integrated Data Viewer (IDV): visualization • THREDDS Data Repository (TDR): data storage framework • Decoders

Unidata and LEAD • Unidata also brings: – Experience with atmospheric data – Community

Unidata and LEAD • Unidata also brings: – Experience with atmospheric data – Community of users – Robust, fielded software

Recent LEAD-Related Efforts Goal: Support both LEAD and our community 2. Application of current

Recent LEAD-Related Efforts Goal: Support both LEAD and our community 2. Application of current technology on our LEAD testbed: Tom 3. Structure of the LEAD testbed: Brian 4. THREDDS Data Repository: Doug

Application of Current Technologies on the LEAD Testbed Systems Tom Baltzer

Application of Current Technologies on the LEAD Testbed Systems Tom Baltzer

Acronyms for LEAD Tools ADAS - ARPS Data Assimilation System (Center for Advanced Prediction

Acronyms for LEAD Tools ADAS - ARPS Data Assimilation System (Center for Advanced Prediction of Storms at OU) ADa. M - Algorithm Development and Mining (University of Alabama at Huntsville) IDV – Integrated Data Viewer (Unidata) LDM/IDD – Local Data Manager/Internet Data Distribution (Unidata) OPe. NDAP – Open-source Project for a Network Data Access Protocol (OPe. NDAP. org) THREDDS – Thematic Real-time Environmental Distributed Data Services TDS - THREDDS Data Server TDR – THREDDS Data Repository (Unidata) WRF – The Weather and Research Forecasting Model (ARW Core - NCAR) Also: WS-Eta – Workstation Eta Model

LEAD Testbed Systems • Testbed systems at several LEAD locations to provide: – Data

LEAD Testbed Systems • Testbed systems at several LEAD locations to provide: – Data • Near Real-Time data ingest, storage and access • LEAD Data Product storage and access – Data Processing • High Performance Computing • Grid and Web Services • Allow each institution to develop methods by which their capabilities fit into LEAD effort • Single Web Portal system at Indiana Univ. to bring it all together and provide User Interface

MU HU CSU Unidata UI IU UNC OU LEAD Grid Core Academic Partner +

MU HU CSU Unidata UI IU UNC OU LEAD Grid Core Academic Partner + Grid Test Bed UAH Core Academic Partner + Education Test Bed Core Academic Partner + Grid Test Bed + Education Test Bed

Data Aspects of LEAD Testbeds

Data Aspects of LEAD Testbeds

LEAD Testbed Systems • UPC Technologies being leveraged to facilitate LEAD needs – LDM/IDD

LEAD Testbed Systems • UPC Technologies being leveraged to facilitate LEAD needs – LDM/IDD – THREDDS – IDV – Net. CDF Decoders – OPe. NDAP (Unidata supported)

Typical LEAD Testbed (Current Source Data Configuration) LEAD Grid System Forecast Model Output Weather

Typical LEAD Testbed (Current Source Data Configuration) LEAD Grid System Forecast Model Output Weather station observations THREDDS Catalog IDD OPe. NDAP Aircraft data Decoders Radar data Grid. FTP Testbed System

Typical LEAD “Data” Testbed (Future Source Data Configuration) LEAD Grid System Forecast Model Output

Typical LEAD “Data” Testbed (Future Source Data Configuration) LEAD Grid System Forecast Model Output Weather station observations THREDDS Catalog OPe. NDAP TDS & TDR IDD Aircraft data Decoders Radar data Grid. FTP Testbed System Note: UPC plans ~ 6 month store

LEAD Processing on the Unidata Testbed System

LEAD Processing on the Unidata Testbed System

UPC Processing Testbed (Current Configuration) - WRF being Steered by Chiz’s GEMPAK precipitation locator

UPC Processing Testbed (Current Configuration) - WRF being Steered by Chiz’s GEMPAK precipitation locator NCEP NAM (Eta) Forecast Ini Bo tial a Co un nd nd dar itio y ns Precipitation Locator WRF Center Lat/Lon Regional Forecasts WS-Eta THREDDS Catalog OPe. NDAP Access Unidata LEAD Test Bed

Next Steps NCEP NAM (Eta) Forecast B Co ound nd ar itio y ns

Next Steps NCEP NAM (Eta) Forecast B Co ound nd ar itio y ns Precipitation Locator Center Lat/Lon Millersville ADa. M Precip Locator THREDDS Catalog WRF Regional Forecasts WS-Eta ial ns t i In itio nd o C OPe. NDAP Access Unidata LEAD Test Bed CAPS ADAS Assimilation

Longer Term NCEP NAM (Eta) Forecast IDD Datasets • Radar • Surface & Upper

Longer Term NCEP NAM (Eta) Forecast IDD Datasets • Radar • Surface & Upper air • Satellite • NCEP NAM B Co ound nd ar itio y ns Precipitation Locator ADa. M ADAS WRF Center Lat/Lon Regional Forecasts WS-Eta OPe. NDAP Access Unidata LEAD Test Bed THREDDS Catalog

Ultimately LEAD Grid System NCEP NAM (Eta) Forecast IDD Datasets • Radar • Surface

Ultimately LEAD Grid System NCEP NAM (Eta) Forecast IDD Datasets • Radar • Surface & Upper air • Satellite • NCEP NAM B Co ound nd ar itio y ns Precipitation Locator Web Service ADa. M Web Service ADAS Web Service WRF Center Lat/Lon Regional Forecasts WS-Eta OPe. NDAP Access Unidata LEAD Test Bed THREDDS Catalog

Objectives for UPC Testbed • Testing ground for integration new UPC and LEAD technologies

Objectives for UPC Testbed • Testing ground for integration new UPC and LEAD technologies • Determining ways to bring LEAD Technologies to the Unidata Community • “Operational” environment for LEAD • Processing cluster • Data Storage – ~6 months of IDD data – LEAD product data

The LEAD Hardware at Unidata Brian Kelly

The LEAD Hardware at Unidata Brian Kelly

Existing LEAD Infrastructure Lead 3 Lead 1 HTTP Server THREDDS Server Open. DAP Server

Existing LEAD Infrastructure Lead 3 Lead 1 HTTP Server THREDDS Server Open. DAP Server LDM Node NFS Server Cluster Node GRID Server Development Tools NFS Server Cluster Node Lead 4 TDS LDM Node NFS Server Cluster Node Lead 2 GRID Server NFS Server Cluster Node Cluster Monitoring Lead. Stor 8 TB of Disk NFS Server

Portal Servers for Web, TDS, Grid and LDM Services UCAR/Unidata LEAD Infrastructure ~30 GFLOP

Portal Servers for Web, TDS, Grid and LDM Services UCAR/Unidata LEAD Infrastructure ~30 GFLOP Processing Cluster 40 TB Storage Cluster

HTTP, TDS and Grid Server LDM Server Test Server Processing Cluster Head Node Storage

HTTP, TDS and Grid Server LDM Server Test Server Processing Cluster Head Node Storage Cluster Gateway Gigabit Network for NFS Storage Access LEAD Portal Systems

LEAD Processing Cluster Beowulf Cluster Connected by a Gigabit Fibre Network Each Node contains

LEAD Processing Cluster Beowulf Cluster Connected by a Gigabit Fibre Network Each Node contains Two Athlon 2400+ CPUs Cluster Uses OSCAR with the MPICH MPD Eight Nodes is ~30 GFLOPs

LEAD Storage Head Node LEAD Storage Cluster LEAD Storage Gigabit Network LEAD Storage Nodes

LEAD Storage Head Node LEAD Storage Cluster LEAD Storage Gigabit Network LEAD Storage Nodes

One (1) Guanghsing GHI-583 5 U Case LEAD Storage Node 24 hot swapable SATA

One (1) Guanghsing GHI-583 5 U Case LEAD Storage Node 24 hot swapable SATA trays 1000 W 2+2 power supply ● One (1) Tyan Thunder K 8 SD Pro Motherboard Dual Opteron CPUs Four 64 -bit 133/100 Mhz PCI-X Slots Two Gigabit Ethernet ports ● One (1) AMD Opteron 242 Processor 1. 6 Ghz CPU ● Three (3) Broadcom RAIDCore BC 4853 Eight SATA ports Controller spanning Advanced raid ● Twenty-Four (24) Seagate Barracuda ST 3400832 AS 7200 RPM 400 GB SATA Drives

LEAD Storage Node Twenty-Four (24) 400 GB Drives Divided into Two (2) Eleven Column

LEAD Storage Node Twenty-Four (24) 400 GB Drives Divided into Two (2) Eleven Column RAID 5 Arrays and Two Hot Spar Form Two (2) 4 TB LUNs Using bcraid Each Node Publishes the Two LUNS over i. SCSI

LEAD Storage Gateway ● Mounts Each Node's Two (2) 4 TB LUNs Published via

LEAD Storage Gateway ● Mounts Each Node's Two (2) 4 TB LUNs Published via i. SCSI ● Builds Two (2) 20 TB 6 column RAID 5 Meta-devices using mdadm ● Divides Each Meta-device into Volume using LVM ● Each Volume is Formatted with an XFS Filesystem ● Each Filesystem is Published with NFS Result: 40 TB of mid-performance double-redundant storage

THREDDS Data Repository (TDR) Doug Lindholm

THREDDS Data Repository (TDR) Doug Lindholm

LEAD Architecture Data Storage Perspective Unidata NCSA OU UAH IU LEAD Data Grid

LEAD Architecture Data Storage Perspective Unidata NCSA OU UAH IU LEAD Data Grid

LEAD Architecture Data Storage Perspective Storage Locator Unidata Data Mover NCSA ID Generator OU

LEAD Architecture Data Storage Perspective Storage Locator Unidata Data Mover NCSA ID Generator OU UAH Name Resolver Metadata Generator Metadata Crosswalk IU Cataloger (my. LEAD) LEAD Data Grid “Atomic” Capabilities

LEAD Architecture Data Storage Perspective Forecast Model (WRF) Storage Locator Unidata Data Mover NCSA

LEAD Architecture Data Storage Perspective Forecast Model (WRF) Storage Locator Unidata Data Mover NCSA ID Generator OU UAH Name Resolver Metadata Generator Metadata Crosswalk IU Cataloger (my. LEAD) LEAD Data Grid Data Assimilation (ADAS) “Atomic” Capabilities Data Mining (ADAM) Visualization (IDV) Application Services

LEAD Architecture Data Storage Perspective Forecast Model (WRF) Storage Locator Unidata Data Mover NCSA

LEAD Architecture Data Storage Perspective Forecast Model (WRF) Storage Locator Unidata Data Mover NCSA ID Generator OU UAH Name Resolver Metadata Generator Metadata Crosswalk IU Cataloger (my. LEAD) LEAD Data Grid Data Assimilation (ADAS) “Atomic” Capabilities Portal Data Mining (ADAM) Visualization (IDV) Application Services User

LEAD Architecture Data Storage Perspective Forecast Model (WRF) Storage Locator Unidata Data Mover NCSA

LEAD Architecture Data Storage Perspective Forecast Model (WRF) Storage Locator Unidata Data Mover NCSA ID Generator OU UAH Name Resolver Metadata Generator Metadata Crosswalk IU Cataloger (my. LEAD) LEAD Data Grid Data Assimilation (ADAS) “Atomic” Capabilities Portal Data Mining (ADAM) Visualization (IDV) Application Services User

LEAD Architecture Data Storage Perspective Forecast Model (WRF) Storage Locator Unidata Data Mover NCSA

LEAD Architecture Data Storage Perspective Forecast Model (WRF) Storage Locator Unidata Data Mover NCSA ID Generator OU UAH Name Resolver Metadata Generator Metadata Crosswalk IU Cataloger (my. LEAD) LEAD Data Grid Data Assimilation (ADAS) “Atomic” Capabilities Portal Data Mining (ADAM) Visualization (IDV) Application Services User

LEAD Architecture Data Storage Perspective Forecast Model (WRF) Storage Locator Unidata Data Mover NCSA

LEAD Architecture Data Storage Perspective Forecast Model (WRF) Storage Locator Unidata Data Mover NCSA ID Generator OU UAH Name Resolver Metadata Generator Metadata Crosswalk IU Cataloger (my. LEAD) LEAD Data Grid Data Assimilation (ADAS) “Atomic” Capabilities Portal Data Mining (ADAM) Visualization (IDV) Application Services User

LEAD Architecture Data Storage Perspective Unidata Cataloger (my. LEAD) THREDDS Data Repository Storage Locator

LEAD Architecture Data Storage Perspective Unidata Cataloger (my. LEAD) THREDDS Data Repository Storage Locator “Atomic” Capabilities Data Repository Data Mover NCSA OU ID Generator Name Resolver UAH IU LEAD Data Grid Metadata Generator Metadata Crosswalk Forecast Model (WRF) Data Assimilation (ADAS) Portal Data Mining (ADAM) Visualization (IDV) Application Services User

THREDDS Data Repository Component Architecture locate. Storage() generate. Unique. ID() N Re am so

THREDDS Data Repository Component Architecture locate. Storage() generate. Unique. ID() N Re am so e lv er M G eta en d er ata at or M C eta ro d ss at w a al k C at al og er U G niq en u er e I at D or Da ov ta er move. Data() M S Lo tora ca ge to r Data Storage map. IDTo. URL() generate. Metadata() translate. Metadata() catalog. Metadata() THREDDS Data Repository put. Data() discover. Data() get. Data()

THREDDS Data Repository Component Architecture locate. Storage() generate. Unique. ID() N Re am so

THREDDS Data Repository Component Architecture locate. Storage() generate. Unique. ID() N Re am so e lv er M G eta en d er ata at or M C eta ro d ss at w a al k C at al og er U G niq en u er e I at D or Da ov ta er move. Data() M S Lo tora ca ge to r Data Storage map. IDTo. URL() generate. Metadata() translate. Metadata() catalog. Metadata() THREDDS Data Repository put. Data() discover. Data() get. Data()

THREDDS Data Repository Component Architecture generate. Unique. ID() map. IDTo. URL() generate. Metadata() translate.

THREDDS Data Repository Component Architecture generate. Unique. ID() map. IDTo. URL() generate. Metadata() translate. Metadata() discover. Data() get. Data() LEAD Configuration D EA catalog. Metadata() THREDDS Data Repository put. Data() m y. L S RL U G niq en u er e I at D or t ch e bu move. Data() TH M RE G et D en a D er da S at ta TH or R E C DD ro S ss t w o. L al E k A locate. Storage() tre Re s Br ou ok rc er e D Data Storage

THREDDS Data Repository Component Architecture locate. Storage() generate. Unique. ID() map. IDTo. URL() generate.

THREDDS Data Repository Component Architecture locate. Storage() generate. Unique. ID() map. IDTo. URL() generate. Metadata() translate. Metadata() TH R C ED at D al S og TH M RE G et D en a D er da S at ta or Da ov ta er move. Data() M S Lo tora ca ge to r Data Storage catalog. Metadata() THREDDS Data Repository put. Data() discover. Data() get. Data() Alternate Configuration

Unidata Architecture Internet Data Distribution (IDD) Data Storage Local Data Manager (LDM)

Unidata Architecture Internet Data Distribution (IDD) Data Storage Local Data Manager (LDM)

Unidata Architecture Internet Data Distribution (IDD) access Data Storage Local Data Manager (LDM)

Unidata Architecture Internet Data Distribution (IDD) access Data Storage Local Data Manager (LDM)

Unidata Architecture Internet Data Distribution (IDD) access Data Storage Local Data Manager (LDM) THREDDS

Unidata Architecture Internet Data Distribution (IDD) access Data Storage Local Data Manager (LDM) THREDDS Catalog discover THREDDS Client API

Unidata Architecture Internet Data Distribution (IDD) Data Storage Local Data Manager (LDM) Common Data

Unidata Architecture Internet Data Distribution (IDD) Data Storage Local Data Manager (LDM) Common Data Model (CDM) access THREDDS Catalog discover THREDDS Client API

Unidata Architecture Internet Data Distribution (IDD) Data Storage Local Data Manager (LDM) Common Data

Unidata Architecture Internet Data Distribution (IDD) Data Storage Local Data Manager (LDM) Common Data Model (CDM) THREDDS Catalog access THREDDS Data Server (TDS) discover THREDDS Client API

Unidata Architecture Internet Data Distribution (IDD) Data Storage Local Data Manager (LDM) Common Data

Unidata Architecture Internet Data Distribution (IDD) Data Storage Local Data Manager (LDM) Common Data Model (CDM) THREDDS Catalog access THREDDS Data Server (TDS) discover store THREDDS Data Repository (TDR) THREDDS Client API

Unidata Architecture Internet Data Distribution (IDD) Common Data Model (CDM) Data Storage Local Data

Unidata Architecture Internet Data Distribution (IDD) Common Data Model (CDM) Data Storage Local Data Manager (LDM) THREDDS Catalog access THREDDS Data Server (TDS) store sto re Locally Generated Data discover THREDDS Data Repository (TDR) store THREDDS Client API

Unidata Architecture Internet Data Distribution (IDD) Common Data Model (CDM) Data Storage Local Data

Unidata Architecture Internet Data Distribution (IDD) Common Data Model (CDM) Data Storage Local Data Manager (LDM) THREDDS Catalog access THREDDS Data Server (TDS) store sto re Locally Generated Data discover THREDDS Data Repository (TDR) store notify THREDDS Client API E-mail Application (e. g. IDV) Service

Questions?

Questions?