A DomainSpecific Modeling Language for Scientific Data Composition
- Slides: 22
A Domain-Specific Modeling Language for Scientific Data Composition and Interoperability Hyun Cho University of Alabama at Birmingham Jeff Gray University of Alabama
File Formats: Image Files Organize and store digital images that are composed of either pixel or vector (geometric) data Bitmap-based Ø Ø Created by scanner and digital camera TIF, JPG, BMP Vector-based Ø Ø Ø Geometric description + Bitmap Resolution Independent & Infinitely scalable Font, DRW, CGM
File Formats: Music and Audio Files Storing audio data that are produced by audio-to-digital converters Key Parameters Ø Uncompressed formats Ø WAV, AIFF and AU Lossless compression Formats Ø Sample Rate, Resolution, Number of channels FLAC, Lossless Windows Media Audio (WMA) Lossy compression Formats Ø MP 3, Lossy Windows Media Audio (WMA)
File Formats: Text Files File formats that are structured as plain text, representing a sequence of lines ASCII, TXT
File Formats: Compound File Formats Used to structure the contents of a document in the file Contain a number of independent data streams that are organized in a hierarchy Ø Ø Stream: files in a file system Storage: sub-directories in a file system MS Office, Open. Office
Characteristics of Generic File Formats Can handle one or two data types Ø May have a limitation of the file size Ø Numeric data or alphanumeric data Mostly limited to a maximum file size of 2 GB May increase file I/O time linearly as the file size grows An In-Depth Examination of Java I/O Performance and Possible Tuning Strategies http: //pages. cs. wisc. edu/~remzi/Classes/736/Fall 2000/Project-Writeups/Kai. Hongfei. html
Characteristics of Generic File Formats Can handle one or two data type Ø May have a limitation of the file size These generic file formats are not appropriate and May increase file I/Ofor timestoring linearly as the file size is grew retrieving scientific data because the files were not designed to maintain high volume of complex scientific data, such as high resolution images, massive numerical data, and graphs. Ø Numeric data or alphanumeric data Mostly limited to a maximum file size of 2 GB An In-Depth Examination of Java I/O Performance and Possible Tuning Strategies http: //pages. cs. wisc. edu/~remzi/Classes/736/Fall 2000/Project-Writeups/Kai. Hongfei. html
Scientific Data Format: Net. CDF 3 Network Common Data Format Machine-independent file format Ø Support a wide variety of platforms including Linux, Mac. OS, & Windows Representing multi-dimensional arrays with ancillary data … Time = 1 Time = n
Scientific Data Format: HDF 5 Hierarchical Data Format File format for managing any kind of data Support high volume and/or complex data Platform-independent Flexible, efficient storage and I/O
Characteristics of the Scientific Data File Formats Self-Descriptive Ø Directly Accessible Ø Can access arbitrary data through APIs Concurrently Accessible Ø Ø Contain metadata to inform the contained data type and their organization Multiple threads or processes can access data simultaneously Enable high performance computing and speedier access Archivable Ø Have their own archiving mechanism to backup and restore a high volume of data
Challenges in Using the Scientific Data File Formats Use different representations to organize the file structure Ø Ø Manage the evolution of APIs Ø Challenging to verify that APIs are evolved in accordance with the evolution of file specification Maintain stability of existing applications from API evolution Ø Each file format needs its own data visualization and composition It is difficult to exchange data between two or more scientific data formats User applications are subject to change of APIs Limited support for data integration among heterogeneous scientific data formats
Framework for Scientific Data File Management
NEW SLIDES NEEDED HERE TO INTRODUCE DSM!
Model-Driven Engineering (MDE) and Domain-Specific Modeling (DSM) ØMDE: specifies and generates software systems based on high-level models ØDomain-Specific Modeling (DSM): a paradigm of MDE that uses notations and rules from an application domain ØMetamodel: defines a Domainspecific Modeling language (DSML) by specifying the entities and their relationships in an application domain ØModel: an metamodel instance of the ØModel Transformation: a process that converts one or more models to various levels of software artifacts (e. g. , other models, source code)
Unifying the representation of file structure organization Analyze data model of each scientific file format Adapt a DSML to build a tool for visualizing & composing the scientific file format in a unified way Common Data Model Feature Model Define DSML from Feature Model Grammar & Syntax Implement DSML Tool Variable Data Model
Unifying the representation of file structure organization Feature Model for Scientific File Format Ø Ø Describe some highlights here And here
Unifying the representation of file structure organization Content Composer Ø Ø DSML Modeling tool for scientific data file Implemented by using GEMS
API Abstraction Layer Help to protect user applications from the evolution of APIs Abstraction create. File( const char *path, File. Creation. Property file. Creation. Property) Net. CDF int nc_create ( const char* path, int cmode, int *ncidp) HDF 5 H 5 File ( const char *name, unsigned int flags)
Integrating data among heterogeneous data formats Content Mapper Ø Define rules how to map data from a scientific data format to another Content Verifier Ø Ø Verify the correctness of the file composition Verify the correctness of mapping rule
Summary From the prototype of the framework Ø Ø A DSML can help to build a graphical tool to compose and support interoperability across scientific file structures Adoption of the layered architecture in the framework can help to maintain the independence of each layer Both the API abstraction layer and the layered architecture are essential to develop and maintain user applications Further works Ø Ø Ø Create metamodels that include full specification of each scientific file Categorizing APIs in accordance to their intended use for API abstraction layer Develop metamodels for managing API evolution
Thank you!
Example of Scientific Data Format: OPe. NDAP Client-server protocol for scientific data access Targeted oceanographic data management
- Modeling and role modeling theory
- Relational vs dimensional data modeling
- Best practices data warehousing
- Scientific inquiry vs scientific method
- How is a scientific law different from a scientific theory?
- Virtual reality modeling language
- Unified modeling language tutorial
- Omg systems modeling language
- Introduction to the unified modeling language
- What does vrml stand for
- Krning
- Pengertian unified modeling language
- What is uml
- Universal modeling language
- Introduction to unified modeling language
- Mercer oneview login
- Uniform modeling language
- Language modeling incorporates rules of
- Java modeling language
- Uml nedir
- Fungsi uml
- Uml adalah
- Interaction overview diagram