MPEG7 Multimedia Content Description Interface Presented by Moustafa
MPEG-7 Multimedia Content Description Interface Presented by: Moustafa A. Hammad
Introduction y More and more digital audio - visual information exists and increasing. How fast and easy can desirable information be made available? y Increasing Internet popularity. y More audio-visual information processing systems emerged. MPEG-1 “Standard for storage and retrieval” MPEG-2 “The digital television standard” MPEG-4 “ Multimedia production, distribution and content access” developed a Syntactic Description Language. “ Fine - where is the semantic? ”
Introduction (Cont. ) z MPEG-7 “ Multimedia content description interface”. z Represents information about the contents, but not the content itself. z Satisfies both database and signal processing communities. z Goal: audio-visual material as searchable as text. z What is the standard? (To be finalized mid 2001)
Topics of Discussion z Scope of the standard z Terminology z Interaction between MPEG-7 and Applications. z Requirements z Applications z Case study : A Proposal for MPEG-7 Description Definition.
Scope of the Standard z MPEG-7 processing chain: y feature extraction (analysis) y the description itself y the search engines (application) z What is in the standard? Description production Standard Description consumption
MPEG-7 Terminology Data, Description Definition Language (DDL) Feature, 1. . * defines AV Content Item 1. . * Descriptor (D), Descriptor value, 0. . * Description Scheme (DS), 1. . * Description, Descriptor 1. . * describes Feature *. . 1 Coded Description and Description Definition Language (DDL) signifies 1. . * To Human or System Data
Interaction between MPEG-7 and Applications MM Content Description Generation Description Definition Language (DDL) Filter Agents Description Schemes (DS) MPEG-7 Description Encoder Descriptor (D) MPEG-7 Coded Description Search/ Query Engine Decoder Human or System User or data processing system
MPEG-7 Requirements Descriptors: (Cross modularity, Direct data manipulation, Data adaptation, Language of text based descriptions, Linking, Prioritization of related information, Unique identification) Description Schemes: ( Description Scheme relationships, Prioritization of descriptors, Hierarchy of descriptors, Scalability of descriptors, Description of temporal range, Data adaptation) DDL: (Compositional capabilities, Unique identification, Primitive/composite data types, Multiple media types, Relationships between description and data, Grammar, Intellectual Property Management and Protection (IPMP), Real time support …. . Etc. ) Descriptors requirement: z General: (Types of features [N-dimensional spatio-temporal structure, Objectives, subjective, Production, composition, Concepts], Referencing analogue data, …. . Etc. )
MPEG-7 Requirements (Cont. ) z z Functional: ( Retrieval effectiveness, similarity-base retrieval …. Etc. ) Coding: (Description efficient representation, Description extraction…Etc. ) Visual specific: (Types of features (color, texture, sketch…), Visual data formats…Etc. ) Audio specific: (Types of features (Frequency contour, Harmony…), Auditory data formats…Etc. ) z Text specific: (Text retrieval, consistency of text description tools) (Types of features (Frequency contour, Harmony…), Auditory data formats…Etc. ) System requirement: (multiplexing, Temporal synchronization, File format, IPMP…Etc. ) Ref: MPEG Requirement Group, “MPEG-7 Requirement”, Doc, ISO/MPEG N 2859, MPEG Vancouver Meeting, July 1999.
MPEG-7 Applications z Pull applications: y video retrieval: (storage and retrieval of video database. Sound effects library, historical speech database…Etc. ) z Push applications: y video selection and filtering: ( Personalized television services, information access facilities for people with special needs. . Etc. ) z Specialized professional and control applications. (Remote sensing applications, Surveillance applications. . Etc. )
A proposal for an MPEG-7 Description Definition language (DDL) z Reference: [J. Hunter (DSTC)] z A schema is based on different schemas; Resource Description Framework (RDF) Schema, XML Document Type Descriptors (DTD), Document Content Description (DCD), A Schema for Object-Oriented XML (SOX). z Satisfies The DDL requirements. z Consists of classes, properties and relations between classes. z Uses of Dublin Core (DC) attributes. (Name, Identifier, Version, Registration Authority, Language, Definition, Obligation, Datatype, Maximum Occurrence, Comment)
The Description Scheme MM Document Audio Video DC. Title DC. Creator DC. Subject DC. Publisher DC. Description DC. Contributor DC. Date DC. Type DC. Format DC. Identifier DC. Source DC. Language DC. Relation. Has. Part DC. Rights Speed Track Music Track Sound. FX Track Sequence 1 Sequence 2 Sequence 3 Phoneme List Scope MIDI tempo List of sound. FX Scene 1. 1 Scene 1. 2 Scene 1. 3 Shot 1. 1. 1 Shot 1. 1. 2 Shot 1. 1. 3 Frame 1 Object 1 Frame 120 Object 2 Object 3 DC. Subject DC. Description DC. Contributor. Presenter DC. Type DC. Format. Length DC. Identifier DC. Relation. Has. Part DC. Coverage. T. Min DC. Coverage. T. Max DC. Description DC. Type DC. Format. Type DC. Identifier DC. Relation. Has. Part DC. Description DC. Type DC. Identifier MPEG-7 Text Script Transcript Edit. List Key. Frame Locale Cast Objects Text Script Transcript Edit. List Key. Frame Camera. Dist Camera. Angle Camera. Motion Lighting Open. Trans Close. Trans Text Image Timestamp colour Anno. Text Anno. Posn Text Position shape Trajectory Speed Colour Texture Volume Anno. Text Anno. Posn
Features of the proposed MPEG-7 DDL z Namespace Declarations <x xmlns: dc=“http: //purl. org/metadata/dublin_core#”> <!-- the ”dc" prefix is bound to http: //purl. org/metadata/dublin_core for the "x" element and contents --> <dc: Title> CNN News </ dc: Title> </x> z The Class Type declarations and Class Hierarchies <class id=“MM_Document”> <property type = “#dc_attribs”/> </class> <class id=“Video_Document”> <subclassof type=“#MM_Document” /> <property type = “duration”/> </class>
Features …. . (Cont. ) z Property type declaration <property. Type id=“frame. Num” datatype=“int”/>, <property. Type id=“secs” datatype=“float”/> <property. Type id=“timestamp”> <Alt> <property type =“#frame. Num”> <property type =“#secs”> </Alt> </propert. Type> z The relationship type declaration <realtion. Type id=“contains” direction=“uni” inverse=“#contained_by”> <domain type=“#MM_Document’/> <range type=“#MM_Document” occurs=“zerormore” order=“Seq”/> <constraint type=“boolean” value= “((range[1]. start>=domain. start)&&range[n]. end <= domain. end ) )”/> </relation. Type> <class id=“scene”> <subclassof type=“#MM_Document” /> <property type = “#dc_attribs”/> <relation type=“contains” range=“#object”/> </class>
Features …. . (Cont. ) z Order and Occurs (Seq, Bag, Alt, Par) z Data typing & user defined datatypes z Attribute Definitions: <attribute. Type id=“src” datatype=“uri”/> z Synchronization and temporal specification <seq> <audio_track src=“audio 1”/> <audio_track begin=“ 5 s” src=“audio 2”/> </seq> Audio 1 5 s Audio 2 z Spatial specification y Both rectangle and polygon representation, HTML syntax and semantics.
Example: MPEG-7 description …………………. <MM_Document src = http: //………. /test. mpg> <!-- other properties> <contains> <Par> <Seq id=“vidoe_sequences”> <sequence id=“seq 1” src=“http: //……. ” /> <!-- other sequences> </Seq> <Seq id=“audio_tracks”> <Audio id=“speech” src=“test. ra”/> <!-- other audios> </Seq> </Par> </contains> <sequence id=‘seq 1” src=“http: //…. . ”/> ……… <contains> <Seq> <Scene id=“scene 1” src=“http: //……. ” /> <!-- other scenes> </Seq> </contains> …………………………. . <!-- declaration od shots, frames and objects/> ……………. . …………………. . <Object id=“#car” src=“http: //……………. . /test. jpg#car”> ………. . <DC. Description. text>”A red car which has been severely damaged by the exposition. ” </DC. Description. text>” ………. . . </object>
Conclusion z The proposed DDL provides most of the DDL requirement. z There are some remarks: y Lack of provision to push applications: (filtering and selection, real time support) y No representation for subjective and concept features. y Simple representation and support for spatial features.
- Slides: 17