Media Hub An Intelligent Multi Media Distributed Platform
Media. Hub: An Intelligent Multi. Media Distributed Platform Hub Glenn Campbell, Tom Lunney, Paul Mc Kevitt School of Computing and Intelligent Systems Faculty of Engineering University of Ulster, Magee Campus Northland Road, Derry {Campbell-g 8, TF. Lunney, P. Mc. Kevitt} @ulster. ac. uk PGNET, Liverpool JMU, June 2005
Outline l l l Goals and objectives Key research problems Distributed Processing Distributed Platforms Architecture of Media. Hub Tools and future development PGNET, Liverpool JMU, June 2005
Project goals The primary objectives of this research are to: l Interpret/generate semantic representations of multimodal input/output l Perform decision-making (fusion and synchronisation) over multimodal data l Implement Media. Hub, a multimodal platform hub PGNET, Liverpool JMU, June 2005
Project objectives Focus on following research questions: l l Will Media. Hub use frames for semantic representation or XML or one of its derivatives? How will Media. Hub communicate with various elements of a platform? Will Media. Hub constitute a blackboard or non-blackboard model? What mechanism will be implemented for decisionmaking within Media. Hub? PGNET, Liverpool JMU, June 2005
Key research problems l Semantic Representation l l l Semantic Storage l l l Represent language and vision Frames or XML? Blackboard model? Non-blackboard model? Decision-making l l Fusion and synchronisation AI technique PGNET, Liverpool JMU, June 2005
Semantic representation l Frames (CHAMELEON) [MODULE INPUT: input INTENTION: intention-type TIME: timestamp] [SPEECH-RECOGNISER UTTERANCE: (Point to Hanne’s office) INTENTION: instruction! TIME: timestamp] [GESTURE: coordinates (3, 2) INTENTION: pointing TIME: timestamp] l XML (M 3 L, Smart. Kom) <presentation. Task> <presentation. Goal> <inform. Focus> <Realization. Type>list </Realization. Type> </inform. Focus> </inform> <abstract. Presentation. Content> <discourse. Topic> <goal>epg_browse</goal> </discourse. Topic> <information. Search id="dim 24"><tv. Program id="dim 23"> <broadcast><time. Deictic id="dim 16">now</time. Deictic> <between>2003 -03 -20 T 19: 42: 32 2003 -0320 T 22: 00</between> <channel><channel id="dim 13"/> </channel> </broadcast></tv. Program> </information. Search> <result> <event> <piece. Of. Information> <tv. Program id="ap_3"> <broadcast> <begin. Time>2003 -03 -20 T 19: 50: 00</begin. Time> <end. Time>2003 -03 -20 T 19: 55: 00</end. Time> <av. Medium> <title>Today’s Stock News</title></av. Medium> <channel>ARD</channel> </broadcast>……. . </event> </result> </presentation. Goal> </presentation. Task> PGNET, Liverpool JMU, June 2005
Semantic storage l Blackboard or Non-blackboard? l l l High coupling – Blackboard? Low coupling - distributed architecture? Communication l l Via central blackboard? Message passing between modules? PGNET, Liverpool JMU, June 2005
Decision-making (fusion & synchronisation) l Rule-based l Potential for Other AI techniques l l Fuzzy Logic Neural Networks Genetic Algorithms Bayesian Networks (CPNs) PGNET, Liverpool JMU, June 2005
Distributed processing PVM (Parallel Virtual Machine) (Sunderam 1990, Fink et al. 1995) l ICE (Amtrup 1995) l DACS (Fink et al. 1995, 1996) l Open Agent Architecture (OAA) (Cheyer et al. 1998, OAA 2004) l JATLite (Kristensen 2001, Jeon et al. 2000) l Java. Spaces (Freeman 2004) l CORBA (Vinoski 1993) l PGNET, Liverpool JMU, June 2005
Intelligent Multimedia Distributed Platforms l Blackboard Model: l Ymir (Thórisson 1999) l CHAMELEON (Brøndsted et al. 1998, 2001) l Smartkom (Bühler et al. 2002, Wahlster et al. 2001, Smart. Kom 2004) l DARBS (Nolle et al. 2001) l DARPA Galaxy Communicator (Bayer et al. 2001) l Psyclone (Psyclone 2004) l Spoken Image/SONAS (Ó Nualláin et al. 1994, Ó Nualláin & Smith 1994, Kelleher et al. 2000) PGNET, Liverpool JMU, June 2005
Intelligent Multimedia Distributed Platforms l Non-blackboard Model: l WAXHOLM (Carlson et al. 1996) l AESOPWORLD (Okada 1996) l COLLAGEN (Rich et al. 1997) l INTERACT (Waibel et al. 1996) l Oxygen (Oxygen 2004) l EMBASSI (Kirste 2001, EMBASSI 2004) l MIAMM (MIAMM 2004) PGNET, Liverpool JMU, June 2005
CHAMELEON l Language & vision integration system l l l consists of ten modules, mostly programmed in C and C++ DACS communication system used for communication Blackboard stores semantic representations produced by other modules Communication between modules achieved by exchanging semantic representations between themselves or blackboard Semantic representation in form of input, output and integration frames PGNET, Liverpool JMU, June 2005
Architecture of CHAMELEON PGNET, Liverpool JMU, June 2005
Smart. Kom l User adaptive interface for human-computer interaction l l l Mobile Public Home/Office l Facilitates speech, gestures and facial expression input l XML-based mark-up language, M 3 L, used for semantic representation l Distributed multiple blackboard model PGNET, Liverpool JMU, June 2005
Architecture of Smart. Kom PGNET, Liverpool JMU, June 2005
Project proposal l Dialogue Manager l l Semantic Representation Database l l Acts as a blackboard module Facilitates communication between other modules Synchronisation Provides semantic representation of language and vision data Decision Making Module l AI technique for a unique form of decision-making l l Bayesian Networks (CPNs) Neural Networks, Genetic Algorithms, Fuzzy Logic PGNET, Liverpool JMU, June 2005
Architecture of Media. Hub PGNET, Liverpool JMU, June 2005
Comparison of Intelligent Multi. Media Platforms PGNET, Liverpool JMU, June 2005
Software Analysis l Main Programming Language l Java l C++ l Semantic Representation l XML l XHTML + Voice l SMIL l RDF Schema l MPEG-7 l Decision Making l HUGIN (Bayesian Networks) (Hugin 2004) PGNET, Liverpool JMU, June 2005 l Fuzzy. J Toolkit (Fuzzy Logic) (NRC 2004)
Project Schedule PGNET, Liverpool JMU, June 2005
Conclusion l An intelligent multimodal distributed platform hub called Media. Hub will be developed l Media. Hub will interpret and generate semantic representations of multimodal input and output l Media. Hub will perform fusion and synchronisation of language and vision data l Unique contribution of Media. Hub is to provide a new method of decision making l Media. Hub will be tested within an existing multimodal platform (e. g. CONFUCIUS) PGNET, Liverpool JMU, June 2005
- Slides: 21