Bushman On Line Dictionary Honours Project Proposal Project
Bushman On. Line Dictionary Honours Project Proposal Project Supervisor: Dr Hussein Suleman Sanvir Manilal Lebogang Molwantoa Kyle Williams 22 May 2009
INTRODUCTION Ø The Bleek and Lloyd Collection is set of indigenous artefacts. Ø Project aims to integrate a collection of digital scans corresponding to a dictionary to the existing Bleek and Lloyd Collection.
SCANNED IMAGES
IMAGE-BASED DICTIONARY Ø Image-based dictionary matches a word to an image. Ø Image-based dictionary for indigenous languages – used as live reference. Ø Aim to integrate collection of digital scans as image-based dictionary.
RESEARCH QUESTION Ø Is it possible to create a reusable, generic archival system that allows users to access an image-based dictionary?
PROPOSED SOLUTION Ø Key-features: § Archive Collection Management § § Search and Browsing functionality § § Lebogang Molwantoa Sanvir Manilal Image-based translation. § Kyle Williams
Sanvir Manilal Lebogang Molwantoa Kyle Williams PROPOSED SOLUTION: SYSTEM OVERVIEW High-level overview of system design
ARCHIVE MANAGEMENT COLLECTION Ø Research Question tackled: § Ø Can we develop a useful and efficient archival system for an image-based dictionary? Archive management collection component of the project can be considered as back end to the system.
ARCHIVE MANAGEMENT COLLECTION Ø Archive is a repository for scanned images and associated metadata. Ø Extensible and allow easy update via API. Ø Size and complexity presents a challenge
ARCHIVE MANAGEMENT COLLECTION Ø Can we develop a useful and efficient archival system? Ø Efficient archive capable of processing and archiving large numbers of images efficiently.
IMAGE BASED TRANSLATION
IMAGE BASED TRANSLATION
IMAGE BASED TRANSLATION Ø Research Question: Can image based searching be done accurately and efficiently? Ø Needs to return correct results Ø Needs to return results in short amount of time
CONTENT BASED IMAGE RETRIEVAL (CBIR)
WORD SPOTTING Ø Subset of CBIR specifically for handwritten historical documents Ø Performs image matching on images of words Ø Used to find repeat occurrences of words in manuscripts
IMAGE BASED TRANSLATION Ø Ø Ø Look at features of word/phrase Create signature for word/phrase based on feature vector Insert signature into database
IMAGE BASED TRANSLATION
IMAGE BASED TRANSLATION Ø User selects word/phrase in collection Ø Signature is calculated Ø Signature is used to search for matches in archive
IMAGE BASED TRANSLATION Ø Accuracy? Ø Evaluation based on controlled tests as well as random tests Ø Key success will be the ability to accurately and efficiently provide image based translation
METHODOLOGY Ø Requirements gathering § Ø Iterative design and development § § Ø UCT Fine arts Three prototyping phases Carrying out periodic evaluation with users Final system integration § Carrying out evaluation of the overall system functioning and performance
SEARCH, DISPLAY AND BROWSE Ø Research question tackled: § Ø Can searching and browsing a visual dictionary be done in an efficient and effective way? Unique problem
SEARCH, DISPLAY AND BROWSE
SEARCH, DISPLAY AND BROWSE
SEARCH, DISPLAY AND BROWSE A few techniques: Ø Live Search Ø Inexact pattern matching Ø Thumbnails Ø Hyperlinks Ø Scrollable list of words Ø
SEARCH, DISPLAY AND BROWSE Ø Evaluation: § Efficiency § Performance § Effectiveness § Aesthetics
DELIVERABLES & WORK ALLOCATION Ø Fully working system with three key components Ø Archive – Lebogang Molwantoa Tools for searching and displaying – Sanvir Manilal Image matching tool – Kyle Williams Reports, website, poster, reflections – Everyone
CONCLUSION Three component project Ø Iterative design Ø Evaluation Ø Questions? Ø
- Slides: 27