Introduction MeiChen Yeh 03062012 Outline What is Multimedia

  • Slides: 47
Download presentation
Introduction Mei-Chen Yeh 03/06/2012

Introduction Mei-Chen Yeh 03/06/2012

Outline • What is Multimedia? • Multimedia System Evaluation – Data – Ground truth

Outline • What is Multimedia? • Multimedia System Evaluation – Data – Ground truth – Measurement • Qo. E Framework • Image Representation

What is Multimedia? • A PC vendor – A PC that has sound capability,

What is Multimedia? • A PC vendor – A PC that has sound capability, a DVD-ROM drive, and perhaps the superiority of multimedia-enabled microprocessors that understand additional multimedia instructions • A consumer entertainment vendor – An interactive cable TV with hundreds of digital channels available, or a cable TV-like service delivered over a highspeed Internet connection • A CS student – Applications that use multiple modalities, including text, images, drawings (graphics), animation, video, sound including speech, and interactivity. Slides from Li & Drew ©Prentice Hall 2004

Multimedia systems: a multidisciplinary subject • • • • Graphics HCI Visualization Computer vision

Multimedia systems: a multidisciplinary subject • • • • Graphics HCI Visualization Computer vision Data compression Graph theory Networking Database systems Signal processing Data mining Machine learning Pattern recognition …

Examples of multimedia systems (just a few) Video teleconferencing Distributed lectures for higher education

Examples of multimedia systems (just a few) Video teleconferencing Distributed lectures for higher education Tele-medicine Co-operative work environments Searching in (very) large video and image databases for target visual objects • Augmented reality: placing real-appearing computer graphics and video objects into scenes • … • • •

Resources of the readings • Proceedings of ACM Multimedia Conference – The premier annual

Resources of the readings • Proceedings of ACM Multimedia Conference – The premier annual event on multimedia research, technology, and art – Started since 1993 – Full papers (<20%), short papers (<30%) – Technical demonstrations, open source software competition, the doctoral symposium, tutorials (6), workshops (11), a brave new topic session, panels (2), Multimedia grand challenge • IEEE Transactions on Multimedia • IEEE Multimedia

11 areas in ACM MM 2012 • • • Media Content Analysis and Processing

11 areas in ACM MM 2012 • • • Media Content Analysis and Processing Multimedia Activity and Event Understanding Multimedia Search and Retrieval Mobile and Location-Based Media Social Media Multimedia Systems and Middleware Media Transport and Sharing Multimedia Security and Forensics Multimedia Authoring, Production and Consumption Multimedia Interaction and Applications Multimedia Art, Entertainment and Culture

Scientifically interesting Useful (Engineering, Industry) Novel

Scientifically interesting Useful (Engineering, Industry) Novel

Multimedia System Evaluation • Data • Ground truth • Measurement

Multimedia System Evaluation • Data • Ground truth • Measurement

Examples • How to evaluate a face recognition system? – Data: face images –

Examples • How to evaluate a face recognition system? – Data: face images – Ground Truth: names (one for each face image) – Metric: recognition rate • How to evaluate a face detection system? – Data: two sets of images, with and without faces – Ground truth: face number and location for each image • Coarse vs. fine labeling – True detection rate and false positive rate

Multimedia System Evaluation • How to evaluate a visual instance retrieval system? – Bull’s

Multimedia System Evaluation • How to evaluate a visual instance retrieval system? – Bull’s eye test Example: 1400 images in 70 categories For each image, compare to every other images, and count # correct matches in top 40 Retrieval rate = # correct matches / # possible hits (28, 000 in this case)

Multimedia System Evaluation • How to evaluate the singing performance? – A karaoke equipment

Multimedia System Evaluation • How to evaluate the singing performance? – A karaoke equipment – A television competition – Ground truth? • More examples? – a person’s appearance (Hot or Not) – a recipe – an instructor’s teaching performance…

Multimedia System Evaluation • “Quality of experience” – User satisfaction in using a system

Multimedia System Evaluation • “Quality of experience” – User satisfaction in using a system (or enjoying multimedia contents) • Gathers ground truth from users • Quantifies users’ perceptions

Qo. S and Qo. E • Quality of Service – The quality level of

Qo. S and Qo. E • Quality of Service – The quality level of native performance metric • pitch, tone, interpretation of a lyric… • Quality of Experience – How users feel about a service – Effected by multi-dimensional factors, and their tradeoffs – An unified scalar is desired! Which Qo. S metric is most influential on users’ perceptions? Slides from Dr. Kuan-Ta Chen

Amazon Mechanical Turk MTurk was launched publicly on November 2, 2005.

Amazon Mechanical Turk MTurk was launched publicly on November 2, 2005.

Mean Opinion Score (MOS) pros and cons?

Mean Opinion Score (MOS) pros and cons?

MOS • Concepts of the scales cannot be concretely defined • Dissimilar interpretations of

MOS • Concepts of the scales cannot be concretely defined • Dissimilar interpretations of the scale among users • Difficult to verify users’ scores

Paired Comparison A B

Paired Comparison A B

Examples: 4 instances, 10 experiments A A B C D 1 0 1 B

Examples: 4 instances, 10 experiments A A B C D 1 0 1 B 9 3 2 C 10 7 4 A matrix of choice frequencies D 9 8 6 # choices that participants prefer C over D

Examples: 4 instances, 10 experiments A A B C D 0. 9 1. 0

Examples: 4 instances, 10 experiments A A B C D 0. 9 1. 0 0. 9 0. 1 0. 7 0. 8 0 0. 3 0. 6 0. 1 0. 2 0. 4 A matrix of choice frequencies Estimated P(choose C over D) Pij = f(si, sj) Example: The Bradley-Terry-Luce (BTL) model: D: 0 C: 0. 63 B: 0. 91 A: 1

Verification • Not every user is trustworthy! • Transitivity property – if A>B and

Verification • Not every user is trustworthy! • Transitivity property – if A>B and B>C => A>C • Transitivity Satisfaction Rate (TSR)

Verification • Detect inconsistent judgments from problematic users – TSR = 1 �� perfect

Verification • Detect inconsistent judgments from problematic users – TSR = 1 �� perfect consistency – TSR >= 0. 8 generally consistent – TSR < 0. 8 judgments are inconsistent

Image Representation

Image Representation

Multimedia file formats • A list of some formats used in the popular product

Multimedia file formats • A list of some formats used in the popular product “Macromedia Director” • These formats differ mainly in how data are compressed. • Features are normally extracted from raw data.

1 -bit images • Each pixel is stored as a single bit (0 or

1 -bit images • Each pixel is stored as a single bit (0 or 1), so also referred to as binary image. • So-called 1 -bit monochrome image No color

8 -bit gray-level images • Each pixel has a grayvalue between 0 and 255.

8 -bit gray-level images • Each pixel has a grayvalue between 0 and 255. (0=>black, 255=>white) • Image resolution refers to the number of pixels in a digital image • A 640 x 480 grayscale image requires ? ? ? k. B One byte per pixel 640 x 480 = 307, 200 ~ 300 k. B

24 -bit color images • Each pixel is represented by three bytes, usually representing

24 -bit color images • Each pixel is represented by three bytes, usually representing RGB. • This format supports 256 x 256 (16, 777, 216) possible colors. • A 640 x 480 24 -bit color image would require 921. 6 k. B! 1972 Lena: 1997

original image Y luminance U chrominance V

original image Y luminance U chrominance V

Image Features • Raw data occupy a lot of memory, may be noisy. •

Image Features • Raw data occupy a lot of memory, may be noisy. • Summarize raw data into descriptive feature values.

Feature types • Global features – Color – Shape – Texture • Local features

Feature types • Global features – Color – Shape – Texture • Local features – – – A fixed-length feature vector SIFT SURF Self similarity Shape context HOG… …

Color histogram • A color histogram counts pixels with a given pixel value in

Color histogram • A color histogram counts pixels with a given pixel value in Red, Green, and Blue (RGB). • An example of histogram that has 2563 bins, for 24 -bit color images:

Color histogram (cont. ) • Quantization cyan magenta yellow

Color histogram (cont. ) • Quantization cyan magenta yellow

Color histogram (cont. ) • Problems of such a representation Case 1 SAME! Case

Color histogram (cont. ) • Problems of such a representation Case 1 SAME! Case 2 SAME! Case 3 SAME!

Search by color histograms

Search by color histograms

Regional color • Divide the image into regions • Extract a color histogram for

Regional color • Divide the image into regions • Extract a color histogram for each region • Put together those color histograms into a long feature vector

Textures • Many natural and man-made objects are distinguished by their texture. • Man-made

Textures • Many natural and man-made objects are distinguished by their texture. • Man-made textures What is this? – Walls, clothes, rugs… • Natural textures – Water, clouds, sand, grass, …

Examples More: http: //www. ux. uis. no/~tranden/brodatz. html

Examples More: http: //www. ux. uis. no/~tranden/brodatz. html

Textual Properties • • • Coarseness: coarse vs. fine Contrast: high vs. low Orientation:

Textual Properties • • • Coarseness: coarse vs. fine Contrast: high vs. low Orientation: directional vs. non-directional Edge: line-like vs. blob-like Regularity: regular vs. random Roughness: rough vs. smooth

Texture features • Structural – Describe arrangement of texture elements – E. g. ,

Texture features • Structural – Describe arrangement of texture elements – E. g. , “texton model”, “texel model” • Statistical – Characterize texture in terms of statistics – E. g. , co-occurrence matrix, Markov random field • Spectral – Analyze in spatial-frequency domain – E. g. , Fourier transform, Gabor filter, wavelets

Shape • Boundary-based feature – Use only the outer boundary of the shape –

Shape • Boundary-based feature – Use only the outer boundary of the shape – E. g. Fourier descriptor, shape context descriptor • Region-based feature – Use the entire shape region – Local descriptors

Shape: Fourier descriptor

Shape: Fourier descriptor

Properties • Invariant to translation, scale, and rotation

Properties • Invariant to translation, scale, and rotation

Assignment #0 • Upload one of your photos with names on Moodle (http: //moodle.

Assignment #0 • Upload one of your photos with names on Moodle (http: //moodle. ntnu. edu. tw/) – < 640 * 480 – Type in your name – Filename: your_student_id. jpg • Due on 03/12 (Mon. ) 11: 55 pm

Assignment #1 • Collect landmark photos – At least 5 sets – Each set

Assignment #1 • Collect landmark photos – At least 5 sets – Each set contains 8 images – Resize the images into a standard size (around 800*600) – Upload the images on moodle – Format: your_student_id. zip • /Taipei_101/001. jpg (no space, use underscore instead)

Reading • David G. Lowe. Distinctive Image Features from Scale-Invariant Key-points, IJCV, 2004.

Reading • David G. Lowe. Distinctive Image Features from Scale-Invariant Key-points, IJCV, 2004.