Video Surveillance is Useless Peter Kovesi School of

Video Surveillance is Useless for Identification Peter Kovesi School of Computer Science & Software

Questions • What image quality do we need for identification? • How do you

Humans are very bad at recognizing unfamiliar faces • Kemp, Towell and Pike (1997)

Bruce et al (1999). Is this person in the array? If they are present

Automated Face Recognition is improving, but there is some way to go. Face Recognition

Identification performance appears to decline with database size in a log-linear manner. database size

• Even with good quality images face recognition performance, by human or machine,

Image quality is defined by many attributes • Minimum feature size that can be

Minimum feature size that can be resolved… Frequency Analysis of images It is possible

Human Face Recognition In humans it has been found that face recognition is tuned

~ 160 mm Human Face Recognition In humans it has been found that face

1951 USAF Chart Groupings of 6 pairs of bars. Each successive set is half

1951 USAF Chart 8 mm 16 mm Groupings of 6 pairs of bars. Each

Eye charts also provide a simple way of measuring the minimum feature size that

$20/20 Vision… … or in metric, 6/6 vision Snellen fraction Minimum Angle of Resolution$

$Snellen fraction log. MAR Chart Letter height 88 mm Number plate letters 80 mm$

Tests conducted with Pulnix TM 6 CN 1/2” CCD camera positioned 6 m from

Expect to lose quality when images are recorded to video Camera image recorded to

Compression is problematic. Test targets survive compression well, but faces do not. Original PNG

Faces do not survive compression well JPEG (14 k. B) JPEG (24 k. B)

Image from a Logitech webcam (640 x 480 image)

A Real Surveillance Camera Installation…

Luminance and colour cues are at least as important as shape cues People perform

Conclusions • Surveillance video, as it is currently used, is almost useless for identification.

Conclusions • Identification errors will occur in biometric systems. As database sizes grow this

Slides: 45

Download presentation

Video Surveillance is Useless Peter Kovesi School of Computer Science & Software Engineering The University of Western Australia

Video Surveillance is Useless for Identification Peter Kovesi School of Computer Science & Software Engineering The University of Western Australia

Questions • What image quality do we need for identification? • How do you measure image quality? • What is the image quality from a video camera? • What is the effect on image quality when you: • record to video tape? • use frame grabbers of different quality? • use image compression?

Humans are very bad at recognizing unfamiliar faces • Kemp, Towell and Pike (1997) tested the value of having photos on credit cards. When a user presented a card with a photograph of someone else that had some resemblance to the user, they were challenged less than 40% of the time. • Bruce et al. (1999, 2001) have tested the ability of people to match good quality CCTV images of unfamiliar faces under a variety of scenarios. Correct recognition rates are typically only 70 -80%.

Bruce et al (1999). Is this person in the array? If they are present match the person. Good quality photograph of target Array of 10 good quality CCTV images

Bruce et al (1999). Is this person in the array? If they are present match the person. When target was present in the array. 12% picked wrong person and 18% said they were not present (overall only 70% correct). Good quality photograph of target When target was not present in the array 70% still matched the target to someone in the array. Array of 10 good quality CCTV images

Automated Face Recognition is improving, but there is some way to go. Face Recognition Vendor Test 2002 Identification performance results: ~ 90% for a database of about 100 individuals ~ 65 -75% for a database of about 37, 000 individuals (US visa application photos taken with standardized equipment and with white backgrounds)

Identification performance appears to decline with database size in a log-linear manner. database size system performance parameter

• Even with good quality images face recognition performance, by human or machine, is poor. • Surveillance video rarely provides good quality images.

• Even with good quality images face recognition performance, by human or machine, is poor. • Surveillance video rarely provides good quality images. What image quality is needed for face identification?

Image quality is defined by many attributes • Minimum feature size that can be resolved • Noise level • Quality of luminance reproduction • Quality of colour reproduction.

Minimum feature size that can be resolved… Frequency Analysis of images It is possible to build up any waveform by adding up a series of sine waves. The human visual system analyses many aspects of images in terms of different frequency components.

Human Face Recognition In humans it has been found that face recognition is tuned to a set of spatial frequencies ranging from about 20 cycles per face width down to about 5 cycles per face width. � The most important spatial frequency for face recognition corresponds to about 10 cycles/face width. To be able to recognize with some confidence you need to be able to resolve 20 cycles/face width (Nasanen 1999) 20 cycles 10 cycles 5 cycles

~ 160 mm Human Face Recognition In humans it has been found that face recognition is tuned to a set of spatial frequencies ranging from about 20 cycles per face width down to about 5 cycles per face width. � The most important spatial frequency for face recognition corresponds to about 10 cycles/face width. To be able to recognize with some confidence you need to be able to resolve 20 cycles/face width (Nasanen 1999) 8 mm 20 cycles 16 mm 10 cycles 5 cycles

1951 USAF Chart Groupings of 6 pairs of bars. Each successive set is half the size of the previous.

1951 USAF Chart 8 mm 16 mm Groupings of 6 pairs of bars. Each successive set is half the size of the previous.

Eye charts also provide a simple way of measuring the minimum feature size that can be resolved.

$20/20 Vision… … or in metric, 6/6 vision Snellen fraction Minimum Angle of Resolution$

20/20 Vision… … or in metric, 6/6 vision Snellen fraction Minimum Angle of Resolution Distance at which you can read the line on the chart 6 6 Distance at which you should be able to read the line

$Snellen fraction log. MAR Chart Letter height 88 mm Number plate letters 80 mm$

Snellen fraction log. MAR Chart Letter height 88 mm Number plate letters 80 mm 6/48 72 mm Average eye spacing 65 mm 58 mm 44 mm 6/24 36 mm 6/12 18 mm 6/6 9 mm

Tests conducted with Pulnix TM 6 CN 1/2” CCD camera positioned 6 m from the target. C-mount lenses: 4 mm 6 mm 8. 5 mm 12. 5 mm 16 mm Images were digitized directly from the camera using a Data Translation 3155 frame grabber (a good quality monochrome digitizer).

4 mm lens

6 mm lens

8. 5 mm lens

12. 5 mm lens

16 mm lens

Expect to lose quality when images are recorded to video Camera image recorded to video, then played back and digitized. (Look at the USAF chart) Camera image digitized directly. (cropped images taken with 12. 5 mm lens)

Compression is problematic. Test targets survive compression well, but faces do not. Original PNG image (190 k. B) JPEG images compressed using Photoshop. Image ‘quality’ can range from 0 - 12 JPEG image quality 0 (14 k. B) JPEG image quality 4 (24 k. B)

Faces do not survive compression well JPEG (14 k. B) JPEG (24 k. B) Original

Image from a Logitech webcam (640 x 480 image)

A Real Surveillance Camera Installation…

4. 8 m

Image quality is defined by many attributes • Minimum feature size that can be resolved • Noise level • Quality of luminance reproduction • Quality of colour reproduction.

Luminance and colour cues are at least as important as shape cues People perform about equally well using just shape information or just pigmentation cues. O’Toole et al 1999 see also Russell et al 2004 Original faces Fixed shape varying pigmentation Fixed pigmentation varying shape

Luminance and colour cues are at least as important as shape cues People perform about equally well using just shape information or just pigmentation cues. Image compression typically quantizes colour information very heavily… O’Toole et al 1999 also Russell et al 2004 Original faces Fixed shape varying pigmentation Fixed pigmentation varying shape

Conclusions • Surveillance video, as it is currently used, is almost useless for identification. • Surveillance cameras should be sited strategically to capture close ups of people as they pass through a constrained point, such as a doorway. • Image quality standards should be developed for surveillance camera installations. • Work is needed to develop measures of image degradation caused by compression.

Conclusions • Identification errors will occur in biometric systems. As database sizes grow this will be an increasing problem. Claims of misidentification have to be taken seriously. • Face recognition performance is poor. It should probably only be used for verification of identity (one to one test), and not for identification (one to many test).