Steganography and Digital Watermarking Jonathan Cummins Patrick Diskin

  • Slides: 38
Download presentation
Steganography and Digital Watermarking Jonathan Cummins, Patrick Diskin, Samuel Lau, Robert Parlett, Mark Ryan

Steganography and Digital Watermarking Jonathan Cummins, Patrick Diskin, Samuel Lau, Robert Parlett, Mark Ryan

Introduction Steganography (covered writing, covert channels) Protection against detection Protection against removal (data hiding)

Introduction Steganography (covered writing, covert channels) Protection against detection Protection against removal (data hiding) (document marking) Watermarking Fingerprinting (all objects are marked (identify all objects, every in the same way) object is marked specific) Source: Richard Popa.

Steganography and encryption Confidentiality Encryption Steganography (hide the content of the message, but not

Steganography and encryption Confidentiality Encryption Steganography (hide the content of the message, but not the existence of the message) (hide the content of the message and even the existence of the message) • Anybody can see both parties are communicating in secret. • Ideally nobody can see that both parties are secretly communicating. • Suspicious. • Innocent.

History • 440 B. C. – Histiaeus shaved the head of his most trusted

History • 440 B. C. – Histiaeus shaved the head of his most trusted slave and tattooed it with a message which disappeared after the hair had regrown. To instigate a revolt against Persians. • 1 st World War – Prisoners of war would hide messages in letters home. Censors intercepting the messages often changed the phrasing to try to obstruct. A message reading “Father is dead” was modified to read “Father is deceased” and when reply came back “Is Father dead or deceased? ”, censor was alerted to the hidden message. • 2 nd World War – Germans would hide data as microdots. This involved photographing the message to be hidden and reducing the size so that it could be used as a period within another document. • Current – Industry demands for digital watermarking and fingerprinting of audio and video.

Hiding information digitally Requirements • The quality of the marked object must noticeably degrade

Hiding information digitally Requirements • The quality of the marked object must noticeably degrade upon the addition of the mark. • Marks should be undetectable without some secret knowledge (typically, a key) • The marks should survive transformations which don’t degrade the perceived quality of the work • If multiple marks are present, they should not interfere with each other.

How steganography works Secret Image Cover Image Secret Image Stego Object Encoder Key Decoder

How steganography works Secret Image Cover Image Secret Image Stego Object Encoder Key Decoder Communications Channel Original Cover

Robustness and fragility • Fragile – Hidden information destroyed as soon as object is

Robustness and fragility • Fragile – Hidden information destroyed as soon as object is modified. – Protocols tend to be easy to implement. – Useful in proving objects have not been manipulated and changed e. g. evidence in a court of law. • Robust – It should be infeasible to remove the hidden data without degrading the perceived quality of the data. – Protocols are more complex. – One single protocol may not withstand all object manipulations. – Useful in copyright watermarking.

Steganography techniques • Program files • Text • Images – Least significant bit –

Steganography techniques • Program files • Text • Images – Least significant bit – Direct cosine transform • Audio – MP 3

Information hiding in program files • If we change something in a program source,

Information hiding in program files • If we change something in a program source, execution could be different. • We can use a serial key or author’s logo to achieve copyright protection. • Method for watermarking the binary source: a = 2; b = 3; c = b + 3; a = 2; c = b + 3; d = b + c; a = 2; • b, c, d must be done in same order, but a can be executed at any time.

Information hiding in program files W = (w 1, w 2, w 3, w

Information hiding in program files W = (w 1, w 2, w 3, w 4, . . . , wn) (Watermark) wi {0, 1} • Divide program into n blocks. • If wi = 0 then code of block i left unchanged; if wi = 1 then two instructions are switched in block i. • To decode we need the original binary file. Comparing the original and marked binary files, we can recover W. • Not resistant to attacks. • If the attacker has enough copies, he can recover W.

Information hiding in documents • How to embed marks in documents? – The goal

Information hiding in documents • How to embed marks in documents? – The goal is to hide information (or a mark) – Without changing the meaning or the appearance of the document – The info or the mark can only be extracted by key holder

Information hiding in documents – Line Shift Coding - Vertical shifting of lines Shifts

Information hiding in documents – Line Shift Coding - Vertical shifting of lines Shifts lines up slightly up or down Lines to be shifted decided by Codebook – Word Shift Coding - Horizontal spacing between each word Shift of words slightly left or right, decided by codebook An example of this

Text techniques • White Space manipulation in plain text files – Text viewers can’t

Text techniques • White Space manipulation in plain text files – Text viewers can’t see white space at the end of lines. • Using a document’s sentence structure and the choice of words to hide information – “The auto drives fast on a slippery road over the hill” changed to “Over the slope the car travels quickly on an ice-covered street”.

Text techniques • www. Spam. Mimic. com takes a message and generates some apparent

Text techniques • www. Spam. Mimic. com takes a message and generates some apparent spam that encodes it. • • On the right, the text being hidden is: “I'm having a great time learning about computer security”. Spam. Mimic’s website doesn’t reveal how it works, but it is probably not hard to figure out. Dear Friend , Especially for you - this red-hot intelligence. We will comply with all removal requests. This mail is being sent in compliance with Senate bill 2116 , Title 9 ; Section 303 ! THIS IS NOT A GET RICH SCHEME. Why work for somebody else when you can become rich inside 57 weeks. Have you ever noticed most everyone has a cellphone & people love convenience. Well, now is your chance to capitalize on this. WE will help YOU SELL MORE and sell more ! You are guaranteed to succeed because we take all the risk ! But don't believe us. Ms Simpson of Washington tried us and says "My only problem now is where to park all my cars". This offer is 100% legal. You will blame yourself forever if you don't order now ! Sign up a friend and you'll get a discount of 50%. Thank-you for your serious consideration of our offer. Dear Decision maker ; Thank-you for your interest in our briefing. If you are not interested in our publications and wish to. . .

Hiding info in XML documents • Using tag structure to hide information Stego key:

Hiding info in XML documents • Using tag structure to hide information Stego key: <img></img> … 0 <img/> … 1 Bit String: 01110 Stego data: <img src=”foo 1. jpg”></img> <img src=”foo 2. jpg”/> <img src=”foo 3. jpg”/> <img src=”foo 4. jpg”/> <img src=”foo 5. jpg”></img>

XML documents • Using white space in tags <user > Stego key: <name>Alice</name >

XML documents • Using white space in tags <user > Stego key: <name>Alice</name > <tag>, </tag>, or <tag/> … 0 <tag >, </tag >, or <tag /> … 1 Bit String: 101100 <id >01</id> </user>

Watermarking images A simple way of watermarking an image is to superimpose another image

Watermarking images A simple way of watermarking an image is to superimpose another image onto it. – The superimposed image might be a company logo or name, etc. – In this case, the aim is not to hide anything, but simply to watermark the image + =

Hiding data in images: LSB • LSB – least significant bits – A simple

Hiding data in images: LSB • LSB – least significant bits – A simple yet effective way of hiding data in an image for any purpose. – The least significant bits of the host image are used to hide the most significant bits of the hidden data. – The hidden data can be another image, or something else. – The next example will show image-in-image hiding works via this method.

Hiding data in images: LSB • Pick the number of bits you wish to

Hiding data in images: LSB • Pick the number of bits you wish to hide the hidden image in. • Scan through the host image and alter its LSB’s with the hidden images MSB’s. So when 4 bits are used to hide information… Host Pixel: 10110001 Secret Pixel: 00111111 New Image Pixel: 10110011 • To extract the hidden image, you take out the LSB’s from the host image and create a new image from them.

Hiding APS inside MDR Original Bit Level Images 1 4 7

Hiding APS inside MDR Original Bit Level Images 1 4 7

Hiding in images: LSB • As an image-in-image method, it appears to work best

Hiding in images: LSB • As an image-in-image method, it appears to work best when both the hidden image and host image have equal priority in terms of the number of bits used. • Not a very good way of watermarking as it is easy to remove the hidden data. • The hidden data can easily be corrupted by noise. • The LSB’s can be used to store other information, like text. But the limitation is the size of the data you wish to store.

Hiding in images: DCT • DCT (Direct Cosine Transformation) is a kind of Fourier

Hiding in images: DCT • DCT (Direct Cosine Transformation) is a kind of Fourier transform. • DCT’s convert image pixel values from the amplitude domain to the frequency domain. – High frequencies correspond to rapidly changing pixel values. – Low frequencies correspond to slowly changing pixel values. • Used to compress JPEG images and can be used as part of a information hiding technique. • The frequency domain exposes the aspects of the image which humans cannot perceive.

JPEG compression works roughly as follows: 1. Split image into 8 x 8 squares,

JPEG compression works roughly as follows: 1. Split image into 8 x 8 squares, of 64 pixels. 2. Transform each square via a DCT, which outputs a new 8 x 8 array of coefficients. 3. Round (“quantize”) each of these coefficients. This is the lossy compression stage. Small unimportant coefficients are rounded to 0 while larger ones lose some of their precision. 4. Further compress the array via a Huffman encoding scheme or similar. Reconstruction of the image is done by inverting the compression and then an inverse DCT.

JPEG compression • A quantizer is used as part of the JPEG compression technique.

JPEG compression • A quantizer is used as part of the JPEG compression technique. • It lowers the accuracy of the DCT coefficients, but in a way which the user hopefully can’t perceive. • To hide information, these values can be tweaked to be all even or all odd. All even = 1 All odd = 0 • An image can store 1 bit of information per 8 x 8 block.

DCT example • DCT example Original Watermarked Pretty much no difference! JPEG Compressed

DCT example • DCT example Original Watermarked Pretty much no difference! JPEG Compressed

Wavelet transformation – Wavelets are mathematical functions for image compression and digital signal processing.

Wavelet transformation – Wavelets are mathematical functions for image compression and digital signal processing. – Used in the JPEG 2000 standard. – Wavelets are better for higher compression levels than the DCT method. – Generally wavelets are more robust and are a good way of hiding data.

Hiding in wavelets • Wavelets are used to store the “detail” in images. •

Hiding in wavelets • Wavelets are used to store the “detail” in images. • They store the high frequency information while the low frequency information is stored separately. • This allows for high compression as the detail is never lost and yet the low frequency image parts can be compressed continually. • Same techniques as used with DCT during the quantizer step. • Currently an ongoing research area.

Hiding in MP 3 s • MP 3 – The data to be hidden

Hiding in MP 3 s • MP 3 – The data to be hidden is stored as the MP 3 file is created – in the compression stage. – As the sound file is being compressed, data is selectively lost depending on the bit rate the user has specified. – The hidden data can be encoded in the parity bit of this information. – To retrieve the data all you need to do is uncompress the MP 3 file and read the parity bits.

Limitations and attacks • Five categories of attacks: - Basic attacks take advantage of

Limitations and attacks • Five categories of attacks: - Basic attacks take advantage of weaknesses in embedding technique. - Robustness attacks attempt to diminish or remove the watermark. - Presentation attacks modify the content of the file to prevent detection of mark. - Interpretation attacks involve finding a situation which prevents assertion of ownership. - Implementation attacks take advantage of poorly implemented software.

Limitations • Most techniques are quite fragile • Successive marks interfere with each other

Limitations • Most techniques are quite fragile • Successive marks interfere with each other • Marks can affect quality of original file

Robustness attacks • Attacker transforms the file, exploiting the fragility of the mark, in

Robustness attacks • Attacker transforms the file, exploiting the fragility of the mark, in order to try to remove it. • Many techniques can survive individual transformations but are vulnerable to combinations of them. • Defence: try to anticipate pirate’s actions and design to cope with them.

Robustness attacks - Stir. Mark • Performs a series of almost unnoticeable distortions to

Robustness attacks - Stir. Mark • Performs a series of almost unnoticeable distortions to attempt to remove mark: • Geometric distortion • Low frequency deviation • Transfer function • Applying Stir. Mark to (a) and (c) produces (b) and (d).

Presentation attacks - Mosaic • Takes advantage of minimum size requirements for embedding. •

Presentation attacks - Mosaic • Takes advantage of minimum size requirements for embedding. • Split image into small tiles to prevent detection of the mark. • Recombine when displaying. • Attempt to remove mark inserted by Digimarc. • Tiles bordered in red show no sign of the mark. • Even with 16 tiles, 6 still contain the mark.

Interpretation attacks 1. Cannot tell which watermark is inserted first. 2. Copyright owner publishes

Interpretation attacks 1. Cannot tell which watermark is inserted first. 2. Copyright owner publishes document d with watermark w, i. e. d + w. 3. Pirate adds watermark w’ and claims that original is d + w – w’. 4. Clear that someone is lying but no way of telling who is genuine owner.

Implementation attacks • If software implementation is poor it can allow some attacks. •

Implementation attacks • If software implementation is poor it can allow some attacks. • Digimarc requires users to register ID and password. • Attacker broke into software and disabled password checks. • Could then change the ID, affecting already marked images and bypassing checks for existing marks to overwrite them.

Conclusion • Steganography claims to become increasingly important as more copyrighted material becomes available

Conclusion • Steganography claims to become increasingly important as more copyrighted material becomes available online. • Many techniques are not robust enough to prevent detection and tampering with embedded data. • For technique to be considered robust: – – • The quality of the media should noticeably degrade upon embedding data. Data should be undetectable without secret knowledge typically the key. If multiple marks are present they should not interfere with each other. The marks should survive attacks that don’t degrade the perceived quality of the work. Methods of embedding and detecting are likely to continue to improve.

Questions

Questions