Data Security and Cryptology XVI Basics of Steganography
Data Security and Cryptology, XVI Basics of Steganography December 16 th, 2015 Valdo Praust mois@mois. ee Lecture Course in Estonian IT College Autumn 2015
Main Legal Acts Regulating Data Security in Estonia • Personal Data Protection Act – regulates processing of personal data • Public Information Act – regulates databases of public sector, including security standard ISKE and secure data exchange layer X-road (X-tee) • Digital Signature Act – regulates components of PKI necessary for successful operating of digital signature
Public Information Act The aim of Public Informartion Act (avaliku teabe seadus) is to ensure that the public and every person has the opportunity to access information intended for public use, based on the principles of a democratic and social rule of law and an open society, and to create opportunities for the public to monitor the performance of public duties. It also regulates the topics concerning public sector databases (avaliku sektori andmekogud), including principles of establishment (asutamine) and management of these databases and their supervision (järelevalve) Earlier there was a special Data Collection Act (andmekogude seadus) in Estonia. Now the topics of mentioned act are incuded into Public Information Act
(Legal) Database (Legal) database (andmekogu) is a structured body of data processed within an information system of the state, local government or other person in public law or person in private law performing public duties which is established and used for the performance of functions provided in an Act, legislation issued on the basis thereof or an international agreement (Legal) database (andmekogu) is a (technical) database (andmebaas) with the necessary added administrtative and legal componets
Chief and Authorized Processor An authorized processor is required to comply with the instructions of the chief processor in the processing of data and housing of the database, and shall ensure the security of the database The chief processor of a database shall organize the establishment and administration of the central technological environment of a database established for the performance of the tasks imposed on or delegated to a local government by the state. Chief and authorized processor may coincide or may not coincide. There might be several different authorized processors of one database, but only one chief processor
X-Road Project: Essence Exchange layer of State Information System (Xroad, X-tee) is a a platform-independent secure standard interface between databases and information systems to connect databases and information systems of the public sector Technically it consists of X-road central system and a TLS-protocol-based secure data exchange protocol Actually is can be considered as a special case of VPN structure which is controlled and managed by the state
Implementation of Personal Data Protection Act The following are excluded from the scope of Act: • processing of personal data by natural persons for personal purposes • transmission of personal data through the Estonian territory without any other processing of such data in Estonia The Act applies to criminal proceedings and court procedure with the specifications provided by procedural law
Essence of Personal Data Personal data (isikuandmed) are any data concerning an identified or identifiable natural person, regardless of the form or format in which such data exist Personal data are all data about the person when it is able to identify the person uniquely
Processing of Personal Data Processing of personal data (isikuandmete töötlemine) is any act performed with personal data, including the collection, recording, organisation, storage, alteration, disclosure, granting access to personal data, consultation and retrieval, use of personal data, communication, cross-usage, combination, closure, erasure or destruction of personal data or several of the aforementioned operations, regardless of the manner in which the operations are carried out or the means used NB! Take into account that processing is not only the changing of data!
Classification of Personal Data Estonian Personal Data Protection Act divides all personal data into two main categories with different protecting conditions: • sensititive personal data (delikaatsed isikuandmed) • other (ordinary) personal data In the 2 nd version of Act there were also defined private personal data (eraelulised isikuandmed) as an additional class (now it is not)
Principles of Processing Personal Data, I 1. Principle of legality - personal data shall be collected only in an honest and legal manner 2. Principle of purposefulness - personal data shall be collected only for the achievement of determined and lawful objectives, and they shall not be processed in a manner not conforming to the objectives of data processing 3. Principle of minimalism - personal data shall be collected only to the extent necessary for the achievement of determined purposes
Principles of Processing Personal Data, II 4. Principle of restricted use - personal data shall be used for other purposes only with the consent of the data subject or with the permission of a competent authority 5. Principle of data quality - personal data shall be up-to-date, complete and necessary for the achievement of the purpose of data processing
Principles of Processing Personal Data, III 6. Principle of security - security measures shall be applied in order to protect personal data from involuntary or unauthorised processing, disclosure or destruction 7. Principle of individual participation - the data subject shall be notified of data collected concerning him or her, the data subject shall be granted access to the data concerning him or her and the data subject has the right to demand the correction of inaccurate or misleading data
Processor(s) of Personal Data A processor (chief processor, vastutav töötleja) of A personal data is a natural or legal person, a branch of a foreign company or a state or local government agency who processes personal data or on whose assignment personal data are processed • A processor of personal data shall determine the purposes of processing of personal data, the categories of personal data to be processed the procedure for and manner of processing personal data and permission for communication of personal data to third persons • A processor of personal data (hereinafter chief processor) may authorise, by an administrative act or contract, another person or agency (hereinafter authorized processor, volitatud töötleja) to process personal data, unless otherwise prescribed by an Act or regulation
Permission for Processing Personal Data General rule: personal data can be processed only with the consent of a data subject An administrative authority shall process personal data only in the course of performance of public duties in order to perform obligations prescribed by law, an international agreement or directly applicable legislation of the Council of the European Union or the European Commission
Security Demands to Processing Environment A processor of personal data is required to take organisational, physical and information technology security measures (safeguards) to protect personal data: • against accidental or intentional unauthorised alteration of the data, in the part of the integrity of data • against accidental or intentional destruction and prevention of access to the data by entitled persons, in the part of the availability of data • against unauthorised processing, in the part of confidentiality of the data
Personal Data versus ISKE: Confidentiality Personal dsata Protection Act determines the set of persons who can access (process) personal data If to compare this statement to security class and sublass definitions of ISKE then it corresponds to the confidentiality subclass S 2
Personal Data versus ISKE: Integrity Personal Data Protection Act states that “personal data shall be up-to-date, complete and necessary for the achievement of the purpose of data processing” and to “prevent unauthorised recording, alteration and deleting of personal data and to ensure that it be subsequently possible to determine when, by whom and which personal data were recorded, altered or deleted or when, by whom and which data were accessed in the data processing system” This definition corresponds to ISKE integrity subclass T 2 definition
Necessity to Hide Semantics Demand from practice: somtimes it’s necessary to hide some information, where it’s impossible to hide the presence of (some) data itself A typical example: possessing/transmitting of some type of information is prohibited and there’s possible to wiretap (eavesdrop) all the (stored or transferred) data Sometimes it’s permitted to possess/transfer only a special type (special format) of information, not other formats and/or types
The Limits of Cryptography Encryption doesn’t always solve the hiding problem completly: the result is typically a random sequence (white noice) which clearly points, that there’s something hidden (and encrypted) Cryptography successfully hides the information but doesn’t hide the fact that there’s something hidden Solution: together with cryptography we should also use other techniques which hide the hiding fact itself
Principles of Steganography (steganograafia) is a technique of hiding (more accurately - a technique of hiding the hiding fact itself) In Ancient Greek it means “a hidden word” Differently from cryptography the steganography doesn’t consider data not only as a bitstream but a bitstream having some (semantic) meaning Steganography usually complements cryptography: for many cases these techniques are used together
Two Main Methods of Steganography 1. Hiding of some data set) into the other data set (file). It’s a classical usage of steganography, when we hide the presence of hidden information 2. Adding the irremovable digital watermark to a file. It’s very important technique for a copyright purposes– it’s impossible to remove the (often invisible) watermark, pointing to the copyright owner, without the loss of utilitarian value of data
Hiding the Semantics: Main Principles Main Principle: one data set (file) is hidden into thge another data set (file) with some systematic and reversible method • A hiddable data set (file) is called a message (sõnum) • This data set (file), where the message is hidden, is called a container or container file (konteiner) The aim is not to hide the content of the message, but the existence of a hidden message at all
Demands to a Container A container is usually a (huge) file with certain (allowed) semantic content Additionally, the container can also be: • a printed material • a broadcast • an analog (voice or video) recording etc Baseline: a special kind of minor changes inside a file (container) usually let the semantic content intact and don’t hinder the use of file (container) for its’ main purposes
Demands to a Message A message is typically considered as a given bitstream, a format and a semantical meaning don’t play here any role • The message is often encrypted before we put it into the container – as the encrypted data is undistinguishable from a white noice, it’s possbile to hide the hiding fact for a subjects which don’t have the decrypting key • The length ratio of message/container depends mainly on the container file format. Typically it’s some hundreds or thousands of times
Picture Container Baseline: a human doesn’t usually perceive an ultra-fine geometric details, dimensions or white balance and other differences Possible solutions for a steganography container use: • changing of R, G ja B lower bits for unpacked RGB pictures (TIFF, RAW) • A palette based RGB-picture (GIF, BMP): a previous technique + palette manipulation • Lossy compressed picture (JPEG): using of such a patterns and white balances which remain intact during compressing processes • A combination of different methods + using of spectrums, Fourier’ transforms and stochastic pattern blocks (contemporarly most-of-spread techniques)
Voice Container Baseline: a human an doesn’t perceive small changes in sounds height, timbre and/or the duration Possible solutions for a steganography container use: • Changing of lower bits of uncompressed voice (CD, WAV, AU etc) • Additional problem: we can use only a weak signals which neighbored by the strong signals. Solution: more complex algorithms of lower bits changing • Lossy compressed voice (MPEG Layer 3 e MP 3, WMA etc): more complex methods which pass all filters in unchanged mode (based on frequency analyse etc methods)
Video Container Baseline : a human an doesn’t perceive a technical details of frames’ changing Possible solutions for a steganography container use: • Uncompressed video: all techniques for pictures and voice + details of combining of sequential frames • Lossy compressed video (MPEG, AVI, RM): all techniques for lossy compressed pictures and voice + details of combining of sequential frames)
Text Container Baseline: in most of cases there was used a formatted and layouted text, a use of pure text is very difficult Possible solutions for a steganography container use: • Manipulation of blanks, line spacings, etc. unnoticeable parameters • semantic methods (for example use of different synonyms alternately, in Estonian “ja” and “ning”)
Code Container Baseline: the command sets of most-of-spread processors usually allow to realize the same functionality in the different ways Main method: modicication of code which remains its’ functional behaviour intact Simple practical solutions: • adding of NOP’s (empty commands) to the arbitrary places • replacing of one cycle type with another • adding of unreacahble (random) code
Comparison of Container Types, I • Picture container’s shortcomings are complex methods (typical is lossy compressing), advcances are wide usage in practice • Uncompressed voice (example CD) is an ideal steganographic container • Compressed voice’s shortcoming is complexity of methods, advcances are very wide usage and huge amount
Comparison of Container Types, II • Video Container has the same shortcomings as picture containers, a great advance is a huge amount which enables to hide long messages • Text Container is generally quite bad container, the ratio of message/container must be extremly big • Code Container has a great advance because contemporary programs are big, shortcoming is a big rate of noticablility (we can easily compare with genuine software)
Hiding Conditions Main condition: when we hide an encrypted messagem it must remain unnoticable without the decryption key Reality: a lot of simple methods doesn’t satisfy tis condition Although, a most of complex end sophisticated method successfully satisfy this condition
An Example of Hiding Naissaar Mine Factory photo (NATO intelligency satellite, 1984, right) : Naissaar harbour photo (same source, left). Both are 30 KB long
An Example of Hiding (cont. ) Here are both pictures inserted into the Tallinn St. Nicholas Church photo (307 KB JPEG, original left). After messages insertion, the Fourier’ transforms are reduced the chuch photo quality (right, JPEG)
Digital watermark Digital Watermark (digivesimärk) is a securityoriented pattern inserted to the data set, which enables to identify the owner/user of the data or the similar information (usually the data whish is related with the copyright permissions) Digital watermarks can be • visible (nähtav), i. e. known for a data set user/owner • invisible (nähtamatu), i. e. unknown and usually undetectable for a data set user/owner Often, a visible watermark can point to an invisible watermark
Watermarkable Documents Invisible watermarks can be added to: • pictures • voice • videos • broadcasts and analog records • codes (programs) A little bit more difficult but still possible is adding the watermarks to layouted text (PS, PDF, RTF) The main aim of a digital watermark is a copyright protection
Typical Demands to the Suitable Watermark Algorithm • A watermark must be unremovable without essential harming the original content • A watermark must remain intact during typical transforming operations (picture zooming/croping, voice mixing etc) • Invisible watermark must remain undetectable without having a corresponding decryption key • An authorized persons/subjects must easily detect both the presence and content of a watermark (usually by having the appropriate decryption key) • A watermark technique’s price must be acceptable
Watermarking - Practical Solutions • There a lot of solutions which are practically usable, i. e. Which satisfy the abouve-mentioned conditions • A lot of algorithms/solutions are broken, and they are breaked all the time Before the actual use of a certain watermarking technique, it’s obligatory to make sure that this method is not broken and don’t have essential weaknesses
Forecast: a Rapid Development of Watermark Techniques Reason: there’s a real necessity to protect a digital product against uncontrolled and unauthorized spread and copy Example: we can personify all software products and only authorized subjects (who have a corresponding key) can check the watermark containing the licence information
- Slides: 40