Object Serialization in Java Or The Persistence of
Object Serialization in Java Or: The Persistence of Memory… Originally from: http: //www. cs. unm. edu/~terran/
So you want to save your data… n Common problem: n You’ve built a large, complex object n n Want to store on disk and retrieve later n Or: want to send over network to another Java process In general: want your objects to be persistent -outlive the current Java process n n Spam/Normal statistics tables Game state Database of student records Etc…
Answer I: customized file formats n Write a set of methods for saving/loading each instance of a class that you care about public class My. Class { public void save. Yourself(Writer o) throws IOException {. . . } public static My. Class load. Yourself(Reader r) throws IOException {. . . } }
Coolnesses of Approach 1: n n Can produce arbitrary file formats Know exactly what you want to store and get back/don’t store extraneous stuff Can build file formats to interface with other codes/programs n XML n Pure text file n Etc. If your classes are nicely hierarchical, makes saving/loading simple n What will happen with Inheritance?
Make Things Saveable/Loadable public interface Saveable { public void save. Yourself(Writer w) throws IOException; // should also have this // public static Object load. Yourself(Reader r) // throws IOException; // but you can’t put a static method in an // interface in Java }
Saving, cont’d public class My. Class. A implements Saveable { public My. Class. A(int arg) { // initialize private data members of A } public void save. Yourself(Writer w) throws IOException { // write My. Class. A identifier and private data on // stream w } public static My. Class. A load. Yourself(Reader r) throws IOException { // parse My. Class. A from the data stream r My. Class. A tmp=new My. Class. A(data); return tmp; } }
Saving, cont’d public class My. Class. B implements Saveable { public void My. Class. B(int arg) {. . . } private My. Class. A _stuff; public void save. Yourself(Writer w) { // write ID for My. Class. B _stuff. save. Yourself(w); // write other private data for My. Class. B w. flush(); } public static My. Class. B load. Yourself(Reader r) { // parse My. Class. B ID from r My. Class. A tmp=My. Class. A. load. Yourself(r); // parse other private data for My. Class. B return new My. Class. B(tmp); } }
Painfulnesses of Approach 1: n n This is called recursive descent parsing Actually, there are plenty of places in the real world where it’s terribly useful. But. . . It’s also a pain (why? ) n If all you want to do is store/retrieve data, do you really need to go to all of that effort? Fortunately, no. Java provides a shortcut that takes a lot of the work out.
Approach 2: Enter Serialization. . . n n n Java provides the serialization mechanism for object persistence It essentially automates the grunt work for you Short form: public class My. Class. A implements Serializable {. . . } // in some other code elsewhere. . . My. Class. A tmp=new My. Class. A(arg); File. Output. Stream fos=new File. Output. Stream("some. obj"); Object. Output. Stream out=new Object. Output. Stream(fos); out. write. Object(tmp); out. flush(); out. close();
In a bit more detail. . . n n To (de-)serialize an object, it must implements Serializable n All of its data members must also be marked serializable n And so on, recursively. . . n Primitive types (int, char, etc. ) are all serizable automatically n So are Strings, most classes in java. util, etc. This saves/retrieves the entire object graph, including ensuring uniqueness of objects
The object graph and uniqueness Mondo. Hash. Table Entry "tyromancy" Vector Entry "zygopleural"
Now some problems… n n static fields are not automatically serialized n Not possible to automatically serialize them because they’re owned by an entire class, not an object Options: n final static fields are automatically initialized (once) the first time a class is loaded n static fields initialized in the static {} block will be initialized the first time a class is loaded n But what about other static fields?
When default serialization isn’t enough n n Java allows write. Object() and read. Object() methods to customize output If a class provides these methods, the serialization/deserialization mechanism calls them instead of doing the default thing
write. Object() in action public class Demo. Class implements Serializable { private int _dat=3; private static int _sdat=2; private void write. Object(Object. Output. Stream o) throws IOException { o. write. Int(_dat); o. write. Int(_sdat); } private void read. Object(Object. Input. Stream i) throws IOException, Class. Not. Found. Exception { _dat=i. read. Int(); _sdat=i. read. Int(); } }
Things that you don’t want to save n n Sometimes, you want to explicitly not store some non-static data n Computed vals that are cached simply for convenience/speed n Passwords or other “secret” data that shouldn’t be written to disk Java provides the “transient” keyword. transient foo means don’t save foo public class My. Class implements Serializable { private int _primary. Val=3; // is serialized private transient int _cached. Val=_primary. Val*2; // _cached. Val is not serialized }
Issue: #0 -- non Serializable fields n n What happens if class Foo has a field of type Bar, but Bar isn’t serializable? If you just do this: Foo tmp=new Foo(); Object. Output. Stream out=new Object. Output. Stream; out. write. Object(tmp); n n n You get a Not. Serializable. Exception Answer: use read/write. Object to explicitly serialize parts that can’t be handled otherwise Need some way to get/set necessary state
Issue: #0. 5 -- non-Ser. superclasses n Suppose n class Foo extends Bar implements Serializable n But Bar itself isn’t serializable n What happens? Bar (not serializable) Foo (serializable)
Non-Serializable superclasses, cont’d n n Bar must provide a no-arg constructor Foo must use read. Object/write. Object to take care of Bar’s private data Java helps a bit with default. Read. Object and default. Write. Object Order of operations (for deserialization) n Java creates a new Foo object n n Java calls Bar’s no-arg constructor Java calls Foo’s read. Object n n n Foo’s read. Object explicitly reads Bar’s state data Foo reads its own data Foo reads its children’s data
When having a non-serializable parent n n Class Zip. File does not implements Serializable, and it does not have a no-arg constructor public class Zip. File implements java. util. zip. Zip. Constants public Zip. File(String filename) throws IOException n public Zip. File(File file) throws Zip. Exception, IOException What can we do? n Can anyone answer me? n n
Issue: #1 -- Efficiency n n n For your Mondo. Hash. Table, you can just serialize/deserialize it with the default methods But that’s not necessarily efficient, and may even be wrong By default, Java will store the entire internal _table, including all of its null entries! Now you’re wasting space/time to load/save all those empty cells Plus, the hash. Code()s of the keys may not be the same after deserialziation -- should explicitly rehash them to check. n hash. Code() is defined in java. lang. Object n Address is usually used in the default implementation
Issue: #2 -- Backward compatibility n n n Suppose that you have two versions of class Foo: Foo v. 1. 0 and Foo v. 1. 1 The public and protected members of 1. 0 and 1. 1 are the same; the semantics of both are the same So Foo 1. 0 and 1. 1 should behave the same and be interchangable BUT. . . The private fields and implementation of 1. 0 and 1. 1 are different What happens if you serialize with a 1. 0 object and deserialize with a 1. 1? Or vice versa?
Backward compat, cont’d. n n n Issue is that in code, only changes to the public or protected matter With serialization, all of a sudden, the private data members (and methods) count too n Serialization is done by the JVM, not codes in Object. Input. Stream/Object. Output. Stream n This is a kind of privilege Have to be very careful to not muck up internals in a way that’s inconsistent with previous versions n E. g. , changing the meaning, but not name of some data field
Backward compat, cont’d n Example: // version 1. 0 public class My. Class { My. Class(int arg) { _dat=arg*2; } private int _dat; } // version 1. 1 public class My. Class { My. Class(int arg) { _dat=arg*3; } // NO-NO! private int _dat; }
Backward compat, cont’d: n n n Java helps as much as it can Java tracks a “version number” of a class that changes when the class changes “substantially” n Fields changed to/from static or transient n Field or method names changed n Data types change n Class moves up or down in the class hierarchy Trying to deserialize a class of a different version than the one currently in memory throws Invalid. Class. Exception
Yet more on backward compat n n n Java version number comes from names of all data and method members of a class If they don’t change, the version number won’t change If you want Java to detect that something about your class has changed, change a name But, if all you’ve done is changed names (or refactored functionality), you want to be able to tell Java that nothing has changed Can lie to Java about version number: static final long serial. Version. UID = 3530053329164698194 L;
The detail list of compatibility n n You have to check the following rules n http: //java. sun. com/javase/6/docs/platform/s erialization/spec/version. html One of the key idea is that n When restoring an object, new things are allowed, and old things should be kept
Issues #3: When facing Singleton pattern n When you are restoring a Singleton object, you need to check whethere is an existing singleton object in the system n This is logical correctness, and you need to check and guarantee it by yourself!
Default Write/Read Object n n Sometimes, we want to add some additional information For example public class Network. Window implements Serializable { private Socket the. Socket; //and many other fields and methods }
Recover the states public class Network. Window implements Serializable { private transient Socket the. Socket; //and many other fields and methods private void write. Object(Object. Output. Stream out) throws IOException { out. default. Write. Object(); out. write. Object(the. Socket. get. Inet. Address()); out. write. Int(the. Socket. get. Port()); } private void read. Object(Object. Input. Stream in) throws IOException, Class. Not. Found. Exception { in. default. Read. Object(); Inet. Address ia = (Inet. Address) in. read. Object(); int the. Port = in. read. Int(); this. the. Socket = new Socket(ia, the. Port); } }
Preventing Serialization n Sometimes you don’t want your class object to be serialized, but your parent implements Serializable… n You can override write. Object and read. Object, and throw exceptions n throw new Not. Serializable. Exception();
Summary n n n Make thing sequential, and so writable n Serialization is difficult and technical, you need to be aware of all the class hierarchy which you are going to serialize You can define your own serialization process You can additional information when serializing You can prevent an instance from serializing
- Slides: 32