Programming for Geographical Information Analysis Core Skills Lecture

  • Slides: 48
Download presentation
Programming for Geographical Information Analysis: Core Skills Lecture 7: Core Packages: File Input/Output

Programming for Geographical Information Analysis: Core Skills Lecture 7: Core Packages: File Input/Output

This lecture Files Text files Binary files

This lecture Files Text files Binary files

Files File types Dealing with files starts with encapsulating the idea of a file

Files File types Dealing with files starts with encapsulating the idea of a file in an object

File locations Captured in two classes: java. io. File Encapsulates a file on a

File locations Captured in two classes: java. io. File Encapsulates a file on a drive. java. net. URL Encapsulates a Uniform Resource Locator (URL), which could include internet addresses.

java. io. File Before we can read or write files we need to capture

java. io. File Before we can read or write files we need to capture them. The File class represents an external file. File(String pathname); File f = new File("e: /my. File. txt"); However, we must remember that different OSs have different file systems. Note the use of a forward slash. Java copes with most of this, but “e: ” wouldn’t work in *NIX / Mac / mobiles etc.

Getting file locations java. awt. File. Dialog Opens a “Open file” box with a

Getting file locations java. awt. File. Dialog Opens a “Open file” box with a directory tree in it. This stays open until the user chooses a file or cancels. Once chosen use File. Dialog’s get. Directory() and get. File() methods to get the directory and filename.

Getting file locations import java. awt. *; import java. io. *; File. Dialog fd

Getting file locations import java. awt. *; import java. io. *; File. Dialog fd = new File. Dialog(new Frame()); fd. set. Visible(true); File f = null; if((fd. get. Directory() != null)||( fd. get. File() != null)) { f = new File(fd. get. Directory() + fd. get. File()); }

The application directory Each object has a java. lang. Class object associated with it.

The application directory Each object has a java. lang. Class object associated with it. This represents the class loaded into the JVM. One use is to get resources local to the class, i. e. in the same directory as the. class file. We use a java. net. URL object to do this. Class = get. Class(); URL url = this. Class. get. Resource("my. File. txt"); We can then use URL’s get. Path() to return the file path as a String for the File constructor.

Useful File methods exists(), can. Read() and can. Write() Test whether the file exists

Useful File methods exists(), can. Read() and can. Write() Test whether the file exists and can be read or written to. create. New. File() and create. Temp. File() Create a new file, and create a new file in “temp” or “tmp”. delete() and delete. On. Exit() Delete the file (if permissions are correct). Delete when JVM shutsdown. is. Directory() and list. Files() Checks whether the File is a directory, and returns an array of Files representing the files in the directory. Can use a Filename. Filter object to limit the returned Files.

Files File types As we’ll see, the type of the file has a big

Files File types As we’ll see, the type of the file has a big effect on how we handle it.

Binary vs. Text files All files are really just binary 0 and 1 bits.

Binary vs. Text files All files are really just binary 0 and 1 bits. In ‘binary’ files, data is stored in binary representations of the primitive types: 8 bits = 1 byte 00000000 00000000 00000001 00000000 00000010 2 00000000 00000100 4 00000000 00110001 00000000 01000001 00000000 1111 = int 0 = int 1 = int 49 = int 65 = int 255

Binary vs. Text files In text files, which can be read in notepad++ etc.

Binary vs. Text files In text files, which can be read in notepad++ etc. characters are stored in smaller 2 -byte areas by code number: 0000 01000001 = code 65 = char “A” 0000 01100001 = code 97 = char “a”

Characters All chars are part of a set of 16 bit international characters called

Characters All chars are part of a set of 16 bit international characters called Unicode. These extend the American Standard Code for Information Interchange (ASCII) , which are represented by the ints 0 to 127, and its superset, the 8 bit ISO-Latin 1 character set (0 to 255). There are some invisible characters used for things like the end of lines. char back = 8; // Try 7, as well! System. out. println("hello" + back + "world"); The easiest way to use stuff like newline characters is to use escape characters. System. out. println("hellonworld");

Binary vs. Text files Note that : 0000 00110001 = code 49 = char

Binary vs. Text files Note that : 0000 00110001 = code 49 = char “ 1” Seems much smaller – it only uses 2 bytes to store the character “ 1”, whereas storing the int 1 takes 4 bytes. However each character takes this, so: 0000 00110001 = code 49 = char “ 1” 0000 00110001 0000 00110010 = code 49, 50 = char “ 1” “ 2” 0000 00110001 0000 00110010 0000 00110111 = code 49, 50, 55 = char “ 1” “ 2” “ 7” Whereas : 00000000 01111111 = int 127

Binary vs. Text files In short, it is much more efficient to store anything

Binary vs. Text files In short, it is much more efficient to store anything with a lot of numbers as binary (not text). However, as disk space is cheap, networks fast, and it is useful to be able to read data in notepad etc. increasingly people are using text formats like XML. As we’ll see, the filetype determines how we deal with files.

Review File f = new File("e: /my. File. txt"); Three methods of getting file

Review File f = new File("e: /my. File. txt"); Three methods of getting file locations: Hardwiring File. Dialog Class get. Resource() Need to decide the kind of file we want to deal with.

This lecture Files Text files Binary files

This lecture Files Text files Binary files

Input and Output (I/O) So, how do we deal with files (and other types

Input and Output (I/O) So, how do we deal with files (and other types of I/O)? In Java we use address encapsulating objects, and input and output “Streams”. Streams are objects which represent the external resources which we can read or write to or from. We don’t need to worry about “how”. Input Streams are used to get stuff into the program. Output streams are used to output from the program.

Streams based on four abstract classes… java. io. Reader and Writer Work on character

Streams based on four abstract classes… java. io. Reader and Writer Work on character streams – that is, treat everything like it’s going to be a character. java. io. Input. Stream and Output. Stream Work on byte streams – that is, treat everything like it’s binary data.

Character based streams Two abstract superclasses – Reader and Writer. These are used for

Character based streams Two abstract superclasses – Reader and Writer. These are used for a variety of character streams. Most important are: File. Reader File. Writer : for reading files. : for writing files.

Example File f = new File(“my. File. txt"); File. Reader fr = null; try

Example File f = new File(“my. File. txt"); File. Reader fr = null; try { fr = new File. Reader (f); } catch (File. Not. Found. Exception fnfe) { Read one character out fnfe. print. Stack. Trace(); of the file. } try { char 1 = fr. read(); fr. close(); } catch (IOException ioe) { ioe. print. Stack. Trace(); } Close the connection to the file so others can use it.

Example File f = new File("my. File. txt"); File. Writer fw = null; try

Example File f = new File("my. File. txt"); File. Writer fw = null; try { fw = new File. Writer (f, true); Note this boolean is optional and sets whether to append to the file (true) or overwrite it (false). Default is overwrite. } catch (IOException ioe) { ioe. print. Stack. Trace(); } try { fw. write("A"); fw. flush(); fw. close(); } catch (IOException ioe) { ioe. print. Stack. Trace(); } Make sure everything in the stream is written out.

Buffers Plainly it is a pain to read a character at a time. It

Buffers Plainly it is a pain to read a character at a time. It is also possible that the filesystem may be slow or intermittent, which causes issues. It is common to wrap streams in buffer streams to cope with these two issues. Buffered. Reader br = new Buffered. Reader(fr); Buffered. Writer bw = new Buffered. Writer(fw);

Buffered. Reader br = new Buffered. Reader(fr); // Remember fr is a File. Reader

Buffered. Reader br = new Buffered. Reader(fr); // Remember fr is a File. Reader not a File. Example int lines = -1; String text. In = " "; String[] file = null; try { while (text. In != null) { text. In = br. read. Line(); lines++; } Run through the file once to count the lines and make a String array the right size. file = new String[lines]; // close the buffer here and remake both File. Reader and // buffer to set it back to the file start. for (int i = 0; i < lines; i++) { file[i] = br. read. Line(); } br. close(); } catch (IOException ioe) {} Go back to the start of the file and read it into the array.

Example String[][] str. Data = get. String. Array(); Buffered. Writer bw = new Buffered.

Example String[][] str. Data = get. String. Array(); Buffered. Writer bw = new Buffered. Writer (fw); // Remember fw is a File. Writer not a File. try{ for (int i = 0; i < str. Data. length; i++) { for (int j = 0; j < str. Data[i]. length; j++) { bw. write(str. Data[i][j] + ", "); } bw. new. Line(); } bw. close(); } catch (IOException ioe) {}

Processing data This is fine for text, but what if we want values and

Processing data This is fine for text, but what if we want values and we have text representations of the values? There is a difference between 0. 5 and “ 0. 5”. The computer understands the first as a number, but not the second First, parse (split and process) the file to get each individual String representing the numbers. Second, turn the text in the file into real numbers.

java. util. String. Tokenizer String line = “Call me Dave”; String. Tokenizer st =

java. util. String. Tokenizer String line = “Call me Dave”; String. Tokenizer st = new String. Tokenizer(line); while (st. has. More. Tokens()) { System. out. println(st. next. Token()); } prints the following output: Call me Dave Default separators: space, tab, newline, carriage-return character, and form-feed.

Processing data There are wrapper classes for each primitive that will do the cast:

Processing data There are wrapper classes for each primitive that will do the cast: double d = Double. parse. Double("0. 5"); int i = Integer. parse. Int("1"); boolean b = Boolean. parse. Boolean("true"); On the other hand, for writing, String can convert most things to itself: String str = String. value. Of(0. 5); String str = String. value. Of(data[i][j]);

Example for (int i = 0; i <= lines; i++) { file[i] = br.

Example for (int i = 0; i <= lines; i++) { file[i] = br. readln(); } br. close(); double[][] data = new double [lines][]; Comma and space separated data for (int i = 0; i < lines; i++) { String. Tokenizer st = new String. Tokenizer(file[i], ", "); data[i] = new double[st. count. Tokens()]; int j = 0; while (st. has. More. Tokens()) { data[i][j] = Double. parse. Double(st. next. Token()); j++; } }

Example double[][] data. In = getdata(); Buffered. Writer bw = new Buffered. Writer (fw);

Example double[][] data. In = getdata(); Buffered. Writer bw = new Buffered. Writer (fw); String temp. Str = ""; try { for (int i = 0; i < data. In. length; i++) { for (int j = 0; j < data. In[i]. length; j++) { temp. Str = String. value. Of(data. In[i][j]); bw. write(temp. Str + ", "); } bw. new. Line(); } bw. close(); } catch (IOException ioe) {} Converts the double to a String.

java. util. Scanner Wraps around all this to make reading easy: Scanner s =

java. util. Scanner Wraps around all this to make reading easy: Scanner s = null; try { s = new Scanner( new Buffered. Reader( new File. Reader("my. Text. txt"))); while (s. has. Next()) { System. out. println(s. next()); } if (s != null) { s. close(); } } catch (Exception e) {} However, no token counter, so not great for reading into arrays.

Scanners By default looks for spaces to tokenise on. Can set up a regular

Scanners By default looks for spaces to tokenise on. Can set up a regular expression to look for. Comma followed by optional space: s. use. Delimiter(", \s*");

Data conversion s. next() / s. has. Next() String next. Boolean() / has. Next.

Data conversion s. next() / s. has. Next() String next. Boolean() / has. Next. Boolean() next. Double() / has. Next. Double() next. Int() / has. Next. Int() next. Line() / has. Next. Line() boolean double int String If the type doesn’t match, throws Input. Mismatch. Exception.

Reading from keyboard Scanner s = new Scanner(System. in); int i = s. next.

Reading from keyboard Scanner s = new Scanner(System. in); int i = s. next. Int(); String str = s. next. Line();

Parsing Strings Usually with text we want to extract useful information. Search and replace.

Parsing Strings Usually with text we want to extract useful information. Search and replace.

String searches starts. With(String prefix), ends. With(String suffix) Returns a boolean. index. Of(int ch),

String searches starts. With(String prefix), ends. With(String suffix) Returns a boolean. index. Of(int ch), index. Of(int ch, int from. Index) Returns an int representing the first position of the first instance of a given Unicode character integer to find. index. Of(String str), index. Of(String str, int from. Index) Returns an int representing the position of the first instance of a given String to find. last. Index. Of Same as index. Of, but last rather than first.

String manipulation replace(char old. Char, char new. Char) Replaces one character with another. substring(int

String manipulation replace(char old. Char, char new. Char) Replaces one character with another. substring(int begin. Index, int end. Index) substring(int begin. Index) Pulls out part of the String and returns it. to. Lower. Case(), to. Upper. Case() Changes the case of the String. trim() Cuts white space off the front and back of a String.

Example String str = "old pond; frog leaping; splash"; int start = str. index.

Example String str = "old pond; frog leaping; splash"; int start = str. index. Of("leaping"); int end = str. index. Of("; ", start); String start. Str = str. substring(0, start); String end. Str = str. substring(end); str = start. Str + "jumping" + end. Str; str now “old pond; frog jumping; splash”

Review Use a java. util. Scanner where possible. Otherwise use a File. Writer/Reader. But

Review Use a java. util. Scanner where possible. Otherwise use a File. Writer/Reader. But remember to buffer both.

This lecture Files Text files Binary files

This lecture Files Text files Binary files

Byte streams Input. Stream Read methods return -1 at the end of the resource.

Byte streams Input. Stream Read methods return -1 at the end of the resource. File. Input. Stream(File file. Object) Allows us to read bytes from a file. Output. Stream Used to write to resources. File. Output. Stream(File file. Object) Used to write to a file if the user has permission. Overwrites old material in file. File. Output. Stream(File file. Object, boolean append) Only overwrites if append is false.

Example File. Input. Stream our. Stream = null; File f = new File(“e: /my.

Example File. Input. Stream our. Stream = null; File f = new File(“e: /my. File. bin”); try { our. Stream = new File. Input. Stream(f); } catch (File. Not. Found. Exception fnfe) { // Do something. } The Stream is then usually used in the following fashion: int c = 0; while( (c = our. Stream. read()) >= 0 ) { // Add c to a byte array (more on this shortly). } our. Stream. close();

Byte streams II There are cases where we want to write to and from

Byte streams II There are cases where we want to write to and from arrays using streams. These are usually used as a convenient way of reading and writing a byte array from other streams and over the network. Byte. Array. Input. Stream Byte. Array. Output. Stream

Example File. Input. Stream fin = new File. Input. Stream(file); Byte. Array. Output. Stream

Example File. Input. Stream fin = new File. Input. Stream(file); Byte. Array. Output. Stream baos = new Byte. Array. Output. Stream(); int c; while((c = fin. read()) >= 0) { baos. write(c); } byte[] b = baos. to. Byte. Array(); Saves us having to find out size of byte array as Byte. Array. Output. Stream has a to. Byte. Array() method.

Buffering streams As with the File. Reader/File. Writer: Buffered. Input. Stream Buffered. Output. Stream

Buffering streams As with the File. Reader/File. Writer: Buffered. Input. Stream Buffered. Output. Stream You wrap the classes using the buffer’s constructors.

Other byte streams Random. Access. File Used for reading and writing to files when

Other byte streams Random. Access. File Used for reading and writing to files when you need to write into the middle of files as opposed to the end. Print. Stream Was used in Java 1. 0 to write characters, but didn’t do a very good job of it. Now deprecated as an object, with the exception of System. out, which is a static final object of this type. Object Streams

Serialization Given that we can read and write bytes to streams, there’s nothing to

Serialization Given that we can read and write bytes to streams, there’s nothing to stop us writing objects themselves from the memory to a stream. This lets us transmit objects across the network and save the state of objects in files. This is known as object serialization. More details at: http: //www. tutorialspoint. com/java_serialization. htm

Summary We can represent and explore the files on a machine with the File

Summary We can represent and explore the files on a machine with the File class. To save us having to understand how external info is produced, java uses streams. We can read and write bytes to files or arrays. We can store or send objects using streams. We can read and write characters to files or arrays. We should always try and use buffers around our streams to ensure access.