Building Java Programs Chapter 6 File Processing File

Building Java Programs Chapter 6: File Processing File input using Scanner reading: 6. 1 - 6. 2 Token Based File Processing Most every piece of software you use handles files. Eclipse handles. java and other files. Word handles. doc files. Music apps use. mp 3 or other format files. 1

• • File Objects The File class in the java. io package represents files. import java. io. *; I/O stands for Input/Output. You can create a File object to get information about a file. File f = new File("example. txt"); if ( f. exists() ) System. out. printf("found file, size is %dn", f. length() ); else System. out. println(“File named example. txt not found. ”); • Creating a File object does not create a new file. 2

File Object methods: File f = new File("example. txt"); Creating a File object does not create a new file. Method name Description f. can. Read() returns whether file is able to be read f. delete() removes file from disk f. exists() whether this file exists on disk f. get. Name() returns file's name f. length() returns number of bytes in file f. rename. To(file) changes name of file f. is. File() Is this a normal file? f. is. Directory() Is this a directory? f. get. Absolute. Path() The full path name: e. g. C: docsbob160L 20example. txt

Relative vs. Absolute Paths • When you specify just a file name, it looks in your current directory (Java project). "example. txt" • You can specify file names with relative path names: "src/File. Test. java" ". . /Paintings/spiral. gif“ (Go up one directory then down to Paintings/spiral. gif. ) • or absolute path names: "C: /docs/bob/160/L 20/example. txt" • Windows uses backslashes but Unix machines use forward slashes. Java allows either! 4

Use a Scanner to read a File • To read a file, create a File object and pass it as a parameter when constructing a Scanner. This scanner will have all the functionality we had before and more. • General syntax: File <fname> = new File("<file name>")); Scanner <sname> new Scanner(<fname>); • Example: File f = new File("example. txt"); Scanner input = new Scanner(f); or just: Scanner input = new Scanner( new File("example. txt") ); Scanner objects can connect to • System. in (the Console, as done before) • A file (as we are seeing now). • A string (discussed later) 5

• Methods of Scanner for a file are the same as we have seen for console input: Method Description next. Int() reads and returns an int value next. Double() reads and returns a double value next() reads and returns the next token* as a String next. Line() reads and returns the next line of input as a String Method Name Description has. Next() whether any more tokens remain has. Next. Double() whether the next token can be interpreted as type double has. Next. Int() whether the next token can be interpreted as type int has. Next. Line() whether any more lines remain in the file * A token on input is any contiguous data separated by white space. We will see examples later. has. Next and has. Next. Line can each return false. Console could not. 6

Almost correct program to read a file import java. io. *; import java. util. *; // for File // for Scanner public class File. Test { public static void main(String[] args){ File f = new File("example. txt"); Scanner input = new Scanner(f); } } while (input. has. Next()) { System. out. println(input. next()); } Exception in thread "main" java. lang. Error: Unresolved compilation problem: Unhandled exception type File. Not. Found. Exception at File. Test. main(File. Test. java: 8) 7

Checked Exceptions • Earlier we saw some common exceptions: Illegal. Argument. Exception Arithmetic. Exception Input. Mismatch. Exception String. Index. Out. Of. Bounds. Exception • The idea of these is: “A bad thing has happened. Kill the program and print an error message. ” Usually these are problems that a good programmer will and does handle by using tests. Compiler assumes that the programmer will handle these. Compile the program as usual. if ( n != 0 && h/n = 3 ) // do not divide if n is 0. if (s 1. length() >= 2 && s 2. length() >= 2 // check the string lengths before indexing. && s 1. ends. With(s 2. substring(s 2. length() - 2))) { • Java also has some exceptions designated as "Checked Exceptions": File. Not. Found. Exception • The idea of these is: "A good programmer should handle these situations using a catch clause or throwing it up to the next level. Don't compile their program if they don’t. " 8

Easiest way to handle checked exception: Throw the exception up to the next method to handle it if it occurs. If no method handles it we get the usual File. Not. Found. Exception. import java. io. *; // for File import java. util. *; // for Scanner public class File. Test { public static void main(String[] args) throws File. Not. Found. Exception { File f = new File("example. txt"); Scanner input = new Scanner(f); while (input. has. Next()) { System. out. println(input. next()); } Example. txt: 10 20 30 40 50 60 }} In the above main() does not handle it so it throws the exception up to the operating system which prints the usual message if the error occurs. Output: 10 20 30 40 50 60 A file named example. txt does exist, so no exception occurs. 10 20 30 40 50

Easiest way to handle checked exception: Throw the exception up to the next method to handle it if it occurs. If no method handles it we get the usual File. Not. Found. Exception. import java. io. *; // for File Changed the file import java. util. *; // for Scanner name to grab. txt public class File. Test { which does not exist. public static void main(String[] args) throws File. Not. Found. Exception { File f = new File(“grab. txt"); Scanner input = new Scanner(f); while (input. has. Next()) { System. out. println(input. next()); } }} Error occurs here, when trying to attach a scanner to the nonexistent file In the above main() does not handle it so it throws the exception up to the operating system which prints the usual message if the error occurs. Exception in thread "main" java. io. File. Not. Found. Exception: grab. txt (The system cannot find the file specified) at java. io. File. Input. Stream. open(Native Method) at java. io. File. Input. Stream. <init>(Unknown Source) at java. util. Scanner. <init>(Unknown Source) 10 at File. Test. main(File. Test. java: 8)

Exceptions • exception: An object that represents a program error. – Programs that contain invalid logic cause (or "throw") exceptions. – Trying to read a file that does not exist will throw an exception. • checked exception: An error that Java forces us to handle or explicitly choose not to handle in our program (otherwise it will not compile). – We must specify what our program will do to handle any potential file I/O failures. We must either: • declare that our program will handle ("catch") the exception, using a try catch block or • explicitly state that we choose not to handle the exception (and we accept that our program will crash if an exception occurs) by adding a throws clause. We will use throws clause for a while. • throws clause, general syntax for a method that could throw an exception: public static <type> <name>(<params>) throws <type> { – When doing file I/O, we use File. Not. Found. Exception. 11 public static void main(String[] args) throws File. Not. Found. Exception {

• Finding these exceptions: – Read the exception text for line numbers in your code (the first line that mentions your method; often near the bottom): Exception in thread "main" java. util. File. Not. Found. Exception at java. util. Scanner. throw. For(Scanner. java: 838) This is called at java. util. Scanner. next(Scanner. java: 1347) the runtime at My. Program. my. Method. Name(My. Program. java: 19) at My. Program. main(My. Program. java: 6) stack Note 1: This is the Scanner method that originally found Note 2: Then this method was and threw the exception. returned to and threw the exception up to the next level. Note 3: Then this method. Note 4: Finally main() threw it to the operating system which printed the message

Scanner exceptions Exception in thread "main" java. util. File. Not. Found. Exception at java. util. Scanner. throw. For(Scanner. java: 838) This is called at java. util. Scanner. next(Scanner. java: 1347) the runtime at My. Program. my. Method. Name(My. Program. java: 19) stack at My. Program. main(My. Program. java: 6) Invert the runtime stack for bubble analogy. Finally bubbles up to operating system which prints the message. at My. Program. main(My. Program. java: 6) Continues bubbling at My. Program. my. Method. Name(My. Program. java: 19) at java. util. Scanner. next(Scanner. java: 1347) at java. util. Scanner. throw. For(Scanner. java: 838) Exception occurs here. Does not handle it, it bubbles up to he next method.

Throwing exception syntax (Skip? ) • throws clause, general syntax: public static <type> <name>(<params>) throws <type> { – When doing file I/O, we use File. Not. Found. Exception. public static void main(String[] args) throws File. Not. Found. Exception { • Like saying, “I know this method may cause the program to crash but I am not going to handle the problem. Some method that called me should handle this if it happens or the program will crash. ” • In this case, main() is throwing the exception which means the operating system will handle it and you will get this if the file is not found: Exception in thread "main" java. util. File. Not. Found. Exception 14

import java. io. *; // for File import java. util. *; // for Scanner public class File. Test { public static void main(String[] args) throws File. Not. Found. Exception { File f = new File("example. txt"); Scanner input = new Scanner(f); while (input. has. Next()) { System. out. println(input. next()); } } } Contents of file example. txt: 16. 2 23. 5 19. 1 7. 4 22. 8 18. 5 -1. 8 14. 9 Output: 16. 2 23. 5 19. 1 7. 4 22. 8 18. 5 -1. 8 14. 9 Note: When input. has. Next() is false it means that the end of the file has been reached. Thus, we have no need for a sentinel value, the EOF (End Of File) marker is the built-in sentinel that has. Next() recognizes as being at the end of the file and returns false. 15

6. 2 Details of Token-Based Processing If the file consists only of tokens that can be processed individually, token-based processing is the best bet. If you find yourself using input. next. Line(), that is line based processing and should not be used with token-based processing. Files and input cursor • Consider a file numbers. txt that contains this text: 308. 2 14. 9 7. 4 3. 9 4. 7 2. 8 -15. 4 • A Scanner views all input as a stream of characters, which it processes with its input cursor: – 308. 2n 14. 9 7. 4 2. 8nnn 3. 9 4. 7 -15. 4n 2. 8n ^ – When you call the methods of the Scanner such as next(), next. Int() or next. Double(), the Scanner returns the next token. 16

Adding a data file to an Eclipse project. Just copy the file and paste it into the project. It will automatically go into the JRE System Library [jre 6] because its extension is not. java. You do not open the file and copy the contents. If the file is on your disk, get into Windows Explorer, find the file, right click on it and choose Copy. Then right click on the project in Eclipse and choose Paste. If you have a link to it on the web, right click on it and choose Copy, then Paste it into the appropriate Eclipse project. The top level of the project just as you pasted programs into the project. Example: hamlet. txt If the above were on a web page you could right click on it and copy the file then paste it into your java project. Just as you do with. java files. Do not open the data file and try to copy and paste the contents into your java project. That will not work. Java knows by the file contents if it is a java file because it contains the line public class File. Test. Your data file will not 17 have that line so java does not know what it is.

Input tokens • token: A unit of user input. Tokens are separated by whitespace (spaces, tabs, new lines). • Example: If an input file contains the following: 23 3. 14 "John Smith" – The tokens in the input are the following, and can be interpreted as the given types: Token Type(s) 1. 23 int, double, String 2. 3. 14 double, String 3. "John String 4. Smith" String Note: The double quotes are just input characters, outside Java they have no meaning as string delimiters. 18

Consuming tokens • Each call to next, next. Int, next. Double, etc. advances the cursor to the position just after the end of the current token, skipping over any whitespace. We call this consuming input. 308. 2n 14. 9 7. 4 2. 8nnn 3. 9 4. 7 -15. 4n 2. 8n ^ double q =input. next. Double(); 308. 2n 14. 9 7. 4 2. 8nnn 3. 9 4. 7 -15. 4n 2. 8n ^ q contains 308. 2 double r = input. next. Double(); 308. 2n 14. 9 7. 4 2. 8nnn 3. 9 4. 7 -15. 4n 2. 8n ^ r contains 14. 9 19

File input question • Consider an input file named numbers. txt that contains the following text: 308. 2 14. 9 7. 4 3. 9 4. 7 2. 8 -15. 4 • Write a program that reads the first 5 values from this file and prints them along with their sum. Its output: number = 308. 2 number = 14. 9 number = 7. 4 number = 2. 8 number = 3. 9 Sum = 337. 19999993 Note: Round off error causes the Sum to look like this. 20

File input answer // Displays the first 5 numbers in the given file, // and displays their sum at the end. import java. io. *; // for File, File. Not. Found. Exception import java. util. *; public class Echo { public static void main(String[] args) throws File. Not. Found. Exception { Scanner input = new Scanner(new File("numbers. txt")); double sum = 0. 0; for (int i = 1; i <= 5; i++) { double next = input. next. Double(); System. out. println("number = " + next); sum += next; } System. out. println("Sum = " + sum); } } 21

What if the file only contained 3 values? 308. 2 14. 9 7. 4 This basically means that we attempted to read beyond the end of file marker. We hit the end of file marker when there is no data left in the file. number = 308. 2 number = 14. 9 number = 7. 4 Exception in thread "main" java. util. No. Such. Element. Exception at java. util. Scanner. throw. For(Unknown Source) at java. util. Scanner. next. Double(Unknown Source) at Echo. main(Echo. java: 13) 22

Testing before reading • The preceding program is impractical because it only processes exactly 5 values from the input file. – A better program would read the entire file, regardless of how many values it contains. • Reminder: The Scanner has useful methods for testing to see what the next input token will be: Method Name Description has. Next() whether any more tokens remain has. Next. Double() whether the next token can be interpreted as type double has. Next. Int() whether the next token can be interpreted as type int has. Next. Line() whether any more lines remain 23

Test existence of value before reading it question • Rewrite the previous program so that it reads the entire file. Assume that the file contains only double values. Its output: number = 308. 2 number = 14. 9 number = 7. 4 number = 2. 8 number = 3. 9 number = 4. 7 number = -15. 4 number = 2. 8 Sum = 329. 29999995 24

Test before read answer // Displays each number in the given file, // and displays their sum at the end. import java. io. *; import java. util. *; public class Echo 2 { public static void main(String[] args) throws File. Not. Found. Exception { Scanner input = new Scanner(new File("numbers. dat")); double sum = 0. 0; while (input. has. Next. Double()) { double next = input. next. Double(); System. out. println("number = " + next); sum += next; } System. out. println("Sum = " + sum); } } 25

File processing question • Modify the preceding program again so that it will handle files that contain non-numeric tokens. – The program should skip any such tokens. • For example, the program should produce the same output as before when given this input file: 308. 2 hello 14. 9 7. 4 bad stuff 2. 8 3. 9 4. 7 oops -15. 4 : -) 2. 8 @#*($& 26 number = 308. 2 number = 14. 9 number = 7. 4 number = 2. 8 number = 3. 9 number = 4. 7 number = -15. 4 number = 2. 8 Sum = 329. 29999995

File processing answer // Displays each number in the given file, // and displays their sum at the end. import java. io. *; import java. util. *; public class Echo 3 { public static void main(String[] args) throws File. Not. Found. Exception { Scanner input = new Scanner(new File("numbers. dat")); double sum = 0. 0; while (input. has. Next()) { if (input. has. Next. Double()) { double next = input. next. Double(); System. out. println("number = " + next); sum += next; } else { input. next(); // consume & throw away bad token } } System. out. println("Sum = " + sum); } } 27

File processing question • Write a program that accepts an input file containing integers representing daily high temperatures. Example input file: weather. dat 42 45 37 49 38 50 46 48 48 30 45 42 45 40 48 • Your program should print the difference between each adjacent pair of temperatures, such as the following: 28 Temperature Temperature Temperature Temperature changed changed changed changed by by by by 3 deg F -8 deg F 12 deg F -11 deg F 12 deg F -4 deg F 2 deg F 0 deg F -18 deg F 15 deg F -3 deg F -5 deg F 8 deg F

File processing answer import java. io. *; import java. util. *; public class Temperatures. Prev. Current { public static void main(String[] args) throws File. Not. Found. Exception { Scanner input = new Scanner(new File("weather. dat")); int prev. Temp = input. next. Int(); while (input. has. Next. Int()) { int current. Temp = input. next. Int(); System. out. println("Temperature changed by " + (prev. Temp - current. Temp) + " deg F"); prev. Temp = current. Temp; } }} This pattern of keeping a current value and the previous value is common in programming. 29

Input Cursor • A Scanner views all input as a stream of characters. • The current position is called the input cursor. 10 20 30n 40 50nn 60n ^ input. next() -> 10 10 20 30n 40 50nn 60n ^ input. next() -> 20 10 20 30n 40 50nn 60n ^ input. next() -> 30 10 20 30n 40 50nn 60n ^ Calling next() is called "consuming input". 30

Example: Reading in the name of the file to be processed. This program finds the average of the numbers in the file. public static void main(String[] args) throws File. Not. Found. Exception { // get the filename Scanner console = new Scanner(System. in); example. txt System. out. print("File: "); String filename = console. next(); 10 20 30 // read and process the file 40 50 int sum = 0; 60 int count = 0; File f = new File(filename); Scanner input = new Scanner(f); console while (input. has. Next. Int()) { File: example. txt int d = input. next. Int(); Sum is: 210. 0 sum += d; Count is: 6 count++; Average is: 35. 0 } System. out. println("Sum is: " + sum); System. out. println("Count is: " + count); System. out. println("Average is: " + (double)sum/count); }}
![Review reading the file public static void main(String[] args) throws File. Not. Found. Exception Review reading the file public static void main(String[] args) throws File. Not. Found. Exception](http://slidetodoc.com/presentation_image_h2/2215ac5437569af2243db8863e77ee84/image-32.jpg)
Review reading the file public static void main(String[] args) throws File. Not. Found. Exception { File f = new File("example. txt"); Scanner input = new Scanner(f); while (input. has. Next()) { System. out. println(input. next()); } } example. txt 10 20 30 40 50 60 output: 10 20 30 40 50 60 Notice: No need for a sentinel. has. Next is false if there is no more data in the file. 32

Review: Working With Files • Create a File object: File f = new File("example. txt"); • Open the file for reading with a Scanner object: Scanner input = new Scanner(f); • For now, just throw the File. Not. Found. Exception to whoever called you. { . . . method(. . . ) throws File. Not. Found. Exception } . . . Scanner input = new Scanner(f); . . . 33

Mixing tokens and lines ( do not do this)(skip these notes) • Using next. Line() in conjunction with the token-based methods (next. Int(), next. Double(), next()) on the same Scanner can cause bad results. 23 Joe 3. 14 "Hello" world 45. 2 19 – You'd think you could read 23 and 3. 14 with next. Int and next. Double, then read Joe "Hello" world with next. Line. System. out. println(input. next. Int()); // 23 System. out. println(input. next. Double()); // 3. 14 System. out. println(input. next. Line()); // – But the next. Line call produces no output! Why? The first line is actually 23 3. 14n After the 3. 14 is read, the input cursor is placed at the n. The input. next. Line() reads up to but not including the n, it gets the empty string and prints that. input. next() would get “Joe”.

Mixing lines and tokens(skip these notes) • Don't read both tokens and lines from the same Scanner: 23 Joe 3. 14 "Hello world" 45. 2 19 input. next. Int() 23t 3. 14n. Joet"Hello" worldntt 45. 2 ^ input. next. Double() 23t 3. 14n. Joet"Hello" worldntt 45. 2 ^ input. next. Line() 23t 3. 14n. Joet"Hello" worldntt 45. 2 ^ 19n // 23 // 3. 14 // "" (empty!) input. next. Line() // "Joet"Hello" world" 23t 3. 14n. Joet"Hello" worldntt 45. 2 19n ^

A more complex example (skip these notes) • Processing a file of names and hours worked each day to compute weekly totals. Input file: hours. txt Desired output: Aaron Aardvark 8 8 8 Bob Baboon 4 4 Chucky Cheetah Jr. 6 6 2 2 Donald Duck 8 40 8 16 20 hours Aaron Aardvark Bob Baboon Chucky Cheetah Jr. Donald Duck 36

Plan: (skip these notes) • Outer loop: for each employee process 2 lines 1. Use next. Line() to read the name. next. Line() consumes and returns the string but consumes and throws away the following n. 2. Use an inner loop to read the line of numbers using next. Int() Aaron Aardvark 8 8 8 Bob Baboon 4 4 Chucky Cheetah Jr. 6 6 2 2 Donald Duck 8 4 8 next. Line() -> "Aaron Aardvark" next. Int() -> 8 Wait a minute: Welty said not to mix linebased processing (next. Line()) with tokenbased processing. We can’t really handle this situation with the mix. Must go to the next section named, strangely enough, line 37 based processing.

Ignore this if not mixing token and line based processing. public static void hours. Worked. V 1(String filename) throws File. Not. Found. Exception { // read and process the file File f = new File(filename); Scanner input = new Scanner(f); while (input. has. Next. Line()) { String name = input. next. Line(); // read name, line based int sum = 0; // read and add data while (input. has. Next. Int()) { // token based sum += input. next. Int(); // token based } System. out. printf("%2 d hours %sn", sum, name); } } Aaron Aardvark 8 8 8 Bob Baboon 4 4 Chucky Cheetah Jr. 6 6 2 2 Donald Duck 8 4 8 next. Line() -> "Aaron Aardvark" next. Int() -> 8 38

Ignore this if not mixing token and line based processing. public static void hours. Worked. V 1(String filename) throws File. Not. Found. Exception { // read and process the file File f = new File(filename); Scanner input = new Scanner(f); while (input. has. Next. Line()) { String name = input. next. Line(); // read name, line based int sum = 0; // read and add data while (input. has. Next. Int()) { // token based sum += input. next. Int(); // token based } System. out. printf("%2 d hours %sn", sum, name); } } Actual Output: Aaron Aardvark 8 8 8 Bob Baboon 4 4 Chucky Cheetah Jr. 6 6 2 2 Donald Duck 8 40 0 8 0 16 0 20 0 hours hours hours Aaron Aardvark Bob Baboon Chucky Cheetah Jr. Donald Duck 39

Ignore this if not mixing token and line based processing. Don't mix line based input: next. Line() With token based input: next(), next. Int(), next. Double() Aaron Aardvarkn 8 8 8n. Bob Baboonn 4 4n. Chucky ^ next. Line() -> "Aaron Aardvark" Aaron Aardvarkn 8 8 8n. Bob Baboonn 4 4n. Chucky ^ next. Int() -> 8 Aaron Aardvarkn 8 8 8n. Bob Baboonn 4 4n. Chucky ^ next. Line() -> "" 40

Ignore this if not mixing token and line based processing. Why do we get the above? next. Line() consumes the line it is on through the next line marker (“n”) it returns the string containing everything up to but not including the end of line marker. Leaves the cursor just after the n. next. Int() consumes the next integer, including all preceding white space and leaves the cursor pointing to the space just following the integer. Example, we have just consumed the last 8 of Aaron’s hours: Aaron Aardvarkn 8 8 8n. Bob Baboonn 4 4n. Chucky ^ The cursor sees nothing followed by a n. It returns the empty string and consumes through the n next. Line() -> "" Aaron Aardvarkn 8 8 8n. Bob Baboonn 4 4n. Chucky ^ There are no ints after the n so it just sets the sum to 0 and then reads Bob Baboon. 41

Ignore this if not mixing token and line based processing. Example, we have just consumed the last 8 of Aaron’s: Aaron Aardvarkn 8 8 8n. Bob Baboonn 4 4n. Chucky ^ The cursor sees nothing followed by a n. It returns the empty string and consumes through the n next. Line() -> "" Aaron Aardvarkn 8 8 8n. Bob Baboonn 4 4n. Chucky ^ 42
- Slides: 42