Advanced UNIX 240 491 Special Topics in Comp

  • Slides: 45
Download presentation
Advanced UNIX 240 -491 Special Topics in Comp. Eng. 1 Semester 2, 2000 -2001

Advanced UNIX 240 -491 Special Topics in Comp. Eng. 1 Semester 2, 2000 -2001 . 10 File Processing v Objectives of these slides: – a more detailed look at file processing in C 240 -491 Adv. UNIX: fp/10 1

Overview. 1. 2. 3. 4. 5 240 -491 Adv. UNIX: fp/10 Background Text Files

Overview. 1. 2. 3. 4. 5 240 -491 Adv. UNIX: fp/10 Background Text Files Error Handling Binary Files Direct Access continued 2

. 6. 7. 8. 9. 10 240 -491 Adv. UNIX: fp/10 Temporary Files Renaming

. 6. 7. 8. 9. 10 240 -491 Adv. UNIX: fp/10 Temporary Files Renaming & Removing Character Pushback Buffering Redirecting I/O 3

. 1 Background v Two types of file: text, binary v Two access methods:

. 1 Background v Two types of file: text, binary v Two access methods: sequential, direct (also called random access( v UNIX I/O is line buffered – input is processed a line at a time – output may not be written to a file immediately until a newline is output 240 -491 Adv. UNIX: fp/10 4

. 2 Text Files v Standard I/O printf() scanf() gets() puts() getchar() putchar File

. 2 Text Files v Standard I/O printf() scanf() gets() puts() getchar() putchar File I/O fprintf() fscanf() fgets() fputs() getc() putc() most just add a 'f' 240 -491 Adv. UNIX: fp/10 5

Function Prototypes v int fscanf(FILE *fp, char *format; (. . . , v int

Function Prototypes v int fscanf(FILE *fp, char *format; (. . . , v int fprintf(FILE *fp, char *format; (. . . , v int fgets(char *str, int max, FILE *fp; ( v int fputs(char *str, FILE *fp; ( v int getc(FILE *fp; ( v int putc(int ch, FILE *fp; ( the new argument is the file pointer fp 240 -491 Adv. UNIX: fp/10 6

. 2. 1 Standard FILE* Constants v Name stdin stdout stderr Meaning standard input

. 2. 1 Standard FILE* Constants v Name stdin stdout stderr Meaning standard input standard output standard error v e. g. if (len >= MAX_LEN) fprintf(stderr, “String is too longn; (” 240 -491 Adv. UNIX: fp/10 7

. 2. 2 Opening / Closing v FILE *fopen(char *filename, char *mode; ( v

. 2. 2 Opening / Closing v FILE *fopen(char *filename, char *mode; ( v int fclose(FILE *fp; ( v fopen() modes: Mode “r” “w” “a” 240 -491 Adv. UNIX: fp/10 Meaning read mode write mode append mode 8

Careful Opening v FILE *fp; /* file pointer */ char *fname = “myfile. dat”;

Careful Opening v FILE *fp; /* file pointer */ char *fname = “myfile. dat”; if ((fp = fopen(fname, “r”)) == NULL) { fprintf(stderr, “Error opening %sn”, fname); exit(1); }. . . /* file opened okay/* 240 -491 Adv. UNIX: fp/10 9

. 2. 3 Text I/O v As with standard I/O: – formatted I/O –

. 2. 3 Text I/O v As with standard I/O: – formatted I/O – line I/O – character I/O 240 -491 Adv. UNIX: fp/10 (fprintf, fscanf( (fgets, fputs( (getc, putc( 10

. 2. 3. 1 Formatted I/O v int fscanf(FILE *fp, char *format; (. .

. 2. 3. 1 Formatted I/O v int fscanf(FILE *fp, char *format; (. . . , v int fprintf(FILE *fp, char *format; (. . . , v Both return EOF if an error or end-of-file occurs. v If okay, fscanf() returns the number of bound variables, fprintf() returns the number of output characters. 240 -491 Adv. UNIX: fp/10 11

. 2. 3. 2 Line I/O v char *fgets(char *str, int max, FILE *fp;

. 2. 3. 2 Line I/O v char *fgets(char *str, int max, FILE *fp; ( v int fputs(char *str, FILE *fp; ( v If an error or EOF occurs, fgets() returns NULL, fputs() returns EOF. v If okay, fgets() returns pointer to string, fputs() returns non-negative integer. 240 -491 Adv. UNIX: fp/10 12

Differences between fgets() and gets() v Use of max argument: fgets() reads in at

Differences between fgets() and gets() v Use of max argument: fgets() reads in at most max-1 chars (so there is room for ‘. (’ v fgets() retains the input ‘n’ v Deleting the ‘n: ’ len 1 = strlen(line)-1; if (line[len 1] == ‘n’) line[len 1] = ‘; ’ 240 -491 Adv. UNIX: fp/10 /* to be safe */ 13

Difference between fputs() and puts() v fputs() does not add a‘n’ to the output.

Difference between fputs() and puts() v fputs() does not add a‘n’ to the output. 240 -491 Adv. UNIX: fp/10 14

Line-by-line Echo #define MAX 100 /* max line length */ : void output_file(char *fname)

Line-by-line Echo #define MAX 100 /* max line length */ : void output_file(char *fname) { FILE *fp; char line[MAX]; if ((fp = fopen(fname, “r”)) == NULL) { fprintf(stderr, “Error opening %sn”, fname); exit(1); } while (fgets(line, MAX, fp) != NULL) fputs(line, stdout); fclose(fp; ( { 240 -491 Adv. UNIX: fp/10 15

. 2. 3. 3 Character I/O v int getc(FILE *fp; ( v int putc(int

. 2. 3. 3 Character I/O v int getc(FILE *fp; ( v int putc(int ch, FILE *fp; ( v Both return EOF if an error or end-of-file occurs. v Can also use fgetc() and fputc. () 240 -491 Adv. UNIX: fp/10 16

Char-by-char Echo #define MAX 100 /* max line length */ : void output_file(char *fname)

Char-by-char Echo #define MAX 100 /* max line length */ : void output_file(char *fname) { FILE *fp; int ch; if ((fp = fopen(fname, “r”)) == NULL) { fprintf(stderr, “Error opening %sn”, fname); exit(1); } while ((ch = getc(fp)) != EOF) putc(ch, stdout); fclose(fp; ( { 240 -491 Adv. UNIX: fp/10 17

Using feof() v Rewrite the previous while-loop as: while (!feof(fp)) { ch = getc(fp);

Using feof() v Rewrite the previous while-loop as: while (!feof(fp)) { ch = getc(fp); putc(ch, stdout; ( { – not a common coding style. 240 -491 Adv. UNIX: fp/10 18

. 3 Error Handling v int ferror(FILE *fp; ( – check error status of

. 3 Error Handling v int ferror(FILE *fp; ( – check error status of file stream – it returns non-zero if there is an error v void clearerr(FILE *fp; ( – reset error status 240 -491 Adv. UNIX: fp/10 continued 19

common in advanced coding v void perror(char *str; ( – print str (usually a

common in advanced coding v void perror(char *str; ( – print str (usually a filename) followed by colon and a system-defined error message v . . . fp = fopen(fname, “r”); if (fp == NULL) { perror(fname); exit(1; ( { 240 -491 Adv. UNIX: fp/10 20

errno v The system error message is based on a system error number (errno)

errno v The system error message is based on a system error number (errno) which is set when a library function returns an error. v #include <errno. h>. . . fp = fopen(fname, “r”); if (errno(. . . ==. . . 240 -491 Adv. UNIX: fp/10 continued 21

v Many errno integer constants are defined in errno. h – it is better

v Many errno integer constants are defined in errno. h – it is better style to use the constant name instead of the number – linux distributions usually put most errno constants in asm/errno. h v Example errno constants: EPERM ENOENT 240 -491 Adv. UNIX: fp/10 permission denied no such file / directory 22

. 4 Binary Files v For storing non-character data – arrays, structs, integers (as

. 4 Binary Files v For storing non-character data – arrays, structs, integers (as bytes), GIFs, compressed data v Not portable across different systems – unless you have cross-platform reading/writing utilities, such as gzip v For portability, use text files 240 -491 Adv. UNIX: fp/10 23

fopen() modes for Binary Files v Mode “rb” “wb” “ab” Meaning read binary file

fopen() modes for Binary Files v Mode “rb” “wb” “ab” Meaning read binary file write binary file append to binary file add a "b" to the text file modes 240 -491 Adv. UNIX: fp/10 24

Reading / Writing v int fread(void *buffer, int size, int num, FILE *fp); int

Reading / Writing v int fread(void *buffer, int size, int num, FILE *fp); int fwrite(void *buffer, int size, int num, FILE *fp; ( v Returns number of things read/written (or EOF. ( 240 -491 Adv. UNIX: fp/10 25

Example v The code will write to a binary file containing employee records with

Example v The code will write to a binary file containing employee records with the following type structure: #define MAX_NAME_LEN 50 struct employee { int salary; char name[MAX_NAME_LEN + 1; [ ; { 240 -491 Adv. UNIX: fp/10 continued 26

struct employee e 1, emps[MAX]; : : /* write the struct to fp */

struct employee e 1, emps[MAX]; : : /* write the struct to fp */ fwrite(&e 1, sizeof(struct employee), 1, fp); /* write all of the array with 1 op */ fwrite(emps, sizeof(struct employee), MAX, fp; ( 240 -491 Adv. UNIX: fp/10 27

. 5 Direct Access v Direct access: move to any record in the binary

. 5 Direct Access v Direct access: move to any record in the binary file and then read (you do not have to read the others before it. ( v e. g. a move to the 5 th employee record would mean a move of size: * 4 sizeof(struct employee( 5 th 240 -491 Adv. UNIX: fp/10 28

fopen() Modes for Direct Access(+) v Mode “rb+” Meaning open binary file for read/write

fopen() Modes for Direct Access(+) v Mode “rb+” Meaning open binary file for read/write “wb+” create/clear binary file for read/write “ab+” open/create binary file for read/write at the end 240 -491 Adv. UNIX: fp/10 29

Employees Example #include <stdio. h> #include <stdlib. h> #include <string. h> #define DF “employees.

Employees Example #include <stdio. h> #include <stdlib. h> #include <string. h> #define DF “employees. dat” #define MAX_NAME_LEN 50 struct employee { int salary; char name[MAX_NAME_LEN + 1]; }; int num_emps = 0; FILE *fp; : 240 -491 Adv. UNIX: fp/10 Poor style: global variables /* num of employees in DF */ 30

Data Format empty space of the right size number employees. dat e 1 e

Data Format empty space of the right size number employees. dat e 1 e 2 e 3 e 4 . . . . v The basic coding technique is to store the number of employee currently in the file (e. g. 4( – some functions will need this number in order to know where the end of the data occurs 240 -491 Adv. UNIX: fp/10 31

Open the Data File void open_file(void) { if ((fp = fopen(DF, “rb+”)) == NULL)

Open the Data File void open_file(void) { if ((fp = fopen(DF, “rb+”)) == NULL) { fp = fopen(DF, “wb+”); /* create file */ num_emps = 0; /* initial num. */ } else /* opened file, read in num. */ fread(&num_emps, sizeof(num_emps), 1, fp; ( { 240 -491 Adv. UNIX: fp/10 32

Move with fseek() v int fseek(FILE *fp, long offset, int origin; ( v Movement

Move with fseek() v int fseek(FILE *fp, long offset, int origin; ( v Movement is specified with a starting position and offset from there. v The current position in the file is indicated with the file position pointer (not the same as fp. ( 240 -491 Adv. UNIX: fp/10 33

Origin and Offset v fseek() origin values: Name Value SEEK_SET 0 SEEK_CUR 1 SEEK_END

Origin and Offset v fseek() origin values: Name Value SEEK_SET 0 SEEK_CUR 1 SEEK_END 2 Meaning beginning of file current position end of file v Offset is a large integer – can be negative (i. e. move backwards( – equals the number of bytes to move 240 -491 Adv. UNIX: fp/10 34

Employees Continued Can write anywhere void put_rec(int posn, struct employee *ep) /* write an

Employees Continued Can write anywhere void put_rec(int posn, struct employee *ep) /* write an employee at position posn */ { long loc; loc = sizeof(num_emps) + ((posn-1)*sizeof(struct employee)); fseek(fp, loc, SEEK_SET); fwrite(ep, sizeof(struct employee), 1, fp; ( { 240 -491 Adv. UNIX: fp/10 No checking to avoid over-writing. 35

Read in an Employee void get_rec(int posn, struct employee *ep) /* read in employee

Read in an Employee void get_rec(int posn, struct employee *ep) /* read in employee at position posn */ { long loc; loc = sizeof(num_emps) + ((posn-1)*sizeof(struct employee)); fseek(fp, loc, SEEK_SET); fread(ep, sizeof(struct employee), 1, fp; ( { 240 -491 Adv. UNIX: fp/10 should really check if ep contains something 36

Close Employees File void close_file(void) { rewind(fp); /* same as fseek(fp, 0, 0); */

Close Employees File void close_file(void) { rewind(fp); /* same as fseek(fp, 0, 0); */ /* update num. of employees */ fwrite(&num_emps, sizeof(num_emps), 1, fp); fclose(fp; ( { 240 -491 Adv. UNIX: fp/10 37

ftell() v Return current position of the file position pointer (i. e. its offset

ftell() v Return current position of the file position pointer (i. e. its offset in bytes from the start of the file: ( long ftell(FILE *fp; ( 240 -491 Adv. UNIX: fp/10 38

. 6 Temporary Files v FILE *tmpfile(void); /* create a temp file */ char

. 6 Temporary Files v FILE *tmpfile(void); /* create a temp file */ char *tmpnam(char *name); /* create a unique name/* v tmpfile() opens file with “wb+” mode; removed when program exits 240 -491 Adv. UNIX: fp/10 39

. 7 Renaming & Removing v int rename(char *old_name, char *new_name; ( – like

. 7 Renaming & Removing v int rename(char *old_name, char *new_name; ( – like mv in UNIX v int remove(char *filename; ( – like rm in UNIX 240 -491 Adv. UNIX: fp/10 40

. 8 Character Pushback v int ungetc(int ch, FILE *fp; ( v Overcomes some

. 8 Character Pushback v int ungetc(int ch, FILE *fp; ( v Overcomes some problems with reading too much – 1 character lookahead can be coded v ungetc() only works once between getc() calls v Cannot pushback EOF 240 -491 Adv. UNIX: fp/10 41

. 9 Buffering v int fflush(FILE *fp; ( – e. g. fflush(stdout; ( v

. 9 Buffering v int fflush(FILE *fp; ( – e. g. fflush(stdout; ( v Flush partial lines – overcomes output line buffering v stderr is not buffered. 240 -491 Adv. UNIX: fp/10 42

setbuf() v void setbuf(FILE *fp, char *buffer; ( v Most common use is to

setbuf() v void setbuf(FILE *fp, char *buffer; ( v Most common use is to switch off buffering: setbuf(stdout, NULL; ( – equivalent to fflush() after every output function call 240 -491 Adv. UNIX: fp/10 43

. 10 Redirecting I/O v FILE *freopen(char *filename, char *mode, FILE *fp; ( –

. 10 Redirecting I/O v FILE *freopen(char *filename, char *mode, FILE *fp; ( – opens the file with the mode and associates the stream with it v Most common use is to redirect stdin, stdout, stderr to mean the file v It is better style (usually) to use I/O redirection at the UNIX level. 240 -491 Adv. UNIX: fp/10 continued 44

v FILE *in; int n; in = freopen("infile", "r", stdin); if (in == NULL)

v FILE *in; int n; in = freopen("infile", "r", stdin); if (in == NULL) { perror("infile"); exit(1); } scanf("%d", &n); /* read from infile */ : fclose(in; ( 240 -491 Adv. UNIX: fp/10 45