operating systems Kernel File Interface or programming IO

  • Slides: 47
Download presentation
operating systems Kernel File Interface (or programming I/O in Unix)

operating systems Kernel File Interface (or programming I/O in Unix)

Input and Output When programming in C on Unix, there are two very different

Input and Output When programming in C on Unix, there are two very different I/O libraries you can use: The C language libraries: o Buffered o Part of the C language o The basic unit is a FILE* The Kernel I/O calls o Unbuffered o System calls – not part of C o The basic unit is a File Descriptor

operating systems

operating systems

operating systems Standard C I/O

operating systems Standard C I/O

operating systems As in C++, the fundamental notion used in doing I/O is the

operating systems As in C++, the fundamental notion used in doing I/O is the stream (but it is not an object as it is in C++. . . it is a data structure) When a file is created or opened in C, the system associates a stream with the file. When a stream is opened, the fopen( ) call returns a pointer to a FILE data structure. The FILE data structure contains all of the information necessary for the I/O library to manage the stream: * a file descriptor * a pointer to the I/O buffer * error flags * etc The original C I/O library was written around 1975 by Dennis Ritchie. Little has changed since then.

operating systems Standard Streams Three streams are predefined and available to a Process. These

operating systems Standard Streams Three streams are predefined and available to a Process. These standard streams are referenced Through the pre-defined FILE pointers stdin, stdout, and stderr. These pointers are defined in <stdio. h>

operating systems Buffering I/O One of the keys of the C I/O library is

operating systems Buffering I/O One of the keys of the C I/O library is that I/O is normally buffered to minimize context switches. Fully Buffered: I/O takes place when a buffer is full. Disk files are normally fully buffered. The buffer is allocated by the I/O library itself by doing a malloc. Line Buffered: I/O takes place when a new line character is encountered. Line buffering is used for terminal I/O. Note that I/O may take place before a new line character is encountered because of the size of the buffer.

operating systems Unbuffered: No buffering is done. Data is output immediately.

operating systems Unbuffered: No buffering is done. Data is output immediately.

operating systems Most Unix systems default to the following: Standard Error is always un-buffered.

operating systems Most Unix systems default to the following: Standard Error is always un-buffered. All streams referring to a terminal device are line buffered (stdin and stdout). All other streams are fully buffered.

operating systems Flushing a Stream You can force a stream to be flushed, (all

operating systems Flushing a Stream You can force a stream to be flushed, (all unwritten bytes are passed to the kernel) #include <stdio. h> int fflush (FILE *fp); I’ve not seen an issue in Windows, but in Unix, you may not see output when you expect to if you don’t flush the buffers.

operating systems fopen #include <stdio. h> FILE *fopen (const char *filename, const char *mode);

operating systems fopen #include <stdio. h> FILE *fopen (const char *filename, const char *mode); pointer to the FILE structure holding the internal state information about the connection to the associated file. Returns a NULL pointer if open fails. full path to the file to be opened when opened for reading and writing * input cannot immediately follow output without an intervening fflush, fseek, fsetpos, or rewind. * output cannot immediately follow input without an intervening fseek, fsetpos, or rewind. Mode bits “r” “rb” “wb” “ab” “r+” “rb+” “wb+” “ab+” open open open text file for reading binary file for reading text file for writing - truncate binary file for writing - truncate text file for writing-append binary file for writing-append text file to read & write (file must exist) binary file to read & write - ditto text file to read & write – truncate binary file to read & write – truncate text file to read & write – append binary file to read & write - append

Opening a Stream operating systems Restriction r file must already exist previous contents are

Opening a Stream operating systems Restriction r file must already exist previous contents are discarded * stream can be read stream can be written stream can only be written at end * w a * * * r+ w+ a+ * * You cannot set permission when a file is opened with w or a * * *

Example of using fopen FILE *in; if ((in = fopen(“file 1. txt”, “r”)) ==

Example of using fopen FILE *in; if ((in = fopen(“file 1. txt”, “r”)) == NULL) perror(“could not open file 1. txt”);

operating systems Related Calls FILE *freopen (const char *pathname, const char *mode, FILE *fp);

operating systems Related Calls FILE *freopen (const char *pathname, const char *mode, FILE *fp); Opens a specified file on a specified stream. Closes the file first, if it is already open. Most typically used with stdin, stdout, and stderr to open a file as one of these streams. FILE *fdopen (int filedes, const char *mode); takes a file descriptor as a parameter. Used with pipes and network connections, because these use file descriptors. Associates an I/O stream with the descriptor.

fclose operating systems #include <stdio. h> int fclose (FILE *stream); returns a zero if

fclose operating systems #include <stdio. h> int fclose (FILE *stream); returns a zero if the close is successful Otherwise it returns -1 All files are closed when the program terminates normally, but this allows no opportunity to do error recovery if termination is not normal. Therefore, it is recommended that all files be closed explicitly.

operating systems Binary I/O is commonly used to read or write arrays or to

operating systems Binary I/O is commonly used to read or write arrays or to read and write structures, because both deal with fixed size blocks of information. Note: Binary files are not necessarily interchangeable across systems! * compilers change how data is packed * binary formats are different on different cpu architectures.

operating systems Unformatted I/O There are three types of unformatted I/O: * Character at

operating systems Unformatted I/O There are three types of unformatted I/O: * Character at a time * Line at a time * Direct I/O (fread and fwrite for binary data)

operating systems Stream Positioning for binary files and text files on GNU systems returns

operating systems Stream Positioning for binary files and text files on GNU systems returns the current byte offset or -1 L #include <stdio. h> long ftell (FILE *fp); int fseek (FILE *fp, long offset, int whence); void rewind (FILE *fp); returns 0 if successful nonzero on error SEEK_SET – from beginning of file SEEK_CUR – from the current position SEEK_END – from the end of the file

operating systems For portability across POSIX systems use: int fgetpos (FILE *fp, fpos_t *pos);

operating systems For portability across POSIX systems use: int fgetpos (FILE *fp, fpos_t *pos); int fsetpos (FILE *fp, const fpos_t *pos); returns 0 if successful the position is passed in this parameter, a new data type defined by the POSIX standard. The position value in an fsetpos must have been obtained in a previous fgetpos call.

fread is used to read binary data and text in fixed sized blocks operating

fread is used to read binary data and text in fixed sized blocks operating systems address of where first byte is to be stored #include <stdio. h> size_t fread (void *ptr, size_t size, size_t nblocks, FILE *stream); The number of blocks to read The size of each block or record The number of items read. It could be less than nblocks if there is an error or eof is reached. The stream to read from

operating systems Interpreting Binary Data If the data that you are reading has some

operating systems Interpreting Binary Data If the data that you are reading has some record structure … struct record_fmt. . . data_buf; fread(&data_buf, sizeof(char), sizeof(data_buf), file_handle);

from the file operating systems struct record_fmt { int a; float b; char id[8];

from the file operating systems struct record_fmt { int a; float b; char id[8]; char pw[8]; }; databuf 010001101000111001 01111010011100101000111101 0011010000110101000011110 0111010001010100011101010011 cout << data_buf. id;

fwrite operating systems #include <stdio. h> address of the first byte to write size_t

fwrite operating systems #include <stdio. h> address of the first byte to write size_t fwrite (void *ptr, size_t size, size_t nblocks, FILE *stream); The number of blocks to write The number of blocks written. If not the same as nblocks, some error has occurred. The stream to write to The size of each block or record

operating systems Character at a time Input #include <stdio. h> The return value is

operating systems Character at a time Input #include <stdio. h> The return value is an unsigned char that has been converted to an int fgetc (FILE *stream); The constant EOF (usually -1) is returned if there is an error or if the end of the file is encountered. fgetc gets the next character in the stream as an unsigned char and returns it as an int. If an eof or an error is encountered, EOF is returned instead. This call is guaranteed to be written as a function.

operating systems Character at a time Input int getc (FILE *stream); highly optimized –best

operating systems Character at a time Input int getc (FILE *stream); highly optimized –best function for reading a single character. Usually implemented as a macro. int getchar ( void ); Equivalent to getc(stdin)

In most implementations, each stream maintains operating systems * an error flag * an

In most implementations, each stream maintains operating systems * an error flag * an end-of-file flag To distinguish between EOF and an error call one of the following functions: #include <stdio. h> int ferror (FILE *fp); returns nonzero (true) if error flag is set, otherwise returns 0 int feof (FILE *fp); returns nonzero (true) if eof flag is set, otherwise returns 0 Clear the flags by calling void clearerr (FILE *fp);

operating systems After reading a character from a stream, it can be pushed back

operating systems After reading a character from a stream, it can be pushed back into the stream. #include <stdio. h> int ungetc (int c, FILE *fp); the character to push back. Note that it is not required that you push back the same character that you read. You cannot pushback EOF. Implementations are not required to support more than a single character of pushback, so don’t count on it.

operating systems Character Output int fputc (int c, FILE *stream); fputc converts c to

operating systems Character Output int fputc (int c, FILE *stream); fputc converts c to an unsigned char and writes it to the stream. EOF is returned if there is an error. int putc (int c, FILE *stream); optimized for single character input int putchar( int c ); assumes stdout is the output stream

Line at a Time Input operating systems #include <stdio. h> returns buf if successful

Line at a Time Input operating systems #include <stdio. h> returns buf if successful and NULL on end of file or error. char *fgets (char *buf, int n, FILE *fp); reads up through and including the next newline character, but no more than n-1 characters. The buffer is terminated with a null byte. If the line is longer than n-1, a partial line is returned. The buffer is still null terminated. If the input contains a null, you can’t tell. char *gets (char *fp); Warning gets has been deprecated because it does not allow the size of the buffer to be specified. This allows buffer overflow!

operating systems String Output #include <stdio. h> int fputs (const char *str, FILE *fp);

operating systems String Output #include <stdio. h> int fputs (const char *str, FILE *fp); writes a null-terminated string to the stream. It does not write the null terminating character. It does not write a newline character. Returns EOF if the function fails. int puts (const char *str); writes the null terminated string to standard-out, replacing the zero terminating character with a new-line character. If successful, the function returns a non-negative value. If the function fails, it returns EOF.

I/O Efficiency operating systems Char at a time #include <stdio. h> int main (void)

I/O Efficiency operating systems Char at a time #include <stdio. h> int main (void) { int c; while ( (c =getc(stdin)) != EOF) if (putc(c, stdout) == EOF) perror("Error writing output"); } if(ferror(stdin)) perror("Error reading input"); exit(0); EOF is ctrl-D

Line at a time operating systems #include <stdio. h> #define MAXLINE 4096 int main

Line at a time operating systems #include <stdio. h> #define MAXLINE 4096 int main (void) { char buf[MAXLINE]; while (fgets(buf, MAXLINE, stdin) != NULL) if (fputs(buf, stdout) == EOF) perror("Output Error"); if (ferror(stdin)) perror("Input Error"); } exit(0);

operating systems for copying a file of 1. 5 M bytes in 30, 000

operating systems for copying a file of 1. 5 M bytes in 30, 000 lines loop is executed 30, 000 times loop is executed 1. 5 M times Function user CPU fgets, fputs getc, putc fgetc, fputc 2. 2 seconds 4. 3 seconds 4. 6 seconds

Formatted Output operating systems int printf (const char *format-spec, print-data … ); writes to

Formatted Output operating systems int printf (const char *format-spec, print-data … ); writes to stdout int fprintf (FILE *fp, const char *format-spec, print data); int sprintf(char *s, const char *format-spec, print-data…); a format-specification has the following format: writes to buffer and %[flags] [width] [. precision] type % -this is format-spec appends a null byte at the end. digits after decimal point. This can truncate data Minimum field width. If width is prefixed with 0, add zeros until minimum width is reached. - left align, default is to right align + prefix value with a sign 0 pad output with zeros prefix positive values with a blank d signed decimal integer i signed decimal integer u unsigned decimal integer o unsigned octal integer x unsigned hex integer f double in fixed point notation e double in exponent notation c single character, an int s a string

operating systems Example Format Specification “%-10. 8 f” left justify the output % -

operating systems Example Format Specification “%-10. 8 f” left justify the output % - introduces the format specification print 8 digits after the decimal point output field is 10 chars wide as a minimum. Padded if fewer characters in the output. Data is never truncated.

Example operating systems int n = 3; double cost-per-item = 3. 25; printf(“Cost of

Example operating systems int n = 3; double cost-per-item = 3. 25; printf(“Cost of %3 d items at $%4. 2 f each = $%6. 2 fn”, n, cost-per-item, n*cost-per-item); Cost of 3 first field is 3 characters wide data is right justified items at $ 3 . 2 5 second field is 4 characters wide with two characters after decimal point =$ 9 . 7 5 third field is 6 characters wide with 2 characters after decimal point right justified

operating systems Formatted Input #include <stdio. h> int scanf (const char* format-spec, data fields);

operating systems Formatted Input #include <stdio. h> int scanf (const char* format-spec, data fields); int fscanf (FILE *fp, const char *format-spec, data fields); int sscanf (const char *buf, const char *format-spec, data fields);

scanf reads formatted data from stdin into the data fields given in the argument

scanf reads formatted data from stdin into the data fields given in the argument string. Each argument must be a pointer to a variable that corresponds to a type specifier in the format specification. The format specification can contain: * white space characters. A white space character causes scanf to read in, but not store all consecutive white space characters in the input stream, up to the next non-white space character. * non-white space characters, except % sign. Causes scanf to read but not store a matching non-white space character. If the character does not match, scanf terminates. * format specification, introduced by %. Causes scanf to read in and convert characters in the input into values of the specified type. The resulting value is assigned to the next data field in the arg list.

operating systems Temporary Files #include <stdio. h> FILE *tmpfile (void); creates a temporary file

operating systems Temporary Files #include <stdio. h> FILE *tmpfile (void); creates a temporary file (type wb+) that is automatically deleted when the file is closed or the program terminates.

operating systems Sample Program Write a simple version of the cat command. It takes

operating systems Sample Program Write a simple version of the cat command. It takes an optional parameter, a file name. It copies the file to stdout. - if no file name is given, it copies stdin to stdout

operating systems Preliminaries #include <stdio. h> #include <stdlib. h> C programmers use #define to

operating systems Preliminaries #include <stdio. h> #include <stdlib. h> C programmers use #define to define constants. It works like a macro … the value 256 gets inserted wherever the name LINELEN appears in the code. There is no type checking! header files for I/O required to define NULL #define LINELEN 256 void send_to_stdout( FILE*); function prototype

Main declaration The number of arguments on the command line Array contains the command

Main declaration The number of arguments on the command line Array contains the command line arguments int main (int argc, char* argv[ ]) {. . . }

Body of main int main (int argc, char* argv[ ]) { Declare a FILE*

Body of main int main (int argc, char* argv[ ]) { Declare a FILE* to hold the file handle FILE *fp; if (argc == 1) send_to_stdout ( stdin); If there is just one command line argument it is the command. Copy from stdin.

int main (int argc, char* argv[ ]) { FILE *fp; if (argc == 1)

int main (int argc, char* argv[ ]) { FILE *fp; if (argc == 1) send_to_stdout ( stdin); If there are two command line arguments, the second else if (argc == 2) one is the file name. { if ( (fp = fopen(*++argv, “r”) ) != NULL) { send_to_stdout ( fp ); fclose ( fp ); }

int main (int argc, char* argv[ ]) { FILE *fp; if (argc == 1)

int main (int argc, char* argv[ ]) { FILE *fp; if (argc == 1) send_to_stdout ( stdin); else if (argc == 2) { if (fp = fopen(*++argv, “r”) ) != NULL) { send_to_stdout ( fp ); fclose ( fp ); } else handle file { perror(“could not open the file. ”); won’t open error exit(1); }

else { perror(“could not open the file. ”); exit(1); Handle the case where }

else { perror(“could not open the file. ”); exit(1); Handle the case where } } there are too many } arguments on the else command line. { perror(“Invalid command – too many arguments”); exit(1); } return 0;

send_to_stdout function void send_to_stdout(FILE *fp) { char line[LINELEN]; } while ( fgets (line, LINELEN,

send_to_stdout function void send_to_stdout(FILE *fp) { char line[LINELEN]; } while ( fgets (line, LINELEN, fp) ) { if (fputs ( line, stdout ) == EOF ) { perror(“Write to stdout failed”); exit(1); } }