CPP09 W2 M1 Streams and Files
Learning Objective
- Understanding streams
- Understand the stream class hierarchy
- Understand the concepts of stream insertion and extraction
- Use streams for file input and output
- Distinguish between text and binary file input and output
- Write programs for random access of data files using:
- get pointer
- put pointer
- seekg()
- tellg()
- seekp()
- tellp()
Streams and Files
Introduction
In simple words, a stream is a sequence of bytes. In input operations, the bytes are transferred from a device (a keyboard, a disk drive, a network connection etc.) to the main memory. Where as in case of output operations, this is reverse. That means, the bytes are transferred from main memory to a device such as a display screen, a printer, a disk drive, network connection, a tape( a file on tape) etc.Streams
In C++, a stream is a source or destination for collection of characters. Streams are of two types:- Output stream
- Input Stream
The Stream Class Hierarchy
The stream classes are arranged in a rather complex hierarchy. Figure below shows the arrangement of the most important of these classes.We’ve already made extensive use of some stream classes. The extraction operator >> is a member of the istream class, and the insertion operator << is a member of the ostream class. Both of these classes are derived from the ios class. The cout object, representing the standard output stream, which is usually directed to the video display, is a predefined object of the ostream_withassign class, which is derived from the ostream class. Similarly, cin is an object of the istream_withassign class, which is derived from istream.
The istream and ostream classes are derived from ios and are dedicated to input and output, respectively. The istream class contains such functions as get(), getline(), read(), and the overloaded extraction (>>) operators, while ostream contains put() and write(), and the overloaded insertion (<<) operators.
The iostream class is derived from both istream and ostream by multiple inheritance. Classes derived from it can be used with devices, such as disk files, that may be opened for both input and output at the same time. Three classes—istream_withassign, ostream_withassign, and iostream_withassign—are inherited from istream, ostream, and iostream, respectively. They add assignment operators to these classes.
The ios class is the granddaddy of all the stream classes, and contains the majority of the features you need to operate C++ streams. The three most important features are the formatting flags, the error-status flags, and the file operation mode.
Stream Insertion and Extraction
Stream InsertionStream classes have their own member data, functions definitions. Class ostream contains functions defined for output operations. These operations are called stream insertions. The <<>
cout<<"Hello World\n";
translates as: the text string to the right of the inserter is to be stored in the stream object on the left. The <<>
Stream Extraction
The opposite of insertion is extraction, which is fetching of data from an input stream. Input stream operations are defined in istream class. The >> operator, called the extractor, can accept any fundamental data type. The most important point to note is that the extractor skips leading white spaces(,\n\t') while extracting any of these data types. cin is the predefined object if the istream class attached to standard input device which is keyboard.
User defined Streams
The input and output streams discussed so far dealt only with standard input and output. C++ also provides specific classes which deal with user-defined streams. User-defined streams are in the form of files. In C++ a files is linked to a stream. Before a file can be opened. a stream must be obtained. These streams are more powerful than the pre-defined streams.File I/O with Streams
Most programs need to save data to disk files and read it back in. Working with disk files requires another set of classes: ifstream for input, fstream for both input and output, and ofstream for output. Objects of these classes can be associated with disk files, and we can use their member functions to read and write to the files.Referring back to Figure previously, you can see that ifstream is derived from istream, fstream is derived from iostream, and ofstream is derived from ostream. These ancestor classes are in turn derived from ios. Thus the file-oriented classes derive many of their member functions from more general classes. The file-oriented classes are also derived, by multiple inheritance, from the fstreambase class. This class contains an object of class filebuf, which is a fileoriented buffer, and its associated member functions, derived from the more general streambuf class. You don’t usually need to worry about these buffer classes.
The ifstream, ofstream, and fstream classes are declared in the FSTREAM file.
Formatted File I/O
In formatted I/O, numbers are stored on disk as a series of characters. Thus 6.02, rather than being stored as a 4-byte type float or an 8-byte type double, is stored as the characters ‘6’, ‘.’, ‘0’, and ‘2’. This can be inefficient for numbers with many digits, but it’s appropriate in many situations and easy to implement. Characters and strings are stored more or less normally.Writing Data
The following program writes a character, an integer, a type double, and two string objects to a disk file. There is no output to the screen. Here’s the listing for FORMATO:// formato.cpp // writes formatted output to a file, using << #include//for file I/O #include#include using namespace std; int main() { char ch = ‘x’; int j = 77; double d = 6.02; string str1 = "Kafka"; //strings without string str2 = "Proust"; // embedded spaces ofstream outfile("fdata.txt"); //create ofstream object if(!outfile){ cout<<"Error Opening File"; return 0; } outfile << ch //insert (write) data << j << ‘ ‘ //needs space between numbers << d << str1 << ‘ ‘ //needs spaces between strings << str2; cout << "File written\n"; return 0; }
When the program terminates, the outfile object goes out of scope. This calls its destructor, which closes the file, so we don’t need to close the file explicitly. There are several potential formatting glitches. First, you must separate numbers (such as 77 and 6.02) with nonnumeric characters. Since numbers are stored as a sequence of characters, rather than as a fixed-length field, this is the only way the extraction operator will know, when the data is read back from the file, where one number stops and the next one begins. Second, strings must be separated with whitespace for the same reason. This implies that strings cannot contain imbedded blanks. In this example we use the space character (‘ ‘) for both kinds of delimiters. Characters need no delimiters, since they have a fixed length. You can verify that FORMATO has indeed written the data by examining the FDATA.TXT file with the Windows WORDPAD accessory or the DOS command TYPE.
Reading Data
We can read the file generated by FORMATO by using an ifstream object, initialized to the name of the file. The file is automatically opened when the object is created. We can then read from it using the extraction (>>) operator.Here’s the listing for the FORMATI program, which reads the data back in from the FDATA.TXT file:
// formati.cpp // reads formatted output from a file, using >> #include//for file I/O #include#include using namespace std; int main() { char ch; int j; double d; string str1; string str2; ifstream infile("fdata.txt"); //create ifstream object if(!infile){ cout<<"File Cannot be opened"; return 0; } //extract (read) data from it infile >> ch >> j >> d >> str1 >> str2; cout << ch << endl //display the data << j << endl << d << endl << str1 << endl << str2 << endl; return 0; }
x 77 6.02 Kafka ProustOf course the numbers are converted back to their binary representations for storage in the program. That is, the 77 is stored in the variable j as a type int, not as two characters, and the 6.02 is stored as a double.
Character I/O
The Character I/O functions such as get() and put() can be used when the programmer wishes to read / write white space characters also. So, with Character I/O the problem of accepting and writing character/string with white spaces is solved.The put() and get() functions, which are members of ostream and istream, respectively, can be used to output and input single characters. Here’s a program, OCHAR, that outputs a string, one character at a time:
// ochar.cpp // file output with characters #include//for file functions #include#include using namespace std; int main() { string str = "Time is a great teacher, but unfortunately " "it kills all its pupils. Berlioz"; ofstream outfile("TEST.TXT"); //create file for output if(!outfile){ cout<<"File Cannot be opened"; return 0; } for(int j=0; j<str.size(); j++) //for each character, outfile.put( str[j] ); //write it to file cout << "File written\n"; return 0; }
We can read this file back in and display it using the ICHAR program.
// ichar.cpp // file input with characters #include//for file functions #includeusing namespace std; int main() { char ch; //character to read ifstream infile("TEST.TXT"); //create file for input if(!infile){ cout<<"File Cannot be opened"; return 0; } while( infile ) //read until EOF or error { infile.get(ch); //read character cout << ch; //display it } cout << endl; return 0; }
Binary Input and Output
All the input and output opeartions you have seen so far are text or character based. That is, all information is stored in the same format as it would be dispalyed on the screen.You can write a few numbers to disk using formatted I/O, but if you’re storing a large amount of numerical data it’s more efficient to use binary I/O, in which numbers are stored as they are in the computer’s RAM memory, rather than as strings of characters. In binary I/O an int is stored in 4 bytes, whereas its text version might be "12345", requiring 5 bytes. Similarly, a float is always stored in 4 bytes, while its formatted version might be "6.02314e13", requiring 10 bytes.
Our next example shows how an array of integers is written to disk and then read back into memory, using binary format. We use two new functions: write(), a member of ofstream; and read(), a member of ifstream. These functions think about data in terms of bytes (type char). They don’t care how the data is formatted, they simply transfer a buffer full of bytes from and to a disk file. The parameters to write() and read() are the address of the data buffer and its length. The address must be cast, using reinterpret_cast, to type char*, and the length is the length in bytes (characters), not the number of data items in the buffer. Here’s the listing for BINIO:
// binio.cpp // binary input and output with integers #include//for file streams #includeusing namespace std; const int MAX = 100; //size of buffer int buff[MAX]; //buffer for integers int main() { for(int j=0; j<MAX; j++) //fill buffer with data buff[j] = j; //(0, 1, 2, ...) //create output stream ofstream os("edata.dat", ios::binary); if(!os){ cerr << "File cannot be opened\n"; } //write to it os.write( reinterpret_cast<char*>(buff), MAX*sizeof(int) ); os.close(); //must close it for(j=0; j<MAX; j++) //erase buffer buff[j] = 0; //create input stream ifstream is("edata.dat", ios::binary); //read from it is.read( reinterpret_cast<char*>(buff), MAX*sizeof(int) ); for(j=0; j<MAX; j++) //check data if( buff[j] != j ) { cerr << "Data is incorrect\n"; return 1; } cout << "Data is correct\n"; return 0; }
The reinterpret_cast Operator
In the BINIO program (and many others to follow) we use the reinterpret_cast operator to make it possible for a buffer of type int to look to the read() and write() functions like a buffer of type char.is.read( reinterpret_cast
The reinterpret_cast operator is how you tell the compiler, “I know you won’t like this, but I want to do it anyway.” It changes the type of a section of memory without caring whether it makes sense, so it’s up to you to use it judiciously. You can also use reinterpret_cast to change pointer values into integers and vice versa. This is a dangerous practice, but one which is sometimes necessary.
Closing Files
So far in our example programs there has been no need to close streams explicitly because they are closed automatically when they go out of scope; this invokes their destructors and closes the associated file. However, in BINIO, since both the output stream os and the input stream is are associated with the same file, EDATA.DAT, the first stream must be closed before the second is opened. We use the close() member function for this. You may want to use an explicit close() every time you close a file, without relying on the stream’s destructor. This is potentially more reliable, and certainly makes the listing more readable.Object I/O
Since C++ is an object-oriented language, it’s reasonable to wonder how objects can be written to and read from disk. The next examples show the process.Writing an Object to Disk
When writing an object, we generally want to use binary mode. This writes the same bit configuration to disk that was stored in memory, and ensures that numerical data contained in objects is handled properly. Here’s the listing for OPERS, which asks the user for information about an object of class person, and then writes this object to the disk file PERSON.DAT:// opers.cpp // saves person object to disk #include//for file streams #includeusing namespace std; //////////////////////////////////////////////////////////////// class person //class of persons { protected: char name[80]; //person’s name short age; //person’s age public: void getData() //get person’s data { cout << “Enter name: “; cin >> name; cout << “Enter age: “; cin >> age; } }; //////////////////////////////////////////////////////////////// int main() { person pers; //create a person pers.getData(); //get data for person //create ofstream object ofstream outfile(“PERSON.DAT”, ios::binary); if(!outfile){ cerr << "File cannot be Opened"; return 0; } //write to it outfile.write(reinterpret_cast<char*>(&pers), sizeof(pers)); return 0; }
Enter name: Coleridge Enter age: 62The contents of the pers object are then written to disk, using the write() function. We use the sizeof operator to find the length of the pers object.
Reading an Object from Disk
Reading an object back from the PERSON.DAT file requires the read() member function. Here’s the listing for IPERS:// ipers.cpp // reads person object from disk #include//for file streams #includeusing namespace std; //////////////////////////////////////////////////////////////// class person //class of persons { protected: char name[80]; //person’s name short age; //person’s age public: void showData() //display person’s data { cout << “Name: “ << name << endl; cout << “Age: “ << age << endl; } }; //////////////////////////////////////////////////////////////// int main() { person pers; //create person variable ifstream infile(“PERSON.DAT”, ios::binary); //create stream if(!infile){ cerr << "File cannot be opened"; return 0; } //read stream infile.read( reinterpret_cast<char*>(&pers), sizeof(pers) ); pers.showData(); //display person return 0; }
Name: Coleridge Age: 62
Compatible Data Structures
To work correctly, programs that read and write objects to files, as do OPERS and IPERS, must be talking about the same class of objects. Objects of class person in these programs are exactly 82 bytes long: The first 80 are occupied by a string representing the person’s name, and the last 2 contain an integer of type short, representing the person’s age. If two programs thought the name field was a different length, for example, neither could accurately read a file generated by the other.Notice, however, that while the person classes in OPERS and IPERS have the same data, they may have different member functions. The first includes the single function getData(), while the second has only showData(). It doesn’t matter what member functions you use, since they are not written to disk along with the object’s data. The data must have the same format, but inconsistencies in the member functions have no effect. However, this is true only in simple classes that don’t use virtual functions.
If you read and write objects of derived classes to a file, you must be more careful. Objects of derived classes include a mysterious number placed just before the object’s data in memory. This number helps identify the object’s class when virtual functions are used. When you write an object to disk, this number is written along with the object’s other data. If you change a class’s member functions, this number changes as well. If you write an object of one class to a file, and then read it back into an object of a class that has identical data but a different member function, you’ll encounter big trouble if you try to use virtual functions on the object. The moral: Make sure a class used to read an object is identical to the class used to write it.
You should also not attempt disk I/O with objects that have pointer data members. As you might expect, the pointer values won’t be correct when the object is read back into a different place in memory.
I/O with Multiple Objects
The OPERS and IPERS programs wrote and read only one object at a time. Our next example opens a file and writes as many objects as the user wants. Then it reads and displays the entire contents of the file. Here’s the listing for DISKFUN:// diskfun.cpp // reads and writes several objects to disk #include//for file streams #includeusing namespace std; //////////////////////////////////////////////////////////////// class person //class of persons { protected: char name[80]; //person’s name int age; //person’s age public: void getData() //get person’s data { cout << “\n Enter name: “; cin >> name; cout << “ Enter age: “; cin >> age; } void showData() //display person’s data { cout << “\n Name: “ << name; cout << “\n Age: “ << age; } }; //////////////////////////////////////////////////////////////// int main() { char ch; person pers; //create person object fstream file; //create input/output file //open for append file.open(“GROUP.DAT”, ios::app | ios::out | ios::in | ios::binary ); do //data from user to file { cout << “\nEnter person’s data:”; pers.getData(); //get one person’s data //write to file file.write( reinterpret_cast<char*>(&pers), sizeof(pers) ); cout << “Enter another person (y/n)? “; cin >> ch; } while(ch==’y’); //quit on ‘n’ file.seekg(0); //reset to start of file //read first person file.read( reinterpret_cast<char*>(&pers), sizeof(pers) ); while( !file.eof() ) //quit on EOF { cout << “\nPerson:”; //display person pers.showData(); //read another person file.read( reinterpret_cast<char*>(&pers), sizeof(pers) ); } cout << endl; return 0; }
Enter person’s data: Enter name: McKinley Enter age: 22Enter another person (y/n)? n Person: Name: Whitney Age: 20
Person: Name: Rainier Age: 21
Person: Name: McKinley Age: 22
Here one additional object is added to the file, and the entire contents, consisting of three objects, are then displayed.
The fstream Class
So far in this chapter the file objects we have created have been for either input or output. In DISKFUN we want to create a file that can be used for both input and output. This requires an object of the fstream class, which is derived from iostream, which is derived from both istream and ostream so it can handle both input and output.The open() Function
In previous examples we created a file object and initialized it in the same statement:ofstream outfile(“TEST.TXT”);
In DISKFUN we use a different approach: We create the file in one statement and open it in another, using the open() function, which is a member of the fstream class. This is a useful approach in situations where the open may fail. You can create a stream object once, and then try repeatedly to open it, without the overhead of creating a new stream object each time.
The Mode Bits
We’ve seen the mode bit ios::binary before. In the open() function we include several new mode bits. The mode bits, defined in ios, specify various aspects of how a stream object will be opened.Mode bit for open function are.
- in Open for reading (default for ifstream)
- out Open for writing (default for ofstream)
- ate Start reading or writing at end of file (AT End)
- app Start writing at end of file (APPend)
- trunc Truncate file to zero length if it exists (TRUNCate)
- nocreate Error when opening if file does not already exist
- noreplace Error when opening for output if file already exists, unless ate or app is set
- binary Open file in binary (not text) mode
We write one person object at a time to the file, using the write() function. When we’ve finished writing, we want to read the entire file. Before doing this we must reset the file’s current position. We do this with the seekg() function, which we’ll examine in the next section. It ensures we’ll start reading at the beginning of the file. Then, in a while loop, we repeatedly read a person object from the file and display it on the screen. This continues until we’ve read all the person objects—a state that we discover using the eof() function, which returns the state of the ios::eofbit.
File Pointers
Each file object has associated with it two integer values called the get pointer and the put pointer. These are also called the current get position and the current put position, or—if it’s clear which one is meant—simply the current position. These values specify the byte number in the file where writing or reading will take place. (The term pointer in this context should not be confused with normal C++ pointers used as address variables.)Often you want to start reading an existing file at the beginning and continue until the end. When writing, you may want to start at the beginning, deleting any existing contents, or at the end, in which case you can open the file with the ios::app mode specifier. These are the default actions, so no manipulation of the file pointers is necessary. However, there are times when you must take control of the file pointers yourself so that you can read from and write to an arbitrary location in the file. The seekg() and tellg() functions allow you to set and examine the get pointer, and the seekp() and tellp() functions perform these same actions on the put pointer.
Resources
PPT:- Chapter 14- File Processing C++ How to program by Dietle and Dietle
- 14.1 Introduction
- 14.2 The Data Hierarchy
- 14.3 Files and Streams
- 14.4 Creating a Sequential-Access File
- 14.5 Reading Data from a Sequential-Access File
- 14.6 Updating Sequential-Access Files
- 14.7 Random-Access Files
- 14.8 Creating a Random-Access File
- 14.9 Writing Data Randomly to a Random-Access File
- 14.10 Reading Data Sequentially from a Random-Access File
- 14.11 Example: A Transaction-Processing Program
- 14.12 Input/Output of Objects
- http://www.arachnoid.com/cpptutor/student3.html ---> Output Formatting]
- http://www.cplusplus.com/doc/tutorial/files/ ---> Input/Output with files
- http://www.cplusplus.com/reference/iostream/ios_base/ ---> ios_base
- http://www.cplusplus.com/reference/iostream/ios/ ---> ios
- http://www.cplusplus.com/reference/iostream/ ---> IO Stream Library
- http://www.cplusplus.com/reference/iostream/fstream/ ---> fstream
Steps to solve the problem sets
- Step 1: Read about Streams and Files given
- Step 2: Try sample programs given (You can copy paste the programs and try your self)
- Step 3: Go through the PPT on file processing.(you can copy paste the program and try your self)
- Step 4: Go through the additional resources given.
- Step 5: Attack the problem sets in the given order(Problem Set A and then Problem Set B ...).
- Step 6: If you are not clear with the problem sets then contact your respective mentor for clarification.
Try the programs given in the PPT for Random access files.
Problem Sets
Problem Set A
1. Write a C program that uses a binary file cricket.binary. The binary file is having the information of Indian cricket player statistics.The binary file is divided into 2 parts.
- 1st part(Header part) is the header, having information of number of records in the file.
- 2nd part(Data part) contains the data of the players.
The layout of the file is
- Header Part is of 4 Bytes( Size of Integer) -- Number of records(N).
- Data Part is of N * Size of Player Bytes -- N Records.
//Player Class class Player{ Private: int id; //#ID -- Unique ID given to each player and starts from 1. char name[50];//#Name -- Name of the Player int career; //#Career -- Career Started year int matches; //#Matches -- Number of Matches Played int runs; //#Runs -- Total runs scored int highest; //#Highest -- Highest score of the player int r50; //#50's -- Number of half centuries of the player. int r100; //#100's -- Number of Centuries of the player. float average;//#Average -- Average score of the player. (Total Runs/Number of Matches played) Public: //Declare the Implementation specific functions here and define them. };
- Print all player details
- Display the details of all the players in a tabular format
- Print Individual Player details
- Display the Individual player details based on the ID enter by the user.
- Update a Player Record for the Match.
- User will enter the ID of the Player to be updated
- IF Player is not found - print Player not found and redisplay the MENU
- Display the Player details.
- Ask for the runs scored for the recent match played by the Player.
- Update the record of the entered player in the same file with the following details.
- Matches Played will be increment by one
- Runs - runs + runs scored
- highest - Check highest score with runs scored and update whichever is higher.
- r50 -- if runs scored is a half century then increment r50 by 1
- r100 -- if runs scored is a century then increment r100 by 1
- average -- Calculate the new average with the runs scored.
- Add a new player
- Generate a new ID for the player (ID will be the next highest ID of all the records)
- Ask for all the details of the player from the user except ID.
- Update the number of players in the file.
- Append the new player details at the last of the file.
- Quit
The program should open the binary file in Random Access Mode for reading and writing. Use seekg() to move the cursor position in the file.
You are not supposed to use Array of Players or Dynamic memory allocation.
You can use ID to seek in the file for the location of the record. (4Bytes+(ID*size of player)from start of file)
The data file should be passed as command line argument.
A sample program cricket_read.cpp is given.
- Compile the cricket_read.cpp
- Execute the program as $> cricket_read.exe in the directory where you saved the binary file.
- Prints all the player details
Comments
Post a Comment