[Skip Navigation] [CSUSB] / [CNS] / [Comp Sci & Eng Dept] / [R J Botting] / [CSci202] / 11
[Text Version] [Syllabus] [Schedule] [Glossary] [Resources] [Grading] [Contact] [Question] [Search ]
Notes: [01] [02] [03] [04] [05] [06] [07] [08] [09] [10] <11> [12] [13] [14] [15] [16] [17] [18] [19] [20]
Labs: [01] [02] [03] [04] [05] [06] [07] [08] [09] [10]
Wed May 13 17:28:44 PDT 2009

Contents


    CSci202 Computer Science II, Session 11 Streams


      (previous): Review of formatted I/O and the arguments to main: [ 10.html ]

      Preparation -- 15 Stream Input/Output 776

      1. 15.1 Introduction 777
      2. 15.2 Streams 778
      3. 15.2.1 Classic Streams vs. Standard Streams 779
      4. 15.2.2 iostream Library Header Files 779
      5. 15.2.3 Stream Input/Output Classes and Objects 779
      6. 15.3 Stream Output 782
      7. 15.3.1 Output of char * Variables 782
      8. 15.3.2 Character Output Using Member Function put 782
      9. 15.4 Stream Input 783
      10. 15.4.1 get and getline Member Functions 784
      11. 15.4.2 istream Member Functions peek, putback and ignore 787
      12. 15.4.3 Type-Safe I/O 787
      13. 15.5 Unformatted I/O Using read, write and gcount 787
      14. 15.6 Introduction to Stream Manipulators 788
      15. 15.6.1 Integral Stream Base: dec, oct, hex and setbase 789
      16. 15.6.2 Floating-Point Precision (precision, setprecision) 790
      17. 15.6.3 Field Width (width, setw) 791
      18. 15.6.4 User-Defined Output Stream Manipulators 793
      19. 15.7 Stream Format States and Stream Manipulators 794
      20. 15.7.1 Trailing Zeros and Decimal Points (showpoint) 795
      21. 15.7.2 Justification (left, right and internal) 796
      22. 15.7.3 Padding (fill, setfill) 798
      23. 15.7.4 Integral Stream Base (dec, oct, hex, showbase) 799
      24. 15.7.5 Floating-Point Numbers; Scientific and Fixed Notation
      25. (scientific, fixed) 800
      26. 15.7.6 Uppercase/Lowercase Control (uppercase) 800
      27. 15.7.7 Specifying Boolean Format (boolalpha) 802
      28. 15.7.8 Setting and Resetting the Format State via Member Function flags 803
      29. 15.8 Stream Error States 804
      30. 15.9 Tying an Output Stream to an Input Stream 807
      31. 15.10 Wrap-Up 807
      Here is an example based on Horstmann's book [ tss2004.cpp ]

      Assigned Work Due -- Send me a Question

      Input

        Looking inside files on UNIX -- file wc cat od

        When working with binary data you need tools to look at it when things go wrong with your code and the file is in a mess.

        Down load [ example ] and save as source and then try these commands

        The file command will make a good guess at the kind of file you have:

         	file example

        The wc command will count the lines, words, and characters in a file:

         	wc -l example
         	wc -w example
         	wc -c example

        The cat command outputs text files to the terminal (among other things).

         	cat example
        For long files use more to see the file one screen at a time, and tap space for the next page, and 'q' to quit.

        The od command will tell you precisely what is in a file -- character by character or byte by byte. It makes unprintable and "white space" characters explicit.

         	od -c example
        (Characters)
         	od -x example
        (hexadecimal bytes)
         	od -cx example
        (both).

        My Experience of Binary files

        Use unformatted binary files only when you need something special: security, efficiency, or to interface with existing data. It is nearly always easier to develop applications using normal line-oriented formatted files. Why? Because you can edit text files with normal editors! When you have unformatted data you have to write code to create the data, to delete the data, to view the data, and also to edit it. This was confirmed when I developed the code for lab06!

        Here is an example -- a program to read any 10 characters from a file: [ 11peek.cpp ] (Down load and try it.... there may be bugs).

        Unformatted data is efficient because our program can go directly to any random place in the file: direct access or in IBM-speak "DASD" (dazz-dee). We have seek to move to a character in a random access file and tell to find out where the file was last at. Call them like this:

         		fstreamvar.tellp();
        Returns the position where characters will be put.
         		fstreamvar.seekp(position);
        Go to position to get characters in. The book shows the other versions: all are useful.

         		fstreamvar.tellg();
        Position for getting characters in.
         		fstreamvar.seekg(position);
        Go to position to get characters in. The book shows the other versions: all are useful.

        Demonstration of seek and tell with formatted data [ 11seek.cpp ]

        But to be able to use direct access, your program must be able to calculate where the data is.

        Start by designing the layout of the data in the unformatted file. Draw a simple picture. Tabulate the data and calculate where it is in a simple case. Work out formulas that calculate where the general data is and hence code for reading and writing it. Document any special constraints. For example the file in lab06 must always start with zero or more real records followed by zero or more blank records. Another example would be to require that the file is always sorted in a particular way. A third trick is to place data near a position calculated by a hash function, that way it is quick to find it again.

        It is also wise to not mix formatted input/output with unformatted input output. As a rule use 'read' and 'write' to handle the data, Stick to the following:

         		fstreamvar.read(address_of_object, sizeof(object)).
         		fstreamvar.write(address_of_object, sizeof(object)).
        These copy bytes into and out of primary memory. Read copies them from disk into RAM and write does the reverse. No transformation occurs, the bytes are copied as is. [ passwd.dat ] (file) [ 11read.cpp ] (read file) [ 11write.cpp ] (change file).

        Another rule that preserves sanity: never, ever switch from reading to writing without first executing a seekp. Similarly when you have written some data and want to read from a direct access file, always take care to do a seekg before reading. However you can safely execute a series of reads (or a series of writes) and get each item from the file in turn.

        A direct access file is a repository for pieces of RAM. We can store them, keep them and then read them back in again. But when read they are are not necessarily any where near their old address.

        Never store pointers or pointer based data in an unformatted file. This means that you can not store any of the Standard Library classes (vectors, strings, deques, etc.) in objects that are written and read into unformatted files.

        It is better to use a data base to handle persistent data than invent your own data format for a direct access file!

        More Experience with Random Access Files

        The fifth lab for this course worked nicely last time I taught this course. But in 2007 the programs started to misbehave -- badly. First, the compiler started whining about adding a "sizeof" to a "streampos". Secondly, the listing program started to output garbage and terminated with a "Segmentation" fault. I checked the old compiled code and it worked. But the moment I recompiled the program it went badly wrong -- accessing the 123,456th character in the file (or some such silly number).

        Reason: the updates to the Gnu compiler changed the internal coding of objects like the user password in the lab. I guess that it added typeid information at the start of the object. So, what my code though was a length was something else....

        The solution was to reconstruct the "passwd" file from scratch using newly compiled programs throughout.

        Now, I'm hoping that they do not upgrade the compiler before the next lab.

        So, add another reason for avoiding direct access binary files. The format of the data is compiler dependent.

        stringstream and strstsream -- real cool stuff

        [ tss2004.cpp ]

        Handy for converting an argument to the main program into a numeric value. Also useful for interpreting a buffer of direct access data, or placing numeric data into character format in a direct access file.

      Chapter 15 pages -- streams

      what do we use input streams for?

      Getting data into the computer.

      You have been using streams since the first "Hello, World" program!

      Chapter 15 pages 777-807 -- Stream Error States

      Can you please explain in what situation we would use Stream Error States and elaborate on what their functionality is?

      Things go wrong when we try to use streams. The user does not supply the data we need. They type "Dec" when we want "12" and ask for a month -- for example. The don't supply any data at all and just terminate the input.

      When these things happen the stream enters what is called an "error" state. The stream libraries provide functions that can test the state of the stream an report.

      Chapter 15 pages 784-785 -- Stream Input/Output State Bits

      Could you give a better explanation of "state bits?"

      A bit has two values: 1 and 0, true and false. the state of the stream is described by a collection of bits -- in a byte, I guess.

      To decode the state you either need to use binary or the ready made functions. Example code [ iostates.cpp ] [ iostates2.cpp ]

      I do anything I can to avoid using them.

      Chapter 15 pages 779 -- Streams

      Can you Explain the difference between Classic streams vs. Standard Streams. Whats the difference between Classic Streams vs. Standard Streams

      The so called Classic streams are used to input and output ASCII these days. Or perhaps IBM still uses EBCDIC? The standard library, however, can also handle the newer Unicode characters. Unicode has a unique 16bit number attached to just about every character used in any language in the world.

      Never forget that ASCII is an American standard. Even countries that speak English and use the English alphabet will use special symbols with special meanings -- for example the "#" in ASCII is the symbol for a Pound Stirling in England.

      Variations on ASCII are used in many languages in Europe -- but, again, some symbols have different glyphs -- extra accents and letters in the alphabet.

      This is enough for CSci202 -- we will stick with ASCII.

      Chapter 15 pages 779-783 -- Stream

      What are stream used for?

      Transmitting data into and out of programs. Also for sending data between programs. Finally, some stream can be used to format and parse data inside a program.

      It is all about manipulating data.

      Can you discuss cerr, clog, eof, get, and put and show examples using them?

      cerr and clog are next.

      Be careful to distinguish "EOF" (a constant) and "eof()" a member function.

      Example: [ tio2009.cpp ]

      Chapter 15 pages 780 -- Input/Output Classes and Objects

      What are the differences and similarities between iostream and ostream? When is it necessary to use them?

      Use ostream only when you have a program with output but no input. iostream has both istream (input) and ostream (output) in it.

      Chapter 15 pages 781 -- cerr and clog

      Can you elaborate more on the usage of cerr and clog?

      They are both used to report errors. The stream cerr goes directly to the person/console/terminal that ran the program. There is no delay. The clog accumulates data and outputs it at the end of the line or when you "clog.flush()". Using this is OK for non-urgent records of what is going on (a log). Use cerr for all urgent messages. Other wise you may have a program that dies before it reports the problem. Very confusing.

      Chapter 15 pages 782 -- memory address

      Why would different compilers display different memory addresses for a pointer?

      Because the compiler chooses how much space to give to each variable. How many bytes are used for an 'int'? How many bytes for a vector? It also can organize the data in any way it likes. Once compilers would leave spaces between some types of data to get faster performance. Or it (normally) packs data in tightly with no gaps.

      Basically there is nothing to stop the programmers who write the compilers doing anything they want with addresses!

      Chapter 15 pages 784-786 -- get and getline function

      Can you further explain the get and getline functions? Are they used to get information from another file of any type?

      All files look like a stream or array of bytes in C++.

      "get" is an overloaded function that gets data of a particular type (if it is on the stream).

      "getline" assumes that you have a text file with lines that are not too long.... It transfers the whole (if possible) into an character array, or (safer) into a C++ string.

      To input into an array of characters we use

       		input.getline(char* buffer, int MAXLENGTH)
      If the line is longer than MAXLENGTH it is difficult to get the rest of the line.

      To input from a stream into a C++ string use

       		getline(aStream, aStringVariable);
      not
       		aStream.getline(aStringVariable);
      Example [ outin.cpp ]

      Chapter number pages 785 -- cin and cin.get

      In fig 15.5, how does cin get the string "Contrasting" while the cin.get get the string "string input with cin and cin.get" from the sentence "Contrasting string input with cin and cin.get"?

      The first input is

      		cin>> buffer1;
      which stops just before the first space that it meets. This is how ">>" is defined to work.

      The next input is

       		cin.get(buffer2, SIZE);
      which treats spaces as part of the input and reads until the end of line. Notice that it even reads the space that was not read by the previous ">>" operator.

      By the way.... inputting data into a char[...] is a recipe for an unreliable and insecure program. Using the C++ "string" is safer and more secure.

      Chapter 15 pages 787 -- stream input

      what is the proper type safety for stream input

      In the old days of C you could input a decimal number and store it in a char* character string. It did not check that the format you specified matched the variables you gave the function.... C++ uses the type of the variable to specify the format of the input data.... int's are input in decimal and so on....

      Type safety is provided by the compiler. It stops you writing code that does stupid things. This is important because input comes from user's who tend to do unexpected things. As a result the input operators either get something of the right type from the input, or they enter an error state.

      Chapter 15 pages 788-794 -- stream manipulators

      Is there a single library that contains all stream manipulators or do you need to simply list each one are using?

      You can get away with "iostream" for normal cin/cout/cerr/clog work and "fstream"(later) to handle files.

      Chapter 15 pages 792 -- Field width

      Could you better explain fig. 15.10?

      This is a nice example of evil programming -- the clever use of features to produce a hard to explain out.

      By setting the width of cin to 5 on line 14 and line 21 we get no more than 5 characters per input... and if a space comes along we get fewer characters in.

      Meanwhile the cout width starts at 4 and grows so the input (1,2,3,4, or 5) characters is appearing in an increasing piece of the output line. Character strings are, by default, right justified and filled (if needed) with spaces. So the output seems to move across the page.

      will stream manipulators dec, oct, and hex ever come in handy?

      Not in CS202. But useful if you do any system programming or computer engineering.

      Chapter 15 pages 793 -- User-Defined Output Stream Manipulators

      In the following function header, why are the & necessary?
       ostream& bell( ostream& output ) ...

      The "&" indicates passing the ostream by reference not by value.

      The << and >> operators expect to be given references to the streams, not a copy of the stream.

      Further -- a copy of a stream may not do what you expect.

      Finally -- "large" data structures (and a stream probably has a large buffer inside it) should always be passed by reference.

      Chapter 15 pages 788-794 -- Stream Manipulators

      What are stream manipulators and how do they work?

      THey are commands that you write after a "<<" or a ">>" which changes the behavior of the input and output. Typically they change the format.

      C++ provides too many stream manipulators. It is worth putting a sticky note on a page in a C++ book that lists them. I don't expect you to memorize them. I do expect you to use them in labs and projects.

      Chapter 15 pages 800 -- Tying an output stream to an input stream

      What is tying output streams to input streams used for?

      To make sure that prompts are sent to the user before you try to read the response to the prompt.

      Luckily 'cin' and 'cout' are already tied.

      Chapter 15 pages 807 -- Tying

      Is tying useful? Would we ever need to use in if we just put the cout << prompt before the cin >> prompt?

      I have never used it, or heard of it being used. It is not part of CS202.

      You need it when an input stream and an output stream are connected to a single place or person, and need to have a conversation.

      With typical luck.... you will need it in your first paid program:-(

      Chapter 15 pages 807 -- graphing

      Can you stream a graph? if not how would you out put graph coordinates?

      The standard libraries don't handle graphic windows. To do interactive graphics needs a different kind of programming. However you can fake graphics using "cout". The resolution is not good and animation needs the old C "curses" library.

      One other option is to use a special library developed by "Horstmann" for his C++ books. Here [ stickman.cpp ] is an example of code that produces graphics -- if you can figure out where the special library is kept. My Q script seems to do this well.

      Here [ ../cs201/ccc.html ] , [ ../cs201/ccc.gif ] , and [ ../cs201/grid.gif ] is some documentation for using these classes.

      By the way ... don't try to do graphics from a remote MS Windows machine. It is possible with UNIX computers.... they run the X Windows system which assumes you are running on the Internet and can export to remote display.... but the process is arcane... More.... JBH3-1 does not have the special libraries for Horstmann or XWindows.

      Exercises

      Depends on the questions!

    . . . . . . . . . ( end of section CSci202 Computer Science II, Session 11 Streams) <<Contents | End>>

    Laboratory 6 -- Streams and Counting

    [ lab06.html ]

    Next -- Exceptions -- Chapter 16 -- Project 3 -- Quiz 3

    See [ 12.html ] Project 3 is due in. There will be a quiz on polymorphism, template functions, streams, and the project.

    Abbreviations

  1. Algorithm::=A precise description of a series of steps to attain a goal, [ Algorithm ] (Wikipedia).
  2. class::="A description of a set of similar objects that have similar data plus the functions needed to manipulate the data".
  3. Data_Structure::=A small data base.
  4. Function::programming=A selfcontained and named piece of program that knows how to do something.
  5. Gnu::="Gnu's Not Unix", a long running open source project that supplies a very popular and free C++ compiler.
  6. KDE::="Kommon Desktop Environment".
  7. object::="A little bit of knowledge -- some data and some know how", and instance of a class".
  8. OOP::="Object-Oriented Programming", Current paradigm for programming.
  9. Semantics::=Rules determining the meaning of correct statements in a language.
  10. SP::="Structured Programming", a previous paradigm for programming.
  11. STL::="The standard C++ library of classes and functions" -- also called the "Standard Template Library" because many of the classes and functions will work with any kind of data.
  12. Syntax::=The rules determining the correctness and structure of statements in a language, grammar.
  13. Q::software="A program I wrote to make software easier to develop",
  14. TBA::="To Be Announced", something I should do.
  15. TBD::="To Be Done", something you have to do.
  16. UML::="Unified Modeling Language".
  17. void::C++Keyword="Indicates a function that has no return".

End