Posted on Saturday, October 27, 2007 at 9:27 PM
The next program counts the words in a file. K&P define a word as "a sequence of any characters except blanks, tabs and newlines." Since we are not using the NEWLINE character we'll redefine it as "a sequence of any characters except blanks, tabs, line feeds and carriage returns."
The algorithm used by K&P is, in my opinion, too complex. It uses a flag,
inword, to keep track of whether the file pointer is "in a word", tests that flag, and then changes it, or not, depending on the results of that test.
Consider TAB, LF, CR and BLANK. Let’s call these
delimiters. The signal to increment the count is a transition from a delimiter to a non-delimiter. Thus, we need to keep track of the previous character in the stream. The problem of "what was the previous character when we read the first character" is trivial. We need only assign it some value. That value must be a delimiter otherwise the first word in the file will not get counted.
...
Posted on Saturday, October 27, 2007 at 3:59 PM
Before proceeding with the next program let’s skip forward to the index of the book to page 338. Here K&P list the code for their "wrapper." The wrapper is a program which defines constants, types, functions and procedures that are used by their other programs. Here is the explanation of why K&P’s programs are not programs at all but procedures. One implements a program such as charcount by including it in the wrapper and calling it from the wrapper. The wrapper for UCSD Pascal, probably the closest pre Turbo implementation to Turbo Pascal, is given on pages 338 .. 341. Turbo Pascal’s units, i.e., its provision for separate compilation, makes the rigmarole of the wrapper superfluous, although it was probably a good strategy in 1981 a couple of years before Turbo Pascal made the scene.
Instead of a wrapper we’ll create a Turbo Pascal unit called Tools. We’ll start with only what we need and expand it as we go. In anticipation of what we’ll need to write out next program here is our first cut of Tools...
Comments:
0
Tags:
unit,
wrapper
Posted on Wednesday, October 24, 2007 at 7:59 PM
Ascii writes all the classic ascii characters to standard output or, if redirected, to a file. If run without redirection the programs seems to perform as expected. We get all the ascii characters including a line feed, chr(10), and a carriage return, chr(13). However, when we redirect the output to ascii.all and then display ascii.all using type we get considerably less output.
DOS displays various "non printing" characters as various symbols. Chr(1) is a smiley face. Chr(2) is a reverse smiley face. The next four characters, chr(3) .. chr(6) are the four suites of a deck of cards, etc.
The output of ascii.all via type is two lines. The first is short, ending with a diamond. Then other characters do not print but actually do something. Chr(7) gives us a beep. Chr(8) backspaces.
Of particular interest is chr(10), a line feed, which causes the cursor to drop down a line, and chr(13), a carriage return, which sends the cursor to the left edge of the screen causing subsequent characters to overwrite whatever was there...
Posted on Tuesday, October 23, 2007 at 11:24 PM
This may seem like a digression but it’s not. I have a point to make regarding our progress up to now.
Consider the following program.
Program Ascii ;
{
outputs all the ascii chars ... maybe
}
Var
i : Byte ;
begin { Ascii }
for i := 0 to 127 do
Write (Chr(i))
end. { Ascii }
Compile and run the program. What is the output?
Now redirect the output to a file, then use the
type command to list the output.
ascii > ascii.all
type ascii.all
Why is the output different? Answer tomorrow.
Posted on Monday, October 22, 2007 at 5:50 PM
The next program counts the number of lines in a text file. My version takes advantage of the fact that ReadLn without a parameter list will move the file pointer past the next CR,LF (new line). Thus the program does not even have to know about the intervening chars. The result is a shorter and more elegant program.
Program LineCnt ;
{
count lines in standard input
}
Var
Count : LongInt ;
begin { LineCnt }
Count := 0 ;
while NOT eof do begin
ReadLn ;
Count := Count + 1
end ;
WriteLn (Count:0)
end. { LineCnt }
I declare
Count as a LongInt, a type that K&P did not have access to. I also used LongInt as the counter in CharCnt. It allows the programs to work with much larger files.
As an interesting exercise try omitting the line
Count := 0
...
Comments:
0
Tags:
initialize,
default
Posted on Monday, October 22, 2007 at 10:54 AM
The next program counts the number of characters in a file. The most noteworthy difference between the K&P version and this one is that a new line counts as 2 characters because in DOS it is two characters. Also my program is named CharCnt because the K&P name, charcount, is too long for a DOS name.
Program CharCnt ;
{
count characters in standard input
}
Var
Count : LongInt ;
Ch : Char ;
begin { CharCnt }
Count := 0 ;
while NOT eof do begin
Read (Ch) ;
Count := Count + 1
end ;
WriteLn (Count:0)
end. { CharCnt }
I have used Pascal's WriteLn procedure for output instead of K&P's
putdec mainly because at this point, page 13, K&B have not gotten around to defining putdec.
Look up the K&P version at
http://cm.bell-labs.com/cm/cs/who/bwk/pascaltools.txt
and search for "charcount"
Posted on Sunday, October 21, 2007 at 1:41 PM
If you run Kopy and input data from the keyboard you get an output something like this:
Now is the time for all good men
Now is the time for all good men
to come to the aid of their country.
to come to the aid of their country.
^Z
The lines come in pairs. The first line is the echoing of the keyboard input; the second line is the program output. The program terminates on end of file, chr(26), which is CTRL Z on the keyboard and is echoed as ^Z. The program does not output this.
Using Kopy to copy files requires redirection.
Kopy < kopy.pas > kopy.txt
This copies kopy.pas to kopy.txt.
fc kopy.pas kopy.txt
reveals that the two files are identical.
I will test all programs but will not always post the results.
Posted on Thursday, October 18, 2007 at 11:24 PM
At the outset Kernighan and Plauger present their readers with a user defined type
character which is defined as
type
character = -1..127; { ASCII, plus ENDFILE }
The only purpose of type character is to defeat Pascal strong typing by mapping the ASCII character set onto a subset of integers. I do not think this is a good idea. K&P apparently do not like strong typing. I disagree. I believe that strong typing is a virtue, not a vice, therefore all the programs which I present will use Pascal’s type
char, not character.
Having established type character K&P then define two constants
const
ENDFILE = -1;
NEWLINE = 10; { ASCII value }
DOS uses chr(26) for end of file and chr(13) for end of line. If I have occasion to use either ENDFILE or NEWLINE they will be defined accordingly (as char, not integer).
K&P then define two "primitive" functions:
getc and
putc. More precisely
...
Posted on Thursday, October 18, 2007 at 10:24 PM
Software Tools in Pascal is a rewrite of Software Tools by Brian W. Kernighan and P. J. Plauger. In the first book Kernighan and Plauger wrote their programs in RatFor, a dialog of Fortran implemented via a preprocessor, which gave Fortran a C-like syntax. In attempting to repeat this effort using Pascal, Kernighan apparently developed a dislike for the language and in the same year Software Tools in Pascal was published he also published his famous rant Why Pascal is Not My Favorite Programming Language. I will address this rant in a later post.
My intention here is to re-write the programs in Software Tools in Pascal using Turbo Pascal, an implementation of the language that became a de-facto standard. I will admit up front that this effort is entirely academic. I'm doing it for fun. Still, some people might find it useful. The tools K&P presented were, even at that time, pretty much available pre-packaged in DOS and Unix, so even the original books were something of an academic effort...