*/
Written some cool source code? Upload it to Programmer's Heaven.
*/

Counting Words

Theme Graphic
Theme Graphic

Software Tools in Turbo Pascal

I intend to go through SOFTWARE TOOLS IN PASCAL by Kernighan & Plauger and to re-write the programs they presented using Turbo Pascal, taking advantage of Turbo Pascal's improvements over the...

Subscribe

Author

Archive

Tags

Posted on Saturday, October 27, 2007 at 9:27 PM

Counting Words

The next program counts the words in a file. K&P define a word as "a sequence of any characters except blanks, tabs and newlines." Since we are not using the NEWLINE character we'll redefine it as "a sequence of any characters except blanks, tabs, line feeds and carriage returns."

The algorithm used by K&P is, in my opinion, too complex. It uses a flag, inword, to keep track of whether the file pointer is "in a word", tests that flag, and then changes it, or not, depending on the results of that test.

Consider TAB, LF, CR and BLANK. Let’s call these delimiters. The signal to increment the count is a transition from a delimiter to a non-delimiter. Thus, we need to keep track of the previous character in the stream. The problem of "what was the previous character when we read the first character" is trivial. We need only assign it some value. That value must be a delimiter otherwise the first word in the file will not get counted.

Program WordCnt ;
{
      WordCnt -- count words in standard input
}
Uses
   Tools ;
CONST
   DELIMITERS  :  Set of Char = [TAB, LF, CR, BLANK] ;
Var
   Ch,
   Prev  :  Char ;
   Count :  LongInt ;
begin { WordCnt }
   Prev  := BLANK ;
   Count := 0 ;
   while NOT eof do begin
      Read (Ch) ;
      if (Prev in DELIMITERS) AND (NOT(Ch in DELIMITERS)) then
             Count := Count + 1 ;
      Prev  := Ch
   end ;
   WriteLn (Count:0)
end.  { WordCnt }


Another departure from K&P is the use of a set and a typed constant. Typed constants allow values to be assigned to structured constant types, a feature of Turbo Pascal that K&P did not have access to. The use of a set makes the program more changeable (maintainable) since the choice of delimiters can be changed by changing their definition in line 8. For example
CONST
   NUL = Chr(0) ;
   DELIMITERS  :  Set of Char = [NUL .. BLANK] ;


In this case I would add NUL to the unit Tools (and I will later). DELIMITERS could also be added to Tools but I did not since I do not anticipate using DELIMITERS again. I also anticipate that other programs that use the concept of delimiters would not need the same definition so I kept the definition local.

Tags: None

0 comments on "Counting Words"
No comments posted yet.

Leave A Comment
Subject:


Comment:
   Bold Italic Underline          Code Link Image Horizontal Rule


Because you do not have or are not logged in to your Programmer's Heaven account, please enter your name.

Name:


To help prevent comment SPAM, please enter the magic code '177' in the box:




Posting Rules
Please follow these rules when posting comments on blog posts.
  • Do not post anything that is racist, hate speech or of a sexual or adult nature.
  • Do not post or link to anything that infringes copyrighted laws.
  • Posting about security or legal topics is fine so long as you are not glorifying or encouraging people to perform illegal activities.
  • Both the author of this blog and the Programmer's Heaven administrators may delete any inappropriate comments without notice at their own discretion.

corner
© 1996-2008 CommunityHeaven LLC. All rights reserved. Reproduction in whole or in part, in any form or medium without express written permission is prohibited.
Violators of this policy may be subject to legal action. Please read our Terms Of Use and Privacy Statement for more information.
North American business development: Nicolai Wadstrom. Publisher: Lars Hagelin.