| File encryption and decryption, Part II | Posted on Sunday, February 10, 2008 at 12:33 AM | One more post on encryption before moving on to other things.
In Part I, I submitted an encryption program that "should prevent the casual snoop from decoding your files." The program's security can be increased tremendously by using the random number generator to generate the key. As it stands the key must be typed in on the command line, a feature that encourages the use of short keys. When the program gets to the end of the key it goes back to the key's beginning and reuses it. In our example we used "buttermilk" as the key. In a file of 500 characters a codebreaker would then get 50 clues to the key.
The random number generator will produce sequences of tens of thousands of characters before it begins to repeat itself. The first problem we have to solve is how to get the random number generator to generate the same sequence during decoding. We solve this by giving the program a "pin" of four characters. Since each number is 8 bits we have a 32 bit seed or 4,294,967,296 possible keys. Of course the user is most likely to use pins made up of alphanumeric characters so there are more like 15,000,000 most probable keys. A user who limits himself to keys made up of easily remembered words limits himself even further. Nevertheless, the number of most probable keys is still in the 100s of thousands.
Read More |
| Bit-manipulation | Posted on Sunday, January 27, 2008 at 6:39 PM | In his infamous rant Why Pascal Is Not My Favorite Programming Language Brian Kernighan says
"There is a paucity of operators ... In particular, there are no bit-manipulation operators (AND, OR, XOR, etc.). I simply gave up trying to write the following trivial encryption program in Pascal:
i := 1;
while getc(c) <> ENDFILE do begin
putc(xor(c, key[i]));
i := i mod keylen + 1
end
because I couldn't write a sensible 'xor' function."
The amazing part of this statement is that I cannot fathom why he found it so hard to write an xor function. It's simply not that hard. In this post we take up the challenge.
We will assume that Kernighan's compiler did not have the type Byte and this may one source of his problem. However, Byte is simply a subrange of the integers (0 .. 255) so this is easily surmounted. We can also safely assume that he had access to the Boolean operators NOT, OR and AND.
Read More |
| File encryption and decryption, Part I | Posted on Saturday, January 19, 2008 at 5:42 PM | The next program is one which appeared in the original Software Tools but not in Software Tools in Pascal. The program is Crypt which encrypts and decrypts a text file.
The algorithm for the encryption is what cryptologists call a "one time pad" (google it). When properly used it is theoretically unbreakable. What we are doing here does not include all the steps entailed in properly using the algorithm but it should prevent the casual snoop from decoding your files.
The algorithm consists of merging the input stream of data with a second stream, the key, to produce an output stream of encrypted data. We extract the ascii codes from each char of input and from each char of the key. A bitwise XOR operation gives us the ascii code for the output which is converted back to a char. Here is the code for Crypt.
Program Crypt ;
{
Crypt -- encrypt and decrypt
}
var
Key : String ;
KeyLen : Byte ;
Read More |
| Command line parameters | Posted on Sunday, January 13, 2008 at 1:39 PM | The Turbo Pascal functions ParamStr and ParamCount give us access to command line arguments. ParamCount returns the number of command line arguments while ParamStr(i : Word) returns the individual parameter strings. Ekho illustrates the use of these two functions. We use the spelling Ekho instead of Echo to avoid colliding with the DOS command ECHO.
Program Ekho ;
{
echo command line arguments to output.
}
Uses
Tools ;
Var
i : Byte ;
begin
for i := 0 to ParamCount do
Write (ParamStr(i), BLANK) ;
WriteLn
end.
Note that even if no parameters are included on the command line we still get output. ParamStr(0) is the executable file itself. Thus:
c:\pascal\tools\ekho
C:\PASCAL\TOOLS\EKHO.EXE
The next few programs we write will use command line parameters.
|
| Trimming strings | Posted on Wednesday, January 02, 2008 at 2:43 AM | Here, for completeness without further comment, is the code for RTrim, LTrim and Trim which are included in the unit Tools.
Function RTrim (Var S : String) : String ;
{
RTrim -- remove trailing blanks
}
Var
i : Byte ;
begin { RTrim }
for i := Length(S) downto 1 do
if S[i] > BLANK then begin
S := Copy(S,1,i) ;
RTrim := S ;
Exit
end ;
{
if we reach this point then S contains only BLANKS
so we return a null string
}
S := '' ;
RTrim := S
end ; { RTrim }
Function LTrim (Var S : String) : String ;
{
LTrim -- remove leading blanks
}
Var
i : Byte ;
begin { LTrim }
for i := 1 to Length(S) do
if S[i] > BLANK then begin
S := Copy(S,i,MAXSTR) ;
Read More |
| Detabbing | Posted on Tuesday, January 01, 2008 at 9:08 AM | DeTab is the inverse of EnTab. Here is the specification:
{
PROGRAM
DeTab -- convert tabs to blanks
USAGE
DeTab
FUNCTION
DeTab copies its input to its output, expanding horizontal tabs to
blanks along the way, so that the output is visually the same as the
input, but contains no tab chars. Tab stops are assumed to be set
every three columns (i.e., 1, 4, 7, 10, ...), so that each tab char
is replaced by from one to three blanks.
BUGS
1. DeTab is naive about backspace, vertical motions, and
non-printing chars.
2. Each line of output will be truncated at MAXSTR chars.
3. Trailing BLANKS will be trimmed from each line of output.
}
DeTab's main routine echos EnTab's, reducing the problem to one of detabbing a single string. To detab the string we could try to simply apply EnTab's strategy in reverse, however, this is another chance to apply the "how would a human do it?" approach.
Read More |
| Entabbing | Posted on Friday, December 28, 2007 at 3:54 AM | The next program in our project is EnTab which replaces runs of blanks in a text file by tabs and blanks. Here is the specification, i.e., the "manual"
PROGRAM
EnTab -- convert runs of blanks into tabs
USAGE
EnTab
FUNCTION
EnTab copies its input to its output, replacing strings of blanks by
tabs so that the output is visually the same as the input but contains
fewer characters. Tab stops are assumed to be set every 3 columns
(i.e., 1, 4, 7, ...), so that each sequence of one to four blanks
ending on a tab stop is replaced by a tab character.
BUGS
1. EnTab is naive about backspaces, vertical motions and non-printing
characters.
2. EnTab will convert a single blank to a tab if it occurs at a tab
stop, thus EnTab is not an exact inverse of DeTab.
3. if any record in the input is longer than 255 char Entab will
truncate that record to 255 char.
Read More |
| Elements of Programming Style | Posted on Monday, December 03, 2007 at 11:43 AM | I've always recommended that programmers make it a point to read this classic by Kernighan and Plauger. I'm now in the process of reading it again. I had to request an interlibrary load to get it.
Upon re-reading it (I'm now up to chapter 3) I find it terribly dated. The examples are all in Fortran and PL/1 (PL/I?). One of the "rules" set forth is "avoid the Fortran arithmetic IF," a good piece of advise if you happen to be programming in Fortran but of little use if you are using C++ or Pascal.
The first chapters of the book seem to deal almost exclusively with when and when not to use GOTO and how to use it when you do. Again advise aimed at Fortran programmers who, lacking WHILE and IF..THEN..ELSE, had no choice but to use GOTO, and at PL/1 programs who could avoid the GOTO but often didn't, producing "Fortran with semicolons."
Read More |
| Showing tabs | Posted on Saturday, November 10, 2007 at 11:03 PM | Some files contain TABs, Chr(9), as a form of modest compression. When a program displaying the file encounters a TAB it prints out a series of one or more blanks. The next two programs insert and remove TABs. To test these programs we will need a program that displays the tabs. This program substitutes a ^ for the tab so we can see the file's true contents.
Program ShowTabs ;
{
ShowTabs -- display tabs in a file.
}
Uses
Tools ; { TAB and CARET ( ^ )}
Var
Ch : Char ;
begin { ShowTabs }
while NOT eof do begin
Read (Ch) ;
if Ch = TAB then
Ch := CARET ;
Write (Ch)
end
end. { ShowTabs }
Tags: None |
| Counting Words | Posted on Saturday, October 27, 2007 at 9:27 PM | The next program counts the words in a file. K&P define a word as "a sequence of any characters except blanks, tabs and newlines." Since we are not using the NEWLINE character we'll redefine it as "a sequence of any characters except blanks, tabs, line feeds and carriage returns."
The algorithm used by K&P is, in my opinion, too complex. It uses a flag, inword, to keep track of whether the file pointer is "in a word", tests that flag, and then changes it, or not, depending on the results of that test.
Consider TAB, LF, CR and BLANK. Let’s call these delimiters. The signal to increment the count is a transition from a delimiter to a non-delimiter. Thus, we need to keep track of the previous character in the stream. The problem of "what was the previous character when we read the first character" is trivial. We need only assign it some value. That value must be a delimiter otherwise the first word in the file will not get counted.
Read More Tags: None |
| The Tools Unit | Posted on Saturday, October 27, 2007 at 3:59 PM | Before proceeding with the next program let’s skip forward to the index of the book to page 338. Here K&P list the code for their "wrapper." The wrapper is a program which defines constants, types, functions and procedures that are used by their other programs. Here is the explanation of why K&P’s programs are not programs at all but procedures. One implements a program such as charcount by including it in the wrapper and calling it from the wrapper. The wrapper for UCSD Pascal, probably the closest pre Turbo implementation to Turbo Pascal, is given on pages 338 .. 341. Turbo Pascal’s units, i.e., its provision for separate compilation, makes the rigmarole of the wrapper superfluous, although it was probably a good strategy in 1981 a couple of years before Turbo Pascal made the scene.
Instead of a wrapper we’ll create a Turbo Pascal unit called Tools. We’ll start with only what we need and expand it as we go. In anticipation of what we’ll need to write out next program here is our first cut of Tools.
Read More |
| Testing LineCnt | Posted on Wednesday, October 24, 2007 at 7:59 PM | Ascii writes all the classic ascii characters to standard output or, if redirected, to a file. If run without redirection the programs seems to perform as expected. We get all the ascii characters including a line feed, chr(10), and a carriage return, chr(13). However, when we redirect the output to ascii.all and then display ascii.all using type we get considerably less output.
DOS displays various "non printing" characters as various symbols. Chr(1) is a smiley face. Chr(2) is a reverse smiley face. The next four characters, chr(3) .. chr(6) are the four suites of a deck of cards, etc.
The output of ascii.all via type is two lines. The first is short, ending with a diamond. Then other characters do not print but actually do something. Chr(7) gives us a beep. Chr(8) backspaces.
Of particular interest is chr(10), a line feed, which causes the cursor to drop down a line, and chr(13), a carriage return, which sends the cursor to the left edge of the screen causing subsequent characters to overwrite whatever was there.
Read More Tags: None |
| Mystery program | Posted on Tuesday, October 23, 2007 at 11:24 PM | This may seem like a digression but it’s not. I have a point to make regarding our progress up to now.
Consider the following program.
Program Ascii ;
{
outputs all the ascii chars ... maybe
}
Var
i : Byte ;
begin { Ascii }
for i := 0 to 127 do
Write (Chr(i))
end. { Ascii }
Compile and run the program. What is the output?
Now redirect the output to a file, then use the type command to list the output.
ascii > ascii.all
type ascii.all
Why is the output different? Answer tomorrow.
Tags: None |
| Counting Lines | Posted on Monday, October 22, 2007 at 5:50 PM | The next program counts the number of lines in a text file. My version takes advantage of the fact that ReadLn without a parameter list will move the file pointer past the next CR,LF (new line). Thus the program does not even have to know about the intervening chars. The result is a shorter and more elegant program.
Program LineCnt ;
{
count lines in standard input
}
Var
Count : LongInt ;
begin { LineCnt }
Count := 0 ;
while NOT eof do begin
ReadLn ;
Count := Count + 1
end ;
WriteLn (Count:0)
end. { LineCnt }
I declare Count as a LongInt, a type that K&P did not have access to. I also used LongInt as the counter in CharCnt. It allows the programs to work with much larger files.
As an interesting exercise try omitting the line
Count := 0
Read More |
| Counting Characters | Posted on Monday, October 22, 2007 at 10:54 AM | The next program counts the number of characters in a file. The most noteworthy difference between the K&P version and this one is that a new line counts as 2 characters because in DOS it is two characters. Also my program is named CharCnt because the K&P name, charcount, is too long for a DOS name.
Program CharCnt ;
{
count characters in standard input
}
Var
Count : LongInt ;
Ch : Char ;
begin { CharCnt }
Count := 0 ;
while NOT eof do begin
Read (Ch) ;
Count := Count + 1
end ;
WriteLn (Count:0)
end. { CharCnt }
I have used Pascal's WriteLn procedure for output instead of K&P's putdec mainly because at this point, page 13, K&B have not gotten around to defining putdec.
Look up the K&P version at
http://cm.bell-labs.com/cm/cs/who/bwk/pascaltools.txt
and search for "charcount"
Tags: None |
| Testing | Posted on Sunday, October 21, 2007 at 1:41 PM | If you run Kopy and input data from the keyboard you get an output something like this:
Now is the time for all good men
Now is the time for all good men
to come to the aid of their country.
to come to the aid of their country.
^Z
The lines come in pairs. The first line is the echoing of the keyboard input; the second line is the program output. The program terminates on end of file, chr(26), which is CTRL Z on the keyboard and is echoed as ^Z. The program does not output this.
Using Kopy to copy files requires redirection.
Kopy < kopy.pas > kopy.txt
This copies kopy.pas to kopy.txt.
fc kopy.pas kopy.txt
reveals that the two files are identical.
I will test all programs but will not always post the results.
Tags: None |
| Getting Started - File Copying | Posted on Thursday, October 18, 2007 at 11:24 PM | At the outset Kernighan and Plauger present their readers with a user defined type character which is defined as
type
character = -1..127; { ASCII, plus ENDFILE }
The only purpose of type character is to defeat Pascal strong typing by mapping the ASCII character set onto a subset of integers. I do not think this is a good idea. K&P apparently do not like strong typing. I disagree. I believe that strong typing is a virtue, not a vice, therefore all the programs which I present will use Pascal’s type char, not character.
Having established type character K&P then define two constants
const
ENDFILE = -1;
NEWLINE = 10; { ASCII value }
DOS uses chr(26) for end of file and chr(13) for end of line. If I have occasion to use either ENDFILE or NEWLINE they will be defined accordingly (as char, not integer).
K&P then define two "primitive" functions: getc and putc. More precisely
Read More Tags: None |
| In which I name the game | Posted on Thursday, October 18, 2007 at 10:24 PM | Software Tools in Pascal is a rewrite of Software Tools by Brian W. Kernighan and P. J. Plauger. In the first book Kernighan and Plauger wrote their programs in RatFor, a dialog of Fortran implemented via a preprocessor, which gave Fortran a C-like syntax. In attempting to repeat this effort using Pascal, Kernighan apparently developed a dislike for the language and in the same year Software Tools in Pascal was published he also published his famous rant Why Pascal is Not My Favorite Programming Language. I will address this rant in a later post.
My intention here is to re-write the programs in Software Tools in Pascal using Turbo Pascal, an implementation of the language that became a de-facto standard. I will admit up front that this effort is entirely academic. I'm doing it for fun. Still, some people might find it useful. The tools K&P presented were, even at that time, pretty much available pre-packaged in DOS and Unix, so even the original books were something of an academic effort.
Read More |
|
Subscribe
RSS Feed
By Tag
and argument command line crypt decrypt encrypt eof Kernighan MS-DOS not one time pad or paramcount parameter paramstr Pascal String Tabs Trim xor
By Month
February 2008
January 2008
December 2007
November 2007
October 2007
Help
Check out the Blog FAQ.
|