Python

Moderators: None (Apply to moderate this forum)
Number of threads: 474
Number of posts: 1166

This Forum Only
Post New Thread
Single Post View       Linear View       Threaded View      f

Report
Reading chars in a text file Posted by XeonicXpressio on 25 Feb 2005 at 7:31 PM
Ok here is what i need to do.
Read in a text file and calculate
1. The total number of lines in the file, including blank lines.
2. The number of blank lines in the file.
3. The number of sentences in the file of text. You may assume that sentences must end with a period, a question mark, or an exclamation point.
4. The number of words in the file. (Think about how you can determine when a word ends.)
5. The number of non-blank characters in the file, including punctuation.
def main():
    myInFile = open("inn.txt", "r")
    TotalLines = 0
    BlankLines = 0
    Words = 0
    for ch in myInFile:
        TotalLines = TotalLines + 1
        InLine = myInFile.readline()
        if (InLine == "\n"):
            BlankLines = BlankLines + 1
    print "Total Words", Words
    print "Total Lines", TotalLines
    print "Total Blank Lines", BlankLines
main()

That is what i have so far but doing for ch in myInFile doesnt look at each character and i cant figure out how to. Also shouldn't if (InLine == "\n"): give me the total number of blank lines? It doesnt seem to be working. I guess until I can figure out how to look at the characters in each line I'm kind of at a dead end. Any help would be appriciated. If someone could help my get started here and tell me if i am even close to being on the right track I would appriciate it.
Report
Re: Reading chars in a text file Posted by Drost on 26 Feb 2005 at 3:49 AM
This message was edited by Drost at 2005-2-26 3:57:39

: Ok here is what i need to do.
: Read in a text file and calculate
: 1. The total number of lines in the file, including blank lines.
: 2. The number of blank lines in the file.
: 3. The number of sentences in the file of text. You may assume that sentences must end with a period, a question mark, or an exclamation point.
: 4. The number of words in the file. (Think about how you can determine when a word ends.)
: 5. The number of non-blank characters in the file, including punctuation.
:
: def main():
:     myInFile = open("inn.txt", "r")
:     TotalLines = 0
:     BlankLines = 0
:     Words = 0
:     for ch in myInFile:
:         TotalLines = TotalLines + 1
:         InLine = myInFile.readline()
:         if (InLine == "\n"):
:             BlankLines = BlankLines + 1
:     print "Total Words", Words
:     print "Total Lines", TotalLines
:     print "Total Blank Lines", BlankLines
: main()
: 

: That is what i have so far but doing for ch in myInFile doesnt look at each character and i cant figure out how to. Also shouldn't if (InLine == "\n"): give me the total number of blank lines? It doesnt seem to be working. I guess until I can figure out how to look at the characters in each line I'm kind of at a dead end. Any help would be appriciated. If someone could help my get started here and tell me if i am even close to being on the right track I would appriciate it.
:

Hi

In your code the red-marked cycle iterates through the lines of the file (speaking Python 2.3+) so deliberatelly reading a new line (the other red-marked part) seems to be superfluous.

lines, blanks, sentences, words, nonwhite = 0, 0, 0, 0, 0

textf = open('file.txt', 'r')
for l in textf:
  lines += 1
  if l.startswith('\n'):
    blanks += 1 # sorry MACs
  else:
    sentences += l.count('.') + l.count('!') + l.count('?')
    tempwords = l.split(None)
    words += len(tempwords)
    nonwhite += sum(map(len, tempwords))
textf.close()

print "Lines: ", lines
print "Blanks: ", blanks
print "Sentences: ", sentences
print "Words: ", words
print "non whitespace characters: ", nonwhite
raw_input('Press Enter...')


The sentence count falls for things like '...' or '?!?'.

Drost

Report
Re: Reading chars in a text file Posted by XeonicXpressio on 26 Feb 2005 at 11:46 AM
Thank you so much Drost! I guess the whole read line was what was screwing me up. I was wondering if you could explain a few things to me.

l.count('.'), l.count('!'), l.count('?')


since l is a whole line is count reading every character in the line?

 tempwords = l.split(None)


This creates a list and the None omits the black characters, right?

words += len(tempwords)
nonwhite += sum(map(len, tempwords))


Why is len(tempwords) when assigned to words counting the number of cells, but when it is in map it counts the number of characters in the cell and creates a new list from that? I read about map and I thought that it would just be applying len to tempwords again, like the line above it. I understand what it is doing I just dont understand why it is doing it.

Thank you for your help, that gave me a much better understanding of how the syntax of python works. I knew how I wanted to do this I just couldn't get the syntax down at all.
Report
Re: Reading chars in a text file Posted by Drost on 26 Feb 2005 at 1:48 PM
: Thank you so much Drost! I guess the whole read line was what was screwing me up. I was wondering if you could explain a few things to me.
:
:
l.count('.'), l.count('!'), l.count('?')

:
: since l is a whole line is count reading every character in the line?

#1

:
:
 tempwords = l.split(None)

:
: This creates a list and the None omits the black characters, right?

#2

:
: words += len(tempwords)
: nonwhite += sum(map(len, tempwords))
: 

:
: Why is len(tempwords) when assigned to words counting the number of cells, but when it is in map it counts the number of characters in the cell and creates a new list from that? I read about map and I thought that it would just be applying len to tempwords again, like the line above it. I understand what it is doing I just dont understand why it is doing it.

#3

: Thank you for your help, that gave me a much better understanding of how the syntax of python works. I knew how I wanted to do this I just couldn't get the syntax down at all.
:

#4

---------
#1:
Strings have a count method which take a substring as a parameter and returns the number that can be found in the string.

So basically we count how many end-of-sentence mark we can find in the line.

#2:
Again strings have the split method which would result in a list that is created through splitting the string on the boundaries which is the parameter this method takes. The speciality is that the None parameter makes the boundaries to be whitespaces (spaces, tabs, etc.) be them any length.

"asd fgh jkl".split(' ')

would result in ["asd", "fgh", "jkl"]

but

"asd fgh  jkl".split(' ') # mark the plus space

would result in ["asd", "fgh", "", "jkl"]

So to get only real words I used .split(None). :)

#3:
tempwords is a list which have elements that are the words of the actual line. map iterates on each member of the list (the words) to have their length which would result in a new list of numbers. Their sum makes up how many 'useful' characters there are in the line.

#4:
The manual is a great help. :)

Drost



 

Recent Jobs

Official Programmer's Heaven Blogs
Web Hosting | Browser and Social Games | Gadgets

Popular resources on Programmersheaven.com
Assembly | Basic | C | C# | C++ | Delphi | Flash | Java | JavaScript | Pascal | Perl | PHP | Python | Ruby | Visual Basic
© Copyright 2011 Programmersheaven.com - All rights reserved.
Reproduction in whole or in part, in any form or medium without express written permission is prohibited.
Violators of this policy may be subject to legal action. Please read our Terms Of Use and Privacy Statement for more information.
Operated by CommunityHeaven, a BootstrapLabs company.