使用 Python 计算文本文件中的行数、单词数和字符数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4783899/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 17:29:53  来源:igfitidea点击:

Counting lines, words, and characters within a text file using Python

python

提问by Alex Karpowitsch

I'm having a bit of a rough time laying out how I would count certain elements within a text file using Python. I'm a few months into Python and I'm familiar with the following functions;

我在布置如何使用 Python 计算文本文件中的某些元素时遇到了一些困难。我已经使用 Python 几个月了,并且熟悉以下函数;

  • raw_input
  • open
  • split
  • len
  • print
  • rsplit()
  • 原始输入
  • 打开
  • 分裂
  • 打印
  • 拆分()

Here's my code so far:

到目前为止,这是我的代码:

fname = "feed.txt"
fname = open('feed.txt', 'r')

num_lines = 0
num_words = 0
num_chars = 0

for line in feed:
    lines = line.split('\n')

At this point I'm not sure what to do next. I feel the most logical way to approach it would be to first count the lines, count the words within each line, and then count the number of characters within each word. But one of the issues I ran into was trying to perform all of the necessary functions at once, without having to re-open the file to perform each function seperately.

在这一点上,我不确定接下来要做什么。我觉得最合乎逻辑的方法是先计算行数,计算每行中的单词,然后计算每个单词中的字符数。但是我遇到的问题之一是尝试一次执行所有必要的功能,而不必重新打开文件来单独执行每个功能。

采纳答案by eumiro

Try this:

尝试这个:

fname = "feed.txt"

num_lines = 0
num_words = 0
num_chars = 0

with open(fname, 'r') as f:
    for line in f:
        words = line.split()

        num_lines += 1
        num_words += len(words)
        num_chars += len(line)

Back to your code:

回到你的代码:

fname = "feed.txt"
fname = open('feed.txt', 'r')

what's the point of this? fnameis a string first and then a file object. You don't really use the string defined in the first line and you should use one variable for one thing only: either a string or a file object.

这有什么意义?fname首先是一个字符串,然后是一个文件对象。您并没有真正使用第一行中定义的字符串,您应该只将一个变量用于一件事:字符串或文件对象。

for line in feed:
    lines = line.split('\n')

lineis one line from the file. It does not make sense to split('\n')it.

line是文件中的一行。它没有意义split('\n')

回答by kynnysmatto

Functions that might be helpful:

可能有用的功能:

  • open("file").read()which reads the contents of the whole file at once
  • 'string'.splitlines()which separates lines from each other (and discards empty lines)
  • open("file").read()它一次读取整个文件的内容
  • 'string'.splitlines()将行彼此分开(并丢弃空行)

By using len() and those functions you could accomplish what you're doing.

通过使用 len() 和那些函数,你可以完成你正在做的事情。

回答by Stephen Paulger

fname = "feed.txt"
feed = open(fname, 'r')

num_lines = len(feed.splitlines())
num_words = 0
num_chars = 0

for line in lines:
    num_words += len(line.split())

回答by sirus

One of the way I like is this one , but may be good for small files

我喜欢的一种方式是这种方式,但可能适用于小文件

with open(fileName,'r') as content_file:
    content = content_file.read()
    lineCount = len(re.split("\n",content))
    words = re.split("\W+",content.lower())

To count words, there is two way, if you don't care about repetition you can just do

计算字数,有两种方法,如果你不在乎重复,你可以这样做

words_count = len(words)

if you want the counts of each word you can just do

如果你想要每个单词的数量,你可以这样做

import collections
words_count = collections.Counter(words) #Count the occurrence of each word

回答by Ozzius

file__IO = input('\nEnter file name here to analize with path:: ')
with open(file__IO, 'r') as f:
    data = f.read()
    line = data.splitlines()
    words = data.split()
    spaces = data.split(" ")
    charc = (len(data) - len(spaces))

    print('\n Line number ::', len(line), '\n Words number ::', len(words), '\n Spaces ::', len(spaces), '\n Charecters ::', (len(data)-len(spaces)))

I tried this code & it works as expected.

我试过这段代码&它按预期工作。