如何在 Python 中计算一个特定的单词?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/38401099/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 20:45:06  来源:igfitidea点击:

How to count one specific word in Python?

python

提问by pluvki

I want to count a specific word in the file.

我想计算文件中的特定单词。

For example how many times does 'apple' appear in the file. I tried this:

例如,“apple”在文件中出现了多少次。我试过这个:

#!/usr/bin/env python
import re 

logfile = open("log_file", "r") 

wordcount={}
for word in logfile.read().split():
    if word not in wordcount:
        wordcount[word] = 1
    else:
        wordcount[word] += 1
for k,v in wordcount.items():
    print k, v

by replacing 'word' with 'apple', but it still counts all possible words in my file.

通过将 'word' 替换为 'apple',但它仍然计算我文件中所有可能的单词。

Any advice would be greatly appreciated. :)

任何建议将不胜感激。:)

回答by Eugene Yarmash

You could just use str.count()since you only care about occurrences of a single word:

您可以使用,str.count()因为您只关心单个单词的出现:

with open("log_file") as f:
    contents = f.read()
    count = contents.count("apple")

However, to avoid some corner cases, such as erroneously counting words like "appleHyman", I suggest that you use a regex:

但是,为了避免一些极端情况,例如错误地计算像 那样的单词"appleHyman",我建议您使用正则表达式

import re

with open("log_file") as f:
    contents = f.read()
    count = sum(1 for match in re.finditer(r"\bapple\b", contents))

\bin the regex ensures that the pattern begins and ends on a word boundary(as opposed to a substring within a longer string).

\b在正则表达式中确保模式在单词边界上开始和结束(而不是较长字符串中的子字符串)。

回答by Wajahat

If you only care about one word then you do not need to create a dictionary to keep track of every word count. You can just iterate over the file line-by-line and find the occurrences of the word you are interested in.

如果您只关心一个词,那么您就不需要创建字典来跟踪每个词的数量。您可以逐行遍历文件并找到您感兴趣的单词的出现次数。

#!/usr/bin/env python

logfile = open("log_file", "r") 

wordcount=0
my_word="apple"
for line in logfile:
    if my_word in line.split():
        wordcount += 1

print my_word, wordcount

However, if you also want to count all the words, and just print the word count for the word you are interested in then these minor changes to your code should work:

但是,如果您还想计算所有单词,并且只打印您感兴趣的单词的单词计数,那么对您的代码进行这些小改动应该可以工作:

#!/usr/bin/env python
import re 

logfile = open("log_file", "r") 

wordcount={}
for word in logfile.read().split():
    if word not in wordcount:
        wordcount[word] = 1
    else:
        wordcount[word] += 1
# print only the count for my_word instead of iterating over entire dictionary
my_word="apple"
print my_word, wordcount[my_word]

回答by Brendan Abel

You can use the Counterdictionary for this

您可以Counter为此使用字典

from collections import Counter

with open("log_file", "r") as logfile:
    word_counts = Counter(logfile.read().split())

print word_counts.get('apple')

回答by Yhlas

This is an example of counting words in array of words. I am assuming file reader will be pretty much similar.

这是对单词数组中的单词进行计数的示例。我假设文件阅读器将非常相似。

def count(word, array):
    n=0
    for x in array:
        if x== word:
            n+=1
    return n

text= 'apple orange kiwi apple orange grape kiwi apple apple'
ar = text.split()

print(count('apple', ar))

回答by Hemo Syrai

def Freq(x,y):
    d={}
    open_file = open(x,"r")
    lines = open_file.readlines()
    for line in lines:
        word = line.lower()
        words = word.split()
        for i in words:
            if i in d:
                d[i] = d[i] + 1
            else:
                d[i] = 1
    print(d)

回答by Narendra

fi=open("text.txt","r")
cash=0
visa=0
amex=0
for line in fi:
    k=line.split()
    print(k)
    if 'Cash' in k:
        cash=cash+1
    elif 'Visa' in k:
        visa=visa+1
    elif 'Amex' in k:
        amex=amex+1

print("# persons paid by cash are:",cash)
print("# persons paid by Visa card are :",visa)
print("#persons paid by Amex card are :",amex)
fi.close()