如何在 Python 中计算一个特定的单词？

Question

提问by pluvki

I want to count a specific word in the file.

我想计算文件中的特定单词。

For example how many times does 'apple' appear in the file. I tried this:

例如，“apple”在文件中出现了多少次。我试过这个：

#!/usr/bin/env python
import re 

logfile = open("log_file", "r") 

wordcount={}
for word in logfile.read().split():
    if word not in wordcount:
        wordcount[word] = 1
    else:
        wordcount[word] += 1
for k,v in wordcount.items():
    print k, v

by replacing 'word' with 'apple', but it still counts all possible words in my file.

通过将 'word' 替换为 'apple'，但它仍然计算我文件中所有可能的单词。

Any advice would be greatly appreciated. :)

任何建议将不胜感激。:)

Answer 1

回答by Eugene Yarmash

You could just use str.count()since you only care about occurrences of a single word:

您可以使用，str.count()因为您只关心单个单词的出现：

with open("log_file") as f:
    contents = f.read()
    count = contents.count("apple")

However, to avoid some corner cases, such as erroneously counting words like "appleHyman", I suggest that you use a regex:

但是，为了避免一些极端情况，例如错误地计算像那样的单词"appleHyman"，我建议您使用正则表达式：

import re

with open("log_file") as f:
    contents = f.read()
    count = sum(1 for match in re.finditer(r"\bapple\b", contents))

\bin the regex ensures that the pattern begins and ends on a word boundary(as opposed to a substring within a longer string).

\b在正则表达式中确保模式在单词边界上开始和结束（而不是较长字符串中的子字符串）。

Answer 2

回答by Wajahat

If you only care about one word then you do not need to create a dictionary to keep track of every word count. You can just iterate over the file line-by-line and find the occurrences of the word you are interested in.

如果您只关心一个词，那么您就不需要创建字典来跟踪每个词的数量。您可以逐行遍历文件并找到您感兴趣的单词的出现次数。

#!/usr/bin/env python

logfile = open("log_file", "r") 

wordcount=0
my_word="apple"
for line in logfile:
    if my_word in line.split():
        wordcount += 1

print my_word, wordcount

However, if you also want to count all the words, and just print the word count for the word you are interested in then these minor changes to your code should work:

但是，如果您还想计算所有单词，并且只打印您感兴趣的单词的单词计数，那么对您的代码进行这些小改动应该可以工作：

#!/usr/bin/env python
import re 

logfile = open("log_file", "r") 

wordcount={}
for word in logfile.read().split():
    if word not in wordcount:
        wordcount[word] = 1
    else:
        wordcount[word] += 1
# print only the count for my_word instead of iterating over entire dictionary
my_word="apple"
print my_word, wordcount[my_word]

Answer 3

回答by Brendan Abel

You can use the Counterdictionary for this

您可以Counter为此使用字典

from collections import Counter

with open("log_file", "r") as logfile:
    word_counts = Counter(logfile.read().split())

print word_counts.get('apple')

Answer 4

回答by Yhlas

This is an example of counting words in array of words. I am assuming file reader will be pretty much similar.

这是对单词数组中的单词进行计数的示例。我假设文件阅读器将非常相似。

def count(word, array):
    n=0
    for x in array:
        if x== word:
            n+=1
    return n

text= 'apple orange kiwi apple orange grape kiwi apple apple'
ar = text.split()

print(count('apple', ar))

Answer 5

回答by Hemo Syrai

def Freq(x,y):
    d={}
    open_file = open(x,"r")
    lines = open_file.readlines()
    for line in lines:
        word = line.lower()
        words = word.split()
        for i in words:
            if i in d:
                d[i] = d[i] + 1
            else:
                d[i] = 1
    print(d)

Answer 6

回答by Narendra

fi=open("text.txt","r")
cash=0
visa=0
amex=0
for line in fi:
    k=line.split()
    print(k)
    if 'Cash' in k:
        cash=cash+1
    elif 'Visa' in k:
        visa=visa+1
    elif 'Amex' in k:
        amex=amex+1

print("# persons paid by cash are:",cash)
print("# persons paid by Visa card are :",visa)
print("#persons paid by Amex card are :",amex)
fi.close()

如何在 Python 中计算一个特定的单词？

提问by pluvki

回答by Eugene Yarmash

回答by Wajahat

回答by Brendan Abel

回答by Yhlas

回答by Hemo Syrai

回答by Narendra

相关推荐

最近更新

标签

如何在 Python 中计算一个特定的单词？

提问by pluvki

回答by Eugene Yarmash

回答by Wajahat

回答by Brendan Abel

回答by Yhlas

回答by Hemo Syrai

回答by Narendra

相关推荐

Python 导入错误：找不到“cudart64_100.dll”

Python 绘制熊猫数据框的饼图和表格

Python 使用导入 keras 时无法导入名称“tf_utils”

Python 列表中的二分查找

相关推荐

最近更新

标签