python - 查找文件中出现的单词
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15083119/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
python - find the occurrence of the word in a file
提问by Ashwin
I am trying to find the count of words that occured in a file. I have a text file (TEST.txt) the content of the file is as follows:
我正在尝试查找文件中出现的单词数。我有一个文本文件(TEST.txt),文件内容如下:
ashwin programmer india
amith programmer india
The result I expect is:
我期望的结果是:
{ 'ashwin':1, 'programmer ':2,'india':2, 'amith ':1}
The code I am using is:
我正在使用的代码是:
for line in open(TEST.txt,'r'):
word = Counter(line.split())
print word
The result I get is:
我得到的结果是:
Counter({'ashwin': 1, 'programmer': 1,'india':1})
Counter({'amith': 1, 'programmer': 1,'india':1})
Can any one please help me? Thanks in advance .
谁能帮帮我吗?提前致谢 。
采纳答案by Mark Tolonen
Use the updatemethod of Counter. Example:
使用update计数器的方法。例子:
from collections import Counter
data = '''\
ashwin programmer india
amith programmer india'''
c = Counter()
for line in data.splitlines():
c.update(line.split())
print(c)
Output:
输出:
Counter({'india': 2, 'programmer': 2, 'amith': 1, 'ashwin': 1})
回答by Anorov
You're iterating over every line and calling Counter each time. You want Counter to run over the entire file. Try:
您正在迭代每一行并每次都调用 Counter 。您希望 Counter 运行整个文件。尝试:
from collections import Counter
with open("TEST.txt", "r") as f:
# Used file context read and save into contents
contents = f.read().split()
print Counter(contents)
回答by Mikhail Vladimirov
from collections import Counter;
cnt = Counter ();
for line in open ('TEST.txt', 'r'):
for word in line.split ():
cnt [word] += 1
print cnt
回答by GrilledTuna
Using a Defaultdict:
使用 Defaultdict:
from collections import defaultdict
def read_file(fname):
words_dict = defaultdict(int)
fp = open(fname, 'r')
lines = fp.readlines()
words = []
for line in lines:
words += line.split(' ')
for word in words:
words_dict[word] += 1
return words_dict
回答by Fuji Komalan
FILE_NAME = 'file.txt'
wordCounter = {}
with open(FILE_NAME,'r') as fh:
for line in fh:
# Replacing punctuation characters. Making the string to lower.
# The split will spit the line into a list.
word_list = line.replace(',','').replace('\'','').replace('.','').lower().split()
for word in word_list:
# Adding the word into the wordCounter dictionary.
if word not in wordCounter:
wordCounter[word] = 1
else:
# if the word is already in the dictionary update its count.
wordCounter[word] = wordCounter[word] + 1
print('{:15}{:3}'.format('Word','Count'))
print('-' * 18)
# printing the words and its occurrence.
for (word,occurance) in wordCounter.items():
print('{:15}{:3}'.format(word,occurance))
回答by Karthic Kannan
f = open('input.txt', 'r')
data=f.read().lower()
list1=data.split()
d={}
for i in set(list1):
d[i]=0
for i in list1:
for j in d.keys():
if i==j:
d[i]=d[i]+1
print(d)

