Python 读取日志文件并获取包含特定单词的行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/16017419/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python read log files and get lines containing specific words
提问by James H
I have log files ( named in the format YYMMDD ) and I'd like to create a script that get only important information from the files ( like the lines that contains "O:NVS:VOICE" ). I have never used Python before so please help!
我有日志文件(以 YYMMDD 格式命名),我想创建一个只从文件中获取重要信息的脚本(例如包含 "O:NVS:VOICE" 的行)。我以前从未使用过 Python,所以请帮忙!
采纳答案by Gareth Webber
This should get you started nicely:
这应该可以让你很好地开始:
infile = r"D:\Documents and Settings\xxxx\Desktop\test_log.txt"
important = []
keep_phrases = ["test",
"important",
"keep me"]
with open(infile) as f:
f = f.readlines()
for line in f:
for phrase in keep_phrases:
if phrase in line:
important.append(line)
break
print(important)
It's by no means perfect, for example there is no exception handling or pattern matching, but you can add these to it quite easily. Look into regular expressions, that may be better than phrase matching. If your files are very big, read it line by line to avoid a MemoryError.
它绝不是完美的,例如没有异常处理或模式匹配,但您可以很容易地将这些添加到其中。查看正则表达式,这可能比短语匹配更好。如果您的文件非常大,请逐行读取以避免出现 MemoryError。
Input file:
输入文件:
This line is super important!
don't need this one...
keep me!
bla bla
not bothered
ALWAYS include this test line
Output:
输出:
['This line is super important!\n', 'keep me!\n', 'ALWAYS include this test line']
Note: This is Python 3.3.
注意:这是 Python 3.3。
回答by John
You'll need to know how to loop over files in a directory, regular expressions to make sure your log file format matches to file you are looping over, how to open a file, how to loop over the lines in the open file, and how to check if one of those lines contains what you are looking for.
您需要知道如何循环目录中的文件、正则表达式以确保您的日志文件格式与您循环的文件匹配、如何打开文件、如何循环打开文件中的行,以及如何检查其中一行是否包含您要查找的内容。
And here some code to get you started.
这里有一些代码可以帮助您入门。
with open("log.log" 'r') as f:
for line in f:
if "O:NVS:VOICE" in line:
print line

