Python RegEx - 从字符串中获取多条信息

Question

提问by Joshua

I'm trying to use python to parse a log file and match 4 pieces of information in one regex. (epoch time, SERVICE NOTIFICATION, hostname and CRITICAL) I can't seem to get this to work. So Far I've been able to only match two of the four. Is it possible to do this? Below is an example of a string from the log file and the code I've gotten to work thus far. Any help would make me a happy noob.

我正在尝试使用 python 来解析日志文件并在一个正则表达式中匹配 4 条信息。（纪元时间、服务通知、主机名和关键）我似乎无法让它工作。到目前为止，我只能匹配四个中的两个。是否有可能做到这一点？下面是一个来自日志文件的字符串示例和我目前使用的代码。任何帮助都会让我成为一个快乐的菜鸟。

[1242248375] SERVICE ALERT: myhostname.com;DNS: Recursive;CRITICAL;SOFT;1;CRITICAL - Plugin timed out while executing system call

[1242248375] 服务警报：myhostname.com；DNS：递归；CRITICAL；SOFT；1；CRITICAL - 执行系统调用时插件超时

hostname = options.hostname

n = open('/var/tmp/nagios.log', 'r')
n.readline()
l = [str(x) for x in n]
for line in l:
    match = re.match (r'^\[(\d+)\] SERVICE NOTIFICATION: ', line)
    if match:
       timestamp = int(match.groups()[0])
       print timestamp

Answer 1

回答by Alex Martelli

You can use |to match any one of various possible things, and re.findallto get all non-overlapping matches to some RE.

您可以使用|匹配各种可能事物中的任何一种，并将re.findall所有非重叠匹配项与某些 RE 匹配。

Answer 2

回答by Dietrich Epp

The question is a bit confusing. But you don't need to do everythingwith regular expressions, there are some good plain old string functions you might want to try, like 'split'.

这个问题有点令人困惑。但是您不需要用正则表达式做所有事情，您可能想尝试一些很好的普通旧字符串函数，例如“split”。

This version will also refrain from loading the entire file in memory at once, and it will close the file even when an exception is thrown.

此版本还将避免一次将整个文件加载到内存中，即使抛出异常，它也会关闭文件。

regexp = re.compile(r'\[(\d+)\] SERVICE NOTIFICATION: (.+)')
with open('var/tmp/nagios.log', 'r') as file:
    for line in file:
        fields = line.split(';')
        match = regexp.match(fields[0])
        if match:
            timestamp = int(match.group(1))
            hostname = match.group(2)

Answer 3

回答by Mike Kale

You can use more than one group at a time, e.g.:

您一次可以使用多个组，例如：

import re

logstring = '[1242248375] SERVICE ALERT: myhostname.com;DNS: Recursive;CRITICAL;SOFT;1;CRITICAL - Plugin timed out while executing system call'
exp = re.compile('^\[(\d+)\] ([A-Z ]+): ([A-Za-z0-9.\-]+);[^;]+;([A-Z]+);')
m = exp.search(logstring)

for s in m.groups():
    print s

Answer 4

回答by user114075

If you are looking to split out those particular parts of the line then.

如果您想拆分该行的那些特定部分，那么。

Something along the lines of:

类似的东西：

match = re.match(r'^\[(\d+)\] (.*?): (.*?);.*?;(.*?);',line)

Should give each of those parts in their respective index in groups.

应该将这些部分中的每一个放在各自的索引中。

Answer 5

回答by Oddthinking

Could it be as simple as "SERVICE NOTIFICATION" in your pattern doesn't match "SERVICE ALERT" in your example?

是否可以像您的模式中的“SERVICE NOTIFICATION”与您的示例中的“SERVICE ALERT”不匹配一样简单？

Python RegEx - 从字符串中获取多条信息

提问by Joshua

回答by Alex Martelli

回答by Dietrich Epp

回答by Mike Kale

回答by user114075

回答by Oddthinking

相关推荐

最近更新

标签

Python RegEx - 从字符串中获取多条信息

提问by Joshua

回答by Alex Martelli

回答by Dietrich Epp

回答by Mike Kale

回答by user114075

回答by Oddthinking

相关推荐

如何将脚本参数传递给 pdb (Python)？

如何从 python (2.5) 中的 subprocess.Popen 获取“实时”信息

在 Python 中接收 16 位整数

如何在 Python 中打印列表、字典或对象集合

相关推荐

最近更新

标签