从日志文件中提取 java 堆栈跟踪的工具

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6107700/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-30 14:21:20  来源:igfitidea点击:

Tool to extract java stack traces from log files

javaexceptionloggingstack-trace

提问by Andrey Adamovich

Is there any tool that can extract a list of stack traces appearing in the log file and probably count unique ones?

是否有任何工具可以提取出现在日志文件中的堆栈跟踪列表并可能计算唯一的堆栈跟踪?

EDIT: I would preffer something that is not GUI-based and be run on the background and give some kind of report back. I have quite many logs gathered from several environments and just would like to get quick overview.

编辑:我会喜欢一些不是基于 GUI 的东西,并在后台运行并返回某种报告。我从多个环境收集了相当多的日志,只是想快速了解一下。

采纳答案by Andrey Adamovich

I have come up with the following Groovy script. It is, of course, very much adjusted to my needs, but I hope it helps someone.

我想出了以下 Groovy 脚本。当然,它非常适合我的需求,但我希望它可以帮助某人。

def traceMap = [:]

// Number of lines to keep in buffer
def BUFFER_SIZE = 100

// Pattern for stack trace line
def TRACE_LINE_PATTERN = '^[\s\t]+at .*$'

// Log line pattern between which we try to capture full trace
def LOG_LINE_PATTERN = '^([<#][^/]|\d\d).*$'

// List of patterns to replace in final captured stack trace line 
// (e.g. replace date and transaction information that may make similar traces to look as different)
def REPLACE_PATTERNS = [
  '^\d+-\d+\@.*?tksId: [^\]]+\]',
  '^<\w+ \d+, \d+ [^>]*?> <[^>]*?> <[^>]*?> <[^>]*?> <',
  '^####<[^>]+?> <[^>]*?> <[^>]*?> <[^>]*?> <[^>]*?> <[^>]*?> <[^>]*?> <[^>]*?> <[^>]*?> <[^>]*?> <[^>]*?> <',
  '<([\w:]+)?TransaktionsID>[^<]+?</([\w:]+)?TransaktionsID>',
  '<([\w:]+)?TransaktionsTid>[^<]+?</([\w:]+)?TransaktionsTid>'
]

new File('.').eachFile { File file ->
  if (file.name.contains('.log') || file.name.contains('.out')) {
    def bufferLines = []
    file.withReader { Reader reader ->
      while (reader.ready()) {      
        def String line = reader.readLine()
        if (line.matches(TRACE_LINE_PATTERN)) {
          def trace = []
          for(def i = bufferLines.size() - 1; i >= 0; i--) {
            if (!bufferLines[i].matches(LOG_LINE_PATTERN)) {
              trace.add(0, bufferLines[i])
            } else {
              trace.add(0, bufferLines[i])
              break
            }
          }
          trace.add(line)
          if (reader.ready()) {
            line = reader.readLine()
            while (!line.matches(LOG_LINE_PATTERN)) {
              trace.add(line)
              if (reader.ready()) {
                line = reader.readLine()
              } else {
                break;
              }
            }
          }
          def traceString = trace.join("\n")
          REPLACE_PATTERNS.each { pattern ->
            traceString = traceString.replaceAll(pattern, '')
          }
          if (traceMap.containsKey(traceString)) {
            traceMap.put(traceString, traceMap.get(traceString) + 1)
          } else {
            traceMap.put(traceString, 1)
          }
        }
        // Keep the buffer of last lines.
        bufferLines.add(line)
        if (bufferLines.size() > BUFFER_SIZE) {
          bufferLines.remove(0)
        }
      }
    }
  }
}

traceMap = traceMap.sort { it.value }

traceMap.reverseEach { trace, number ->
  println "-- Occured $number times -----------------------------------------"
  println trace
}

回答by Raman

Here is a quick-and-dirty grep expression... if you are using a logger such as log4j than the first line of the exception will generally contain WARNor ERROR, the next line will contain the Exception name, and optionally a message, and then the subsequent stack trace will begin with one of the following:

这是一个快速而肮脏的 grep 表达式...如果您使用的是 log4j 之类的记录器,则异常的第一行通常包含WARNor ERROR,下一行将包含异常名称,以及可选的消息,然后随后的堆栈跟踪将以下列其中一项开始:

  1. "\tat"(tab + at)
  2. "Caused by: "
  3. "\t... <some number> more"(these are the lines that indicate the number of frames in the stack not shown in a "Caused by" exception)
  4. An Exception name (and perhaps message) before the stack
  1. "\tat"(标签 + 在)
  2. "Caused by: "
  3. "\t... <some number> more"(这些是指示堆栈中未显示在“Caused by”异常中的帧数的行)
  4. 堆栈前的异常名称(可能还有消息)

We want to get all of the above lines, so the grep expression is:

我们想要得到以上所有的行,所以 grep 表达式是:

grep -P "(WARN|ERROR|^\tat |Exception|^Caused by: |\t... \d+ more)"

grep -P "(WARN|ERROR|^\tat |Exception|^Caused by: |\t... \d+ more)"

It assumes an Exception class always contains the word Exceptionwhich may or may not be true, but this is quick-and-dirty after all.

它假设一个 Exception 类总是包含Exception可能是也可能不是真的这个词,但这毕竟是快速而肮脏的。

Adjust as necessary for your specific case.

根据您的具体情况进行必要的调整。

回答by Aaron Digulla

You can write this yourself pretty easily. Here is the pattern:

你可以很容易地自己写这个。这是模式:

  1. Open file
  2. Search for the string "\n\tat "(that's new line, tab, at, blank) This is a pretty uncommon string outside of stack traces.
  1. 打开文件
  2. 搜索字符串"\n\tat "(即新行、制表符、at、空白) 这是堆栈跟踪之外的一个非常罕见的字符串。

Now all you need to do is find the first line that doesn't start with \tto find the end of the stack trace. You may want to skip 1-3 lines after that to catch chained exceptions.

现在您需要做的就是找到第一行不是\t以找到堆栈跟踪结尾的开头。您可能希望在此之后跳过 1-3 行以捕获链式异常。

Plus add a couple of lines (say 10 or 50) before the first line of the stack trace to get some context.

另外在堆栈跟踪的第一行之前添加几行(比如 10 或 50)以获得一些上下文。

回答by daniel kullmann

I wrote a tool in Python. It manages to split two stack traces even if they come right after each other in the log.

我用 Python 写了一个工具。它设法拆分两个堆栈跟踪,即使它们在日志中紧随其后。

#!/usr/bin/env python
#
# Extracts exceptions from log files.
#

import sys
import re
from collections import defaultdict

REGEX = re.compile("(^\tat |^Caused by: |^\t... \d+ more)")
# Usually, all inner lines of a stack trace will be "at" or "Caused by" lines.
# With one exception: the line following a "nested exception is" line does not
# follow that convention. Due to that, this line is handled separately.
CONT = re.compile("; nested exception is: *$")

exceptions = defaultdict(int)

def registerException(exc):
  exceptions[exc] += 1

def processFile(fileName):
  with open(fileName, "r") as fh:
    currentMatch = None
    lastLine = None
    addNextLine = False
    for line in fh.readlines():
      if addNextLine and currentMatch != None:
        addNextLine = False
        currentMatch += line
        continue
      match = REGEX.search(line) != None
      if match and currentMatch != None:
        currentMatch += line
      elif match:
        currentMatch = lastLine + line
      else:
        if currentMatch != None:
          registerException(currentMatch)
        currentMatch = None
      lastLine = line
      addNextLine = CONT.search(line) != None
    # If last line in file was a stack trace
    if currentMatch != None:
      registerException(currentMatch)

for f in sys.argv[1:]:
  processFile(f)

for item in sorted(exceptions.items(), key=lambda e: e[1], reverse=True):
  print item[1], ":", item[0]

回答by Ani

Here's nice code that does the same - http://www.techiedelight.com/java-program-search-exceptions-huge-log-file-on-server/

这是执行相同操作的不错代码 - http://www.techiedelight.com/java-program-search-exceptions-huge-log-file-on-server/

It basically reads the log file line by line and search for keyword “Exception” in each line. Once found, it will print the next 10 lines (exception trace) in a separate output file.

它基本上是逐行读取日志文件并在每一行中搜索关键字“异常”。一旦找到,它将在单独的输出文件中打印接下来的 10 行(异常跟踪)。

回答by Rudy

I use Baretail.

我使用Baretail