实时日志文件上的 bash tail，计算具有相同日期/时间的 uniq 行

Question

提问by zapp

I'm looking for a good way to tail on a live log file, and display number of lines with the same date/time.

我正在寻找一种跟踪实时日志文件并显示具有相同日期/时间的行数的好方法。

Currently this is working:

目前这是有效的：

 tail -F /var/logs/request.log | [cut the date-time] | uniq -c

BUT the performance is not good enough. There is a delay of more than one minute, and it output in bulks of few lines each time.

但性能不够好。有超过一分钟的延迟，每次输出几行。

Any idea?

任何的想法？

Answer 1

回答by Floris

Your problem is most likely related to buffering in your system, not anything intrinsically wrong with your line of code. I was able to create a test scenario where I could reproduce it - then make it go away. I hope it will work for you too.

您的问题很可能与系统中的缓冲有关，而不是您的代码行有任何本质上的错误。我能够创建一个可以重现它的测试场景 - 然后让它消失。我希望它也对你有用。

Here is my test scenario. First I write a short script that writes the time to a file every 100 ms (approx) - this is my "log file" that generates enough data that uniq -cshould give me an interesting output every second:

这是我的测试场景。首先，我编写了一个简短的脚本，每 100 毫秒（大约）将时间写入一个文件——这是我的“日志文件”，它生成足够的数据，uniq -c每秒钟都会给我一个有趣的输出：

#!/bin/ksh
while :
do
  echo The time is `date` >> a.txt
  sleep 0.1
done

(Note - I had to use kshwhich has the ability to do a sub-second sleep)

（注意 - 我必须使用ksh它有能力做一个亚秒sleep）

In another window, I type

在另一个窗口中，我输入

tail -f a.txt | uniq -c

Sure enough, you get the following output appearing every second:

果然，每秒都会出现以下输出：

   9 The time is Thu Dec 12 21:01:05 EST 2013
  10 The time is Thu Dec 12 21:01:06 EST 2013
  10 The time is Thu Dec 12 21:01:07 EST 2013
   9 The time is Thu Dec 12 21:01:08 EST 2013
  10 The time is Thu Dec 12 21:01:09 EST 2013
   9 The time is Thu Dec 12 21:01:10 EST 2013
  10 The time is Thu Dec 12 21:01:11 EST 2013
  10 The time is Thu Dec 12 21:01:12 EST 2013

etc. No delays. Important to note - I did not attempt to cut out the time. Next, I did

等没有延误。重要的是要注意 -我没有试图减少时间。接下来，我做了

tail -f a.txt | cut -f7 -d' ' | uniq -c

And your problem reproduced - it would "hang" for quite a while (until there was 4k of characters in the buffer, and then it would vomit it all out at once).

并且您的问题重现了 - 它会“挂起”很长一段时间（直到缓冲区中有 4k 个字符，然后它会立即全部吐出）。

A bit of searching online ( https://stackoverflow.com/a/16823549/1967396) told me of a utility called stdbuf. At that reference, it specifically mentions almost exactly your scenario, and they provide the following workaround (paraphrasing to match my scenario above):

在线搜索（https://stackoverflow.com/a/16823549/1967396）告诉我一个名为stdbuf的实用程序。在该参考文献中，它特别提到了几乎完全符合您的方案，并且他们提供了以下解决方法（解释为与我上面的方案相匹配）：

tail -f a.txt | stdbuf -oL cut -f7 -d' ' | uniq -c

And that would be great… except that this utility doesn't exist on my machine (Mac OS) - it is specific to GNU coreutils. This left me unable to test - although it may be a good solution for you.

那就太好了……除了这个实用程序在我的机器 (Mac OS) 上不存在 - 它特定于 GNU coreutils。这让我无法测试 - 尽管它对您来说可能是一个很好的解决方案。

Never fear - I found the following workaround, based on the socatcommand (which I honestly barely understand, but I adapted from the answer given at https://unix.stackexchange.com/a/25377).

不要害怕 - 我根据socat命令找到了以下解决方法（老实说，我几乎不明白，但我改编自https://unix.stackexchange.com/a/25377给出的答案）。

Make a small file called tailcut.sh(this is the "long_running_command" from the link above):

制作一个名为的小文件tailcut.sh（这是上面链接中的“long_running_command”）：

#!/bin/ksh
tail -f a.txt | cut -f7 -d' '

Give it execute permissions with chmod 755 tailcut.sh. Then issue the following command:

赋予它执行权限chmod 755 tailcut.sh。然后发出以下命令：

socat EXEC:./tailcut.sh,pty,ctty STDIO | uniq -c

And hey presto - your lumpy output is lumpy no more. The socatsends the output from the script straight to the next pipe, and uniqcan do its thing.

嘿，快 - 你的块状输出不再是块状的。该socat发送从脚本直奔下一个管道输出，并且uniq可以做的事情。

Answer 2

回答by Julien Palard

You may try logtop, (apt-get install logtop):

你可以试试logtop，( apt-get install logtop):

Usage:

用法：

tail -F /var/logs/request.log | [cut the date-time] | logtop

Example:

例子：

$ tail -f /var/log/varnish/varnishncsa.log  | awk '{print }' | logtop
5585 elements in 10 seconds (558.50 elements/s)
   1  690 69.00/s [28/Mar/2015:23:13:48
   2  676 67.60/s [28/Mar/2015:23:13:47
   3  620 62.00/s [28/Mar/2015:23:13:49
   4  576 57.60/s [28/Mar/2015:23:13:53
   5  541 54.10/s [28/Mar/2015:23:13:54
   6  540 54.00/s [28/Mar/2015:23:13:55
   7  511 51.10/s [28/Mar/2015:23:13:51
   8  484 48.40/s [28/Mar/2015:23:13:52
   9  468 46.80/s [28/Mar/2015:23:13:50

Columns are, from left to right:

列是，从左到右：

Just row number
qte seen
hits per second
the actual line

只是行号
看过
每秒点击次数
实际线路

Answer 3

回答by yaccz

Consider how uniq -cis working.

考虑如何uniq -c工作。

In order to print the count, it needs to read all the unique lines and only once a line that is different from the previous one, it can print the line and number of occurences.

为了打印计数，它需要读取所有唯一的行，并且只有一次与前一行不同的行，它可以打印该行和出现次数。

That's just how the algorithm fundamentally works and there is no way around it.

这就是算法的基本工作原理，没有办法绕过它。

You can test this by running

您可以通过运行来测试

touch a
tail -F a | uniq -c

And then one after another

然后一个接一个

echo 1 >> a
echo 1 >> a
echo 1 >> a

nothing happens. Only after you run

没发生什么事。只有在你跑完之后

echo 2 >> a

uniqcan print there were 3 "1\n" occurences.

uniq可以打印有 3 个“1\n”出现。

实时日志文件上的 bash tail，计算具有相同日期/时间的 uniq 行

提问by zapp

回答by Floris

回答by Julien Palard

回答by yaccz

相关推荐

最近更新

标签

实时日志文件上的 bash tail，计算具有相同日期/时间的 uniq 行

提问by zapp

回答by Floris

回答by Julien Palard

回答by yaccz

相关推荐

如何在 vagrant ssh bash 中配置颜色

Bash shell，尝试创建和评估掩码

从 shell 脚本 (bash) 的参数列表中删除最后一个参数

bash 从一个目录执行bash脚本到另一个目录？

相关推荐

最近更新

标签