实时日志文件上的 bash tail,计算具有相同日期/时间的 uniq 行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20554032/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
bash tail on a live log file, counting uniq lines with same date/time
提问by zapp
I'm looking for a good way to tail on a live log file, and display number of lines with the same date/time.
我正在寻找一种跟踪实时日志文件并显示具有相同日期/时间的行数的好方法。
Currently this is working:
目前这是有效的:
tail -F /var/logs/request.log | [cut the date-time] | uniq -c
BUT the performance is not good enough. There is a delay of more than one minute, and it output in bulks of few lines each time.
但性能不够好。有超过一分钟的延迟,每次输出几行。
Any idea?
任何的想法?
回答by Floris
Your problem is most likely related to buffering in your system, not anything intrinsically wrong with your line of code. I was able to create a test scenario where I could reproduce it - then make it go away. I hope it will work for you too.
您的问题很可能与系统中的缓冲有关,而不是您的代码行有任何本质上的错误。我能够创建一个可以重现它的测试场景 - 然后让它消失。我希望它也对你有用。
Here is my test scenario. First I write a short script that writes the time to a file every 100 ms (approx) - this is my "log file" that generates enough data that uniq -c
should give me an interesting output every second:
这是我的测试场景。首先,我编写了一个简短的脚本,每 100 毫秒(大约)将时间写入一个文件——这是我的“日志文件”,它生成足够的数据,uniq -c
每秒钟都会给我一个有趣的输出:
#!/bin/ksh
while :
do
echo The time is `date` >> a.txt
sleep 0.1
done
(Note - I had to use ksh
which has the ability to do a sub-second sleep
)
(注意 - 我必须使用ksh
它有能力做一个亚秒sleep
)
In another window, I type
在另一个窗口中,我输入
tail -f a.txt | uniq -c
Sure enough, you get the following output appearing every second:
果然,每秒都会出现以下输出:
9 The time is Thu Dec 12 21:01:05 EST 2013
10 The time is Thu Dec 12 21:01:06 EST 2013
10 The time is Thu Dec 12 21:01:07 EST 2013
9 The time is Thu Dec 12 21:01:08 EST 2013
10 The time is Thu Dec 12 21:01:09 EST 2013
9 The time is Thu Dec 12 21:01:10 EST 2013
10 The time is Thu Dec 12 21:01:11 EST 2013
10 The time is Thu Dec 12 21:01:12 EST 2013
etc. No delays. Important to note - I did not attempt to cut out the time. Next, I did
等没有延误。重要的是要注意 -我没有试图减少时间。接下来,我做了
tail -f a.txt | cut -f7 -d' ' | uniq -c
And your problem reproduced - it would "hang" for quite a while (until there was 4k of characters in the buffer, and then it would vomit it all out at once).
并且您的问题重现了 - 它会“挂起”很长一段时间(直到缓冲区中有 4k 个字符,然后它会立即全部吐出)。
A bit of searching online ( https://stackoverflow.com/a/16823549/1967396) told me of a utility called stdbuf. At that reference, it specifically mentions almost exactly your scenario, and they provide the following workaround (paraphrasing to match my scenario above):
在线搜索(https://stackoverflow.com/a/16823549/1967396)告诉我一个名为stdbuf的实用程序。在该参考文献中,它特别提到了几乎完全符合您的方案,并且他们提供了以下解决方法(解释为与我上面的方案相匹配):
tail -f a.txt | stdbuf -oL cut -f7 -d' ' | uniq -c
And that would be great… except that this utility doesn't exist on my machine (Mac OS) - it is specific to GNU coreutils. This left me unable to test - although it may be a good solution for you.
那就太好了……除了这个实用程序在我的机器 (Mac OS) 上不存在 - 它特定于 GNU coreutils。这让我无法测试 - 尽管它对您来说可能是一个很好的解决方案。
Never fear - I found the following workaround, based on the socat
command (which I honestly barely understand, but I adapted from the answer given at https://unix.stackexchange.com/a/25377).
不要害怕 - 我根据socat
命令找到了以下解决方法(老实说,我几乎不明白,但我改编自https://unix.stackexchange.com/a/25377给出的答案)。
Make a small file called tailcut.sh
(this is the "long_running_command" from the link above):
制作一个名为的小文件tailcut.sh
(这是上面链接中的“long_running_command”):
#!/bin/ksh
tail -f a.txt | cut -f7 -d' '
Give it execute permissions with chmod 755 tailcut.sh
. Then issue the following command:
赋予它执行权限chmod 755 tailcut.sh
。然后发出以下命令:
socat EXEC:./tailcut.sh,pty,ctty STDIO | uniq -c
And hey presto - your lumpy output is lumpy no more. The socat
sends the output from the script straight to the next pipe, and uniq
can do its thing.
嘿,快 - 你的块状输出不再是块状的。该socat
发送从脚本直奔下一个管道输出,并且uniq
可以做的事情。
回答by Julien Palard
You may try logtop
, (apt-get install logtop
):
你可以试试logtop
,( apt-get install logtop
):
Usage:
用法:
tail -F /var/logs/request.log | [cut the date-time] | logtop
Example:
例子:
$ tail -f /var/log/varnish/varnishncsa.log | awk '{print }' | logtop
5585 elements in 10 seconds (558.50 elements/s)
1 690 69.00/s [28/Mar/2015:23:13:48
2 676 67.60/s [28/Mar/2015:23:13:47
3 620 62.00/s [28/Mar/2015:23:13:49
4 576 57.60/s [28/Mar/2015:23:13:53
5 541 54.10/s [28/Mar/2015:23:13:54
6 540 54.00/s [28/Mar/2015:23:13:55
7 511 51.10/s [28/Mar/2015:23:13:51
8 484 48.40/s [28/Mar/2015:23:13:52
9 468 46.80/s [28/Mar/2015:23:13:50
Columns are, from left to right:
列是,从左到右:
- Just row number
- qte seen
- hits per second
- the actual line
- 只是行号
- 看过
- 每秒点击次数
- 实际线路
回答by yaccz
Consider how uniq -c
is working.
考虑如何uniq -c
工作。
In order to print the count, it needs to read all the unique lines and only once a line that is different from the previous one, it can print the line and number of occurences.
为了打印计数,它需要读取所有唯一的行,并且只有一次与前一行不同的行,它可以打印该行和出现次数。
That's just how the algorithm fundamentally works and there is no way around it.
这就是算法的基本工作原理,没有办法绕过它。
You can test this by running
您可以通过运行来测试
touch a
tail -F a | uniq -c
And then one after another
然后一个接一个
echo 1 >> a
echo 1 >> a
echo 1 >> a
nothing happens. Only after you run
没发生什么事。只有在你跑完之后
echo 2 >> a
uniq
can print there were 3 "1\n" occurences.
uniq
可以打印有 3 个“1\n”出现。