bash 从 Apache 日志中对 uniq IP 地址进行排序

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18682308/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-10 00:09:49  来源:igfitidea点击:

Sort uniq IP address in from Apache log

apachebashsortinglogging

提问by Arthur

I'm trying to extract IP addresses from my apache log, count them, and sort them.

我正在尝试从我的 apache 日志中提取 IP 地址,对其进行计数并对其进行排序。

And for whatever reason, the sorting part is horrible.

无论出于何种原因,排序部分都很糟糕。

Here is the command:

这是命令:

cat access.* | awk '{ print  }' | sort | uniq -c | sort -n

Output example:

输出示例:

  16789 65.X.X.X
  19448 65.X.X.X
   1995 138.X.X.X
   2407 213.X.X.X
   2728 213.X.X.X
   5478 188.X.X.X
   6496 176.X.X.X
  11332 130.X.X.X

I don't understand why these values aren't really sorted. I've also tried to remove blanks at the start of the line (sed 's/^[\t ]*//g') and using sort -n -t" " -k1, which doesn't change anything.

我不明白为什么这些值没有真正排序。我还尝试删除行 ( sed 's/^[\t ]*//g')开头的空格并使用sort -n -t" " -k1,这不会改变任何内容。

Any hint ?

任何提示?

回答by linsort

This may be late, but using the numeric in the first sort will give you the desired result,

这可能晚了,但在第一种排序中使用数字会给你想要的结果,

cat access.log | awk '{print }' | sort -n | uniq -c | sort -nr | head -20

Output:

输出:

 29877 93.xxx.xxx.xxx
  17538 80.xxx.xxx.xxx
   5895 198.xxx.xxx.xxx
   3042 37.xxx.xxx.xxx
   2956 208.xxx.xxx.xxx
   2613 94.xxx.xxx.xxx
   2572 89.xxx.xxx.xxx
   2268 94.xxx.xxx.xxx
   1896 89.xxx.xxx.xxx
   1584 46.xxx.xxx.xxx
   1402 208.xxx.xxx.xxx
   1273 93.xxx.xxx.xxx
   1054 208.xxx.xxx.xxx
    860 162.xxx.xxx.xxx
    830 208.xxx.xxx.xxx
    606 162.xxx.xxx.xxx
    545 94.xxx.xxx.xxx
    480 37.xxx.xxx.xxx
    446 162.xxx.xxx.xxx
    398 162.xxx.xxx.xxx

回答by Benjamin Dupuis

Why use cat | awk? You only need to use awk:

为什么使用cat | awk?您只需要使用awk

awk '{ print  }' /var/log/*access*log | sort -n | uniq -c | sort -nr | head -20

回答by Arthur

I don't know why a simple sort -ndidn't work, but adding a non numeric character between the counter and the IP soved my issue.

我不知道为什么一个简单的sort -n不起作用,但是在计数器和 IP 之间添加一个非数字字符解决了我的问题。

cat access.* | awk '{ print  } ' | sort | uniq -c | sed -r 's/^[ \t]*([0-9]+) (.*)$/ --- /' | sort -rn

回答by tue

This should work

这应该工作

cat access.* | awk '{ print  }' | sort | awk '{print  " " ;}' | sort -n

I can't see a problem.

我看不出有什么问题。

Control characters in the files?

文件中的控制字符?

File system full (temp files)?

文件系统已满(临时文件)?

回答by Antony Gibbs

If sort isn't resulting as expected it's probably due to a locale issue.

如果排序未按预期产生,则可能是由于区域设置问题。

| LC_ALL=C sort -rn

| LC_ALL=C sort -rn

awk '{array[]++}END{ for (ip in array) print array[ip] " " ip}' <path/to/apache/*.log> | LC_ALL=C sort -rn

Sources sort not sorting as expected (space and locale)

排序未按预期排序(空间和语言环境)

https://www.commandlinefu.com/commands/view/9744/sort-ip-by-count-quickly-with-awk-from-apache-logs

https://www.commandlinefu.com/commands/view/9744/sort-ip-by-count-quickly-with-awk-from-apache-logs