Linux 使用 grep 获取每行匹配后的下一个 WORD
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/10971765/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Using grep to get the next WORD after a match in each line
提问by aditya.gupta
I want to get the "GET" queries from my server logs.
我想从我的服务器日志中获取“ GET”查询。
For example, this is the server log
例如,这是服务器日志
1.0.0.127.in-addr.arpa - - [10/Jun/2012 15:32:27] code 404, message File not fo$
1.0.0.127.in-addr.arpa - - [10/Jun/2012 15:32:27] "GET /hello HTTP/1.1" 404 -
1.0.0.127.in-addr.arpa - - [10/Jun/2012 15:41:57] code 404, message File not fo$
1.0.0.127.in-addr.arpa - - [10/Jun/2012 15:41:57] "GET /ss HTTP/1.1" 404 -
When I try with simple grep or awk,
当我尝试使用简单的 grep 或 awk 时,
Adi:~ adi$ awk '/GET/, /HTTP/' serverlogs.txt
it gives out
它发出
1.0.0.127.in-addr.arpa - - [10/Jun/2012 15:32:27] "GET /hello HTTP/1.1" 404 -
1.0.0.127.in-addr.arpa - - [10/Jun/2012 15:41:57] "GET /ss HTTP/1.1" 404 -
I just want to display : helloand ss
我只想显示:你好和ss
Is there any way this could be done?
有没有办法做到这一点?
采纳答案by Tim Pote
Assuming you have gnu grep, you can use perl-style regex to do a positive lookbehind:
假设你有 gnu grep,你可以使用 perl-style regex 做一个积极的回顾:
grep -oP '(?<=GET\s/)\w+' file
If you don't have gnu grep, then I'd advise just using sed:
如果您没有 gnu grep,那么我建议您只使用 sed:
sed -n '/^.*GET[[:space:]]\{1,\}\/\([-_[:alnum:]]\{1,\}\).*$/s///p' file
If you happen to have gnu sed, that can be greatly simplified:
如果你碰巧有 gnu sed,那可以大大简化:
sed -n '/^.*GET\s\+\/\(\w\+\).*$/s///p' file
The bottom line here is, you certainly don't need pipes to accomplish this. grep
or sed
alone will suffice.
这里的底线是,您当然不需要管道来完成此操作。 grep
或者sed
一个人就足够了。
回答by John Carter
回答by Todd A. Jacobs
It's often easier to use a pipeline rather than a single complex regular expression. This works on the data you provided:
使用管道通常比使用单个复杂的正则表达式更容易。这适用于您提供的数据:
fgrep GET /tmp/foo |
egrep -o 'GET (.*) HTTP' |
sed -r 's/^GET \/(.+) HTTP//'
This pipeline returns the following results:
此管道返回以下结果:
hello
ss
There are certainly other ways to get the job done, but this patently works on the provided corpus.
当然还有其他方法可以完成工作,但这显然适用于提供的语料库。
回答by Charles Chow
use a pipe if you use grep:
如果您使用 grep,请使用管道:
grep -o /he.* log.txt | grep -o [^/].*
grep -o /ss log.txt | grep -o [^/].*
[^/] means extract the letters after ^ symbol from the grep output
[^/] 表示从 grep 输出中提取 ^ 符号后的字母
回答by P....
gawk '{match(,/\/(\w+)/,a);} length(a[1]){print a[1]}' log.txt
hello
ss
If you have gawk
then above command will use match
function to select the desired value using regex and storing it to an array a
.
如果你有gawk
那么上面的命令将使用match
函数使用正则表达式选择所需的值并将其存储到数组中a
。
回答by ajp619
I was trying to do this and came across this link: https://www.unix.com/shell-programming-and-scripting/153101-print-next-word-after-found-pattern.html
我试图这样做并遇到了这个链接:https: //www.unix.com/shell-programming-and-scripting/153101-print-next-word-after-found-pattern.html
Summary: use grep to find matching lines, then use awk to find the pattern and print the next field:
总结:使用grep查找匹配行,然后使用awk查找模式并打印下一个字段:
grep pattern logfile | \
awk '{for(i=1; i<=NF; i++) if($i~/pattern/) print $(i+1)}'
If you want to know the unique occurrences:
如果您想知道唯一的事件:
grep pattern logfile | \
awk '{for(i=1; i<=NF; i++) if($i~/pattern/) print $(i+1)}' | \
sort | \
uniq -c