bash 使用grep检测重复字符
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 
原文地址: http://stackoverflow.com/questions/13033226/
Warning: these are provided under cc-by-sa 4.0 license.  You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Detect repeated characters using grep
提问by Unknown
I'm trying to write a grep (or egrep) command that will find and print any lines in "words.txt" which contain the same lower-case letter three times in a row. The three occurrences of the letter may appear consecutively (as in "mooo") or separated by one or more spaces (as in "x x x") but not separated by any other characters.
我正在尝试编写一个 grep(或 egrep)命令,该命令将在“words.txt”中查找并打印任何连续三次包含相同小写字母的行。出现的三个字母可以连续出现(如“mooo”)或由一个或多个空格分隔(如“xx x”)但不由任何其他字符分隔。
words.txt contains:
words.txt 包含:
The monster said "grrr"!
He lived in an igloo only in the winter.
He looked like an aardvark.
Here's what I think the command should look like:
这是我认为命令应该是这样的:
grep -E '\b[^ ]*[[:alpha:]]{3}[^ ]*\b' 'words.txt'
Although I know this is wrong, but I don't know enough of the syntax to figure it out. Using grep, could someone please help me?
虽然我知道这是错误的,但我对语法的了解还不够多。使用grep,有人可以帮助我吗?
回答by choroba
Does this work for you?
这对你有用吗?
grep '\([[:lower:]]\) * *'
It takes a lowercase character [[:lower:]]and remembers it \( ... \). It than tries to match any number of spaces _*(0 included), the rememberd character \1, any number of spaces, the remembered character. And that's it.
它需要一个小写字符[[:lower:]]并记住它\( ... \)。然后尝试匹配任意数量的空格_*(包括 0)、记住的字符\1、任意数量的空格、记住的字符。就是这样。
You can try running it with --color=autoto see what parts of the input it matched.
您可以尝试运行它--color=auto以查看它匹配输入的哪些部分。
回答by nshew
Try this. Note that this will not match "mooo", as the word boundary (\b) occurs before the "m".
尝试这个。请注意,这将不匹配“mooo”,因为单词边界 ( \b) 出现在“m”之前。
grep -E '\b([[:alpha:]]) *\1 *\1 *\b' words.txt
grep -E '\b([[:alpha:]]) *\1 *\1 *\b' words.txt
[:alpha:]is an expression of a character class. To use as a regex charset, it needs the extra brackets. You may have already known this, as it looks like you started to do it, but left the open bracket unclosed.
[:alpha:]是字符类的表达式。要用作正则表达式字符集,它需要额外的括号。您可能已经知道这一点,因为看起来您已经开始这样做了,但未关闭开放的括号。

