bash 有没有一种简单的方法可以将“原始”字符串传递给 grep?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11856054/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 22:31:04  来源:igfitidea点击:

Is there an easy way to pass a "raw" string to grep?

bashescapinggrepcommand-line-interface

提问by slezica

grepcan't be fed "raw" strings when used from the command-line, since some characters need to be escaped to not be treated as literals. For example:

grep从命令行使用时不能输入“原始”字符串,因为某些字符需要转义才能不被视为文字。例如:

$ grep '(hello|bye)' # WON'T MATCH 'hello'
$ grep '\(hello\|bye\)' # GOOD, BUT QUICKLY BECOMES UNREADABLE

I was using printfto auto-escape strings:

printf用来自动转义字符串:

$ printf '%q' '(some|group)\n'
\(some\|group\)\n

This produces a bash-escaped version of the string, and using backticks, this can easily be passed to a grep call:

这会生成字符串的 bash 转义版本,并使用反引号,可以轻松地将其传递给 grep 调用:

$ grep `printf '%q' '(a|b|c)'`

However, it's clearly not meant for this: some characters in the output are not escaped, and some are unnecessarily so. For example:

然而,这显然不是为了这个:输出中的某些字符没有被转义,而有些则是不必要的。例如:

$ printf '%q' '(^#)'
\(\^#\)

The ^character should not be escaped when passed to grep.

^传递给角色时,不应该逃脱grep

Is there a cli tool that takes a raw string and returns a bash-escaped version of the string that can be directly used as pattern with grep? How can I achieve this in pure bash, if not?

是否有 cli 工具接受原始字符串并返回字符串的 bash 转义版本,该版本可以直接用作 grep 的模式?如果不是,我如何在纯 bash 中实现这一目标?

采纳答案by tripleee

If you are attempting to get grepto use Extended Regular Expression syntax, the way to do that is to use grep -E(aka egrep). You should also know about grep -F(aka fgrep) and, in newer versions of GNU Coreutils, grep -P.

如果您正在尝试grep使用扩展正则表达式语法,那么这样做的方法是使用grep -E(aka egrep)。您还应该了解grep -F(又名fgrep),并且在较新版本的 GNU Coreutils 中,grep -P.

Background: The original grephad a fairly small set of regex operators; it was Ken Thompson's original regular expression implementation. A new version with an extended reperttheitroade was developed later, and for compatibility reasons, got a different name. With GNU grep, there is only one binary, which understands the traditional, basic RE syntax if invoked as grep, and ERE if invoked as egrep. Some constructs from egrepare available in grepby using a backslash escape to introduce special meaning.

背景:原始grep的正则表达式运算符集相当小;它是 Ken Thompson 最初的正则表达式实现。后来开发了具有扩展曲目的新版本,出于兼容性原因,使用了不同的名称。使用 GNU grep,只有一个二进制文件,如果作为 调用,它可以理解传统的基本 RE 语法,如果作为 调用,则可以理解grepERE egrep。从一些结构egrep是可grep通过使用一个反斜线引进特殊的意义。

Subsequently, the Perl programming language has extended the formalism even further; this regex dialect seems to be what most newcomers erroneously expect grep, too, to support. With grep -P, it does; but this is not yet widely supported on all platforms.

随后,Perl 编程语言进一步扩展了形式主义;这种正则表达式方言似乎也是大多数新人错误地期望grep支持的方言。随着grep -P,它; 但这尚未在所有平台上得到广泛支持。

So, in grep, the following characters have a special meaning: ^$[]*.\

因此,在 中grep,以下字符具有特殊含义:^$[]*.\

In egrep, the following characters also have a special meaning: ()|+?{}. (The braces for repetition were not in the original egrep.) The grouping parentheses also enable backreferences with \1, \2, etc.

在 中egrep,以下字符也有特殊含义:()|+?{}. (用于重复的括号都不在原始egrep)的分组圆括号也能够与反向引用\1\2等等。

In many versions of grep, you can get the egrepbehavior by putting a backslash before the egrepspecials. There are also special sequences like \<\>.

在 的许多版本中grep,您可以egrep通过在egrep特价前放置反斜杠来获得行为。还有一些特殊的序列,如\<\>.

In Perl, a huge number of additional escapes like \w\s\dwere introduced. In Perl 5, the regex facility was substantially extended, with non-greedy matching *?+?etc, non-grouping parentheses (?:...), lookaheads, lookbehinds, etc.

在 Perl 中,\w\s\d引入了大量额外的转义符。在 Perl 5 中,正则表达式功能得到了显着扩展,具有非贪婪匹配*?+?等、非分组括号(?:...)、前瞻、后视等。

... Having said that, if you really do want to convert egrepregular expressions to grepregular expressions without invoking any external process, try ${regex/pattern/substitution}for each of the egrepspecial characters; but recognize that this does not handle character classes, negated character classes, or backslash escapes correctly.

...话虽如此,如果您真的想在不调用任何外部进程的情况下egrep正则表达式转换为grep正则表达式,请尝试每个特殊字符;但要认识到这不能正确处理字符类、否定字符类或反斜杠转义。${regex/pattern/substitution}egrep

回答by ephemient

If you want to search for an exact string,

如果要搜索确切的字符串,

grep -F '(some|group)\n' ...

-Ftells grepto treat the pattern as is, with no interpretation as a regex.

-F告诉grep按原样处理模式,没有解释为正则表达式。

(This is often available as fgrepas well.)

(这也经常可用fgrep。)

回答by Riccardo Galli

When I use grep -E with user provided strings I escape them with this

当我将 grep -E 与用户提供的字符串一起使用时,我会用这个来转义它们

ere_quote() {
    sed 's/[][\.|$(){}?+*^]/\&/g' <<< "$*"
}

example run

示例运行

ere_quote ' \ $ [ ] ( ) { } | ^ . ? + *'
# output
# \ $ \[ \] \( \) \{ \} \| \^ \. \? \+ \*

This way you may safely insert the quoted string in your regular expression.

这样您就可以安全地在正则表达式中插入带引号的字符串。

e.g. if you wanted to find each line starting with the user content, with the user providing funny strings as .*

例如,如果你想找到以用户内容开头的每一行,用户提供有趣的字符串作为 .*

userdata=".*"
grep -E -- "^$(ere_quote "$userdata")" <<< ".*hello"
# if you have colors in grep you'll see only ".*" in red

回答by LLL

I think that previous answers are not complete because they miss one important thing, namely string which begin with dash (-). So while this won'twork:

我认为以前的答案并不完整,因为他们错过了一件重要的事情,即以破折号 (-) 开头的字符串。因此,尽管这将无法正常工作:

echo "A-B-C" | grep -F "-B-"

This one will:

这将:

echo "A-B-C" | grep -F -- "-B-"