bash 查找包含给定文本的文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6153152/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 20:34:14  来源:igfitidea点击:

Find files containing a given text

bashfind

提问by Owen

In bash I want to return file name (and the path to the file) for every file of type .php|.html|.jscontaining the case-insensitive string "document.cookie" | "setcookie"

在 bash 中,我想为.php|.html|.js包含不区分大小写的字符串的每个类型的文件返回文件名(以及文件的路径)"document.cookie" | "setcookie"

How would I do that?

我该怎么做?

回答by bear24rw

egrep -ir --include=*.{php,html,js} "(document.cookie|setcookie)" .

The rflag means to search recursively (search subdirectories). The iflag means case insensitive.

r标志表示递归搜索(搜索子目录)。该i标志表示不区分大小写。

If you just want file names add the l(lowercase L) flag:

如果您只想要文件名,请添加l(小写L)标志:

egrep -lir --include=*.{php,html,js} "(document.cookie|setcookie)" .

回答by Raoul

Try something like grep -r -n -i --include="*.html *.php *.js" searchstrinhere .

尝试类似的东西 grep -r -n -i --include="*.html *.php *.js" searchstrinhere .

the -imakes it case insensitlve

-i使它不区分大小写

the .at the end means you want to start from your current directory, this could be substituted with any directory.

.你想从你的当前目录开始结束手段,这可以与任何目录取代。

the -rmeans do this recursively, right down the directory tree

-r方法递归地执行此操作,就在目录树的下方

the -nprints the line number for matches.

-n打印匹配项的行号。

the --includelets you add file names, extensions. Wildcards accepted

--include让你添加的文件名,扩展名。接受通配符

For more info see: http://www.gnu.org/software/grep/

有关更多信息,请参阅:http: //www.gnu.org/software/grep/

回答by Michael Berkowski

findthem and grepfor the string:

find他们和grep字符串:

This will find all files of your 3 types in /starting/path and grep for the regular expression '(document\.cookie|setcookie)'. Split over 2 lines with the backslash just for readability...

这将在 /starting/path 和 grep 中找到 3 种类型的所有文件,用于正则表达式'(document\.cookie|setcookie)'。用反斜杠分成 2 行只是为了便于阅读......

find /starting/path -type f -name "*.php" -o -name "*.html" -o -name "*.js" | \
 xargs egrep -i '(document\.cookie|setcookie)'

回答by Fredrik Pihl

Sounds like a perfect job for grepor perhaps ack

听起来像是一个完美的工作,grep或者可能是ack

Or this wonderful construction:

或者这个奇妙的结构:

find . -type f \( -name *.php -o -name *.html -o -name *.js \) -exec grep "document.cookie\|setcookie" /dev/null {} \;

回答by nos

find . -type f -name '*php' -o -name '*js' -o -name '*html' |\
xargs grep -liE 'document\.cookie|setcookie'

回答by Pedro Vernetti

Just to include one more alternative, you could also use this:

只是为了包括一个更多的选择,你也可以使用这个:

find "/starting/path" -type f -regextype posix-extended -regex "^.*\.(php|html|js)$" -exec grep -EH '(document\.cookie|setcookie)' {} \;

find "/starting/path" -type f -regextype posix-extended -regex "^.*\.(php|html|js)$" -exec grep -EH '(document\.cookie|setcookie)' {} \;

Where:

在哪里:

  • -regextype posix-extendedtells findwhat kind of regex to expect
  • -regex "^.*\.(php|html|js)$"tells findthe regex itself filenames must match
  • -exec grep -EH '(document\.cookie|setcookie)' {} \;tells findto run the command (with its options and arguments) specified between the -execoption and the \;for each file it finds, where {}represents where the file path goes in this command.

    while

    • Eoption tells grepto use extended regex (to support the parentheses) and...
    • Hoption tells grepto print file paths before the matches.
  • -regextype posix-extended告诉find期望什么样的正则表达式
  • -regex "^.*\.(php|html|js)$"告诉find正则表达式本身文件名必须匹配
  • -exec grep -EH '(document\.cookie|setcookie)' {} \;告诉find运行在-exec选项和\;它找到的每个文件之间指定的命令(及其选项和参数),其中{}表示文件路径在此命令中的位置。

    尽管

    • E选项告诉grep使用扩展正则表达式(以支持括号)和...
    • H选项告诉grep在匹配之前打印文件路径。

And, given this, if you only want file paths, you may use:

并且,鉴于此,如果您只想要文件路径,则可以使用:

find "/starting/path" -type f -regextype posix-extended -regex "^.*\.(php|html|js)$" -exec grep -EH '(document\.cookie|setcookie)' {} \; | sed -r 's/(^.*):.*$/\1/' | sort -u

find "/starting/path" -type f -regextype posix-extended -regex "^.*\.(php|html|js)$" -exec grep -EH '(document\.cookie|setcookie)' {} \; | sed -r 's/(^.*):.*$/\1/' | sort -u

Where

在哪里

  • |[pipe] send the output of findto the next command after this (which is sed, then sort)
  • roption tells sedto use extended regex.
  • s/HI/BYE/tells sedto replace every First occurrence (per line) of "HI" with "BYE" and...
  • s/(^.*):.*$/\1/tells it to replace the regex (^.*):.*$(meaning a group[stuff enclosed by ()] including everything[.*= one or more of any-character] from the beginning of the line[^] till' the first ':' followed by anythingtill' the end of line[$]) by the first group[\1] of the replaced regex.
  • utells sort to remove duplicate entries (take sort -uas optional).
  • |[pipe] 将 的输出发送find到此之后的下一个命令(即sed, then sort
  • r选项告诉sed使用扩展的正则表达式。
  • s/HI/BYE/告诉sed将“HI”的每个第一次出现(每行)替换为“BYE”,然后......
  • s/(^.*):.*$/\1/告诉它来代替正则表达式(^.*):.*$(意味着[东西由包围()],包括一切[.*从=一个或多个任意的字符的]的行的开头[ ^]直到“第一‘:’随后任何直到”的的端行[ $]) 由替换的正则表达式的第一[ \1] 组成。
  • u告诉 sort 删除重复条目(sort -u作为可选)。

...FAR from being the most elegant way. As I said, my intention is to increase the range of possibilities (and also to give more complete explanations on some tools you could use).

...远不是最优雅的方式。正如我所说,我的目的是增加可能性的范围(并且还对您可以使用的一些工具进行更完整的解释)。