我可以使用 Git 在存储库中搜索匹配的文件名吗?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/277546/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Can I use Git to search for matching filenames in a repository?
提问by Newbie Git
Just say I have a file: "HelloWorld.pm" in multiple subdirectories within a Git repository.
假设我在 Git 存储库的多个子目录中有一个文件:“HelloWorld.pm”。
I would like to issue a command to find the full paths of all the files matching "HelloWorld.pm":
我想发出一个命令来查找与“HelloWorld.pm”匹配的所有文件的完整路径:
For example:
例如:
/path/to/repository/HelloWorld.pm
/path/to/repository/but/much/deeper/down/HelloWorld.pm
/path/to/repository/please/dont/make/me/search/through/the/lot/HelloWorld.pm
How can I use Git to efficiently find all the full paths that match a given filename?
如何使用 Git 有效地找到与给定文件名匹配的所有完整路径?
I realise I can do this with the Linux/Unix find command but I was hoping to avoid scanning all subdirectories looking for instances of the filename.
我意识到我可以使用 Linux/Unix find 命令执行此操作,但我希望避免扫描所有子目录以查找文件名的实例。
回答by Brian Campbell
git ls-files
will give you a listing of all files in current state of the repository (the cache or index). You can pass a pattern in to get files matching that pattern.
git ls-files
将为您提供存储库当前状态(缓存或索引)中所有文件的列表。您可以传入一个模式以获取与该模式匹配的文件。
git ls-files HelloWorld.pm '**/HelloWorld.pm'
If you would like to find a set of files and grep through their contents, you can do that with git grep
:
如果您想查找一组文件并通过它们的内容进行 grep,您可以使用以下命令git grep
:
git grep some-string -- HelloWorld.pm '**/HelloWorld.pm'
回答by Uwe Geuder
Hmm, the original question was about the repository. A repository contains more than 1 commit (in the general case at least), but the answers given before search only through one commit.
嗯,最初的问题是关于存储库。一个存储库包含 1 个以上的提交(至少在一般情况下),但在搜索之前给出的答案仅通过一个提交。
Because I could not find an answer that really searches the whole commit history I wrote a quick brute force script git-find-by-name that takes (nearly) all commits into consideration.
因为我找不到真正搜索整个提交历史的答案,所以我编写了一个快速暴力脚本 git-find-by-name ,它考虑了(几乎)所有提交。
#! /bin/sh
tmpdir=$(mktemp -td git-find.XXXX)
trap "rm -r $tmpdir" EXIT INT TERM
allrevs=$(git rev-list --all)
# well, nearly all revs, we could still check the log if we have
# dangling commits and we could include the index to be perfect...
for rev in $allrevs
do
git ls-tree --full-tree -r $rev >$tmpdir/$rev
done
cd $tmpdir
grep *
Maybe there is a more elegant way.
也许有更优雅的方式。
Please note the trivial way the parameter is passed into grep, so it will match parts of filename. If that is not desired anchor your search expression and/or add suitable grep options.
请注意将参数传递给 grep 的简单方式,因此它将匹配文件名的一部分。如果不需要锚定您的搜索表达式和/或添加合适的 grep 选项。
For deep histories the output might be too noisy, I thought about a script that converts a list of revisions into a range, like the opposite of what git rev-list can do. But so far it has remained a thought.
对于深层历史,输出可能过于嘈杂,我想到了一个将修订列表转换为范围的脚本,就像 git rev-list 可以做的相反。但到目前为止,它仍然是一个想法。
回答by Greg Hewgill
Try:
尝试:
git ls-tree -r HEAD | grep HelloWorld.pm
回答by Bull
git ls-files | grep -i HelloWorld.pm
The grep -i makes grep case insensitive.
grep -i 使 grep 不区分大小写。
回答by Dean Hall
[It's a bit of comment abuse, I admit, but I can't comment yet and thought I would improve @uwe-geuder's answer.]
[这有点滥用评论,我承认,但我还不能评论,我想我会改进@uwe-geuder 的回答。]
#!/bin/bash
#
#
# I'm using a fixed string here, not a regular expression, but you can easily
# use a regular expression by altering the call to grep below.
name=""
# Verify usage.
if [[ -z "$name" ]]
then
echo "Usage: $(basename "##代码##") <file name>" 1>&2
exit 100
fi
# Search all revisions; get unique results.
while IFS= read rev
do
# Find $name in $rev's tree and only use its path.
grep -F -- "$name" \
<(git ls-tree --full-tree -r "$rev" | awk '{ print }')
done < \
<(git rev-list --all) \
| sort -u
Again, +1 to @uwe-geuder for a great answer.
再次,+1 @uwe-geuder 以获得一个很好的答案。
If you're interested in the BASH itself:
如果您对 BASH 本身感兴趣:
Unless you're guaranteed of the word-splitting in a for loop (as when using an array like this: for item in "${array[@]}"
), I highly recommend using while IFS= read var ; do ... ; done < <(command)
when the command output you're looping over is separated by newlines (or read -d''
when output is separated by the null string $'\0'
). While git rev-list --all
is guaranteed to use 40-byte hexadecimal strings (without spaces), I never like to take chances. I can now easily change the command from git rev-list --all
to any command that produces lines
除非您保证在 for 循环中进行分词(如使用这样的数组时: for item in "${array[@]}"
),否则我强烈建议while IFS= read var ; do ... ; done < <(command)
在您循环的命令输出由换行符分隔read -d''
时(或当输出由空字符串$'\0'
)。虽然git rev-list --all
保证使用 40 字节的十六进制字符串(没有空格),但我从不喜欢冒险。我现在可以轻松地将命令更改git rev-list --all
为任何生成行的命令
I also recommend using built-in BASH mechanisms to inject input and filter output instead of temporary files.
我还建议使用内置的 BASH 机制来注入输入和过滤输出,而不是临时文件。
回答by dirkjot
The script by Uwe Geuder (@uwe-geuder) is great but there really is no need to dump each of the ls-tree outputs in its own directory, unfiltered.
Uwe Geuder (@uwe-geuder) 的脚本很棒,但实际上没有必要将每个 ls-tree 输出转储到自己的目录中,未经过滤。
Much faster and using less storage: Run the grep on the output and then store it, as shown in this gist
更快,使用更少的存储空间:在输出上运行 grep 然后存储它,如本要点所示