git 如何替换git历史中文件中的文本?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4110652/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to substitute text from files in git history?
提问by Tom
I've always used an interface based git client (smartGit) and thus don't have much experience with the git console.
我一直使用基于界面的 git 客户端 (smartGit),因此对 git 控制台没有太多经验。
However, I now face the need to substitute a string in all .txt files from history (so, not erasing the whole file but just substituting a string). I found the following command:
但是,我现在需要替换历史记录中所有 .txt 文件中的字符串(因此,不是删除整个文件,而是替换一个字符串)。我找到了以下命令:
git filter-branch --tree-filter 'git ls-files -z "*.php" |xargs -0 perl -p -i -e "s#(PASSWORD1|PASSWORD2|PASSWORD3)#xXxXxXxXxXx#g"' -- --all
I tried this, and unfortunately noticed that while the password did get changed, all binary files got corrupted. Images, etc. would all be corrupted.
我试过了,不幸的是,虽然密码确实被更改了,但所有二进制文件都已损坏。图像等都会被破坏。
Is there a better way to do this that won't corrupt my binary files?
有没有更好的方法来做到这一点,不会破坏我的二进制文件?
Thanks.
谢谢。
EDIT:
编辑:
I got mixed up with something. The actual code that caused binary files to get corrupted was:
我被某些事情搞混了。导致二进制文件损坏的实际代码是:
$ git filter-branch --tree-filter "find . -type f -exec sed -i -e 's/originalpassword/newpassword/g' {} \;"
The code at the top actually removedall files with my password strangely enough.
顶部的代码实际上使用我的密码删除了所有文件,这很奇怪。
采纳答案by jweyrich
You can avoid touching undesired files by passing -name "pattern"
to find
.
您可以通过传递-name "pattern"
到find
.
This works for me:
这对我有用:
git filter-branch --tree-filter "find . -name '*.php' -exec sed -i -e \
's/originalpassword/newpassword/g' {} \;"
回答by Roberto Tyley
I'd recommend using the BFG Repo-Cleaner, a simpler, faster alternative to git-filter-branch
specifically designed for rewriting files from Git history.
我建议使用BFG Repo-Cleaner,这是一种更简单、更快的替代方案,git-filter-branch
专门用于重写 Git 历史记录中的文件。
You should carefully follow these steps here: https://rtyley.github.io/bfg-repo-cleaner/#usage- but the core bit is just this: download the BFG's jar(requires Java 7 or above) and run this command:
您应该在这里仔细按照以下步骤操作:https: //rtyley.github.io/bfg-repo-cleaner/#usage- 但核心位就是:下载BFG 的 jar(需要 Java 7 或更高版本)并运行此命令:
$ java -jar bfg.jar --replace-text replacements.txt -fi *.php my-repo.git
The replacements.txt
file should contain all the substitutions you want to do, in a format like this (one entry per line - note the comments shouldn't be included):
该replacements.txt
文件应包含您想要执行的所有替换,格式如下(每行一个条目 - 请注意不应包含注释):
PASSWORD1 # Replace literal string 'PASSWORD1' with '***REMOVED***' (default)
PASSWORD2==>examplePass # replace with 'examplePass' instead
PASSWORD3==> # replace with the empty string
regex:password=\w+==>password= # Replace, using a regex
regex:\r(\n)==> # Replace Windows newlines with Unix newlines
Your entire repository history will be scanned, and .php
files (under 1MB in size) will have the substitutions performed: any matching string (that isn't in your latestcommit) will be replaced.
您的整个存储库历史将被扫描,.php
文件(大小小于 1MB)将执行替换:任何匹配的字符串(不在您的最新提交中)将被替换。
Full disclosure: I'm the author of the BFG Repo-Cleaner.
完全披露:我是 BFG Repo-Cleaner 的作者。
回答by Nay
I created a file at /usr/local/git/findsed.sh , with the following contents:
我在 /usr/local/git/findsed.sh 创建了一个文件,内容如下:
find . -name 'githubDirToSubmodule.sh' -exec sed -i '' -e 's/What I want to remove//g' {} \;
I ran the command:
我运行了命令:
git filter-branch --tree-filter "sh /usr/local/git/findsed.sh"
Explanation of commands
命令说明
When you run git filter-branch, this goes through each revision that you ever committed, one by one. --tree-filter runs the findsed.sh script on each committed revision, saves it, then progresses to the next revision.
当您运行 git filter-branch 时,它会逐一检查您提交的每个修订版。--tree-filter 在每个提交的修订上运行 foundsed.sh 脚本,保存它,然后进行到下一个修订。
The find command finds a specific file or set of files and executes (-exec) the sed editor on that file. sed is a command that takes the regex after s/ and replaces it with the string between / and /g (blank in my example). {} is a reference to the files path that was given by the find command. The file path is fed to sed, so that sed knows what to work on. \; just ends the -exec command.
find 命令查找特定文件或文件集并对该文件执行 (-exec) sed 编辑器。sed 是一个命令,它将 s/ 之后的正则表达式替换为 / 和 /g 之间的字符串(在我的示例中为空白)。{} 是对 find 命令提供的文件路径的引用。文件路径被提供给 sed,因此 sed 知道要处理什么。\; 只是结束 -exec 命令。
Seperating the shell script and command out into seperate pieces allows for less complication when it comes to quotes '' or "".
当涉及到引号 '' 或 "" 时,将 shell 脚本和命令分成单独的部分可以减少复杂性。
Peculiarities
特点
I successfully implemented this on a mac, and apparently sed is a particular (older?) version on macs. This matters, as it sometimes behaves differently. Make sure to do sed -i '' or else it was adding a "-e" to the end of files, thinking that that was what i wanted to name my backup files. -i '' says dont make backup files, just edit the files in place and no backup file needed.
我在 mac 上成功实现了这个,显然 sed 是 mac 上的一个特定(旧?)版本。这很重要,因为它有时表现不同。确保执行 sed -i '' 否则它会在文件末尾添加“-e”,认为这就是我想要命名备份文件的原因。-i '' 表示不要制作备份文件,只需就地编辑文件,不需要备份文件。
Specifying -name 'filename.sh' helped me avoid another issue that I could not solve. There was another file with .sh and that file ended without a newline character. sed for some reason, would add a newline character to the end, despite the 's/blah/blah/g' not matching anything in that file. So instead of figuring out that issue, I just told the find to ignore all other files.
指定 -name 'filename.sh' 帮助我避免了另一个我无法解决的问题。还有另一个带有 .sh 的文件,该文件没有换行符结束。sed 出于某种原因,会在末尾添加一个换行符,尽管 's/blah/blah/g' 与该文件中的任何内容都不匹配。因此,我没有弄清楚这个问题,而是告诉 find 忽略所有其他文件。
Additional commands that work
其他有效的命令
Additionally, I found these commands to work in the findsed.sh file (only one command at a time, not multple, so comment # the others out):
此外,我发现这些命令可以在 findsed.sh 文件中使用(一次只能执行一个命令,而不是多个命令,因此将其他命令注释掉):
find . -name '.publishNewZenPackFromGithub.sh.swp' -exec rm -f {} \;
find . -name '*' -exec grep -H PassToRemove {} \;
Enjoy!
享受!
回答by Ben Hymanson
Could be a shell expansion issue. If filter-branch is losing the quotes around "*.php"
by the time it evaluates the command, it may be expanding to nothing, thus git ls-files -z
listing all files.
可能是外壳扩展问题。如果 filter-branch 在"*.php"
评估命令时丢失了引号,它可能会扩展为空,从而git ls-files -z
列出所有文件。
You could check the filter-branch source or trying different quoting tricks, but what I'd do is just make a one-line shell script that does your tree-filter and pass that script instead.
您可以检查过滤器分支源或尝试不同的引用技巧,但我要做的只是制作一个单行的 shell 脚本来执行您的树过滤器并传递该脚本。
回答by VonC
With Git 2.24 (Q4 2019), git filter-branch
(and BFG) is deprecated.
在 Git 2.24(2019 年第四季度)中,git filter-branch
(和 BFG)已弃用。
The equivalent would be, using newren/git-filter-repo
, and its example section:
等效的将是, using newren/git-filter-repo
,及其示例部分:
cd repo
git filter-repo --path-glob '*.txt' --replace-text expressions.txt
with expressions.txt
:
与expressions.txt
:
literal:originalpassword==>newpassword