你如何让 Git 忽略空格和制表符?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/12427779/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do you make Git ignore spaces and tabs?
提问by CommaToast
I have a small scripting project that consists of five different source files in one directory called "Droid XX-XX-XX". Each time I created a new backup copy of the source directory, I put the date in the X's. So there are about 15 different versions from different dates. I want to add each of these to my bare new Git repository starting from the earliest.
我有一个小型脚本项目,它由一个名为“Droid XX-XX-XX”的目录中的五个不同源文件组成。每次我创建源目录的新备份副本时,我都会将日期放在 X 中。因此,大约有 15 个不同日期的不同版本。我想从最早开始将这些添加到我的新 Git 存储库中。
However I have run into several problems.
但是,我遇到了几个问题。
One problem is that some of the files use tabs for indentation, while others use spaces -- but Git treats a whole line as different even when the only difference is the tab vs. space issue. How can I make Git ignore indentation formatting?
Another problem is that some filenames would have no spaces while others had spaces between the words -- but Git treats them as different files. Worse, sometimes the filename was changed to something different (like "PatrolPlan" changed to just "Patrol") for no real reason. When I'm adding a new set of files, how can I tell Git that even though the filename is different, it's really just a new version of a certain older file? Or better yet, can I set it to auto-detect when this happens?
The last problem is that at certain points during development, we merged two source files into one, or split one into two -- but Git doesn't automatically detect the similarities and deduce what happened. How can I tell Git what happened? Or better yet, how can I set it to auto-detect when two source files were combined or when one was split up?
一个问题是一些文件使用制表符进行缩进,而其他文件使用空格——但 Git 将整行视为不同,即使唯一的区别是制表符与空格问题。如何让 Git 忽略缩进格式?
另一个问题是某些文件名没有空格,而其他文件名在单词之间有空格——但 Git 将它们视为不同的文件。更糟糕的是,有时文件名会无缘无故地更改为不同的名称(例如“PatrolPlan”更改为“Patrol”)。当我添加一组新文件时,我如何告诉 Git 即使文件名不同,它实际上只是某个旧文件的新版本?或者更好的是,我可以将其设置为在发生这种情况时自动检测吗?
最后一个问题是,在开发过程中的某些时刻,我们将两个源文件合并为一个,或将一个拆分为两个——但 Git 不会自动检测相似之处并推断发生了什么。我怎么能告诉 Git 发生了什么?或者更好的是,如何将其设置为自动检测两个源文件何时合并或一个何时拆分?
I realize questions (2) and (3) are highly related. Thanks for any assistance!
我意识到问题 (2) 和 (3) 是高度相关的。感谢您的帮助!
回答by Kelvin
It's sounding like you need more control and standardization of the development process. The one who commits changes should be the same person who modifies the files. Or at least the committer should know exactly what changed.
听起来您需要对开发过程进行更多控制和标准化。提交更改的人应该是修改文件的人。或者至少提交者应该确切地知道发生了什么变化。
Examine carefully the output of git diff
, and use the -w
flag to ignore spaces. There's also options to show differences within a line. See Diffs within a linebelow.
仔细检查 的输出git diff
,并使用-w
标志忽略空格。还有一些选项可以显示一行内的差异。请参阅下面一行中的差异。
Note that you won't be able to tell git to skip the space changes when committing. I suggest using GitX (I prefer the "brotherbard" fork), which allows you to interactively discard hunks before committing.
请注意,您将无法告诉 git 在提交时跳过空间更改。我建议使用 GitX(我更喜欢“brotherbard”叉),它允许您在提交之前以交互方式丢弃大块头。
Use descriptive messages when committing. For example, if a file was split, say so. Make your commits small. If you find yourself writing long commit messages, break up the commit into smaller parts. That way when you examine the logs a long time later, it will make more sense what changed.
提交时使用描述性消息。例如,如果文件被拆分,请说明。使您的提交变小。如果您发现自己编写了很长的提交消息,请将提交分解为更小的部分。这样,当您在很长时间后检查日志时,更改的内容会更有意义。
Diffs within a line
一行内的差异
Git has some ability to show "word" differences in a single line. The simplest way is to just use git diff --color-words
.
Git 具有在一行中显示“单词”差异的能力。最简单的方法是使用git diff --color-words
.
However, I like customizing the meaning of a "word" using the diff.wordRegex
config. I also like the plain
word-diff format because it more clearly shows where the differences are (inserts brackets around the changes in addition to using color).
但是,我喜欢使用diff.wordRegex
配置自定义“单词”的含义。我也喜欢plain
word-diff 格式,因为它更清楚地显示了差异的位置(除了使用颜色之外,还在更改周围插入括号)。
Command:
命令:
git diff --word-diff=plain
along with this in my config:
连同这个在我的配置中:
[diff]
wordRegex = [[:alnum:]_]+|[^[:alnum:]_[:space:]]+
This regex treats these as "words":
此正则表达式将这些视为“单词”:
- consecutive strings of alphanumerics and underscores
- consecutive strings of non-alphanumerics, non-underscores, and non-spaces (good for detecting operators)
- 连续的字母数字和下划线字符串
- 非字母数字、非下划线和非空格的连续字符串(适用于检测运算符)
You must have a recent version of git
to use wordRegex
. See your git-config
man page to see if the option is listed.
您必须有最新版本的git
才能使用wordRegex
。查看您的git-config
手册页以查看是否列出了该选项。
UPDATE
更新
If you use git mv
to rename a file (which is preferable to using another tool or the OS to rename), you can see git detecting the rename. I highly recommend committing a rename independently of any edits to the contents of the file. That's because git doesn't actually store the fact that you renamed - it uses a heuristic based on how much the file has changed to guess whether it was the same file. The less you change it during the rename-commit, the better.
如果您使用git mv
重命名文件(这比使用其他工具或操作系统重命名更可取),您可以看到 git 检测重命名。我强烈建议独立于对文件内容的任何编辑提交重命名。那是因为 git 实际上并不存储您重命名的事实 - 它使用基于文件更改量的启发式方法来猜测它是否是同一个文件。在重命名提交期间更改的越少越好。
If you did change the file contents slightly, you can use -C
param to git diff
and git log
to try harder to detect copies and renames. Add a percentage (e.g. -C75%
) to make git more lenient about differences. The percent represents how similar the contents have to be to be considered a match.
如果您确实稍微更改了文件内容,则可以使用-C
param togit diff
并git log
更努力地检测副本和重命名。添加一个百分比(例如-C75%
)以使 git 对差异更加宽容。百分比表示内容必须有多相似才能被视为匹配。
回答by CommaToast
Now that I know a lot more about Git, I can answer my own questions.
现在我对 Git 有了更多的了解,我可以回答我自己的问题了。
It would be better to do a global search-replace using regex to standardize the whitespace between all the files across the different versions of the project, so that when they are sequentially committed, the whitespaces changes won't need commits. That being said, Atlassian SourceTree's diff tool allows you to hide whitespace changes, so at least you won't see those.
The key to deal with filename changes is to make a commit where only the file's name changes (don't stage any other changes). Then make a commit where its contents change. That way, normal diff tools that don't do a ton of heuristics and deep digging can make sense out of what has happened. The problem is that if too much changes about a file, like the name AND a lot of the contents, then most diff tools will treat it as a summary deletion and new file. (as mentioned in the correct answer)
This is a tougher one, there's no really good way around it. If you split up a file into two, or merge two, it will just be ugly in the diff. Try not to make lots of changes at the same time as doing the split, so that the split will be one thing, and subsequent changes will be another.
最好使用正则表达式进行全局搜索替换,以标准化项目不同版本中所有文件之间的空格,这样当它们按顺序提交时,空格更改就不需要提交。话虽如此,Atlassian SourceTree 的 diff 工具允许您隐藏空白更改,因此至少您不会看到这些更改。
处理文件名更改的关键是进行仅文件名更改的提交(不要暂存任何其他更改)。然后在其内容发生变化的地方进行提交。这样,不会进行大量启发式和深入挖掘的普通差异工具可以理解发生的事情。问题是,如果一个文件有太多变化,比如名称和很多内容,那么大多数差异工具会将其视为摘要删除和新文件。(如正确答案中所述)
这是一个更艰难的问题,没有什么好的解决方法。如果你把一个文件分成两个,或者合并两个,它在差异中会很丑陋。尽量不要在拆分的同时进行大量更改,这样拆分将是一回事,随后的更改将是另一回事。
回答by trojanfoe
You won't be able to make git ignore tabs/spaces as git creates a hash of each file and if the hash is different the file is considered different.
Git treats trees (directories) the same as files; if their content changes then they are different tree's.
您将无法让 git 忽略制表符/空格,因为 git 创建每个文件的哈希值,如果哈希值不同,则文件被认为是不同的。
Git 将树(目录)视为文件;如果它们的内容发生变化,那么它们就是不同的树。
I don't think these changes are anything to worry about however; they happen during any development. I think the best approach for you is to replayyour development using git. In other words start with your initial version and then make the necessary changes (as you did originally) and git will remember what you are doing.
不过,我认为这些变化无需担心;它们发生在任何开发过程中。我认为对您来说最好的方法是使用 git重放您的开发。换句话说,从您的初始版本开始,然后进行必要的更改(就像您最初所做的那样),git 会记住您在做什么。
Optional: If you want to record the date/time of the changes to be roughly those originally made, then you can use the --date
command line option to git commit
to tell git when these changes were made.
可选:如果您想将更改的日期/时间记录为与最初所做的大致相同,那么您可以使用--date
命令行选项git commit
告诉 git 何时进行这些更改。