为什么 Git 把这个文本文件当作二进制文件?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/6855712/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Why does Git treat this text file as a binary file?
提问by nacho4d
I wonder why git tells me this:?
我想知道为什么 git 告诉我这个:?
$ git diff MyFile.txt
diff --git a/MyFile.txt b/MyFile.txt
index d41a4f3..15dcfa2 100644
Binary files a/MyFile.txt and b/MyFile.txt differ
Aren't they text files?
它们不是文本文件吗?
I have checked the .gitattributes and it is empty. Why I am getting this message? I cannot get diffs as I use to anymore
我检查了 .gitattributes,它是空的。为什么我会收到这条消息?我不能像以前那样获得差异了
ADDED:
添加:
I've noticed there is an @
in the file permissions, what is this? Could this be the reason?
我注意到@
文件权限中有一个,这是什么?这可能是原因吗?
$ls -all
drwxr-xr-x 5 nacho4d staff 170 28 Jul 17:07 .
drwxr-xr-x 16 nacho4d staff 544 28 Jul 16:39 ..
-rw-r--r--@ 1 nacho4d staff 6148 28 Jul 16:15 .DS_Store
-rw-r--r--@ 1 nacho4d staff 746 28 Jul 17:07 MyFile.txt
-rw-r--r-- 1 nacho4d staff 22538 5 Apr 16:18 OtherFile.txt
采纳答案by Philip Oakley
It simply means that when git inspects the actual content of the file (it doesn't knowthat any given extension is not a binary file - you can use the attributes file if you want to tell it explicitly - see the man pages).
它只是意味着当 git 检查文件的实际内容时(它不知道任何给定的扩展名不是二进制文件 - 如果你想明确地告诉它,你可以使用属性文件 - 请参阅手册页)。
Having inspected the file's contents it has seen stuff that isn't in basic ascii characters. Being UTF16 I expect that it will have 'funny' characters so it thinks it's binary.
检查了文件的内容后,它发现了一些不是基本 ascii 字符的内容。作为 UTF16,我希望它会有“有趣”的字符,所以它认为它是二进制的。
There are ways of telling git if you have internationalisation (i18n) or extended character formats for the file. I'm not sufficiently up on the exact method for setting that - you may need to RT[Full]M ;-)
有多种方法可以告诉 git 文件是否具有国际化 (i18n) 或扩展字符格式。我对设置它的确切方法还不够了解 - 您可能需要 RT[Full]M ;-)
Edit: a quick search of SO found can-i-make-git-recognize-a-utf-16-file-as-textwhich should give you a few clues.
编辑:快速搜索 SO 发现can-i-make-git-recognize-a-utf-16-file-as-text这应该给你一些线索。
回答by naitsirch
If you have not set the type of a file, Git tries to determine it automatically and a file with really long lines and maybe some wide characters(e.g. Unicode) is treated as binary. With the .gitattributesfile you can define how Git interpretes the file. Setting the diffattribute manually lets Git interprete the file content as text and will do an usual diff.
如果你还没有设置文件的类型,Git 会尝试自动确定它,并且具有非常长的行和一些宽字符(例如 Unicode)的文件被视为二进制文件。使用.gitattributes文件,您可以定义 Git 解释文件的方式。手动设置diff属性让 Git 将文件内容解释为文本,并会做一个通常的 diff。
Just add a .gitattributesto your repository root folder and set the diffattribute to the paths or files. Here's an example:
只需将.gitattributes添加到您的存储库根文件夹并将diff属性设置为路径或文件。下面是一个例子:
src/Acme/DemoBundle/Resources/public/js/i18n/* diff
doc/Help/NothingToSay.yml diff
*.css diff
If you want to check if there are attributes set on a file, you can do that with the help of git check-attr
如果要检查文件是否设置了属性,可以借助git check-attr 来完成
git check-attr --all -- src/my_file.txt
Another nice reference about Git attributes could be found here.
关于 Git 属性的另一个很好的参考可以在这里找到。
回答by Hemant
I was having this issue where Git GUI and SourceTree was treating Java/JS files as binary and thus couldn't see difference
我遇到了这个问题,其中 Git GUI 和 SourceTree 将 Java/JS 文件视为二进制文件,因此看不到差异
Creating file named "attributes" in .git\info folder with following content solved the problem
在 .git\info 文件夹中创建名为“attributes”的文件,内容如下解决了问题
*.java diff
*.js diff
*.pl diff
*.txt diff
*.ts diff
*.html diff
If you would like to make this change for all repositories then you can add attributes file in following location $HOME/.config/git/attributes
如果您想对所有存储库进行此更改,则可以在以下位置添加属性文件 $HOME/.config/git/attributes
回答by Chris Murphy
Git will even determine that it is binary if you have one super-long line in your text file. I broke up a long String, turning it into several source code lines, and suddenly the file went from being 'binary' to a text file that I could see (in SmartGit).
如果您的文本文件中有一个超长的行,Git 甚至会确定它是二进制的。我分解了一个长字符串,把它变成了几个源代码行,突然这个文件从“二进制”变成了一个我可以看到的文本文件(在 SmartGit 中)。
So don't keep typing too far to the right without hitting 'Enter' in your editor - otherwise later on Git will think you have created a binary file.
因此,不要在编辑器中不按“Enter”键的情况下继续向右输入太远的内容 - 否则稍后 Git 会认为您已经创建了一个二进制文件。
回答by deadlydog
I had this same problem after editing one of my files in a new editor. Turns out the new editor used a different encoding (Unicode) than my old editor (UTF-8). So I simply told my new editor to save my files with UTF-8 and then git showed my changes properly again and didn't see it as a binary file.
在新编辑器中编辑我的一个文件后,我遇到了同样的问题。事实证明,新编辑器使用的编码 (Unicode) 与我的旧编辑器 (UTF-8) 不同。所以我只是告诉我的新编辑器用 UTF-8 保存我的文件,然后 git 再次正确显示我的更改并且没有将其视为二进制文件。
I think the problem was simply that git doesn't know how to compare files of different encoding types. So the encoding type that you use really doesn't matter, as long as it remains consistent.
我认为问题只是 git 不知道如何比较不同编码类型的文件。因此,您使用的编码类型实际上并不重要,只要它保持一致即可。
I didn't test it, but I'm sure if I would have just committed my file with the new Unicode encoding, the next time I made changes to that file it would have shown the changes properly and not detected it as binary, since then it would have been comparing two Unicode encoded files, and not a UTF-8 file to a Unicode file.
我没有测试它,但我确定我是否会使用新的 Unicode 编码提交我的文件,下次我对该文件进行更改时,它会正确显示更改并且不会将其检测为二进制,因为那么它会比较两个 Unicode 编码的文件,而不是一个 UTF-8 文件和一个 Unicode 文件。
You can use an app like Notepad++to easily see and change the encoding type of a text file; Open the file in Notepad++ and use the Encoding menu in the toolbar.
您可以使用Notepad++ 之类的应用程序轻松查看和更改文本文件的编码类型;在 Notepad++ 中打开文件并使用工具栏中的编码菜单。
回答by howard
I have had same problem. I found the thread when I search solution on google, still I don't find any clue. But I think I found the reason after studying, the below example will explain clearly my clue.
我有同样的问题。我在谷歌上搜索解决方案时找到了该线程,但我仍然没有找到任何线索。但我想我在研究后找到了原因,下面的例子将清楚地解释我的线索。
echo "new text" > new.txt
git add new.txt
git commit -m "dummy"
for now, the file new.txt is considered as a text file.
现在,文件 new.txt 被视为文本文件。
echo -e "newer textdiff --git a/new.txt b/new.txt
index fa49b07..410428c 100644
Binary files a/new.txt and b/new.txt differ
0" > new.txt
git diff
you will get this result
你会得到这个结果
git diff -a
and try this
试试这个
diff --git a/new.txt b/new.txt
index fa49b07..9664e3f 100644
--- a/new.txt
+++ b/new.txt
@@ -1 +1 @@
-new file
+newer text^@
you will get below
你会得到下面
cd directory/of/interest
file *
回答by StuFF mc
We had this case where an .html file was seen as binary whenever we tried to make changes in it. Very uncool to not see diffs. To be honest, I didn't checked all the solutions here but what worked for us was the following:
我们遇到过这种情况,每当我们尝试对其进行更改时,.html 文件都会被视为二进制文件。看不到差异非常不酷。老实说,我没有检查这里的所有解决方案,但对我们有用的是以下内容:
- Removed the file (actually moved it to my Desktop) and commited
the
git deletion
. Git saysDeleted file with mode 100644 (Regular) Binary file differs
- Re-added the file (actually moved
it from my Desktop back into the project). Git says
New file with mode 100644 (Regular) 1 chunk, 135 insertions, 0 deletions
The file is now added as a regular text file
- 删除了文件(实际上是将它移到了我的桌面)并提交了
git deletion
. Git 说Deleted file with mode 100644 (Regular) Binary file differs
- 重新添加文件(实际上是将它从我的桌面移回项目中)。Git 说
New file with mode 100644 (Regular) 1 chunk, 135 insertions, 0 deletions
该文件现在已添加为常规文本文件
From now on, any changes I made in the file is seen as a regular text diff. You could also squash these commits (1, 2, and 3 being the actual change you make) but I prefer to be able to see in the future what I did. Squashing 1 & 2 will show a binary change.
从现在开始,我在文件中所做的任何更改都被视为常规文本差异。您也可以压缩这些提交(1、2 和 3 是您所做的实际更改),但我更希望将来能够看到我做了什么。Squashing 1 & 2 将显示二进制变化。
回答by patricktokeeffe
Per this helpful answer, you can ask Git directly why it treats a file in a particular way:
根据这个有用的答案,您可以直接询问 Git 为什么它以特定方式处理文件:
$ file *
CR6Series_stats resaved.dat: ASCII text, with very long lines, with CRLF line terminators
CR6Series_stats utf8.dat: UTF-8 Unicode (with BOM) text, with very long lines, with CRLF line terminators
CR6Series_stats.dat: ASCII text, with very long lines, with CRLF line terminators
readme.md: ASCII text, with CRLF line terminators
It produces useful output like this:
它会产生如下有用的输出:
# .gitattributes file
.gitignore diff
回答by Robba
This is also caused (on Windows at least) by text files that have UTF-8 with BOMencoding. Changing the encoding to regular UTF-8immediately made Git see the file as type=text
这也是由具有UTF-8 和 BOM编码的文本文件引起的(至少在 Windows 上)。将编码更改为常规UTF-8立即使 Git 将文件视为 type=text
回答by Erik Zivkovic
I had an instance where .gitignore
contained a double \r
(carriage return) sequence by purpose.
我有一个实例,其中按目的.gitignore
包含一个双\r
(回车)序列。
That file was identified as binary by git. Adding a .gitattributes
file helped.
该文件被 git 识别为二进制文件。添加.gitattributes
文件有帮助。